Hello,
I have a table with two columns. In one column there are the countries and in the other column some values (as below).
COLUMN A COLUMN B
Afghanistan 1209
Afghanistan 300
Afghanistan 214
Afghanistan 134
Afghanistan 95
Afghanistan 64
Afghanistan 50
Afghanistan 48
Afghanistan 48
Afghanistan 44
Afghanistan 44
Afghanistan 38
Afghanistan 36
Afghanistan 31
Afghanistan 29
Afghanistan 29
Afghanistan 29
Afghanistan 28
Afghanistan 25
Afghanistan 24
Afghanistan 22
Afghanistan 22
Afghanistan 21
Afghanistan 20
Afghanistan 20
Afghanistan 16
Afghanistan 15
Afghanistan 14
Afghanistan 13
Afghanistan 13
Afghanistan 13
Afghanistan 13
Afghanistan 13
Afghanistan 12
Afghanistan 12
Afghanistan 12
Afghanistan 11
Afghanistan 9
Afghanistan 9
Afghanistan 8
Afghanistan 7
Afghanistan 7
Afghanistan 6
Afghanistan 4
Afghanistan 4
Afghanistan 3
Afghanistan 2
Afghanistan 1
Afghanistan 1
Afghanistan 1
Albania 352
Albania 351
Albania 245
Albania 240
Albania 168
Albania 166
Albania 161
Albania 147
Albania 141
Albania 138
Albania 133
Albania 127
Albania 101
Albania 94
Albania 83
Albania 78
Albania 71
Albania 61
Albania 54
Albania 44
Albania 39
Albania 16
Algeria 1882
Algeria 726
Algeria 646
Algeria 528
Algeria 524
Algeria 517
Algeria 498
Algeria 427
Algeria 422
Algeria 372
Algeria 361
Algeria 319
Algeria 303
Algeria 298
Algeria 290
Algeria 288
Algeria 268
Algeria 267
Algeria 239
Algeria 238
Algeria 238
Algeria 231
Algeria 231
Algeria 222
Algeria 217
Algeria 216
Algeria 209
Algeria 191
Algeria 176
Algeria 174
Algeria 166
Algeria 164
Algeria 163
Algeria 162
Algeria 147
Algeria 139
Algeria 131
Algeria 122
Algeria 121
Algeria 120
Algeria 119
Algeria 119
Algeria 115
Algeria 111
Algeria 106
Algeria 105
Algeria 103
Algeria 98
Algeria 94
Algeria 93
Algeria 88
Algeria 87
Algeria 85
Algeria 82
Algeria 78
Algeria 78
Algeria 69
Algeria 64
Algeria 63
Algeria 62
Algeria 60
Algeria 59
Algeria 59
Algeria 56
Algeria 40
Algeria 9
Andorra 264
Angola 234
Angola 100
Angola 58
Angola 52
Angola 49
Angola 44
Angola 37
Angola 28
Angola 28
Angola 27
Angola 27
Angola 20
Angola 10
Angola 10
Angola 9
Angola 3
Anguilla 191
Antigua and Barbuda 557
Antigua and Barbuda 268
Argentina 9495
Argentina 7807
Argentina 4206
Argentina 2875
Argentina 2812
Argentina 2753
Argentina 2645
Argentina 1798
Argentina 1780
Argentina 1768
Argentina 1750
Argentina 1711
Argentina 1618
Argentina 1538
Argentina 1475
Argentina 1246
Argentina 1178
Argentina 1178
Argentina 1168
Argentina 1111
Argentina 1032
Argentina 1030
Argentina 1028
Argentina 1017
Argentina 941
Argentina 922
Argentina 920
Argentina 903
Argentina 892
Argentina 825
Argentina 768
Argentina 767
Argentina 764
Argentina 759
Argentina 742
Argentina 697
Argentina 679
Argentina 648
Argentina 628
Argentina 613
Argentina 596
Argentina 595
Argentina 515
Argentina 512
Argentina 507
Argentina 506
Argentina 487
Argentina 465
Argentina 460
Argentina 446
Argentina 438
Argentina 432
Argentina 406
The columns are sorted by country first (column A) and by descending value (Column B) secondly. The above dataset is just a subset.
I need to detect the upper outlier(s) for the dataset of each country. So for example in Afghanistan the values 1209 and possibly also 300 and 214 might qualify as outliers in the country dataset. I would like to apply a formula that detects those upper outliers for each country maybe by showing in a third column the string OUTLIER if the condition is true as below.
I could create a formula for each country but there are 200 countries and thousands of value and finding a formula that can be just copied and pasted in the third column would be much easier.
COLUMN A COLUMN B COLUMN C (formula to be copied and pasted)
Afghanistan 1209 OUTLIER
Afghanistan 300 OUTLIER
Afghanistan 214 OUTLIER
Afghanistan 134
Afghanistan 95
Afghanistan 64
Afghanistan 50
.
.
Albania 352 OUTLIER
Albania 351 OUTLIER
Albania 245
Albania 240
Albania 168
.
etc.
I guess the problems are two:
1) find the right formula to detect the outliers (first quartile and third quartile ...)
2) apply the formula to each country subgroups not to the entire column
I searched on google for "excel apply formula by group" but I could not find a suitable answer to my problem. Mostly the results show how to use pivot tables or create simple Sum formulas by group. Nothing to show how to use a complex formula able to detect outliers.
Can anybody please help me with this challenging problem?
I would really appreciate it.
World 33
PS: I would have attached an excel file with the dataset but unfortunately I do not seem to be able to by the system.
Bookmarks