Recently I have continued to research centralisation and diversity of Pokemon metagames from their usages, and am confirming that my previous research was correct.
First of all, a little history. A few months ago, I defined the diversity
of a metagame as follows. I first check the minimum number of Pokemon having a nonzero probability of any two of them being together in a team. This means, for example, that if Scizor, Metagross, Tyranitar and Lucario satisfy this property, then this number would be 4. I repeat this for three, four, five and six Pokemon, thus extracting five numbers in all. The diversity would thus be the last of these numbers, but written as a summation of the others.
The way these numbers are found is rather simple. You first take the percentage usages, and start summing them up together cumulatively. This is called a cumulative frequency distribution
. At the points where the cumulative frequency first exceeds 1, 2, 3, 4 and 5, check the number of Pokemon used up until that summation, and that would be the minimum number of Pokemon having a nonzero probability of any 2, 3, 4, 5 and 6 of them being together in a team respectively.
Now this cumulative frequency distribution can be plotted graphically. Let's do this for the Standard, UU, Uber and Suspect metagames of the previous month, May:
(Note that the above aren't strictly cumulative frequency distributions because they adds up to 6 instead of to 1, but this can be easily rectified by dividing each percentage usage by 6.)
Let's look at the graphs above more closely, in particular where the number 1 on the vertical axis is. The violet graph (Suspect) becomes 1 when the x-value (number of Pokemon) is about 2. For Uber, this is about 2.5, for Standard this is about 5 and for UU this is about 5.5. These numbers are none other than the first of the numbers I defined at the start. The other numbers are found where the graphs become 2, 3, 4 and 5 respectively. So we conclude that the Suspect metagame was slightly less diverse than Uber was in May, while Standard and UU were much more diverse.
To illustrate, from the graphs above, the diversity numbers for Standard would be 5, 11, 21, 35 and 63, which can be written as 5 + 6 + 10 + 14 + 18 = 63. The other metagames would have the following diversity numbers:
UU: 6, 15, 27, 45, 79, which can be written as 6 + 9 + 12 + 18 + 24 = 79.
Uber: 3, 6, 10, 16, 27, which can be written as 3 + 3 + 4 + 6 + 11 = 27.
Suspect: 2, 5, 9, 14, 26, which can be written as 2 + 3 + 4 + 5 + 12 = 26.
More than a year ago, I had already commented that the usages of Pokemon seem to follow what is called an exponential distribution
. This can be readily confirmed by looking at the graphs above. What is interesting is that I have never encountered Pokemon usages whose graph shape differed from the above. This seems to suggest that Pokemon usages following an exponential distribution is no coincidence but must follow from how the players choose their Pokemon to be on a team.
What's even more interesting is that an exponential distribution has only one parameter called lambda (the greek letter l). In a nutshell, the larger lambda is, the steeper its graph starts (like the Suspect and Uber graphs, which seem to have a very similar lambda value).
But as we said before, the shape of the graphs alone can give us this information about diversity, and the shape of the graphs are governed by just a single value (lambda). But how can this lambda be found? One simple approximation involves the diversity value we found. The cumulative frequency distribution has its inverse equal to -ln(1-x) / lambda. Since the diversity value is at 5/6 of the cumulative frequency distribution, we can use it to find an approximate value for lambda:
Diversity ~ -ln(1-(5/6)) / lambda
= -ln(1/6) / lambda
= ln(6) / lambda
lambda = ln(6) / Diversity
lambda ~ 1.792 / Diversity
This confirms that the lambda value is a measure of centralisation
, as it is inversely proportional to the measure of diversity by its definition above, and, intuitively, centralisation and diversity are inversely proportional.
Hence, the Suspect lambda value is about 1.792 / 26 = 0.0689, the Uber lambda value is about 1.792 / 27 = 0.0664, the Standard lambda value is about 1.792 / 63 = 0.0284 while the UU lambda value is about 1.792 / 79 = 0.0227.
One final thing. Since the above graphs vary between 0 and 6, 3 is the value from which we can find the median
. The median is the minimum number of Pokemon that contribute to half the usages (this is also equal to our third number in the diversity numbers). From the graphs above, the Suspect median is 9, the Uber median is 10, the Standard median is 21 and the UU median is 26. This means that, for example, in the May Suspect metagame, the top 9 Pokemon in the usages list contributed to more than half of the total usages
, which further means that one of the top 9 Pokemon was more probable to be used than one of the remaining 490 or so Pokemon. Replace the number 9 in the previous sentence by the medians for the other metagames.