Further Notes on Centralisation and Diversity

X-Act · Jun 5, 2009

Recently I have continued to research centralisation and diversity of Pokemon metagames from their usages, and am confirming that my previous research was correct.

First of all, a little history. A few months ago, I defined the diversity of a metagame as follows. I first check the minimum number of Pokemon having a nonzero probability of any two of them being together in a team. This means, for example, that if Scizor, Metagross, Tyranitar and Lucario satisfy this property, then this number would be 4. I repeat this for three, four, five and six Pokemon, thus extracting five numbers in all. The diversity would thus be the last of these numbers, but written as a summation of the others.

The way these numbers are found is rather simple. You first take the percentage usages, and start summing them up together cumulatively. This is called a cumulative frequency distribution. At the points where the cumulative frequency first exceeds 1, 2, 3, 4 and 5, check the number of Pokemon used up until that summation, and that would be the minimum number of Pokemon having a nonzero probability of any 2, 3, 4, 5 and 6 of them being together in a team respectively.

Now this cumulative frequency distribution can be plotted graphically. Let's do this for the Standard, UU, Uber and Suspect metagames of the previous month, May:

(Note that the above aren't strictly cumulative frequency distributions because they adds up to 6 instead of to 1, but this can be easily rectified by dividing each percentage usage by 6.)

Let's look at the graphs above more closely, in particular where the number 1 on the vertical axis is. The violet graph (Suspect) becomes 1 when the x-value (number of Pokemon) is about 2. For Uber, this is about 2.5, for Standard this is about 5 and for UU this is about 5.5. These numbers are none other than the first of the numbers I defined at the start. The other numbers are found where the graphs become 2, 3, 4 and 5 respectively. So we conclude that the Suspect metagame was slightly less diverse than Uber was in May, while Standard and UU were much more diverse.

To illustrate, from the graphs above, the diversity numbers for Standard would be 5, 11, 21, 35 and 63, which can be written as 5 + 6 + 10 + 14 + 18 = 63. The other metagames would have the following diversity numbers:

UU: 6, 15, 27, 45, 79, which can be written as 6 + 9 + 12 + 18 + 24 = 79.
Uber: 3, 6, 10, 16, 27, which can be written as 3 + 3 + 4 + 6 + 11 = 27.
Suspect: 2, 5, 9, 14, 26, which can be written as 2 + 3 + 4 + 5 + 12 = 26.

More than a year ago, I had already commented that the usages of Pokemon seem to follow what is called an exponential distribution. This can be readily confirmed by looking at the graphs above. What is interesting is that I have never encountered Pokemon usages whose graph shape differed from the above. This seems to suggest that Pokemon usages following an exponential distribution is no coincidence but must follow from how the players choose their Pokemon to be on a team.

What's even more interesting is that an exponential distribution has only one parameter called lambda (the greek letter l). In a nutshell, the larger lambda is, the steeper its graph starts (like the Suspect and Uber graphs, which seem to have a very similar lambda value).

But as we said before, the shape of the graphs alone can give us this information about diversity, and the shape of the graphs are governed by just a single value (lambda). But how can this lambda be found? One simple approximation involves the diversity value we found. The cumulative frequency distribution has its inverse equal to -ln(1-x) / lambda. Since the diversity value is at 5/6 of the cumulative frequency distribution, we can use it to find an approximate value for lambda:

Code:

Diversity ~ -ln(1-(5/6)) / lambda
          = -ln(1/6) / lambda
          = ln(6) / lambda
   lambda = ln(6) / Diversity
[B]  lambda ~ 1.792 / Diversity[/B]

This confirms that the lambda value is a measure of centralisation, as it is inversely proportional to the measure of diversity by its definition above, and, intuitively, centralisation and diversity are inversely proportional.

Hence, the Suspect lambda value is about 1.792 / 26 = 0.0689, the Uber lambda value is about 1.792 / 27 = 0.0664, the Standard lambda value is about 1.792 / 63 = 0.0284 while the UU lambda value is about 1.792 / 79 = 0.0227.

One final thing. Since the above graphs vary between 0 and 6, 3 is the value from which we can find the median. The median is the minimum number of Pokemon that contribute to half the usages (this is also equal to our third number in the diversity numbers). From the graphs above, the Suspect median is 9, the Uber median is 10, the Standard median is 21 and the UU median is 26. This means that, for example, in the May Suspect metagame, the top 9 Pokemon in the usages list contributed to more than half of the total usages, which further means that one of the top 9 Pokemon was more probable to be used than one of the remaining 490 or so Pokemon. Replace the number 9 in the previous sentence by the medians for the other metagames.

wildfire393 · Jun 5, 2009

Cool stuff.

One could probably argue that the suspect metagame being of a similar level of centralization to Uber means that the suspects are too centralizing, but then again anyone playing on the suspect ladder is required to use them to gain voting rights so that wouldn't stand up. It's quite interesting how close they are, though. I wonder if you could show the centralization curves for previous suspect metagames, as well as possibly the centralization curve for the last few months of Garchomp's standard usage, to get a picture of what overcentralization looks like.

Also do you have the centralization tables from May? (x y z - two of these pokemon were used on a standard team, etc). I can't seem to find them.

jimmyolsen · Jun 5, 2009

I'm unsure if it's an error in the program you used to draw the graphs, or an anomaly in the data itself, but the suspect ladder graph appears to have a different derivative than expected between x = 20 and x = 35. While I agree with you 100% that the other three curves all follow an exponential distribution, the suspect ladder seems to have a second term that is causing some sort of miscalculation. There is no apparent reason that the concavity in that interval should be close to zero, as in fact it should be at it's minimum concavity in that interval.

Again, this could be attributed entirely to the program drawing the graph, however you may wish to consider the possibility of a second term.

d2m · Jun 5, 2009

Try doing this for only the top 25 and it will tell a different story methinks. The standard and UU ladders are dominated by a minor number of team configurations, and, similarly, it becomes extremely centralized the better the teams get.

The Uber metagame is more centralized because ubers are few, but extremely powerful, of course, Kyogre is still extremely widely used.

X-Act · Jun 5, 2009

@jimmyolsen: Actually, all the 4 graphs above are not exact exponential distributions, but follow it approximately. The Suspect curve happens to be the one that strays the furthest from the exact exponential curve the most. EDIT: Maybe the exclusion of Deoxys-S in the middle of the month period could have contributed to this.

@d2m: Was yours an answer to jimmyolsen or to me? If it was to me, I don't understand. The UU metagame contains the least amount of Pokemon to choose from but is still the one having the highest diversity, while Ubers contains the most amount of Pokemon to choose from but is one of the least diverse metagames.

Aiklap · Jun 5, 2009

Great job here, I appriciate the time and effort put into this.
excellent to see a chart, it makes information easy to read and study.
Obviously, Uber and suspect are going to be the least diverse, otherwise it wouldn't be suspect and uber.
In fact, on suspect, you can see the huge steep slope, where the "suspect(s)" is/are, and from 2 upwards it follows the standard line (roughly) from the start.
I am right in saying this is May's statistics?

Revolution.Z · Jun 5, 2009

Hmm that's pretty Smart X-Act. Is this from May's or April's stats? And cheers this is some pretty interesting and intriguing stuff. Wow the ubers and suspect ladders are a lot less used than I thought.

X-Act · Jun 5, 2009

Yes, it is from May. I said that also in the original post.

d2m · Jun 5, 2009

X-Act said:
@jimmyolsen: Actually, all the 4 graphs above are not exact exponential distributions, but follow it approximately. The Suspect curve happens to be the one that strays the furthest from the exact exponential curve the most. EDIT: Maybe the exclusion of Deoxys-S in the middle of the month period could have contributed to this.

@d2m: Was yours an answer to jimmyolsen or to me? If it was to me, I don't understand. The UU metagame contains the least amount of Pokemon to choose from but is still the one having the highest diversity, while Ubers contains the most amount of Pokemon to choose from but is one of the least diverse metagames.

No, that's a misconception. When you have access to Ubers, there are few non-Uber pokemon that can be completely viable in the tier thanks to the large stat difference between OU and Ubers, whereas UU and NU (the largest tier) stats are extremely close, meaning many NUs are viable in UU. In OU, there are also very few UUs and NUs that are viable.

Just because there's a larger number available doesn't mean there's any incentive to use the vast majority. My other point was that the farther up the leaderboard you go, the more centralized it gets, bad and random teams skew the actual ranking.

eric the espeon · Jun 5, 2009

Brilliant, I agree with this being an effective measure (though checking lambda only from one data point at 5/6 could lead to slight irregularities, but may not be easily remedied and so long as the graphs stick close to exponential should be fine.).

Do you intend to produce these each time the stats come out/OU list is remade or is this one off?

Suspect being more centralised may have something to do with the need for using the suspects on each team to qualify, however once you get out of the top 10 where the suspects should generally lie I think that something else may take over. People are likely to be less inclined to try out rarer Pokemon if they are specifically working towards a rating goal, making the ladder more competitive and in this case more centralised around Pokemon known to be very effective.

In short, more incentive to win means people stick with the best of the best causing centralisation in Suspect.

A.P. · Jun 5, 2009

Interesting stuff.

Thanks for the insight, X-Act!

By the way, ln is log of e, right? I'm already forgetting what I learnt in Pre-Calc earlier this year...

wildfire393 · Jun 5, 2009

ap13095 said:
Interesting stuff.

Thanks for the insight, X-Act!

By the way, ln is log of e, right? I'm already forgetting what I learnt in Pre-Calc earlier this year...

ln is log base e, also known as the natural log.

Imran · Jun 5, 2009

Alternativly known as the Napierian Logarithm, which I believe was the original name.

petrie911 · Jun 5, 2009

Strictly speaking, wouldn't it be better to simply do a least-squares fit to the data to find lambda? ie, instead of using the 5/6 method on the cumulative distribution function, do an exponential fit to the probability distribution function (or a linear fit to the logarithm of said function). That should give better values for lambda. It can also give you an idea of how good the exponential fit is.

Additionally, I wonder what the fact that usage distributions are nearly always exponential tells us about how people choose Pokemon for their teams.

Tagrineth · Jun 5, 2009

d2m said:
No, that's a misconception. When you have access to Ubers, there are few non-Uber pokemon that can be completely viable in the tier thanks to the large stat difference between OU and Ubers, whereas UU and NU (the largest tier) stats are extremely close, meaning many NUs are viable in UU. In OU, there are also very few UUs and NUs that are viable.

Just because there's a larger number available doesn't mean there's any incentive to use the vast majority. My other point was that the farther up the leaderboard you go, the more centralized it gets, bad and random teams skew the actual ranking.

That's the whole point of these graphs. Ubers has the largest available pool of pokemon (everything except Arceus because he hasn't been released yet, right?) but the least diversity because the game is centralised around the few 680s and the tiny number of pokemon that can stand up to them on some level.

d2m · Jun 5, 2009

Tagrineth said:
That's the whole point of these graphs. Ubers has the largest available pool of pokemon (everything except Arceus because he hasn't been released yet, right?) but the least diversity because the game is centralised around the few 680s and the tiny number of pokemon that can stand up to them on some level.

Exactly my point. But I'm pointing out they are skewed towards more diversity by accepting every team, even the fringe, joke, and just plain bad teams. If you start to restrict it up the ladder, it gets ridiculously centralized, and I think that's what needs more attention.

Arcaseven · Jun 5, 2009

I appreciate the info, X-Act, you're fantastic as always. But for people like me who are "C average" math students and couldn't read a graph if their lives depended on it - what does the data show?

jimmyolsen · Jun 5, 2009

petrie911 said:
Strictly speaking, wouldn't it be better to simply do a least-squares fit to the data to find lambda? ie, instead of using the 5/6 method on the cumulative distribution function, do an exponential fit to the probability distribution function (or a linear fit to the logarithm of said function). That should give better values for lambda. It can also give you an idea of how good the exponential fit is.

Additionally, I wonder what the fact that usage distributions are nearly always exponential tells us about how people choose Pokemon for their teams.

The fact that it's exponential simply shows that people acknowledge the more obvious synergy of various combinations. There's absolutely no way that such a correlation could be expressed as a Normal, Laplacian, or Maxwell distribution.

I'm somewhat surprised that an exponential curve fits the data better than a Zeta or various other power-based distribution. I would have expected the cumulative distribution to have scaled much faster than it did. The fact that it didn't fit Zipf's model (which was originally based on what words get used together), is incredibly surprising.

X-Act · Jun 6, 2009

petrie911 said:
Strictly speaking, wouldn't it be better to simply do a least-squares fit to the data to find lambda? ie, instead of using the 5/6 method on the cumulative distribution function, do an exponential fit to the probability distribution function (or a linear fit to the logarithm of said function). That should give better values for lambda. It can also give you an idea of how good the exponential fit is.

Additionally, I wonder what the fact that usage distributions are nearly always exponential tells us about how people choose Pokemon for their teams.

Yes, you could do that. You could also reduce the cumulative frequency distribution equation to linear form and apply linear regression. These would probably find a better value for lambda, but the method to find lambda from the diversity has the merit of being much quicker and, more importantly, shows that diversity and lambda are inversely proportional.

Miao · Jun 7, 2009

I'm curious, given past data for the metagame , this looks like a solid way to calculate the change in diversity over time. I'm not sure how much data you have archived, but seeing how the diversity of the metagame has changed (especially with the addition or removal of suspect pokemon) would be very interesting.

BaldWombat · Jun 7, 2009

I would find it interesting to see this with types instead of pokemon. People complain that OU is only Steel and Dragon. I would like to see what percentage of teams have 1, 2, etc of a certain type.

petrie911 · Jun 7, 2009

So, X-Act, in order to make that graph, I assume you have the usage data in some sort of spreadsheet or something. Any chance you could upload that somewhere so we could do some analysis of our own?

X-Act · Jun 8, 2009

Honestly, yesterday I've been thinking more about this and I'm not entirely sure that this is an exponential distribution anymore. Or if it is, I'm not seeing what the lambda value signifies exactly.

Also, since this is strictly speaking a discrete distribution and not a continuous one, this is more akin to a geometric distribution rather than an exponential one, which has the cumulative distribution function 1 - (1-p)^k, but that's not the problem. The problem is the following. Since the cumulative usages c_1, c_2, c_3, ... should equal 1 - (1-p)^1, 1 - (1-p)^2, 1 - (1-p)^3, etc., this can be used to find what p is... but, unfortunately, p varies a bit too much to be supposed to be constant.

So, while what I wrote in the original post is not incorrect as such, it wouldn't be valid if the distribution is not exponential (or geometric).

Blue_Tornado · Jun 8, 2009

Sorry for this sounding a bit dumb, but I didn't understand the graph. I did understand that the number means 'more versatile' when it's higher, but I didn't get what the numbers actually stand for. 1 pokemon for each 5?ect...
I would really aprecciate an explanation, I'm only in middle school x_x

X-Act · Jun 9, 2009

1) Take the percentage usages from Doug's statistics.
2) Add them up cumulatively.
3) Plot the points and join.

A point (x,y) on the graph would mean "The sum of the percentage usages of all Pokemon from #1 to #x on the ladder is y".

Further Notes on Centralisation and Diversity

X-Act

np: Biffy Clyro - Shock Shock

wildfire393

jimmyolsen

d2m

X-Act

np: Biffy Clyro - Shock Shock

Aiklap

Revolution.Z

X-Act

np: Biffy Clyro - Shock Shock

d2m

eric the espeon

maybe I just misunderstood

A.P.

wildfire393

Imran

petrie911

Tagrineth

d2m

Arcaseven

jimmyolsen

X-Act

np: Biffy Clyro - Shock Shock

Miao

BaldWombat

petrie911

X-Act

np: Biffy Clyro - Shock Shock

Blue_Tornado

X-Act

np: Biffy Clyro - Shock Shock