eric the espeon
maybe I just misunderstood
Background reading: X-Act's threads here, here, here, here, and here.
The early measures were reasonably good within a metagame, but either had some other flaw or had to include the total number of Pokemon, rendering the numbers generated not very useful for comparing diversity across tiers. The most recent estimate of diversity (the last "here") relies on the cumulative usage following an exponential distribution very closely to be accurate, which has been shown to not always be the case, and will never take into account all of the little kinks in the usage graph since it only uses one data point (at 5/6 total cumulative usage).
It's always seemed like using all of the usage data (like the graphs here), rather than a specific point should be more accurate since even if the trend was extremely predictable and relied almost entirely on a single constant which could be taken as a measure of diversity, there would always be quirks in the stats. After doing some research I came across something called Simpson's Diversity Index which is commonly used for measuring the diversity of species in a real ecosystem, and can be applied almost directly to Pokemon usage stats. Basically, you take the % usage of each Pokemon (not the % per team which Doug's stats give directly, the actual share of usage), square it, then just take the sum of all the squared usages away from one.
This measure does not need a constant for metagame size added in artificially, so comparisons across metagames with different numbers of Pokemon are entirely valid. It should not run into problems with extremely large metagames or small ones, or rely on data perfectly fitting any pre-determined trend.
Anyway, what most of you will be more interested in:
Higher value = more diverse, one would be perfectly diverse, 0 would be the least diverse possible.
Any surprises there? Explanations for the larger shifts? Is it interesting enough to be worth generating stats over longer timescales, or even graphs of comparative diversity?
Still working on the big pokestats sheet, which depending on interest may or may not include diversity numbers from the start of shoddy stats to now.
The early measures were reasonably good within a metagame, but either had some other flaw or had to include the total number of Pokemon, rendering the numbers generated not very useful for comparing diversity across tiers. The most recent estimate of diversity (the last "here") relies on the cumulative usage following an exponential distribution very closely to be accurate, which has been shown to not always be the case, and will never take into account all of the little kinks in the usage graph since it only uses one data point (at 5/6 total cumulative usage).
It's always seemed like using all of the usage data (like the graphs here), rather than a specific point should be more accurate since even if the trend was extremely predictable and relied almost entirely on a single constant which could be taken as a measure of diversity, there would always be quirks in the stats. After doing some research I came across something called Simpson's Diversity Index which is commonly used for measuring the diversity of species in a real ecosystem, and can be applied almost directly to Pokemon usage stats. Basically, you take the % usage of each Pokemon (not the % per team which Doug's stats give directly, the actual share of usage), square it, then just take the sum of all the squared usages away from one.
This measure does not need a constant for metagame size added in artificially, so comparisons across metagames with different numbers of Pokemon are entirely valid. It should not run into problems with extremely large metagames or small ones, or rely on data perfectly fitting any pre-determined trend.
Anyway, what most of you will be more interested in:
Higher value = more diverse, one would be perfectly diverse, 0 would be the least diverse possible.
Code:
April March February January
OU 0.982237088 0.982308722 0.981674694 0.980915579
OU Lead 0.97101384 0.97198333 0.971747105 0.969492976
UU 0.986095532 0.985346613 0.98657257 0.985764676
UU Lead 0.967771738 0.963061923 0.964481434 0.956267244
Ubers 0.96793781 0.966426838 0.963107316 0.961904362
Ubers Lead 0.933500605 0.927127416 0.920775591 0.904700124
Still working on the big pokestats sheet, which depending on interest may or may not include diversity numbers from the start of shoddy stats to now.