1. Welcome to Smogon Forums! Please take a minute to read the rules.
3. Ever find yourself missing out on the latest Smogon articles? We've now got a subscription service, so the newest articles (of your choice) are delivered right to your inbox! Check it out here.

# New (and hopefully correct) method of measuring centralisation in a metagame

Discussion in 'Pokémetrics' started by X-Act, Dec 12, 2008.

Not open for further replies.
1. ### X-Actnp: Biffy Clyro - Shock Shock

Joined:
Feb 17, 2006
Messages:
4,675
This is a continuation of this thread. There, I tried to measure the centralisation of a metagame using just two numbers. This was overly simplistic, unfortunately, and, while that formula produced pretty adequate numbers, it wasn't good enough (see my penultimate post in that thread and the post before it to see why).

One of the problems was that I didn't take the number of legal Pokemon in a particular metagame into account. This is actually very important. If a metagame has 100 Pokemon and each were used, say, 1000 times, that metagame has zero centralisation. However, if another metagame has 100 Pokemon where 10 of them were used 10000 times and the other 90 weren't used at all, that metagame is much more centralised. This very simple thing is something that the previous formula didn't address at all, which is something that kind of makes me ashamed when I think about it... :|

Anyway, back to the new formula. As I said before, if all Pokemon were used equally, then the centralisation would be zero. Thus, suppose the total number of usages is U and the number of Pokemon in a metagame is P. For a completely uncentralised metagame, each of the P Pokemon would be expected to be used U/P times.

Of course, in reality, there would be Pokemon that would be used more than U/P times and others that would be used less than U/P times, and this would give us an element of centralisation depending on how far the deviation from the U/P is for each Pokemon. So I defined the centralisation of a metagame to be this deviation.

So basically, the centralisation of a metagame is almost the same as the standard deviation of all usages, except that, instead of finding the deviation from the mean of the usages, we find the deviation from the number U/P. Rearranging this equation to simplify it and multiplying it by 100 to get a more manageable number, we get:

Code:
```Central = 100 x sqrt(Sum_i(((U_i/U) - (1/P))^2))
where U_i is the usage of Pokemon i
U is the sum of all usages
P is the number of Pokemon in the metagame```
Using this formula, here are the centralisation numbers for all metagames from July to November:

Code:
```    Ladder               Months
Jul   Aug   Sep   Oct   Nov
------------------------------------------
Standard    13.0  12.6  11.7  12.4  11.9
UU    12.2  11.2  10.7  11.4  12.4
Uber    21.3  19.9  19.7  20.3  20.2
Suspect     --   13.2  13.4  15.5   --
CAP     --    --   12.3   --   15.4
Little Cup     --    --   17.9   --   16.4```
2. ### wildfire393

Joined:
Oct 3, 2008
Messages:
648
Intereting.

These numbers are quite close to the ones on the other thread.

Tell me, does this formula count ALL legal Pokemon, even NFEs? It seems to me it would make sense to discount those NFEs (and perhaps some UUs) that are truly Never Used, as they only serve to dilute the pool.

Or maybe that's just me. Clearly a 100% decentralized metagame is not possible or even anywhere near probable. A better measure of centralization might be calculated by figuing out, of some predetermined number of "viable" Pokemon, how many uses each one gets.
3. ### X-Actnp: Biffy Clyro - Shock Shock

Joined:
Feb 17, 2006
Messages:
4,675
It depends on whether the NFEs are legal in that metagame or not. In UU, the NFEs aren't legal (so far), so they weren't taken into account. In the other metagames, they were legal, so they were taken into account.

A 100% decentralised metagame is obviously not possible, but from these numbers, it is obvious that OU is much more centralised than Ubers even though they have almost the same amount of legal Pokemon.

Also the formula in the other thread can be used as a quick, rough way of measuring centralisation, I guess.
4. ### Griffin

Joined:
Sep 12, 2008
Messages:
332
Pokémon in Little Cup (taken from here):

Ubers/Banned (5)
OU (31 + Murkrow)
BL (58)
UU/NU (75)

32 + 58 + 75 = 165 Pokémon allowed.
5. ### X-Actnp: Biffy Clyro - Shock Shock

Joined:
Feb 17, 2006
Messages:
4,675
Thank you. :) I'll edit the table in the original post shortly with Little Cup centralisation information.

EDIT: Updated.
6. ### Kataphraktoi

Joined:
Mar 21, 2006
Messages:
74
I havnt closely examined the workings of this formula, but thats almost hard to believe when looking at the uber usage stats and based on experience
7. ### cimhappiness is such hard work

Joined:
Jun 3, 2007
Messages:
5,413
I certainly hope people don't go "this month's number was bigger / smaller than before thus that pokemon centralized / decentralized"

Other than that, yay more data! Thanks!
8. ### eric the espeonmaybe I just misunderstood

Joined:
Aug 7, 2007
Messages:
3,694
Dam I wrote a massive thing detailing what I was trying to say and hit the "back" button by accident and it went...
But yea here again...

This is probably wrong but:
If you added a large number of useless and therefore virtually never used Pokemon to OU, would this formula give a much higher centralization figure, even if for the players the two metagames (OU and OU+useless pokemon) would seem almost exactly the same.

This may seem like an extreme example but when comparing OU and UU you see that in OU there is more levels of power below the top tier Pokemon. So the bottom of the barrel Pokemon from UU become nonviable, which is very similar to the situation outlined above. Basically even if OU and UU had the same number of OUs, the same "usage gradient" for a the most part (not at the lower end), and for the average player seemed equally centralized this formula would show OU as more centralized because less % of the legal Pokemon would be used.

This may be what you are aiming for but... IMO a better way would be to "chop off" the Pokemon below a certain % usage. This means that adding many useless Pokemon that do not really affect the environment for players would not mess with the centralization figures. It would also help when comparing things like LC to OU with very large difference in the number of Pokemon that are allowed.

Edit: hmm rereading your OP maybe that's not the best way for some odd situations but maybe there is something else?

Ok well its not as good as the first one, but rewriting stuff is frustrating for me.. What do you think?

And is there any way to account for the "statistical noise" that comes with a smaller player base (which causes fewer teams ect..)? Possibly with a separate figure from the standard centralization one.

Also I like the idea of this and hope you continue working to perfect the formula.
9. ### Antman

Joined:
Aug 7, 2008
Messages:
20
nice, but I'm confused at one part

Central = 100 x sqrt(Sum_i((U_i/U) - (1/P))^2)

Am i reading the formula wrong, or are you square rooting a square?

also on a side note, what is Sum_i?

I guess are you doing this

U-I - 1/P
sum_i of that
square that

or
u-I - 1/P
sqaure that
then do sum_i of that
10. ### petrie911

Joined:
Aug 27, 2005
Messages:
861
I think you mean

Central = 100 x sqrt(Sum_i((U_i/U - 1/P)^2))
11. ### X-Actnp: Biffy Clyro - Shock Shock

Joined:
Feb 17, 2006
Messages:
4,675
I'm doing the latter, but I wrote it incorrectly in the original postthis thread. Sorry for the confusion; it's been corrected now.

To be clear, here's what I'm doing:

1) For every Pokemon usage, divide it by U, subtract it by 1/P, and square the resulting number.
2) Sum up all these numbers together.
3) Take the square root.
4) Multiply by 100 and round it to 1 decimal place.

Indeed. Thanks, I corrected it.
12. ### FastHippo

Joined:
Dec 26, 2007
Messages:
153
Isn't OU less centralized? Doesn't zero equal no centralization? So, wouldn't a smaller value equals less centralization?
13. ### david stoneFast-moving, smart, sexy and alarming.

Joined:
Aug 3, 2005
Messages:
5,152
I think he just misspoke.

I agree with eric the espeon. Your method works for comparing changes in centralization within one tier (and thus with a relatively stable list of legal Pokemon), but I disagree with the methodology if it's to be used to compare between tiers.

Imagine a tier with 500 legal Pokemon. There are 60 Pokemon that make up the "OU" part of that tier, and the remaining 440 are not used at all. Imagine another tier with 100 legal Pokemon. There are 50 Pokemon that make up the "OU" part of that tier, and the remaining 50 are not used at all. Unless I've made a mistake (and I haven't actually calculated this out, but I picked numbers that ought to make this the case if I'm understanding your formula properly), the second tier would be less centralized than the first.

This is fine if we do not attempt to compare centralization between tiers. The problem is that this is exactly what people will (quite naturally) do. The first tier is more centralized, but is in fact more diverse. I think that it is actually diversity that's important, not any measure of centralization.
14. ### Luxormaniac

Joined:
Dec 1, 2008
Messages:
265
all this math makes my head hurt... and im a freaking math genius

so if centralization is the measure of a select fews usage as a proportion to the whole, why not do something like that? if something like the top 10 OU pokes are selected, their usage percentages added, and that divided by the total of the percentage use of all the other OU's, wouldnt that show basically how much those pokemon are used in relation to the whole?

C = 100(TopTenSum/OtherSum)
or use any other number, it doesnt have to be 10. 100 is to clean up decimals.

I havent really thought this through, but it makes sense to me. Comments?

ps. when x-act has math trouble, the smogon server crashes.

i should put that in my signature.
15. ### HipmonleeHave a rice day

Joined:
Dec 19, 2004
Messages:
7,376
Yeah, I am unsure about the value of including the total number of pokemon. It seems to be unnecessary information.

If you are using it to study the affect of banning or unbanning pokemon it seems to be an unnecessary bias toward having as few pokemon as possible. Whereas if you are using it to compare trends in a metagame that isnt having the total number of pokemon changed, then it is a constant and adds nothing.

Have a nice day.
16. ### X-Actnp: Biffy Clyro - Shock Shock

Joined:
Feb 17, 2006
Messages:
4,675
Okay, thanks for the feedback. Let me think more about this so that we get to a good method together.

From the feedback received, it seems like the first method I posted in the other thread is actually better than this?
17. ### NextDimension

Joined:
Dec 9, 2008
Messages:
6
i dont really understand it :\$
18. ### Luxormaniac

Joined:
Dec 1, 2008
Messages:
265
I dont quite understand the problems here, but...
Wouldnt the sum of the usage percentages of the top ten pokemon used for any given month be at least in a part of a formula? And if we're measuring how a few are overwhelmingly more popular than others, shouldnt we compare it to something?

Maybe just the usage percentage sum of the top 10 pokes for any month. If they were used very often, it would be higher.If only they were used, it would be 100%. If it is closer to about 20% (1 pokes usage = 1/~51? OU pokes?(im only taking this out of OU, not including others, like UU, but they are not used often in OU for a reason) =~2%, 2x10=20%) then there would be no centralization. if it got lower than that, or near it, it would probably be time to bump stuff up to OU.

So C = 1usagesum + 2usagesum... + 10usagesum

This is what we're measuring, right? A quantification of the recurring usage of common pokes?

I'd at least like the data to try this way...
In any case, i think a simpler formula is better.
Squares and square roots and multiplication? sheesh, too hard.
19. ### X-Actnp: Biffy Clyro - Shock Shock

Joined:
Feb 17, 2006
Messages:
4,675
Yeah that would be another way.

I think I'm going to revert to something similar to the first method posted in the other thread. I'm working on it at this very moment.
20. ### Luxormaniac

Joined:
Dec 1, 2008
Messages:
265
Looking at the old formula, i made a fictional example with 100 OU pokemon out of 500 total.

If 10 are frequently used, then C=500/100-10=500/90=5.56
if 50 are frequently used, then C=500/100-50-500/50=10

If the latter is less centralized, why does it have a higher number?

Perhaps a discussion on centralization formula philosophy might help us all.
I'm no Philosopher, go to shoddy if you want to talk to him.

~It could be some sort of joint variation (C=kxy)
~it could be an inverse variation (C=kx/y)
~It could be something totally unexpected (C=k(x^2-xy))

But i think we should go for some type of linear function or direct variation.
Then let us think what the variables are:
~we need some variable carrying data regarding common pokemon usage
~Then we need to decide whether to just weigh that against the whole 500 or just to OU 50.

Hopefully this will help someone come up with a formula.
But if we want C to go up as F(requently used) goes down, F should probably be on the bottom of some fraction. it could be as simple as
C=k/F

Or perhaps we need some advanced statistical knowledge involving standard deviation and whatnot. I didnt bring my mathbook so i cant help.
21. ### Jiggy-Ninja

Joined:
Jan 20, 2008
Messages:
401
The main beef I have with your other formulation is that it used 2 values that were attained by arbitrary means. OU was defined as Pokemon in roughly 1 in 20 teams, and frequently used as 1 in 4. I don't think there is any logical formulation you could use to derive those numbers, and with appropriate tweaking of the numbers you might be able to change how the metagame's centralizing numbers looked.

What about that thing you mentioned before that Pokemon usage closely follows a power function? Did that not work out as you had hoped?
22. ### X-Actnp: Biffy Clyro - Shock Shock

Joined:
Feb 17, 2006
Messages:
4,675
Okay, here's what I've been doing so far. I haven't yet found a formula, but this is still good enough to post.

I basically answered the question "how many teams do I need to look at in order for Pokemon X to have more than a 50% chance of appearing in at least one of them?" I worked this out for every Pokemon in the November list, and here are the results for the first 50 Pokemon in each metagame:

Code:
```Standard Ladder           Uber Ladder               UU Ladder
----------------------------------------------------------------------------
Scizor      2.537685298   Kyogre      1.061093162   Hitmontop   1.908564325
Heatran     2.628249958   Rayquaza    1.083068521   Clefable    2.173988733
Salamence   2.693460988   Groudon     1.159466521   Claydol     2.423209404
Tyranitar   3.466635525   Darkrai     1.443815643   Rotom       3.055073026
Zapdos      3.521405782   Dialga      1.534303703   Steelix     3.126797396
Blissey     3.86395347    Mewtwo      1.563613715   Ninetales   3.203655872
Gyarados    4.026390589   Palkia      1.723493733   Froslass    3.490355698
Lucario     4.032952016   Lugia       2.436619865   Venusaur    4.262107929
Infernape   4.233378538   Scizor      2.547991561   Absol       4.597109064
Gengar      4.317822228   Deoxys-e    2.589065218   Weezing     4.599791104
Metagross   4.553594368   Deoxys-f    3.155284931   Drifblim    5.281797513
Swampert    5.115591059   Giratina    3.850437172   Lanturn     5.772032138
Bronzong    5.402953579   Blissey     3.889977625   Altaria     5.897488981
Azelf       5.712249499   Latias      3.958602987   Toxicroak   6.197071657
Celebi      5.842242429   Mew         5.906335413   Mantine     6.650364688
Vaporeon    6.76588375    Metagross   6.287832176   Hitmonlee   6.748091827
Shaymin-s   6.787114413   Garchomp    6.80473042    Nidoking    6.784054663
Starmie     7.766964334   Bronzong    7.840761244   Swellow     6.928722296
Skarmory    7.935805322   Tyranitar   7.944675846   Blastoise   7.058321971
Suicune     7.9694465     Giratina-o  8.546574202   Hypno       7.177106615
Kingdra     8.165947352   Forretress  9.666912143   Leafeon     7.476313966
Machamp     8.533503864   Ho-oh       12.07704123   Glaceon     7.560839088
Jirachi     8.654240511   Manaphy     12.39990493   Drapion     7.814781663
Gliscor     8.699431812   Shedinja    12.79843381   Hitmonchan  8.964887075
Togekiss    8.885419169   Latios      13.03754819   Sharpedo    9.385614957
Rotom-h     9.368004491   Deoxys-l    14.63889693   Scyther     9.516680877
Snorlax     9.741070407   Shaymin-s   15.52784196   Primeape    9.602429283
Weavile     9.835937695   Wobbuffet   16.15056858   Miltank     9.700693601
Forretress  10.11114933   Jirachi     20.99342052   Kabutops    9.949258117
Electivire  10.21391429   Kingdra     22.22007185   Lapras      10.33978361
Mamoswine   10.32671058   Azelf       23.40318638   Poliwrath   10.40250624
Breloom     10.73751892   Celebi      28.66860493   Manectric   10.76780597
Hippowdon   10.84201733   Gengar      30.172124     Jumpluff    11.00898619
Tentacruel  11.26103919   Heatran     33.50490236   Probopass   11.10125223
Heracross   12.1205878    Infernape   40.84380797   Nidoqueen   11.11558012
Magnezone   12.33028754   Cloyster    44.27649745   Camerupt 11.15877958
Aerodactyl  13.41068378   Lucario     46.77548838   Lopunny     11.40985531
Rhyperior   13.45065463   Zapdos      47.93426935   Electrode   11.88887264
Flygon      13.48817075   Ninjask     49.1514752    Articuno    12.05453209
Dusknoir    13.68368609   Skarmory    54.70283226   Kangaskhan  13.28144762
Cresselia   13.70860115   Registeel   55.22218006   Persian     13.38357575
Dragonite   13.75939583   Salamence   56.84072921   Meganium    15.07023395
Jolteon     15.10485085   Heracross   57.40140283   Venomoth    15.17470246
Porygonz    15.87812346   Swampert    59.15138328   Linoone     15.29393398
Dugtrio     16.37408703   Starmie     61.01073622   Jynx        15.36096822
Rotom-w     16.74102048   Weavile     65.83669176   Grumpig     15.70494273
Ninjask     17.14478551   Magnezone   69.77624238   Golduck     16.12330888
Alakazam    18.03961269   Charizard   74.21446592   Omastar     17.02903071
Donphan     18.78762868   Gyarados    80.34285602   Shedinja    17.48551698
Empoleon    19.94265998   Electivire  88.90092666   Kingler     17.64305403```
Looking at the numbers for Standard and UU, they look similar. Then look at the Ubers numbers; they're very different from those of the other two. In particular, look at the top 10 or so Uber Pokemon and the bottom few Uber Pokemon compared to the others. I haven't translated these numbers into a single centralisation factor yet, but I'm sure that these are the numbers I need to start with.
23. ### HipmonleeHave a rice day

Joined:
Dec 19, 2004
Messages:
7,376
I'd like to see these numbers in graph form. I dont have any way of doing that myself, but I think it would be pretty useful..

I think intuitively we can say that Uber is an example of a metagame with a lot of centralisation. So the stats of Uber can be used as a guideline of what a more centralised metagame might look like when compared with OU or UU.

A decentralised metagame is one where the most used pokemon are not used much and the least used pokemon are used a lot. but at what point you want that changeover to be made is not clear. Like if you say "we want the top 8 pokes to not be used much but the next 7 pokes to be used lots" then obviously OU is a less centralised metagame than UU, but those figures are kinda arbitrary..

Have a nice day.
24. ### X-Actnp: Biffy Clyro - Shock Shock

Joined:
Feb 17, 2006
Messages:
4,675
Here they are:

This is exactly why my previous definition took the number of frequently-used Pokemon F and the number of OU Pokemon O into account. As I said there, intuitively I thought that a centralised metagame has a high F and a low O while a decentralised metagame has a low F and a high O... exactly what you're saying above.

If you look at the stats above, the Standard metagame's frequently-used Pokemon (those that have a number that is 4 or less according to my definition) aren't very numerous. In fact, the UU metagame has one more Pokemon than Standard that is frequently-used. Uber, though, has a lot of frequently-used Pokemon; suffice to say that Kyogre, Rayquaza and Groudon have almost a 50% chance of being in any one Uber team (their numbers are very close to 1).

I'm also going to say that diversity and centralisation aren't always related. I'd going to propose the following: let me quote your sentence again:

I'm going to split that sentence in two. I'm going to say that a decentralised metagame is one where the most used Pokemon are not used much (or, conversely, a centralised metagame is one where the most used Pokemon are used a lot), and a diverse metagame is one where the least used Pokemon are used a lot.

For example, it is clear from above that UU is the more diverse metagame of the three; however, Standard is the least centralised. This means that having a metagame that is more diverse doesn't necessarily mean that it is less centralised.

To this end, I'm going to propose that centralisation and diversity be defined as two separate things.
25. ### X-Actnp: Biffy Clyro - Shock Shock

Joined:
Feb 17, 2006
Messages:
4,675
I finally had a eureka moment!

I take back what I said earlier: centralisation and diversity ARE related with one another.

Suppose I look at 50 different teams of a particular metagame. The question is: how many different Pokemon would there be in those teams? If there are only 6 different Pokemon, then the metagame is the least diverse it can be (all teams are the same!). If there are 300 different Pokemon, then the metagame is the most diverse possible (all teams have different Pokemon). Numbers in between provide different levels of diversity.

We can view this question differently as follows. Suppose we look at a random team used in the Standard metagame, say. From which pool of Pokemon is it likely to be constructed?

I then remembered that I have already done this before: in the typical movesets algorithm. There, the question was: suppose we look at a random moveset of a Pokemon. From which pool of moves is it likely to be constructed? The only difference is that a moveset has 4 moves, while a Pokemon team has 6 Pokemon.

Hence we apply the same algorithm used for the movepool to the Pokemon usages, and count the moves used. The algorithm for 4 moves was to list the smallest number of moves whose sum of probabilities to be in a moveset exceeds 3. Here, we count (instead of list) the smallest number of Pokemon whose sum of probabilities to be used exceeds 5.

To do this, we first need to convert every Pokemon usage to a probability of it being in a team. This is easily done: divide the usage of every Pokemon by the total number of teams used. This is equal to the sum of all usages divided by 6, and hence the probability that Pokemon i with usage U_i belongs to a team is 6 x U_i / U, where U is the sum of all usages.

When I did this to the November Standard, Underused and Uber ladders, summed the probabilities up, stopping exactly when the sum exceeded 5, and counted the number of probabilities I summed up, the results were: