1. Welcome to Smogon Forums! Please take a minute to read the rules.

# Correlation Analysis of Pokemon Usages

Discussion in 'Pokémetrics' started by X-Act, May 17, 2009.

1. ### X-Actnp: Biffy Clyro - Shock Shock

Joined:
Feb 17, 2006
Messages:
4,675
Here is a further statistic that I like to crank out from time to time. This time it regards looking for correlation in Pokemon usages.

What is correlation, though? Two Pokemon's usages are said to correlate if the usage increase or decrease of one of them affects the usage increase or decrease of the other. There are two types of correlation: direct correlation and inverse correlation.

Two Pokemon's usages are said to exhibit direct correlation if, when the usage of the first Pokemon increases, the usage of the second one increases as well by roughly the same margin, and vice-versa. Two Pokemon's usages are said to exhibit inverse correlation if, when the usage of the first Pokemon increases, the usage of the second one decreases by roughly the same margin, and vice-versa.

In statistics, there is an important measure of correlation called Pearson's product-moment correlation coefficient. I didn't use this method of correlation, however; I used a simpler method. What I did was simply to find the difference of (i.e. subtract) the increase or decrease of a Pokemon's usage in the last 6 months from the increase or decrease in usage of the other Pokemon. If this subtraction is quite small, the Pokemon would exhibit usage correlation.

I analysed the top 100 Pokemon used in the Standard metagame and the top 50 Pokemon used in the Uber metagame. The reason why the UU metagame was not accounted for is that we don't have enough statistics for its metagame yet to formulate a good correlation analysis out of it.

Here are the results.

Standard Metagame Correlation:
Code:
```Abomasnow
Direct: Froslass, Moltres
Inverse: Rotom-H

Absol:
Direct: None
Inverse: None

Aerodactyl:
Direct: None
Inverse: Crobat, Forretress, Skarmory

Alakazam
Direct: Mismagius
Inverse: Umbreon

Ambipom
Direct: None
Inverse: None

Arcanine
Direct: Honchkrow, Registeel, Sceptile
Inverse: Dugtrio, Gardevoir, Magmortar

Azelf
Direct: Weavile
Inverse: None

Azumarill
Direct: Blastoise
Inverse: None

Blastoise
Direct: Azumarill, Porygon2
Inverse: None

Blaziken
Direct: Rampardos
Inverse: None

Blissey
Direct: None
Inverse: None

Breloom
Direct: None
Inverse: None

Bronzong
Direct: Heatran, Suicune
Inverse: None

Cacturne
Direct: Froslass
Inverse: Crobat

Celebi:
Direct: None
Inverse: None

Charizard
Direct: None
Inverse: None

Claydol
Direct: None
Inverse: None

Clefable
Direct: Walrein
Inverse: None

Cresselia
Direct: None
Inverse: None

Crobat
Direct: None
Inverse: Aerodactyl, Cacturne, Electrode, Kabutops

Donphan
Direct: None
Inverse: None

Dragonite
Direct: None
Inverse: None

Dugtrio
Direct: Magmortar
Inverse: Arcanine, Staraptor

Dusknoir
Direct: None
Inverse: None

Electivire:
Direct: None
Inverse: None

Electrode
Direct: Kabutops
Inverse: Crobat

Empoleon
Direct: Jolteon
Inverse: Rhyperior, Rotom-W

Espeon
Direct: None
Inverse: None

Flygon
Direct: None
Inverse: None

Forretress
Direct: Skarmory
Inverse: Aerodactyl

Froslass
Direct: Abomasnow, Cacturne
Inverse: None

Direct: None
Inverse: None

Gardevoir
Direct: Magmortar
Inverse: Arcanine

Gengar
Direct: Skarmory
Inverse: None

Glaceon
Direct: None
Inverse: None

Gliscor
Inverse: None

Direct: Starmie
Inverse: None

Heatran
Direct: Bronzong
Inverse: Infernape

Heracross
Direct: None
Inverse: None

Hippowdon
Direct: None
Inverse: None

Hitmontop
Direct: None
Inverse: None

Honchkrow
Direct: Arcanine, Registeel, Sceptile, Typhlosion
Inverse: None

Houndoom
Direct: None
Inverse: None

Infernape
Direct: None
Inverse: Heatran

Jirachi
Direct: None
Inverse: None

Jolteon
Direct: Empoleon
Inverse: Rotom-W, Zapdos

Kabutops
Direct: Electrode
Inverse: Crobat

Kingdra
Direct: None
Inverse: None

Lanturn
Direct: None
Inverse: None

Latias
Direct: None
Inverse: None

Lucario
Direct: None
Inverse: Magnezone

Ludicolo
Direct: Tangrowth

Machamp
Direct: None
Inverse: None

Magmortar
Direct: Dugtrio, Gardevoir, Medicham
Inverse: Arcanine

Magnezone
Direct: None
Inverse: Lucario

Mamoswine
Direct: None
Inverse: None

Medicham
Direct: Magmortar, Spiritomb
Inverse: None

Metagross
Direct: None
Inverse: None

Milotic
Direct: None
Inverse: None

Mismagius
Direct: Alakazam
Inverse: Umbreon

Moltres
Direct: Abomasnow
Inverse: None

Direct: Gliscor, Smeargle
Inverse: None

Porygon2
Direct: Blastoise, Sceptile
Inverse: None

Porygon-Z
Direct: None
Inverse: Ludicolo

Raikou
Direct: Rampardos
Inverse: None

Rampardos
Direct: Blaziken, Raikou
Inverse: None

Registeel
Direct: Arcanine, Honchkrow
Inverse: None

Rhyperior
Direct: None
Inverse: Empoleon

Direct: None
Inverse: Ludicolo, Rotom-C

Rotom-C
Direct: None

Rotom-H
Direct: None
Inverse: Abomasnow

Rotom-W
Direct: None
Inverse: Empoleon, Jolteon

Salamence
Direct: None
Inverse: None

Sceptile
Direct: Arcanine, Honchkrow, Porygon2
Inverse: None

Scizor
Direct: None
Inverse: None

Shaymin
Direct: None
Inverse: Snorlax, Vaporeon

Shuckle
Direct: Walrein
Inverse: None

Skarmory
Direct: Forretress, Gengar
Inverse: Aerodactyl

Slowbro
Direct: None
Inverse: None

Smeargle
Inverse: None

Snorlax
Direct: None
Inverse: Shaymin

Spiritomb
Direct: Medicham
Inverse: None

Staraptor
Direct: None
Inverse: Dugtrio

Starmie
Inverse: None

Suicune
Direct: Bronzong
Inverse: None

Swampert
Direct: None
Inverse: None

Tangrowth
Direct: Ludicolo
Inverse: None

Tentacruel
Direct: None
Inverse: None

Togekiss
Direct: None
Inverse: None

Torterra
Direct: None
Inverse: None

Typhlosion
Direct: Honchkrow
Inverse: None

Tyranitar
Direct: None
Inverse: None

Umbreon
Direct: None
Inverse: Alakazam, Mismagius, Weezing

Uxie
Direct: None
Inverse: None

Vaporeon
Direct: None
Inverse: Shaymin

Walrein
Direct: Shuckle, Clefable
Inverse: None

Weavile
Direct: Azelf
Inverse: None

Weezing
Direct: None
Inverse: Umbreon

Yanmega
Direct: None
Inverse: None

Zapdos
Direct: None
Inverse: Jolteon```
Uber Metagame Correlation:
Code:
```Blissey
Direct: None
Inverse: None

Bronzong
Direct: None
Inverse: None

Celebi
Direct: None
Inverse: None

Darkrai
Direct: None
Inverse: None

Deoxys-S
Direct: Groudon
Inverse: None

Deoxys-A
Direct: None
Inverse: None

Deoxys-D
Direct: Gengar
Inverse: None

Dialga
Direct: None
Inverse: None

Forretress
Direct: None
Inverse: None

Garchomp
Direct: None
Inverse: None

Gengar
Direct: Deoxys-D
Inverse: None

Giratina
Direct: None
Inverse: None

Giratina-O
Direct: Weavile
Inverse: Metagross

Groudon
Direct: Deoxys-S
Inverse: None

Direct: Manaphy
Inverse: None

Heatran
Direct: None
Inverse: None

Heracross
Direct: None
Inverse: None

Ho-oh
Direct: Quagsire
Inverse: Primeape, Toxicroak

Infernape
Direct: Manaphy
Inverse: None

Jirachi
Direct: None
Inverse: None

Jumpluff
Direct: None
Inverse: None

Kingdra
Direct: None
Inverse: None

Kyogre
Direct: None
Inverse: None

Latias
Direct: Lugia
Inverse: None

Latios
Direct: None
Inverse: None

Lucario
Direct: None
Inverse: None

Ludicolo
Direct: None
Inverse: None

Lugia
Direct: Latias
Inverse: None

Manaphy
Inverse: Primeape, Smeargle

Metagross
Direct: None
Inverse: Giratina-O

Mew
Direct: None
Inverse: None

Mewtwo
Direct: None
Inverse: None

Palkia
Direct: None
Inverse: None

Primeape
Direct: Smeargle
Inverse: Ho-oh, Manaphy

Quagsire
Direct: Ho-oh
Inverse: None

Rayquaza
Direct: None
Inverse: None

Registeel
Direct: None
Inverse: None

Direct: None
Inverse: None

Salamence
Direct: None
Inverse: None

Scizor
Direct: None
Inverse: None

Shaymin-S
Direct: None
Inverse: None

Shedinja
Direct: None
Inverse: None

Skarmory
Direct: None
Inverse: None

Smeargle
Direct: Primeape
Inverse: Manaphy

Starmie
Direct: None
Inverse: None

Toxicroak
Direct: None
Inverse: Ho-oh

Tyranitar
Direct: None
Inverse: None

Weavile
Direct: Giratina-O
Inverse: None

Wobbuffet
Direct: None
Inverse: None```
Finally, here are some graphs that confirm a few of the above correlations graphically:

Heatran and Bronzong exhibit direct correlation in the Standard metagame.​

Empoleon and Rhyperior exhibit inverse correlation in the Standard metagame.​

Jolteon and Zapdos exhibit inverse correlation in the Standard metagame.​

Manaphy and Infernape exhibit direct correlation in the Uber metagame.​

Deoxys-D and Gengar exhibit direct correlation in the Uber metagame.​

Metagross and Giratina-O exhibit inverse correlation in the Uber metagame.​
2. ### Seymor

Joined:
Apr 28, 2009
Messages:
15
Graphs could use axis labels.
3. ### eric the espeonmaybe I just misunderstood

Joined:
Aug 7, 2007
Messages:
3,694
Could you show how strong or otherwise the correlations are? This could allow your system to detect more slight correlations without exaggerating them.

Very cool stats, its interesting to see that because of the massive amount of sample data you can see correlations between some rare Pokemon than would only actually encounter each other in a small % of matches (like Umbreon and Zam).

It also seems that a lot of the Pokemon with noticeable correlations are leads, could you possibly do a lead correlation analysis?
4. ### X-Actnp: Biffy Clyro - Shock Shock

Joined:
Feb 17, 2006
Messages:
4,675
About axis labels: the vertical axis is percentage usage, while the horizontal axis is:

1 - Nov, 2 - Dec, 3 - Jan, 4 - Feb, 5 - Mar, 6 - Apr

The correlation formula indicates how much they correlate as a number. The nearer the number is to zero, the better the correlation is. I decided not to show any numbers so that I don't confuse the new Smogon user. I could provide the formulae though.

I might do a lead correlation analysis, yes - however, not in the near future. It didn't take me long to implement this (on Excel), but it did take me a good deal of time to perfect the formulae.
5. ### eric the espeonmaybe I just misunderstood

Joined:
Aug 7, 2007
Messages:
3,694
Then how about a "simple" table for the less math inclined (what you have done so far), as well as one with the correlation displayed as a number for those who want more in depth info (and more sensitive to slight correlations)?
6. ### X-Actnp: Biffy Clyro - Shock Shock

Joined:
Feb 17, 2006
Messages:
4,675
eric, I could just as well post the 25,000 numbers (that's twenty five thousand) that signify the correlation of each Pokemon with each other. There will basically be two 100x100 tables for OU, and two 50x50 tables for Ubers. I don't know if people _really_ want to browse through a 100x100 table of numbers, which is why I didn't post it (and why Excel is excellent for doing this).

I guess that if people want to see it, I'd just upload the Excel sheet for people to download if they feel like. I'll do that after I return home from work, though.
7. ### eric the espeonmaybe I just misunderstood

Joined:
Aug 7, 2007
Messages:
3,694
mm, that would be over the top (though I suppose some people may like it) but a more sensitive readout than given in the OP would be nice. I mean, looking at the top 50 Ubers a vast majority of them are:
Direct: None
Inverse: None
And the same can be said for a good portion of OU, when probably most of them have some weaker correlation with at least a few Pokemon that has not shown up. Making the tables less sensitive could mean that statistical noise creates some odd results, but so long as you can see how strong the correlation is you can judge for yourself which to pay the most attention to.
8. ### X-Actnp: Biffy Clyro - Shock Shock

Joined:
Feb 17, 2006
Messages:
4,675
That's why I might as well provide the correlation for every Pokemon with every Pokemon... because I don't know how much am I going to slacken the criterion to allow two Pokemon to show correlation.
9. ### eric the espeonmaybe I just misunderstood

Joined:
Aug 7, 2007
Messages:
3,694
The problem with just having the entire table could be that most Pokemon have a level of correlation with each other that is not separable from the statistical noise. The ideal situation would be to have a table of all correlations that you can set a "threshold" for notable correlation yourself, and it shows you all the correlations stronger than that value. But having you pick a single (more lenient) value and including the strength of correlation would be awesome.

Also, why so few comments? This seems like a pretty interesting set of data already for OU/Ubers players.
10. ### Arzamo

Joined:
Aug 15, 2007
Messages:
317
I find it strange that none of these are exhibiting the predator-prey relationships that occur in real-life ecology. If you have the time, X-Act, maybe you could account for some time delay for some of the obvious ones?

The statistics are interesting, but they make me want to form possibly nonexistent reasons for the correlations. The only reason I can right now come up with for the inverse correlations without time delay is that two Pokémon fill the same niche in the metagame, and that explanation only works for a few, e.g. Jolteon and Zapdos, Weezing and Umbreon, Electrode/Aerodactyl/Crobat. And even those only very roughly fill the same niche.
11. ### zarator^_^Moderator

Joined:
Mar 12, 2008
Messages:
5,006
It's all very interesting, although I think that - as every statistic should remember^^ - correlation is not causation. For example, there's little to no concept link (counter/teampartner and so on) between Blaziken and Rampardos IMO
12. ### Arzamo

Joined:
Aug 15, 2007
Messages:
317
Even though correlation is not necessarily causation, a true correlation always has some explanation (at least in the vast majority of cases). It might be the predator-prey relationship I discussed in my previous post, which is a direct causation, or it may just be a third, unknown Pokémon or set of Pokémon causing the correlation. The confounding factor - the third Pokémon - could also explain correlations between two Pokémon that fill the same niche.
13. ### X-Actnp: Biffy Clyro - Shock Shock

Joined:
Feb 17, 2006
Messages:
4,675
Okay, I'm going to post the criterion I used for two Pokemon to show 'enough' correlation.

Say Pokemon A has percentage usages a_1, a_2, a_3, ..., a_n from months 1, 2, 3, ..., n in chronological order, while Pokemon B has percentage usages b_1, b_2, b_3, ..., b_n. My definitions for direct and inverse correlation were:

Code:
```Direct Correlation = abs((a_1 + b_2) - (a_2 + b_1)) + abs((a_2 + b_3) - (a_3 + b_2)) + ... + abs((a_(n-1) + b_n) - (a_n + b_(n-1)))

Inverse Correlation = abs((a_2 + b_2) - (a_1 + b_1)) + abs((a_3 + b_3) - (a_2 + b_2)) + ... + abs((a_n + b_n) - (a_(n-1) + b_(n-1)))

where abs(x) is the unsigned value of x```
The nearer the correlation is to 0, the more correlation is exhibited. A value of 0 shows perfect correlation.

The average change between a month usage and the next, m_a, for Pokemon A is defined as follows:
Code:
`m_a = (abs(a_2 - a_1) + abs(a_3 - a_2) + ... + abs(a_n - a_(n-1))) / (n-1)`
Finally, Pokemon A was said to show 'enough' correlation with Pokemon B if
Code:
`[Direct/Inverse] Strong Correlation < max(m_a / 2, m_b / 2)`
Now what I could do to slacken the above definition is
Code:
`[Direct/Inverse] Weak Correlation < max(m_a, m_b)`
I'll try this and see what happens.
14. ### Imran

Joined:
May 3, 2008
Messages:
1,318
X-Act, you say that you did not provide a PMCC value. Obviously I do not want to pile more onto your plate, would it be too hard to generate a list of "correlation partners" for each Pokemon? If we could just find the |PMCC| values of each Pokemon with respect to the top 30 or 40 Pokemon in each tier, and then order them largest to smallest (obviously we would want to display the parity of each PMCC value in the list), this would give us an easier way to quantify how well each Pokemon correlates with another, and I believe would be very useful for analysis writing and suspect nominations. From what I remember, PMCC is just a function of three "varience type" calculations, but I do not know how complicated it would be to write a script that could calculate these and order them for you. It might be worth looking into, it would at the very least, mean that we wouldn't necessarily need this arbitary threshold value, and the PMCC is easy to interpret (being simply a value where |x|<1)
15. ### X-Actnp: Biffy Clyro - Shock Shock

Joined:
Feb 17, 2006
Messages:
4,675
First of all, PMCC is not simpler to interpret than my method. For PMCC, the nearer the number is to 1 or -1, the stronger the correlation, but where are you going to take the cut-off? Is it +/-0.9? +/-0.8?

Secondly, I did apply PMCC to the Pokemon usages, and found that the correlation generated wasn't as good as I wanted it to be. Basically, my idea of perfect correlation is: if a Pokemon increased by p% usage from one month to the next, the other Pokemon usage should increase or decrease by that same amount. I found that PMCC doesn't cater for this enough, so I discarded it.