Data Official Smogon University Usage Statistics Discussion Thread, mk.2

Status
Not open for further replies.

Antar

is a Battle Server Administratoris a Programmeris a Super Moderatoris a Community Contributor
Official Data Miner
#1

I've cut back on the number of baselines at which I'm calculating the stats, but it's still far too much to put into one thread, so the stats will continue to be on sim.smogon.com (see top link).

The previous rules still apply, namely:

NOTE: DISCUSSION IN THIS THREAD WILL BE LIMITED TO STATISTICS CALCULATIONS, CLARIFICATIONS AND OVERALL TRENDS. DISCUSSIONS OF INDIVIDUAL POKEMON WILL BE DELETED (each Pokemon has its own thread--discuss there). POSTS THAT SIMPLY QUOTE OR REFERENCE STATISTICS WITH PERSONAL COMMENTARY WILL BE DELETED. POSTS DISCUSSING HYPOTHETICAL LOWER TIERS WILL BE DELETED.
and I'll announce each month when the stats are "up."

Feel free to ask any questions you have about how things are calculated, but be sure to first check the FAQ directly below this post.

Enjoy, data junkies!

Link to previous stats discussion thread
 
Last edited:

Antar

is a Battle Server Administratoris a Programmeris a Super Moderatoris a Community Contributor
Official Data Miner
#2
Frequently Asked Questions
  1. I can't load that link!
    That stinks! You must be on a network that blocks port 8080. Unfortunately, there's nothing I can do about that right now. Eventually, we might move the stats to their own dedicated web server, and that will probably fix your problem.

  2. Where are the moveset stats / metagame stats / lead stats / mega stats / changes since last month?
    Respectively, in the "moveset," "metagame," "lead," "mega" and "changes" subfolders of each month.

  3. What's this business with "Raw" and "Real?"
    Jimera0, yeah--I should really include just a bit of text at the top of each Standard Stats post...
    • Usage % : Weighted
    • Raw: Unweighted
    • "Real": Only counts the Pokemon which actually appear in battle (Doubles not supported)
    The reason for the name "real" is historic--back when I first took over the stats and then the running of PO, only the Pokemon that appeared in battle were recorded in the logs, so there was no way to actually *get* the full team stats. When I modified PO to generate logs with full team info in them, we were left with a decision regarding which stats to use, and the argument was that counting only Pokemon appearing in battle was somewhat more legit, because that corresponded to actual, or "real" usage (that argument lost out in the end).
  4. How are usage stats weighted?
    Every player on Pokemon Showdown has a skill rating for each metagame they participate in. This rating--which is different from your ladder score--is calculated using an algorithm called Glicko and consists of an estimated skill value R and an uncertainty in that estimate RD. Based on these two values, we calculate the likelihood that a given player has a "true" skill value above a certain baseline (the conventional baseline was 1500, corresponding to the "average" player). For more about ratings, read here. For more about weightings, read here. Note that, starting with the May stats, if a player has an RD greater than 100, and the baseline is above 1500, then their team is not counted in the stats. Note further that it typically only takes about 5 or 6 battles to get one's RD below 100.

  5. How are tiers determined from usage?
    Tiers are based off a predictive algorithm designed to estimate how often a Pokemon will appear in the next month's usage statistics, based on the usage stats for the past three months (we update our standard tiers every three months). So we start by weighting the last three months' stats like this:
    Code:
    Three month usage= (20x last month + 3x month before that + 1x month before that)/24
    then the "OU" list for that metagame consists of all the Pokemon who appear on at least ~3.41% of teams, which is not as random a number as it might seem. Note that suspect tests are designed to move Pokemon into the Borderline ("BL") teams, which, like Ubers, are not based on usage statistics.

    As for which stats are used to determine the tiers, we're currently using a baseline of 1695 for OU, 1630 for all other tiers.

  6. Why does "Illuminate" sometimes show up in the abilities section of the moveset stats for Pokemon that can't have Illuminate as an ability?
    "Illuminate" is my placeholder for "no ability," or an ability that simply isn't recognized. This kind of situation happens when Showdown glitches out and (should be) exceedingly rare. Note that the nature equivalent is Hardy (though all five neutral natures are also aliased to Hardy) and the item equivalent is "nothing" (though that could also correspond to no item).

  7. What's the deal with the file names?
    You'll notice that for each tier and type of analysis, there are a bunch of of different files, most with names like uu-1630.0.txt. The first part of the filename is the tier, the second part is the weighting baseline (see (4)). If there's no number following the tier name, then the baseline is 1500. Also note that a baseline of 0.0 means that the stats are basically unweighted.

  8. How should I think about Baseline-0 vs. 1500 vs. 1630/1695 vs. 1760/1825 stats?
    • Baseline-0 (unweighted) stats represent everything in the format, no matter how lulzy the player or team. This is what you'd expect to encounter if we stopped doing matchmaking.
    • 1500 (no extension) stats represents what the average player in the metagame sees. Since Showdown's playerbase is more than just Smogonites, this is considerably "below" what the average person reading this thread sees.
    • 1630 (1695 for OU) stats represent "standard" stats, what the typical competitive player should see and be prepared for.
    • 1760 (1825 for OU) stats represent "1337" stats, what the best-of-the-best in the metagame are doing. To some extent, this is what all players should strive to be doing, but there are some Pokemon and strategies that are difficult to pull off and might require a greater amount of skill than the typical competitive player possesses.

  9. Why are the OU stats for 1695 and 1825 instead of for 1630 and 1760?
    OU, aka "Standard," is, well, our standard tier. It sees more battles than any other format and has the largest playerbase (second only to randbats). It also has the smallest fraction of "competitive players" of all non-random formats, due to its prominence and easy accesibility. Since our rating systems are percentile-based (that is, a rating of x roughly corresponds to being better than y% of the ladder, rather than indicating that the player is the nth best in the metagame), that means that it's a lot easier to get a rating of 1630 in OU than it is in UU or LC. Because of that, and because OU has a larger pool of battles to work with, we can up our baseline to 1695 for the "standard" stats. Similarly, while 1760 is the usual value we use for "elite" stats (the best of the best), the number that works better for OU is 1825.

  10. What's the best way to make use of the moveset stats?
    • If you're trying to figure out what's good in a tier (in terms of movesets), 1760/1825 is probably the way to go, since that tells you what the very top players use on their Pokemon.
    • If you want to determine what the likelihood is that your opponent's Pokemon carries X move or Y item, consult the moveset stats closest to your own Glicko R rating.
    • If you're having trouble dealing with a certain Pokemon and are looking for checks/counters, consult the 1500 (or even possibly the 0) stats: the lack of "1337"ness is vastly preferred to the sheer lack of data you encounter when you get that high.
  11. Can I perform my own analyses?
    Due to privacy concerns, I can't give you access to the raw logs, but if you have background with a programming language that can parse json, take a look in the "chaos" folder of each month's stats. Those files contain all the information used to generate the moveset statistics and include a lot more data than I could feasibly put into a file.
More to come!
 
Last edited:

The Immortal

They Don't Want None
is a member of the Site Staffis a Battle Server Administratoris a Smogon Social Media Contributoris a Community Leaderis a Programmeris a Live Chat Contributor Alumnusis a Tiering Contributor Alumnus
Other Metas Leader
#7
Would it be possible to combine NU alpha and NU beta? They were the same tier but for roughly half a month each.
 

Antar

is a Battle Server Administratoris a Programmeris a Super Moderatoris a Community Contributor
Official Data Miner
#8
Why are you showing the 1825 ou stats instead of the 1760 ones
Could have sworn I said this somewhere: treat 1630 stats as "standard" stats and 1760 stats as "1337" stats for all metagames, except OU where replace those numbers with 1695 and 1825, respectively. Edit: Added as entries (8) and (9) of the FAQ.

which are what actually affect the tiers?
Which UU/RU stats are used for tiering?
tennisace, thanks. Also said this in the OP: 1695 for OU, 1630 for everything else.
Would it be possible to combine NU alpha and NU beta? They were the same tier but for roughly half a month each.
Yeah... yeah... (PITA)
 
Last edited:
#9
Question about the JSON files in chaos - for checks & counters, you have an array of 3 numbers for each pokemon. What do they represent?
 

Antar

is a Battle Server Administratoris a Programmeris a Super Moderatoris a Community Contributor
Official Data Miner
#10
QxC4eva,
  • Number n of times the matchup occurred (don't count U-Turn KOs or force-outs)
  • The fraction p of times the counter got the KO or caused a switch
  • Standard deviation for the previous value: sqrt(p*(1.0-p)/n)
 
#11
In that case, what method was used to calculate the separate percentages for KO and switch out?

Also curious about the metric for teammates - in ou-1695 it says one of aegislash's teammate is thundurus with a score of +4.865%. What does that mean in context?
 

Antar

is a Battle Server Administratoris a Programmeris a Super Moderatoris a Community Contributor
Official Data Miner
#12
In that case, what method was used to calculate the separate percentages for KO and switch out?
Well *I* know how many were KOs vs. switch-outs. It was just an oversight that I didn't provide it in the json. I'll try to remember to rectify that before the next update (might as well provide all the matchup data).

Also curious about the metric for teammates - in ou-1695 it says one of aegislash's teammate is thundurus with a score of +4.865%. What does that mean in context?
P(X|Y)-P(X). If Thundurus' overall usage was, say, 10%, then that means, of all times that had Aegislash, 14.865% of them had Thundurus.
 

Antar

is a Battle Server Administratoris a Programmeris a Super Moderatoris a Community Contributor
Official Data Miner
#14
QxC4eva, I have a whole thread about Checks vs. Counters. This would be a good thing to add to that discussion. Keep in mind, though: the more specific the definition, the less data is available. Case in point: very few Pokemon have C&C lists in the 1825 stats. That's simply because there were so few battles between two players with Glicko R above 1825.
 
#15
First of all, this is super cool and I'm glad all this data is available.

Secondly, I have a few questions about the data in the chaos folder.

1. Can't seem to figure out the scale of the number values for items/moves/abilities/etc. For instance, in items, the highest usage value I see might be 0.8ish for an uncommonly used 'mon like Lunatone, whereas for something like M-Kangaskhan (talking VGC here), I see Kangaskhanite has a value about 800. I assume that's because there are far more instances of M-Kangaskhan being used than Lunatone, but I can't figure out the exact way this is being calculated.

2. For the raw count, I assume that's how many times that Pokemon has been used in battle. If both players have the same Pokemon are their team, will that number be incremented once (because a battle includes that Pokemon), or twice (because that Pokemon is being used by two different people)?

3. I'm looking to get this data into a Java program I've been working on. Anyone have any sort of experience with that? I didn't know about json until I started looking into this, and I'd love to ask them how they went about it.
 

Antar

is a Battle Server Administratoris a Programmeris a Super Moderatoris a Community Contributor
Official Data Miner
#16
1. Can't seem to figure out the scale of the number values for items/moves/abilities/etc.
See item (4) of the FAQ (2nd post). Short answer: values are weighted
2. For the raw count, I assume that's how many times that Pokemon has been used in battle. If both players have the same Pokemon are their team, will that number be incremented once (because a battle includes that Pokemon), or twice (because that Pokemon is being used by two different people)?
See item (3) of the FAQ. Short answer: yes, twice.
3. I'm looking to get this data into a Java program I've been working on. Anyone have any sort of experience with that? I didn't know about json until I started looking into this, and I'd love to ask them how they went about it.
Gson is the alpha and the omega of json parsing in Java.
 
#17
Thanks for the quick answers. Managed to get everything working using Gson like you said, thanks a ton for that.

So for the weighed values, I just want to make sure I get this.
Let's say I'm looking at the value for Gale Wings on Talonflame in 1825 OU, which is approximately 2746. I also look at the value for Guts on Heracross, also in 1825 OU, about 359.
Does this mean I'm 7.6ish times more likely to encounter Talonflame with Gale Wings than I am a Heracross with Guts in a battle in that tier?

If that is the case, what would I have to do to change those two numbers to give me the percentage of that Pokemon with that ability? (So, for Talonflame, the 2746 would become something damn well near 100%, or 1). Would I just add the numbers for Gale Wings and Flame Body, and divide the number for Gale Wings by the total? How would this work for moves, where the total will be roughly (exactly?) four times higher than whatever total I'd be looking for?

Another question: Why does it seem like only some Pokemon are listed in the teammates section? Are they just not far away enough from the normal value? (Wasn't it like 10%? I remember reading about it somewhere else, can't recall now).
 

Antar

is a Battle Server Administratoris a Programmeris a Super Moderatoris a Community Contributor
Official Data Miner
#18
Let's say I'm looking at the value for Gale Wings on Talonflame in 1825 OU, which is approximately 2746. I also look at the value for Guts on Heracross, also in 1825 OU, about 359.
Does this mean I'm 7.6ish times more likely to encounter Talonflame with Gale Wings than I am a Heracross with Guts in a battle in that tier?
If your Glicko rating is at-or-above 1825, yes.

Would I just add the numbers for Gale Wings and Flame Body, and divide the number for Gale Wings by the total?
Yes.

How would this work for moves, where the total will be roughly (exactly?) four times higher than whatever total I'd be looking for?
Sum the abilities to get the total.

Another question: Why does it seem like only some Pokemon are listed in the teammates section?
They should all be there. You do mean in the json, right, rather than in the txt file?
 
#19
Another one for you Antar. What do the json teammate numbers mean? They are so different to the ones on your reports I cannot make sense of them. Aegislash teammates ou-1695:

JSON
Thundurus: 2684.1450260005295
Keldeo: 1540.784457932271

Usage
Aegislash: 21.73938%
Thundurus: 18.45916%
Keldeo: 15.26924%

TXT
Thundurus: +4.865%
Keldeo: +3.954%

I think the question is how you find what P(X|Y) and P(X) is
 

Antar

is a Battle Server Administratoris a Programmeris a Super Moderatoris a Community Contributor
Official Data Miner
#20
#22
I had a go parsing the new ou-1825 json. The only error I got was that Smeargle has "magicbounce" as one of its moves... but that aside, it seems the format was exactly the same as last month's. Did you include the extra matchup data that you said you would?
 

Antar

is a Battle Server Administratoris a Programmeris a Super Moderatoris a Community Contributor
Official Data Miner
#23
QxC4eva, when did I say I'd include any extra data? Unless I'm thinking of something else, the only thing I ever promised was to "think about" what you proposed. As for "magicbounce" being a move, that's a PS bug, pure and simple.
 
#24
Oh, not the lure data lol. This:
In that case, what method was used to calculate the separate percentages for KO and switch out?
Well *I* know how many were KOs vs. switch-outs. It was just an oversight that I didn't provide it in the json. I'll try to remember to rectify that before the next update (might as well provide all the matchup data).
The C&C outcomes are still crammed into a single number.
 

The Immortal

They Don't Want None
is a member of the Site Staffis a Battle Server Administratoris a Smogon Social Media Contributoris a Community Leaderis a Programmeris a Live Chat Contributor Alumnusis a Tiering Contributor Alumnus
Other Metas Leader
#25
For formats that cut down Pokemon, such as Battle Spot Singles, do the stats include all the Pokemon or only those selected?
 
Status
Not open for further replies.