1. Welcome to Smogon Forums! Please take a minute to read the rules.
  2. Click here to ensure that you never miss a new SmogonU video upload!

Data Official Smogon University Usage Statistics Discussion Thread, mk.2

Discussion in 'Smogon Metagames' started by Antar, Jun 4, 2014.

Thread Status:
Not open for further replies.
  1. Antar

    Antar
    is a Battle Server Administratoris a Programmeris a Super Moderatoris a Community Contributor
    Official Data Miner

    Joined:
    Feb 17, 2010
    Messages:
    3,885

    I've cut back on the number of baselines at which I'm calculating the stats, but it's still far too much to put into one thread, so the stats will continue to be on sim.smogon.com (see top link).

    The previous rules still apply, namely:

    and I'll announce each month when the stats are "up."

    Feel free to ask any questions you have about how things are calculated, but be sure to first check the FAQ directly below this post.

    Enjoy, data junkies!

    Link to previous stats discussion thread
    Last edited: Dec 1, 2014
    Swede, Moonclawz and Kit Kasai like this.
  2. Antar

    Antar
    is a Battle Server Administratoris a Programmeris a Super Moderatoris a Community Contributor
    Official Data Miner

    Joined:
    Feb 17, 2010
    Messages:
    3,885
    Frequently Asked Questions
    1. I can't load that link!
      That stinks! You must be on a network that blocks port 8080. Unfortunately, there's nothing I can do about that right now. Eventually, we might move the stats to their own dedicated web server, and that will probably fix your problem.

    2. Where are the moveset stats / metagame stats / lead stats / mega stats / changes since last month?
      Respectively, in the "moveset," "metagame," "lead," "mega" and "changes" subfolders of each month.

    3. What's this business with "Raw" and "Real?"
    4. How are usage stats weighted?
      Every player on Pokemon Showdown has a skill rating for each metagame they participate in. This rating--which is different from your ladder score--is calculated using an algorithm called Glicko and consists of an estimated skill value R and an uncertainty in that estimate RD. Based on these two values, we calculate the likelihood that a given player has a "true" skill value above a certain baseline (the conventional baseline was 1500, corresponding to the "average" player). For more about ratings, read here. For more about weightings, read here. Note that, starting with the May stats, if a player has an RD greater than 100, and the baseline is above 1500, then their team is not counted in the stats. Note further that it typically only takes about 5 or 6 battles to get one's RD below 100.

    5. How are tiers determined from usage?
      Tiers are based off a predictive algorithm designed to estimate how often a Pokemon will appear in the next month's usage statistics, based on the usage stats for the past three months (we update our standard tiers every three months). So we start by weighting the last three months' stats like this:
      Code:
      Three month usage= (20x last month + 3x month before that + 1x month before that)/24
      then the "OU" list for that metagame consists of all the Pokemon who appear on at least ~3.41% of teams, which is not as random a number as it might seem. Note that suspect tests are designed to move Pokemon into the Borderline ("BL") teams, which, like Ubers, are not based on usage statistics.

      As for which stats are used to determine the tiers, we're currently using a baseline of 1695 for OU, 1630 for all other tiers.

    6. Why does "Illuminate" sometimes show up in the abilities section of the moveset stats for Pokemon that can't have Illuminate as an ability?
      "Illuminate" is my placeholder for "no ability," or an ability that simply isn't recognized. This kind of situation happens when Showdown glitches out and (should be) exceedingly rare. Note that the nature equivalent is Hardy (though all five neutral natures are also aliased to Hardy) and the item equivalent is "nothing" (though that could also correspond to no item).

    7. What's the deal with the file names?
      You'll notice that for each tier and type of analysis, there are a bunch of of different files, most with names like uu-1630.0.txt. The first part of the filename is the tier, the second part is the weighting baseline (see (4)). If there's no number following the tier name, then the baseline is 1500. Also note that a baseline of 0.0 means that the stats are basically unweighted.

    8. How should I think about Baseline-0 vs. 1500 vs. 1630/1695 vs. 1760/1825 stats?
      • Baseline-0 (unweighted) stats represent everything in the format, no matter how lulzy the player or team. This is what you'd expect to encounter if we stopped doing matchmaking.
      • 1500 (no extension) stats represents what the average player in the metagame sees. Since Showdown's playerbase is more than just Smogonites, this is considerably "below" what the average person reading this thread sees.
      • 1630 (1695 for OU) stats represent "standard" stats, what the typical competitive player should see and be prepared for.
      • 1760 (1825 for OU) stats represent "1337" stats, what the best-of-the-best in the metagame are doing. To some extent, this is what all players should strive to be doing, but there are some Pokemon and strategies that are difficult to pull off and might require a greater amount of skill than the typical competitive player possesses.

    9. Why are the OU stats for 1695 and 1825 instead of for 1630 and 1760?
      OU, aka "Standard," is, well, our standard tier. It sees more battles than any other format and has the largest playerbase (second only to randbats). It also has the smallest fraction of "competitive players" of all non-random formats, due to its prominence and easy accesibility. Since our rating systems are percentile-based (that is, a rating of x roughly corresponds to being better than y% of the ladder, rather than indicating that the player is the nth best in the metagame), that means that it's a lot easier to get a rating of 1630 in OU than it is in UU or LC. Because of that, and because OU has a larger pool of battles to work with, we can up our baseline to 1695 for the "standard" stats. Similarly, while 1760 is the usual value we use for "elite" stats (the best of the best), the number that works better for OU is 1825.

    10. What's the best way to make use of the moveset stats?
      • If you're trying to figure out what's good in a tier (in terms of movesets), 1760/1825 is probably the way to go, since that tells you what the very top players use on their Pokemon.
      • If you want to determine what the likelihood is that your opponent's Pokemon carries X move or Y item, consult the moveset stats closest to your own Glicko R rating.
      • If you're having trouble dealing with a certain Pokemon and are looking for checks/counters, consult the 1500 (or even possibly the 0) stats: the lack of "1337"ness is vastly preferred to the sheer lack of data you encounter when you get that high.
    11. Can I perform my own analyses?
      Due to privacy concerns, I can't give you access to the raw logs, but if you have background with a programming language that can parse json, take a look in the "chaos" folder of each month's stats. Those files contain all the information used to generate the moveset statistics and include a lot more data than I could feasibly put into a file.
    More to come!
    Last edited: Jun 4, 2014
    bludz, Swede, MikeDawg and 1 other person like this.
  3. Antar

    Antar
    is a Battle Server Administratoris a Programmeris a Super Moderatoris a Community Contributor
    Official Data Miner

    Joined:
    Feb 17, 2010
    Messages:
    3,885
    As you might have guessed, May stats are now online.

    Note that the "changes since last month" likely look pretty drastic. That's because I changed the way that stats are calculated for baselines above 1500.

    It's a bit like comparing apples and oranges, but I put them online anyway.
  4. Delibird is Amazing

    Delibird is Amazing

    Joined:
    Jun 14, 2013
    Messages:
    421
    Why are you showing the 1825 ou stats instead of the 1760 ones, which are what actually affect the tiers?
  5. avocado

    avocado call me mr. worldwide
    is a member of the Site Staffis a Forum Moderatoris a Community Contributoris a Tiering Contributoris a Contributor to Smogon
    Moderator

    Joined:
    Jul 17, 2010
    Messages:
    4,027
    Which UU/RU stats are used for tiering?
  6. tennisace

    tennisace cardiac cats
    is a member of the Site Staffis a Community Contributoris a Contributor to Smogonis an Administratoris a Smogon Social Media Contributor Alumnusis a Researcher Alumnusis a CAP Contributor Alumnusis a Tiering Contributor Alumnusis a Smogon Media Contributor Alumnus
    Administrator

    Joined:
    Dec 16, 2007
    Messages:
    8,400
    Antar likes this.
  7. The Immortal

    The Immortal They Don't Want None
    is a member of the Site Staffis a Battle Server Administratoris a Smogon Social Media Contributoris a Programmeris a Forum Moderatoris a Live Chat Contributor Alumnusis a Tiering Contributor Alumnus
    Other Metas Leader

    Joined:
    Sep 27, 2010
    Messages:
    5,905
    Would it be possible to combine NU alpha and NU beta? They were the same tier but for roughly half a month each.
  8. Antar

    Antar
    is a Battle Server Administratoris a Programmeris a Super Moderatoris a Community Contributor
    Official Data Miner

    Joined:
    Feb 17, 2010
    Messages:
    3,885
    Could have sworn I said this somewhere: treat 1630 stats as "standard" stats and 1760 stats as "1337" stats for all metagames, except OU where replace those numbers with 1695 and 1825, respectively. Edit: Added as entries (8) and (9) of the FAQ.

    tennisace, thanks. Also said this in the OP: 1695 for OU, 1630 for everything else.
    Yeah... yeah... (PITA)
    Last edited: Jun 4, 2014
    tennisace and avocado like this.
  9. QxC4eva

    QxC4eva
    is a Pre-Contributor

    Joined:
    Feb 21, 2014
    Messages:
    273
    Question about the JSON files in chaos - for checks & counters, you have an array of 3 numbers for each pokemon. What do they represent?
  10. Antar

    Antar
    is a Battle Server Administratoris a Programmeris a Super Moderatoris a Community Contributor
    Official Data Miner

    Joined:
    Feb 17, 2010
    Messages:
    3,885
    QxC4eva,
    • Number n of times the matchup occurred (don't count U-Turn KOs or force-outs)
    • The fraction p of times the counter got the KO or caused a switch
    • Standard deviation for the previous value: sqrt(p*(1.0-p)/n)
    QxC4eva likes this.
  11. QxC4eva

    QxC4eva
    is a Pre-Contributor

    Joined:
    Feb 21, 2014
    Messages:
    273
    In that case, what method was used to calculate the separate percentages for KO and switch out?

    Also curious about the metric for teammates - in ou-1695 it says one of aegislash's teammate is thundurus with a score of +4.865%. What does that mean in context?
  12. Antar

    Antar
    is a Battle Server Administratoris a Programmeris a Super Moderatoris a Community Contributor
    Official Data Miner

    Joined:
    Feb 17, 2010
    Messages:
    3,885
    Well *I* know how many were KOs vs. switch-outs. It was just an oversight that I didn't provide it in the json. I'll try to remember to rectify that before the next update (might as well provide all the matchup data).

    P(X|Y)-P(X). If Thundurus' overall usage was, say, 10%, then that means, of all times that had Aegislash, 14.865% of them had Thundurus.
  13. Antar

    Antar
    is a Battle Server Administratoris a Programmeris a Super Moderatoris a Community Contributor
    Official Data Miner

    Joined:
    Feb 17, 2010
    Messages:
    3,885
    It's done. I just called the combined stats "NU" (which is such bad form, since it's not an official tier, but oh well).
    The Immortal likes this.
  14. Antar

    Antar
    is a Battle Server Administratoris a Programmeris a Super Moderatoris a Community Contributor
    Official Data Miner

    Joined:
    Feb 17, 2010
    Messages:
    3,885
    QxC4eva, I have a whole thread about Checks vs. Counters. This would be a good thing to add to that discussion. Keep in mind, though: the more specific the definition, the less data is available. Case in point: very few Pokemon have C&C lists in the 1825 stats. That's simply because there were so few battles between two players with Glicko R above 1825.
  15. Lightning Storm

    Lightning Storm

    Joined:
    Aug 3, 2008
    Messages:
    191
    First of all, this is super cool and I'm glad all this data is available.

    Secondly, I have a few questions about the data in the chaos folder.

    1. Can't seem to figure out the scale of the number values for items/moves/abilities/etc. For instance, in items, the highest usage value I see might be 0.8ish for an uncommonly used 'mon like Lunatone, whereas for something like M-Kangaskhan (talking VGC here), I see Kangaskhanite has a value about 800. I assume that's because there are far more instances of M-Kangaskhan being used than Lunatone, but I can't figure out the exact way this is being calculated.

    2. For the raw count, I assume that's how many times that Pokemon has been used in battle. If both players have the same Pokemon are their team, will that number be incremented once (because a battle includes that Pokemon), or twice (because that Pokemon is being used by two different people)?

    3. I'm looking to get this data into a Java program I've been working on. Anyone have any sort of experience with that? I didn't know about json until I started looking into this, and I'd love to ask them how they went about it.
  16. Antar

    Antar
    is a Battle Server Administratoris a Programmeris a Super Moderatoris a Community Contributor
    Official Data Miner

    Joined:
    Feb 17, 2010
    Messages:
    3,885
    See item (4) of the FAQ (2nd post). Short answer: values are weighted
    See item (3) of the FAQ. Short answer: yes, twice.
    Gson is the alpha and the omega of json parsing in Java.
  17. Lightning Storm

    Lightning Storm

    Joined:
    Aug 3, 2008
    Messages:
    191
    Thanks for the quick answers. Managed to get everything working using Gson like you said, thanks a ton for that.

    So for the weighed values, I just want to make sure I get this.
    Let's say I'm looking at the value for Gale Wings on Talonflame in 1825 OU, which is approximately 2746. I also look at the value for Guts on Heracross, also in 1825 OU, about 359.
    Does this mean I'm 7.6ish times more likely to encounter Talonflame with Gale Wings than I am a Heracross with Guts in a battle in that tier?

    If that is the case, what would I have to do to change those two numbers to give me the percentage of that Pokemon with that ability? (So, for Talonflame, the 2746 would become something damn well near 100%, or 1). Would I just add the numbers for Gale Wings and Flame Body, and divide the number for Gale Wings by the total? How would this work for moves, where the total will be roughly (exactly?) four times higher than whatever total I'd be looking for?

    Another question: Why does it seem like only some Pokemon are listed in the teammates section? Are they just not far away enough from the normal value? (Wasn't it like 10%? I remember reading about it somewhere else, can't recall now).
  18. Antar

    Antar
    is a Battle Server Administratoris a Programmeris a Super Moderatoris a Community Contributor
    Official Data Miner

    Joined:
    Feb 17, 2010
    Messages:
    3,885
    If your Glicko rating is at-or-above 1825, yes.

    Yes.

    Sum the abilities to get the total.

    They should all be there. You do mean in the json, right, rather than in the txt file?
  19. QxC4eva

    QxC4eva
    is a Pre-Contributor

    Joined:
    Feb 21, 2014
    Messages:
    273
    Another one for you Antar. What do the json teammate numbers mean? They are so different to the ones on your reports I cannot make sense of them. Aegislash teammates ou-1695:

    JSON
    Thundurus: 2684.1450260005295
    Keldeo: 1540.784457932271

    Usage
    Aegislash: 21.73938%
    Thundurus: 18.45916%
    Keldeo: 15.26924%

    TXT
    Thundurus: +4.865%
    Keldeo: +3.954%

    I think the question is how you find what P(X|Y) and P(X) is
  20. Antar

    Antar
    is a Battle Server Administratoris a Programmeris a Super Moderatoris a Community Contributor
    Official Data Miner

    Joined:
    Feb 17, 2010
    Messages:
    3,885
    QxC4eva, the number in the JSON is the number in the TXT times Aegislash's usage:

    .04865 x 55176 = 2684

    (55176 is the 1695 usage count for Aegislash, which is most easily calculable by taking the sum of the Abilities values)

    P(X|Y) would be that value in the TXT plus the usage percentage. (So P(Thundurus|Aegislash)=23.324%).

    And P(X) is just the usage %.
  21. Antar

    Antar
    is a Battle Server Administratoris a Programmeris a Super Moderatoris a Community Contributor
    Official Data Miner

    Joined:
    Feb 17, 2010
    Messages:
    3,885
  22. QxC4eva

    QxC4eva
    is a Pre-Contributor

    Joined:
    Feb 21, 2014
    Messages:
    273
    I had a go parsing the new ou-1825 json. The only error I got was that Smeargle has "magicbounce" as one of its moves... but that aside, it seems the format was exactly the same as last month's. Did you include the extra matchup data that you said you would?
  23. Antar

    Antar
    is a Battle Server Administratoris a Programmeris a Super Moderatoris a Community Contributor
    Official Data Miner

    Joined:
    Feb 17, 2010
    Messages:
    3,885
    QxC4eva, when did I say I'd include any extra data? Unless I'm thinking of something else, the only thing I ever promised was to "think about" what you proposed. As for "magicbounce" being a move, that's a PS bug, pure and simple.
  24. QxC4eva

    QxC4eva
    is a Pre-Contributor

    Joined:
    Feb 21, 2014
    Messages:
    273
    Oh, not the lure data lol. This:
    The C&C outcomes are still crammed into a single number.
    Antar likes this.
  25. The Immortal

    The Immortal They Don't Want None
    is a member of the Site Staffis a Battle Server Administratoris a Smogon Social Media Contributoris a Programmeris a Forum Moderatoris a Live Chat Contributor Alumnusis a Tiering Contributor Alumnus
    Other Metas Leader

    Joined:
    Sep 27, 2010
    Messages:
    5,905
    For formats that cut down Pokemon, such as Battle Spot Singles, do the stats include all the Pokemon or only those selected?
Thread Status:
Not open for further replies.

Users Viewing Thread (Users: 0, Guests: 0)