1. New to the forums? Check out our Mentorship Program!
    Our mentors will answer your questions and help you become a part of the community!
  2. Welcome to Smogon Forums! Please take a minute to read the rules.

Data Official Smogon University Simulator Statistics — January 2014

Discussion in 'Competitive Discussion' started by Antar, Feb 1, 2014.

Thread Status:
Not open for further replies.
  1. Jimera0

    Jimera0 You don't understand, Edgar is the one in the hole!

    Joined:
    Jul 24, 2010
    Messages:
    1,685
    While I'm sure you've probably posted explanations before Antar, I feel that we could use some explanation of what the the "real" vs. "raw" percentages and the like mean, and an explanation of why they're included and the like in the OP of the topic.

    I know this has almost certainly been addressed before, but it's somewhat difficult to track down exactly where this was explained. Hell, if you can find a post where it was explained satisfactorily you could just link to it in the OP and call it a day. I feel this would help a lot of people (including myself :P) to better understand what all the numbers mean and how to interpret them, and also stop some people from thinking that the fact that the columns don't all match up means there are errors in the data collection.
  2. Antar

    Antar Self-anointed Czar of LC UU
    is a Battle Server Administratoris a Programmeris a Super Moderatoris a Community Contributor
    Official Data Miner

    Joined:
    Feb 17, 2010
    Messages:
    3,168
    Counted as a KO.

    Jimera0, yeah--I should really include just a bit of text at the top of each Standard Stats post...
    • Usage % : Weighted
    • Raw: Unweighted
    • "Real": Only counts the Pokemon which actually appear in battle (Doubles not supported)
    The reason for the name "real" is historic--back when I first took over the stats and then the running of PO, only the Pokemon that appeared in battle were recorded in the logs, so there was no way to actually *get* the full team stats. When I modified PO to generate logs with full team info in them, we were left with a decision regarding which stats to use, and the argument was that counting only Pokemon appearing in battle was somewhat more legit, because that corresponded to actual, or "real" usage (that argument lost out in the end).
  3. Calm_Mind_Latias

    Calm_Mind_Latias

    Joined:
    Aug 20, 2013
    Messages:
    431
    Thank you for the weighting link as I did not bother to read it before, and it was rather straightforward. That explains so much about the weighing of a Pokemon's usage in tiering, but this seems to depend on what is the "average player" and that trolls who perform worse in their battles relative to the average player are given fewer weight than more skilled players.

    A standard score of 1 is a standard score 1. But what I meant was that something based on one's relative ranking depends on the composition of people in that sample. The presence of trolls certainly do affect the mean skill of the ladder by depressing it (and skill is not something with a hard quantity that possesses a true zero and can be easily measured, like height, but something that can be normalized). For the SAT, there are fewer "trolls" who take it just to score in the 200s on the subtests for the obvious reasons, and it possesses a natural mechanism to exclude "trolls" without any statistical filters. Pokemon battling does not have a strong disincentive to discourage trolling, and trolls can collectively influence the definition of "average".

    How does the system prevent trolls from influencing the definition of the average player? Although I doubt trolls have much to do with the highly kurtotic distributions at the high end on the old system.

    Edit: I do not think this post belongs here since it does not concern metagame trends or usage. But I still believe it is a legitimate question.
    Last edited: Feb 3, 2014
  4. Antar

    Antar Self-anointed Czar of LC UU
    is a Battle Server Administratoris a Programmeris a Super Moderatoris a Community Contributor
    Official Data Miner

    Joined:
    Feb 17, 2010
    Messages:
    3,168
    Eh, it's fine.

    Regarding trolls, it's true that much of what we've built is based on the assumption that players like to win, but I haven't noticed the rating distribution being at all lopsided. I've actually been pleasantly surprised by how Gaussian everything looks.

    Unresized image (open)


    [​IMG]


    This is the distribution of Elo* ratings for the OU ladder for most of January, IIRC. It's borked around 1000 because most alts on PS are only associated with 1 or 2 games (I think 80% of all alts only play one game). But other than that, it's pretty good. The right tail is a little heavier, which makes sense, since players are more likely to "reset" a low alt than a high one, but even that's not totally skewed.

    So bottom line: I get that trolls *could* be a problem, but given that we don't see any peaks on this distribution around extremely low ratings, I think we're okay.

    *This is not the Elo rating currently deployed on PS. This is Elo calculated strictly using the standard Elo formula, with a K factor of 50, no modifications, no hacks, no nothing.
    Calm_Mind_Latias likes this.
  5. Shinji Mimura

    Shinji Mimura

    Joined:
    Jul 9, 2010
    Messages:
    62
    So what is the list of tier changes? I think I only heard Jirachi, Landorus, and Terrakion to UU, and I think Kangaskhan.

    Anyone else?
  6. Calm_Mind_Latias

    Calm_Mind_Latias

    Joined:
    Aug 20, 2013
    Messages:
    431
    Does this explain the the counterintuitive data that Genesect doesn't "check" or "counter" anything in OU, except Pinsir? Obvious Latios seems to be checked by it, since (Scarf) Genesect has two moves it can choose from to KO it.
  7. Antar

    Antar Self-anointed Czar of LC UU
    is a Battle Server Administratoris a Programmeris a Super Moderatoris a Community Contributor
    Official Data Miner

    Joined:
    Feb 17, 2010
    Messages:
    3,168
  8. migetno1

    migetno1 bRMT Developer
    is a Programmer

    Joined:
    Apr 17, 2012
    Messages:
    36
    I've formatted the OU moveset statistics into a more accessible format at http://sweepercalc.com/stats/

    I'll add in Ubers / UU / VGC when I get some free time.
  9. ArcFurnace

    ArcFurnace

    Joined:
    Dec 20, 2013
    Messages:
    12
    I was wondering, how is the data for spreads (nature and EVs) stored in the raw data? I ask because for quite a few Pokemon, they have a diversity of spreads, so listing only the most common ones winds up with 50% or more of the spreads listed under "other". Obviously you can't display every spread, but would it be possible to display, say, the top 2-4 most common natures (with percentages) separately from the EVs, so that you can at least get an idea of which natures are most popular (and by what margin)?
  10. Calm_Mind_Latias

    Calm_Mind_Latias

    Joined:
    Aug 20, 2013
    Messages:
    431
    Well, I think the most important stat people are interested in are the amount of speed investment a Pokemon has received.

    It would be helpful if there was some data that shows how a given Pokemon's speed is distributed among the players. People may use spreads to speed creep other Pokemon that try to speed creep it.

    One, for instance, may run a Landorus-T that creeps 44 Speed Rotom-W (just 8 EVs needed) (and 44 Speed Rotom-W doesn't show up in the stats). And some Rotom-W might try to creep that before it U-Turns out by putting 8 extra EVs. I do not think this type of spreads would show up in the usage statistics.

    It also makes you wonder how many let's say, (Mega) Scizor, are trying to outspeed minimum speed Heatran (and/or Rotom-W before it is burned) to hit it with Superpower. It seems to be a worthy investment if one wants to get through Heatran (and/or Rotom-W), or lose momentum by manually switching out. This doesn't seem to show up on the statistics either.
    Last edited: Feb 13, 2014
  11. ArcFurnace

    ArcFurnace

    Joined:
    Dec 20, 2013
    Messages:
    12
    You are correct that it wouldn't show Speed investment; I will admit I wasn't thinking of that at the time. My ulterior motive is that I breed Pokemon in-game, so it helps to know which natures to give them (since natures can't be adjusted after hatching, while EV spreads can be set later). Obviously the situation is different for people using simulators. I'm not sure how you would want to set up the display of spreads if you wanted to focus on Speed, especially since EVs are much more finely adjustable. Trying to ensure you displayed the majority of levels of investment seems like it might require a lot more slots.
    Last edited: Feb 13, 2014
  12. migetno1

    migetno1 bRMT Developer
    is a Programmer

    Joined:
    Apr 17, 2012
    Messages:
    36
    As far as I know, the json data has ALL the spreads used if the player meets the 1500 cutoff. This leads to common pokemon like Rotom-Wash having about 10000 different spreads listed with many of them having a count under 5. If you wanted to, you could parse this data to get a percentage of the natures used.
  13. Antar

    Antar Self-anointed Czar of LC UU
    is a Battle Server Administratoris a Programmeris a Super Moderatoris a Community Contributor
    Official Data Miner

    Joined:
    Feb 17, 2010
    Messages:
    3,168
    ArcFurnace -- yes, the data for spreads is collected by counting the occurrence of each and every individual spread. migetno1, the 1500 cutoff is not a "hard" cutoff. See my Weighting FAQ for more details.

    One note: if the Pokemon's spread contains useless EVs (255 EVs in one stat, improperly optimized LC spreads), my scripts round that down and bin it with the equivalent spread that contains no useless EVs.

    Calm_Mind_Latias -- it's at the top of my "to-do" list to start generating "speed tier" info from usage data (throwing in Choice Scarf and speed-boosting moves as well). But I just haven't had the time recently.
  14. Leer

    Leer

    Joined:
    Jan 1, 2014
    Messages:
    351
    Quick question, not sure if it's been answered before, but are unrated battles (challenges) counted?
  15. Antar

    Antar Self-anointed Czar of LC UU
    is a Battle Server Administratoris a Programmeris a Super Moderatoris a Community Contributor
    Official Data Miner

    Joined:
    Feb 17, 2010
    Messages:
    3,168
    Unrated battles aren't even logged, so no.
  16. ArcFurnace

    ArcFurnace

    Joined:
    Dec 20, 2013
    Messages:
    12
    Working on analyzing the raw data myself to get the information I want. It's actually going pretty well (hooray for Python), but I have a question about the data format- in the raw data, each unique spread for a Pokemon is paired with a number. What exactly does that number represent? I was assuming it was something along the lines of "number of times this spread appeared", but there has to be something else adjusting it, since it's not necessarily an integer and if you add them all up it doesn't add up to the 'Raw count' variable for that Pokemon. Is it being adjusted by the weighting function intended to reduce the impact of bad players on the stats?
  17. Leer

    Leer

    Joined:
    Jan 1, 2014
    Messages:
    351
    Oh duh, should have known '~'

    Also this has been a good reminder of why I really need to start learning Python...
  18. Antar

    Antar Self-anointed Czar of LC UU
    is a Battle Server Administratoris a Programmeris a Super Moderatoris a Community Contributor
    Official Data Miner

    Joined:
    Feb 17, 2010
    Messages:
    3,168
  19. ArcFurnace

    ArcFurnace

    Joined:
    Dec 20, 2013
    Messages:
    12
    It's working. Excellent.

    [​IMG]

    Python code (open)

    Code:
    import json
    # Used for analyzing data posted by Antar from Smogon simulators
    # Server address: http://sim.smogon.com:8080/Stats/
    file = input('Which file do you want to analyze?\n')
    f = open(file)
    a = json.load(f)
    
    # Data becomes a nested dictionary
    # First layer keys are 'info', 'data'
    b = a['data']  
    
    # Second layer keys are Pokemon names (capitalized)
    name = input('Which Pokemon do you want to analyze?\n')
    
    # Third layer keys are 'Abilities', 'Checks and Counters', 'Items', 'Moves', 
    # 'Raw count', 'Spreads', 'Teammates'
    
    pkmn = b[name]
    temp = pkmn['Spreads']
    # In the Spreads dictionary for a specific Pokemon, every single unique spread
    # is a key, and the weighted count is the number associated with it.
    
    naturestats = dict(Adamant=0, Bashful=0, Bold=0, Brave=0, Calm=0, Careful=0,
    Docile=0, Gentle=0, Hardy=0, Hasty=0, Impish=0, Jolly=0, Lax=0, Lonely=0, 
    Mild=0, Modest=0, Naive=0, Naughty=0, Quiet=0, Quirky=0, Rash=0, Relaxed=0, 
    Sassy=0, Serious=0, Timid=0)
    total = 0
    
    for spread, count in temp.items():
        total += count
        for nature in naturestats.keys():
            if nature in spread:
                naturestats[nature] += count
    
    for nature in naturestats.keys():
        naturestats[nature] /= total
    
    print('Nature usage (5% or greater):')
    
    for nature, percent in naturestats.items():
        if percent > 0.05:
            print(nature, '({0:.1f}%)'.format(percent*100))
    f.close()
    

    You'll need Python installed (this was created in Python 3.3). Save the code as a .py file, put it in a folder with the .json file you want to analyze, and run it from a command line window. No error handling, though, so make sure you spell things right.
    Last edited: Feb 15, 2014
  20. Agent Gibbs

    Agent Gibbs

    Joined:
    Dec 8, 2011
    Messages:
    840
    Hey Antar, I noticed that there are no usage stats for the Gen 3 OU ladder, which is the only old gen missing. Is there any way we could get those stats as well?
    TRC likes this.
  21. Antar

    Antar Self-anointed Czar of LC UU
    is a Battle Server Administratoris a Programmeris a Super Moderatoris a Community Contributor
    Official Data Miner

    Joined:
    Feb 17, 2010
    Messages:
    3,168
    Gen III didn't get a ladder on PS until Feb. 12...
  22. Agent Gibbs

    Agent Gibbs

    Joined:
    Dec 8, 2011
    Messages:
    840
    Oh, ok. I never thought to look for it until yesterday, so when I saw it, I just assumed it had been there for a while. My mistake!
  23. MicfiJasan

    MicfiJasan

    Joined:
    Nov 15, 2012
    Messages:
    135
    So I'm starting to look a bit deeper into the Checks and Counters data courtesy of the json files and I did borrow some of Antar's source code. The first thing my limited python knowledge managed to find me was what I'm calling "Average Opposing Success Rate". What this means, to oversimplify a bit, is what is the chance something good happens to the opponent when we have Pokemon X out. It's probably a bit easier to explain with actual numbers, so I'll get that up right below. These are taken from the top 100 most used Pokemon in OU last month.

    AOSR (open)

    1. Pinsir: 32.6606536624097
    2. Mawile: 32.6844481742917
    3. Manaphy: 34.3630632235175
    4. Heracross: 34.4424965021286
    5. Charizard: 35.0592024325037
    6. Medicham: 35.3023614950572
    7. Volcarona: 35.4067400910503
    8. Lucario: 35.4455291037521
    9. Gyarados: 35.5116863785033
    10. Conkeldurr: 36.516907127894
    11. Dragonite: 36.5545320428452
    12. Aegislash: 36.5681975927158
    13. Kyurem-Black: 37.8309717680315
    14. Bisharp: 38.3649220193829
    15. Kingdra: 38.531349203473
    16. Venusaur: 38.5553733987965
    17. Clefable: 38.8988327548322
    18. Garchomp: 39.1048053132335
    19. Breloom: 39.2098243163858
    20. Cloyster: 39.3413409294272
    21. Talonflame: 39.3922228568415
    22. Gardevoir: 39.6711615064359
    23. Alakazam: 39.9964141451897
    24. Gengar: 40.1997603794041
    25. Reuniclus: 40.2864221128222
    26. Crawdaunt: 40.6154542915553
    27. Salamence: 40.6454368190726
    28. Azumarill: 40.6689531224642
    29. Haxorus: 40.7513280330215
    30. Scizor: 41.2739257364425
    31. Keldeo: 41.4610637679135
    32. Greninja: 41.7516802360431
    33. Weavile: 42.5671333255041
    34. Landorus: 42.8043943463543
    35. Slowbro: 42.8043943463543
    36. Togekiss: 43.5658724202397
    37. Porygon2: 43.9430431099078
    38. Arcanine: 44.1548236761597
    39. Sableye: 44.3135961436258
    40. Gliscor: 44.3913692085438
    41. Metagross: 44.450457754148
    42. Terrakion: 44.4600453571302
    43. Blastoise: 44.5511315724735
    44. Latios: 44.7306280229763
    45. Mamoswine: 44.930485065953
    46. Hydreigon: 44.9459549417092
    47. Chandelure: 45.0101588925653
    48. Nidoking: 45.0129339855979
    49. Ferrothorn: 45.4970226366418
    50. Genesect: 45.6367708095554
    51. Sylveon: 45.7302719438663
    52. Infernape: 45.8951290756273
    53. Diggersby: 46.3248129428848
    54. Absol: 46.3320133131536
    55. Umbreon: 46.3743233324173
    56. Excadrill: 46.3759120080044
    57. Darmanitan: 46.4111232917662
    58. Heatran: 46.5177171463897
    59. Goodra: 46.6600627629875
    60. Zapdos: 46.6995220785531
    61. Thundurus: 46.9785487566663
    62. Thundurus-Therian: 47.4919887437474
    63. Trevenant: 47.498523525822
    64. Florges: 47.9143122421163
    65. Noivern: 47.9300544465411
    66. Quagsire: 48.0237927022329
    67. Whimsicott: 48.0444210975334
    68. Starmie: 48.1152167905302
    69. Aggron: 48.1960393308203
    70. Ditto: 48.3003550165925
    71. Latias: 48.3643931242995
    72. Tyranitar: 48.366004291485
    73. Manectric: 48.3852945812972
    74. Jirachi: 48.6828542642543
    75. Vaporeon: 49.0304566322915
    76. Klefki: 49.2993117538896
    77. Espeon: 49.3014704588863
    78. Mandibuzz: 49.3193762553036
    79. Gastrodon: 49.6502828137244
    80. Jellicent: 49.7605682005731
    81. Ambipom: 49.7643480320047
    82. Chansey: 50.6178222346396
    83. Skarmory: 50.9443778369248
    84. Magnezone: 51.5004764249148
    85. Celebi: 51.807078744494
    86. Ninetales: 52.0344370841584
    87. Blissey: 52.1084252145068
    88. Jolteon: 52.1130142076991
    89. Crobat: 53.1164262429457
    90. Donphan: 53.3124666216968
    91. Landorus-Therian: 53.3148566332438
    92. Tentacruel: 54.225819632441
    93. Rotom-Wash: 54.3354877933639
    94. Politoed: 55.0057811770138
    95. Deoxys-Speed: 55.543091182007
    96. Deoxys-Defense: 57.0743333619611
    97. Galvantula: 57.5611122966849
    98. Scolipede: 58.5615741449645
    99. Forretress: 61.3480515649239
    100. Smeargle: 63.4775826385627


    Let's start at the top of the list with Pinsir, with an AOSR of about 32.66. That means that once he got out on the field, the opponent only gained an advantage (Pinsir switched or was KOed) 32.66% of the time. Compare that with Smeargle, who gave the opponent an advantage nearly twice as often.

    As for my first impressions, there are a lot of Megas high on the list. Half of the Pokemon with scores below 40 had a Mega Evolution available to them. If Gamefreak wanted these guys to be the powerhouses of their teams, they certainly succeeded. You may notice Rotom-W and Landorus-T being near the bottom of this list, which is strange for the super-standard bulky momentum core they are. My theory is that this is due to Antar's list counting U-Turn and Volt Switch as a switch out. Still, in theory this would only affect the number of positive outcomes for the U-Turner/Volt Switcher, since the opponent likely checks them if they stay in to tank the moves anyway.

    Some caveats/other observations about this data:
    • These numbers were taken from the Checks and Counters data, which coincidentally looked at what happened when a Pokemon was switched out or KOed. Thus, Pokemon who are often used as suicide leads, like Smeargle, Galvantula, and the Deoxys formes, will have innately worse scores than the rest, regardless of their ability to support the team. If you don't understand anything I'm saying, peruse through here. Antar explains the situations that make up this data quite well.
    • Generally, stallier Pokemon have worse scores. I would guess it is due to the offensive nature of the meta putting heavy pressure on stall teams, although Venusaur, who seems to be the best wall right now, has a good score for a defensive poke.
    • Chansey is currently performing about 2% better than Blissey.
    • I'm only calling it AOSR because it was the first thing that came to mind and I didn't want to keep writing "Average Opposing Success Rate" a bunch of times, so if anyone can think of something shorter/leads to a better acronym, I'll implement that as well.
    • I'm eventually trying to build up the Python knowledge to have a weighted average success rate for the Pokemon itself rather than its counters. You can approximate the unweighted version by subtracting these numbers from 100, but it will actually be slightly less due to double downs/double switches making up a small amount of these scores.
    • Most importantly, and perhaps the biggest problem, is that these stats don't weight by usage, they just need to pass the minimum encounters to get counted equally. I chose this because CRE has generally favored Pokemon lower in usage, simply because the higher deviation of their counters means the CREs of their counters will be lower. In addition, while the crap pokes lose more due to high deviation, they don't actually weight much at all. If anything, they'll attribute more to pokes who aren't matched up a lot. I'll apply the weighting when/if I get good enough at Python to do so.
    Jukain, TRC, Antar and 2 others like this.
  24. Antar

    Antar Self-anointed Czar of LC UU
    is a Battle Server Administratoris a Programmeris a Super Moderatoris a Community Contributor
    Official Data Miner

    Joined:
    Feb 17, 2010
    Messages:
    3,168
    I can give you the "Encounter Matrix" if you want it. That's the comprehensive table of what happens when X faces off with Y and should yield better results.
  25. asbdsp

    asbdsp

    Joined:
    Mar 8, 2013
    Messages:
    123
    Swagplay ought to be counted as a thing in the metagame analysis section. Just saying.
    Kingpoleon and Antar like this.
Thread Status:
Not open for further replies.

Users Viewing Thread (Users: 0, Guests: 0)