1. Welcome to Smogon Forums! Please take a minute to read the rules.
  2. Click here to ensure that you never miss a new SmogonU video upload!

Data Official Smogon University Usage Statistics Discussion Thread, mk.3

Discussion in 'Smogon Metagames' started by Antar, Jan 5, 2017.

  1. kobo1d

    kobo1d

    Joined:
    May 30, 2007
    Messages:
    41
    If sampled stats are the proposed resolution, would they at least be published?
  2. david0895

    david0895

    Joined:
    Jun 3, 2015
    Messages:
    528
    So, what is going to happen this month? Will we get sampled stats or we will wait the next month?
    avocado likes this.
  3. Zarel

    Zarel Not a Yuyuko fan
    is a member of the Site Staffis a Battle Server Administratoris a Programmeris a Pokemon Researcheris an Administrator
    Creator of PS

    Joined:
    Aug 16, 2011
    Messages:
    3,589
    I have no clue. I'm gathering May stats on a new server and everything is normal so far.

    I'm not usually involved in any of the rest of the process. I suspect Antar is busy with lots of stuff so it's possible we'd end up with no tier shifts for multiple months?
    Hilomilo likes this.
  4. Mijzelffan

    Mijzelffan

    Joined:
    Sep 27, 2010
    Messages:
    108
    Are the april stats even that important if we have the may stats? I'd personally advocate just considering april the month that never was and continue tiering using the other months' stats as normal.
  5. Dussky

    Dussky formerly UB-013

    Joined:
    Jan 21, 2017
    Messages:
    12
    When you assume these tiers will be out? The reason we need new tiers monthly is so we can establish every tier. We should have NU atm, but we dont.. :/
  6. Darvin

    Darvin

    Joined:
    Mar 3, 2015
    Messages:
    212
    Antar's last post on the matter indicated that he's currently working on extracting the stats from the old server now that the new servers are taking off the load. I'd imagine as soon as he's done and has stats to post the tiers can be created in very short order, but until such time it's one of those "it'll be done when it's done" sort of things.
  7. cosine180

    cosine180

    Joined:
    Sep 5, 2014
    Messages:
    179
    Is this data scraped from replays or is it based on the full teams people load when they start a battle?
  8. Zarel

    Zarel Not a Yuyuko fan
    is a member of the Site Staffis a Battle Server Administratoris a Programmeris a Pokemon Researcheris an Administrator
    Creator of PS

    Joined:
    Aug 16, 2011
    Messages:
    3,589
    The data is compiled from private logs, including team data not available in replays.
  9. cosine180

    cosine180

    Joined:
    Sep 5, 2014
    Messages:
    179
    Zarel Is there any way to get access to the raw sets? The fact that the stats are provided as independent features leaves out a lot of interesting information. I'm trying to build a Bayesian Network that can, for instance, determine the probability of Landorus-Therian having Stealth Rocks given that it's Rocky Helmet.
  10. kobo1d

    kobo1d

    Joined:
    May 30, 2007
    Messages:
    41
    I would also be interested in said stats, because there are a lot of really cool tools that could be created with a little bit of data science knowledge, but AFAIK the restricted level of data released is a Smogon policy decision. With complete information, it would be trivial to scout individual players teams before or during a match and could theoretically have a negative impact on team-building creativity.
  11. cosine180

    cosine180

    Joined:
    Sep 5, 2014
    Messages:
    179
    I am only interested in individual Pokemon's stats, not full teams. Also, usernames wouldn't have to be included.
    Sleepless likes this.
  12. Zarel

    Zarel Not a Yuyuko fan
    is a member of the Site Staffis a Battle Server Administratoris a Programmeris a Pokemon Researcheris an Administrator
    Creator of PS

    Joined:
    Aug 16, 2011
    Messages:
    3,589
    Antar would be the one to ask about that sort of thing.
  13. Antar

    Antar
    is a Battle Server Administratoris a Programmeris a Super Moderatoris a Community Contributor
    Official Data Miner

    Joined:
    Feb 17, 2010
    Messages:
    3,885
    March stats are now up. As I understand the plan, March / April stats will not be used for tiering updates -- we'll just use May's once the month is over.

    If you spot any problems, please let me know (post here or send me a VM).
  14. Antar

    Antar
    is a Battle Server Administratoris a Programmeris a Super Moderatoris a Community Contributor
    Official Data Miner

    Joined:
    Feb 17, 2010
    Messages:
    3,885
    cosine180, kobo1d, no. I'm not making that data available. I have granted special access in the past for projects such as yours, but I'm not able to devote time to supporting such projects at current.
    NeverUsedTier and Sir Kay like this.
  15. Pearl

    Pearl
    is a Tournament Director Alumnusis a Site Staff Alumnusis a Forum Moderator Alumnusis a Community Contributor Alumnusis a Tiering Contributor Alumnusis a Contributor Alumnus

    Joined:
    Apr 4, 2010
    Messages:
    1,644
  16. G-Luke

    G-Luke We Eat Losers
    is a Pre-Contributor

    Joined:
    Mar 22, 2015
    Messages:
    2,101
    I thought a combination of March and May stats would have been used.
  17. Antar

    Antar
    is a Battle Server Administratoris a Programmeris a Super Moderatoris a Community Contributor
    Official Data Miner

    Joined:
    Feb 17, 2010
    Messages:
    3,885
    G-Luke, I guess the ideal situation would be to generate banlists based on March/April/May like we used to, but I really doubt April will be done in time.

    I'm cool using March/May or Feb/March/May, but the point is, no one is proposing we update tiers 10 days before June based on March alone.
    Ernesto and G-Luke like this.
  18. Machineae

    Machineae

    Joined:
    Aug 26, 2011
    Messages:
    321
    With new servers n such is it expected that stats will be available earlier in the month? Obviously theres still all the same info it has to process but maybe new server=faster server?
  19. Antar

    Antar
    is a Battle Server Administratoris a Programmeris a Super Moderatoris a Community Contributor
    Official Data Miner

    Joined:
    Feb 17, 2010
    Messages:
    3,885
    Disjunction likes this.
  20. david0895

    david0895

    Joined:
    Jun 3, 2015
    Messages:
    528
    Is your pc that takes the data from the server or is the server that elaborate them itself?
  21. Antar

    Antar
    is a Battle Server Administratoris a Programmeris a Super Moderatoris a Community Contributor
    Official Data Miner

    Joined:
    Feb 17, 2010
    Messages:
    3,885
    All data processing happens on the server. The amount of time it would take to copy the files to my own machine would completely cancel out any performance gains from running on a dedicated machine. Not to mention the costs (for both PS and myself) of the additional bandwidth.
    david0895 likes this.
  22. TheCtes

    TheCtes

    Joined:
    Mar 13, 2017
    Messages:
    26
    I believe it makes the most senes to just use the stats for May. Although one of them was earlier this month, two Pokemon have been banned since March and that does lead to some changes in viability and such. Not to mention just using the May stats means we have tiers that more accurately reflect the current metagame.

    Just giving my 2 cents.
  23. John Philmore

    John Philmore

    Joined:
    May 30, 2017
    Messages:
    1
    Ok, I hate to do this, especially since this question has probably been posted before and I hate to be annoying, BUT...

    I'm building a program to help people build competitive teams. Just trying to compact all of the information into something that a normal person can handle :). So I'm using the JSON files in chaos files, and that has been extremely helpful. BUT some of these numbers I just don't get. So let me try to sum up my questions as much as possible:

    1. Every Pokemon has an abilities object, and each ability has a number, presumably to tell how popular that ability is. However, I don't understand where these numbers come from. Sure, they're weighted, but by what? The Pokémon's raw count? Well that doesn't make sense, cause the calculation results then dont make sense. The usage then? That doesn't make sense either, unless I'm not seeing it.

    2. Every Pokémon has a moves object, and each move in that object has, again, a number. Now I read something about that from a post earlier in this thread, but it still evades my comprehension. Again, these numbers are weighted, but how and probably more importantly, WHY?

    3. Usage is a funny one. A number between 0 and 1, probably used to represent the amount of times that Pokémon A was used in actual battle. However, Antar mentioned that actual usage was based on gathering information from abilities. I did that, and the calculations I did were WAY off. Am I doing something wrong then?

    4. Teammates. You've probably heard this one before. Now I've read that its equal to P(X|Y)-P(Y), which explains the occasional negative data. However, for P(Y), would I then use usage or raw count?

    I'm sorry if these questions have been answered before, but these questions have been bugging me for a while now and I want to understand.
  24. david0895

    david0895

    Joined:
    Jun 3, 2015
    Messages:
    528
    Antar how is the data extraction going? Do you know if we have to wait some days or something like a week?
  25. Martin

    Martin Like the hole in a doughnut
    is a Forum Moderatoris a Live Chat Contributor
    Moderator

    Joined:
    Jan 9, 2013
    Messages:
    6,638
    We usually got stats between 5-10 days into a month in the past, so it'll probably be around the same on the new servers
    Carfer97 and Reviloja753 like this.

Users Viewing Thread (Users: 0, Guests: 0)