Data Official Smogon University Usage Statistics Discussion Thread, mk.3

Zarel

Not a Yuyuko fan
is a member of the Site Staffis a Battle Server Administratoris a Programmeris a Pokemon Researcheris an Administrator
Creator of PS
#53
I have no clue. I'm gathering May stats on a new server and everything is normal so far.

I'm not usually involved in any of the rest of the process. I suspect Antar is busy with lots of stuff so it's possible we'd end up with no tier shifts for multiple months?
 

Dussky

formerly UB-013
#55
When you assume these tiers will be out? The reason we need new tiers monthly is so we can establish every tier. We should have NU atm, but we dont.. :/
 
#60
Zarel Is there any way to get access to the raw sets?
I would also be interested in said stats, because there are a lot of really cool tools that could be created with a little bit of data science knowledge, but AFAIK the restricted level of data released is a Smogon policy decision. With complete information, it would be trivial to scout individual players teams before or during a match and could theoretically have a negative impact on team-building creativity.
 
#61
I would also be interested in said stats, because there are a lot of really cool tools that could be created with a little bit of data science knowledge, but AFAIK the restricted level of data released is a Smogon policy decision. With complete information, it would be trivial to scout individual players teams before or during a match and could theoretically have a negative impact on team-building creativity.
I am only interested in individual Pokemon's stats, not full teams. Also, usernames wouldn't have to be included.
 

Zarel

Not a Yuyuko fan
is a member of the Site Staffis a Battle Server Administratoris a Programmeris a Pokemon Researcheris an Administrator
Creator of PS
#62
Zarel Is there any way to get access to the raw sets? The fact that the stats are provided as independent features leaves out a lot of interesting information. I'm trying to build a Bayesian Network that can, for instance, determine the probability of Landorus-Therian having Stealth Rocks given that it's Rocky Helmet.
I would also be interested in said stats, because there are a lot of really cool tools that could be created with a little bit of data science knowledge, but AFAIK the restricted level of data released is a Smogon policy decision. With complete information, it would be trivial to scout individual players teams before or during a match and could theoretically have a negative impact on team-building creativity.
I am only interested in individual Pokemon's stats, not full teams. Also, usernames wouldn't have to be included.
Antar would be the one to ask about that sort of thing.
 

Antar

is a Battle Server Administratoris a Programmeris a Super Moderatoris a Community Contributor
Official Data Miner
#67
G-Luke, I guess the ideal situation would be to generate banlists based on March/April/May like we used to, but I really doubt April will be done in time.

I'm cool using March/May or Feb/March/May, but the point is, no one is proposing we update tiers 10 days before June based on March alone.
 
#68
With new servers n such is it expected that stats will be available earlier in the month? Obviously theres still all the same info it has to process but maybe new server=faster server?
 

Antar

is a Battle Server Administratoris a Programmeris a Super Moderatoris a Community Contributor
Official Data Miner
#71
Is your pc that takes the data from the server or is the server that elaborate them itself?
All data processing happens on the server. The amount of time it would take to copy the files to my own machine would completely cancel out any performance gains from running on a dedicated machine. Not to mention the costs (for both PS and myself) of the additional bandwidth.
 
#72
I believe it makes the most senes to just use the stats for May. Although one of them was earlier this month, two Pokemon have been banned since March and that does lead to some changes in viability and such. Not to mention just using the May stats means we have tiers that more accurately reflect the current metagame.

Just giving my 2 cents.
 
#73
Ok, I hate to do this, especially since this question has probably been posted before and I hate to be annoying, BUT...

I'm building a program to help people build competitive teams. Just trying to compact all of the information into something that a normal person can handle :). So I'm using the JSON files in chaos files, and that has been extremely helpful. BUT some of these numbers I just don't get. So let me try to sum up my questions as much as possible:

1. Every Pokemon has an abilities object, and each ability has a number, presumably to tell how popular that ability is. However, I don't understand where these numbers come from. Sure, they're weighted, but by what? The Pokémon's raw count? Well that doesn't make sense, cause the calculation results then dont make sense. The usage then? That doesn't make sense either, unless I'm not seeing it.

2. Every Pokémon has a moves object, and each move in that object has, again, a number. Now I read something about that from a post earlier in this thread, but it still evades my comprehension. Again, these numbers are weighted, but how and probably more importantly, WHY?

3. Usage is a funny one. A number between 0 and 1, probably used to represent the amount of times that Pokémon A was used in actual battle. However, Antar mentioned that actual usage was based on gathering information from abilities. I did that, and the calculations I did were WAY off. Am I doing something wrong then?

4. Teammates. You've probably heard this one before. Now I've read that its equal to P(X|Y)-P(Y), which explains the occasional negative data. However, for P(Y), would I then use usage or raw count?

I'm sorry if these questions have been answered before, but these questions have been bugging me for a while now and I want to understand.
 

Users Who Are Viewing This Thread (Users: 1, Guests: 0)