• The moderator of this forum is jetou.
  • Welcome to Smogon! Take a moment to read the Introduction to Smogon for a run-down on everything Smogon, and make sure you take some time to read the global rules.

Programming Pokemon Global Link Data Scrape

I did a recent web scrape of the Pokemon Global Link website's Battle Spot data for USUM, consolidating all the data in usable spreadsheet format.

https://3ds.pokemon-gl.com/battle/usum/

The following fields captured:
* FormeID (coming soon)
* Pokedex ID
* Species
* Type1/Type2
* Top Moves used*
* Top Abilities used*
* Top Natures used*
* Top Items used*
* Top Moves used when victorious*
* Top Moves used to defeat it*
* Top teammates**
* Top teammates when victorious**
* Top enemies when defeated**

* Stored as entries "rank, name (of move, ability, etc), frequency ratio" separated by semicolons.
** Stored as entries "rank, name (of Pokemon), FormeID (coming soon)" separated by semicolons

After parsing the first run, I realized that forme differences are stored by the image URLs' unique IDs, so I'm rerunning the program again to update those, and then create a convenient lookup table for it.

https://docs.google.com/spreadsheets/d/1mbpAq6lDPkWEt1Mn1SofR__UHU4xgTboHkd9QIyihA0/edit#gid=0
 
Do you by chance know where I can get singles match data with outcomes? I'm working on a machine learning project where I would like to predict the outcomes of matches. I have 1v1 data, but it is very minimal (no items, moves, natures, EVs) and I do not think my results will generalize well to 3v3 or 6v6 matches --which is what I am ultimately interested in. Preferably I would like data that includeds the Pokemon on each team as well as their moves/nature/item/ability/EV, but I could get by on data with only the two player's teams and the winner --thanks to you and your hard work I can use a probabilistic approach to bridge that information gap.

I would appreciate any leads you can give me!

PS.
I don't really care about the platforms the battles took place on (Showdown, WiFi, Ect) as long as the games are actual games played by humans.
 
Back
Top