I'm very excited to announce the completion of two projects I've been working on the past couple months:
1. A MySQL database containing a bunch of Pokemon data (e.g. abilities, moves, Pokemon) and relationships between them.
2. A GraphQL API for querying the data in this database.
First let me say that I can't really afford to host these myself, and I haven't implemented security features in the API (i.e. controlling how much users can query it), so it's not yet ready to deploy. However, virtually all of the data is there, and you can run them on your own machine by following the instructions here and here. In a month or so I'll look into how I can host it affordably so that this set-up isn't necessary.
What's a GraphQL API?
Many of you have probably used REST APIs in your apps. PokeAPI is one example. The downside of REST APIs is that you don't have much control over the data that gets sent once you make a query. Let's say your program needs to know what type Gengar is. If you make the query
A GraphQL API is like a REST API in that it allows your app to fetch data, but it is based on the GraphQL language (Graph Query Language), which allows you to be very specific in which information you get. To get the typing of Gengar, we'd just run:
Now we (or our app) get only the information we need. But we can go further with GraphQL. What if we want to know the defense matchups of Gengar's type (i.e. how does it handle other attacking types)? Our query is "Get the Types that Gengar has, and for each of those Types look up their Type matchups." We can add just a few more lines to get that (I've added some limit arguments to fit the result on screen):
You can nest your queries as deep as you like, and query as many fields as you like. Thus, for any relationship you could think of, you can query it: which abilities resist the water type, and which Pokemon have those abilities? Ask:
Below I've linked some sources where you can learn the basics of the language so that you can write your own queries. Before that though, here are some other examples:
Examples
Let's say you wanted to know (for some reason...)
But doesn't lightning rod resist electric type attacks? Notice that I asked for 'generation: 4' at the top. Let's change that argument to 'generation: 5' (I also change a 'limit' argument to make it fit):
See? now the resistance shows up (lower right)!
One more example: Let's say we're making a "fun" meme team, and we want to know how to raise our secondary effect chance. Then we can ask how to do that:
(You can also ask which items and moves modify secondary effect chance, of course). So we know that we'll need Serene Grace, and Water Pledge and Fire Pledge (the API doesn't at the moment represent the fact that you need to use those moves TOGETHER in the API). Also, you can ask how many abilities modify secondary effect chance (it's 3, but I chose the fetch limit to be 2 to fit it all on my screen). On the beautiful day that GameFreak adds more secondary effect chance-boosting abilities, you can find them here.
How I made it
This project has two components: the MySQL database, and the GraphQL API which queries it. I posted the links to the GitHub repositories at the top. In both repos, I have several READMEs explaining what each piece of the code does.
The first repo,
The second repo,
How to use it
Like I said at the top, I don't think I can deploy this myself for awhile. I'm currently looking for a job, so I don't have a stable income (but I'm doing OK, don't worry about me) and don't think I can take on the hosting costs right now. I also just need to research how to host projects like this. I'm relatively new to web development, so I haven't done my research on where to host my projects yet.
I wrote instructions on how to set it up on your own machine in each of the repositories. Unfortunately, you also need to have MySQL installed. However, you don't need to perform any MySQL queries yourself, except a 'CREATE DATABASE' to make an empty database to put the data into.
If you were to get it all running on your machine, then you could use something like Apollo Client to use the API in your own apps. You're free to fork this project for that purpose, but please let me know if you do so. I'd like to know that someone's using my API! In addition to the GraphQL docs, there's also this site, which has a pretty good tutorial on how to use GraphQL. I've used the 'graphql-node' tutorial on there, and I'm planning on doing the 'React + Apollo' track in a couple weeks before I start working on my own app for this API (coming soon...). Just be aware that this tutorial was created by Prisma, a tech company whose product is used heavily in the tutorial. It's an ORM (basically a tool to help you work with databases), which seems pretty good from my brief experience with it in the tutorial, but I didn't use it in this project, and there are a lot of other ORMs out there you may want to research before using it. It's still a good tutorial, but just keep in mind that you don't need to use Prisma to use GraphQL, and that the main reason it's in the tutorial instead of another ORM is because it's the commercial product of the authors of the tutorial.
Closing
Thank you for reading. I'm going to take a 2-week break from this project for the holidays. I'll be pretty busy after that (gotta start getting ready for job interviews), so I won't be able to make any major changes to it. However, if you notice any incorrect data, like a move having the wrong power or something (which there almost certainly is), or any other bugs, please let me know; I'll fix them when I can.
In the future, I'd like to try and get another data source than scraping from Bulbapedia. A lot of the data is in tables, and the code to scrape the data is really dependent on the shape of those tables. Anytime they change the layout of the table, then, I'd probably have to change the code to scrape the data if I want to get new data again. On the other hand, whenever a new generation comes out, assuming the table formats stay generally the same, I could just run all the code again with very little modification and get all the new data (though I'd probably need to write a couple new scripts and modify some others if new mechanics are introduced).
1. A MySQL database containing a bunch of Pokemon data (e.g. abilities, moves, Pokemon) and relationships between them.
2. A GraphQL API for querying the data in this database.
First let me say that I can't really afford to host these myself, and I haven't implemented security features in the API (i.e. controlling how much users can query it), so it's not yet ready to deploy. However, virtually all of the data is there, and you can run them on your own machine by following the instructions here and here. In a month or so I'll look into how I can host it affordably so that this set-up isn't necessary.
What's a GraphQL API?
Many of you have probably used REST APIs in your apps. PokeAPI is one example. The downside of REST APIs is that you don't have much control over the data that gets sent once you make a query. Let's say your program needs to know what type Gengar is. If you make the query
https://pokeapi.co/api/v2/pokemon/gengar
, you get a .json file which contains that information, but it also contains all the other information on Gengar.A GraphQL API is like a REST API in that it allows your app to fetch data, but it is based on the GraphQL language (Graph Query Language), which allows you to be very specific in which information you get. To get the typing of Gengar, we'd just run:
Now we (or our app) get only the information we need. But we can go further with GraphQL. What if we want to know the defense matchups of Gengar's type (i.e. how does it handle other attacking types)? Our query is "Get the Types that Gengar has, and for each of those Types look up their Type matchups." We can add just a few more lines to get that (I've added some limit arguments to fit the result on screen):
You can nest your queries as deep as you like, and query as many fields as you like. Thus, for any relationship you could think of, you can query it: which abilities resist the water type, and which Pokemon have those abilities? Ask:
Below I've linked some sources where you can learn the basics of the language so that you can write your own queries. Before that though, here are some other examples:
Examples
Let's say you wanted to know (for some reason...)
- Abilities that start with "l"
- The names of these abilities
- Which Pokemon introduced after Generation 4 can have these abilities, and which ability slot the ability occupies on those Pokemon
- Whether any of these abilities resist elemental Types, and by what factor
But doesn't lightning rod resist electric type attacks? Notice that I asked for 'generation: 4' at the top. Let's change that argument to 'generation: 5' (I also change a 'limit' argument to make it fit):
See? now the resistance shows up (lower right)!
One more example: Let's say we're making a "fun" meme team, and we want to know how to raise our secondary effect chance. Then we can ask how to do that:
- Which abilities modify secondary effect chance?
- Which field states (terrains, weathers, entry hazards, etc.) modify secondary effect chance?
(You can also ask which items and moves modify secondary effect chance, of course). So we know that we'll need Serene Grace, and Water Pledge and Fire Pledge (the API doesn't at the moment represent the fact that you need to use those moves TOGETHER in the API). Also, you can ask how many abilities modify secondary effect chance (it's 3, but I chose the fetch limit to be 2 to fit it all on my screen). On the beautiful day that GameFreak adds more secondary effect chance-boosting abilities, you can find them here.
How I made it
This project has two components: the MySQL database, and the GraphQL API which queries it. I posted the links to the GitHub repositories at the top. In both repos, I have several READMEs explaining what each piece of the code does.
The first repo,
poke-db
is all about inserting the data into a MySQL database so that the API can query it. For the data, I scraped most of it from hundreds of Bulbapedia pages and tables using Python (I copied the two learnset.js
files on the Pokemon Showdown repo for the Pokemon learnsets instead of scraping them). Then I did a couple steps of processing using Python and then Javascript (those are the two languages I know...) before inserting it into the database. The src/data
folder has a lot of .csv and .json files, which you're free to use as well. However, you need a MySQL database to really represent all the relationships between everything.The second repo,
poke-gql
is where I write the code for the GraphQL API that actually queries the database. This sets up the actual server (I use Apollo Server) where you can query the API by going to http://localhost:4000/ (this is where I took my screenshots).How to use it
Like I said at the top, I don't think I can deploy this myself for awhile. I'm currently looking for a job, so I don't have a stable income (but I'm doing OK, don't worry about me) and don't think I can take on the hosting costs right now. I also just need to research how to host projects like this. I'm relatively new to web development, so I haven't done my research on where to host my projects yet.
I wrote instructions on how to set it up on your own machine in each of the repositories. Unfortunately, you also need to have MySQL installed. However, you don't need to perform any MySQL queries yourself, except a 'CREATE DATABASE' to make an empty database to put the data into.
If you were to get it all running on your machine, then you could use something like Apollo Client to use the API in your own apps. You're free to fork this project for that purpose, but please let me know if you do so. I'd like to know that someone's using my API! In addition to the GraphQL docs, there's also this site, which has a pretty good tutorial on how to use GraphQL. I've used the 'graphql-node' tutorial on there, and I'm planning on doing the 'React + Apollo' track in a couple weeks before I start working on my own app for this API (coming soon...). Just be aware that this tutorial was created by Prisma, a tech company whose product is used heavily in the tutorial. It's an ORM (basically a tool to help you work with databases), which seems pretty good from my brief experience with it in the tutorial, but I didn't use it in this project, and there are a lot of other ORMs out there you may want to research before using it. It's still a good tutorial, but just keep in mind that you don't need to use Prisma to use GraphQL, and that the main reason it's in the tutorial instead of another ORM is because it's the commercial product of the authors of the tutorial.
Closing
Thank you for reading. I'm going to take a 2-week break from this project for the holidays. I'll be pretty busy after that (gotta start getting ready for job interviews), so I won't be able to make any major changes to it. However, if you notice any incorrect data, like a move having the wrong power or something (which there almost certainly is), or any other bugs, please let me know; I'll fix them when I can.
In the future, I'd like to try and get another data source than scraping from Bulbapedia. A lot of the data is in tables, and the code to scrape the data is really dependent on the shape of those tables. Anytime they change the layout of the table, then, I'd probably have to change the code to scrape the data if I want to get new data again. On the other hand, whenever a new generation comes out, assuming the table formats stay generally the same, I could just run all the code again with very little modification and get all the new data (though I'd probably need to write a couple new scripts and modify some others if new mechanics are introduced).
Last edited: