Programming Yet Another Pokemon Showdown Battle-Bot

Hello Smogon,

I'd like to share yet another attempt at making a worthy Pokemon Showdown battle bot. This bot takes an approach similar to most others - transpose through the game as far as time permits, evaluate the resulting states, and make a decision about what the best move is.

In an attempt to remain competitive against humans, I have made the decision logic not completely deterministic. If there are multiple good options, the bot may choose a different move if faced with the same situation again.

Github Repository:
https://github.com/pmariglia/showdown
Results
I've been using the account NiceNameNerd for battling/testing.
All of these games were played by the bot - I mostly test random battles because the spreads are predictable (Serious + 85/85/85/85/85/85), but the bot can play most single battle formats.
169249

PS. Ignore Battle-Factory - those games were played a long time ago

Replays
Gen7RandomBattle replays:
https://replay.pokemonshowdown.com/gen7randombattle-888793347https://replay.pokemonshowdown.com/gen7randombattle-888796020https://replay.pokemonshowdown.com/gen7randombattle-874410489
Gen4OU replays vs. Technical Machine (created by David Stone - thanks for battling me David!)
Game 1
Game 2
Game 3
 

Austin

Schismatic
is a Programmeris a Community Contributoris a Forum Moderator Alumnusis a Battle Simulator Moderator Alumnus
Very cool project, I love seeing these battle bots pop up every now and again.
In an attempt to remain competitive against humans, I have made the decision logic not completely deterministic. If there are multiple good options, the bot may choose a different move if faced with the same situation again.
I can 100% see where you are coming from but sadly this also brings a downside of possible choosing a bad move now that your bot doesn’t want to choose the same move. I don’t know the exact logic and I see the word “may” but I would pin-point this to the reason it has a ~50% win rate
 
I can 100% see where you are coming from but sadly this also brings a downside of possible choosing a bad move now that your bot doesn’t want to choose the same move. I don’t know the exact logic and I see the word “may” but I would pin-point this to the reason it has a ~50% win rate
You're absolutely correct, and for this reason I made the threshold for this extremely small. Most of the time only one move is the option because that move is better than the others.

Let me give an example. The image below is the evaluation result of a two-turn transposition. The state is a Keldeo against a Manaphy. The bot's moves are the rows and the opponent's moves are the columns.
169261


Highest priority is always given to the safest move (i.e. the move with the lowest possible score assuming the opponent plays the best response). In this situation, that is secretsword.

However, the opponent has revealed gliscor and moltres, two pokemon that if switched to would mean that it's best move would be hydropump. Even knowing this, hydropump is not an option because it is too risky - a psychic or energyball from Manaphy would hurt too much after it resists hydropump. This evaluation results in a 100% secretsword choice.

Furthermore, if additional moves are considered, their probability of being chosen will be lower than the safest move. Through my experimental testing, I've found that this logic performed better than simply choosing the safest move.

Regarding the 50% win rate. Note that I've played several hundred games with the bot and due to matchmaking one would expect winrate to approach 50% eventually.
 
Last edited:
Update!

Alright Austin I concede defeat. After a bunch of testing I decided to drop the non-deterministic nature of the decision logic in favour of various caching mechanisms and being able to search an additional depth in a reasonable amount of time.

Still mainly focusing on random battles due to them being a bit simpler (I can boast better numbers in this format). I've been able to maintain a ~65% GXE in RandomBattles (~1450-1650 Elo).

Main ladder is... admittedly tougher. I suspect it is a combination of the additional unknowns (EVs + natures + movesets) plus the opponents on the main ladder being generally more skilled. I have trouble getting the bot much further than the ~50% GXE range (~1250-1350 Elo) in the gen7ou ladder.

There are still a few more things I'd like to do that I think would improve the performance. Firstly, I'd like to have the ability to score a Pokemon's worth differently based on what the opponent's Pokemon are. I feel something like this would help with the main ladder as preserving win-conditions is a natural strategy.

Secondly, I'd like to do some parameter tuning for the evaluation function. I've more or less stuck with the same parameter weights for the life of the bot. I am almost certain that some improvements can be realized by tuning the parameters.
 

Austin

Schismatic
is a Programmeris a Community Contributoris a Forum Moderator Alumnusis a Battle Simulator Moderator Alumnus
Update!

Alright Austin I concede defeat. After a bunch of testing I decided to drop the non-deterministic nature of the decision logic in favour of various caching mechanisms and being able to search an additional depth in a reasonable amount of time.

Still mainly focusing on random battles due to them being a bit simpler (I can boast better numbers in this format). I've been able to maintain a ~65% GXE in RandomBattles (~1450-1650 Elo).

Main ladder is... admittedly tougher. I suspect it is a combination of the additional unknowns (EVs + natures + movesets) plus the opponents on the main ladder being generally more skilled. I have trouble getting the bot much further than the ~50% GXE range (~1250-1350 Elo) in the gen7ou ladder.

There are still a few more things I'd like to do that I think would improve the performance. Firstly, I'd like to have the ability to score a Pokemon's worth differently based on what the opponent's Pokemon are. I feel something like this would help with the main ladder as preserving win-conditions is a natural strategy.

Secondly, I'd like to do some parameter tuning for the evaluation function. I've more or less stuck with the same parameter weights for the life of the bot. I am almost certain that some improvements can be realized by tuning the parameters.
Deadly, sounds good ! :)
 
A perfect battle bot must be non-deterministic though. https://en.wikipedia.org/wiki/Nash_equilibrium
Well, I guess the right word is random, but you can add deterministic "randomness" with a seeded PRNG.

This assumes an opponent that will try to learn your strategy and counteract it, which might not be the case for randbats.
Actually my first implementation of the decision logic was using the Nash Equilibrium. It was awful. My assumption at the time was that the bot needed to see deeper into the game to be more effective, so I ditched that path in favour of being able to prune paths in the tree.

The problem with using the Nash Equilibrium over the safest move is that the entire grid (like the one in my comment above) needs to be evaluated so that the equilibrium decision(s) can be determined. When only considering the safest decision, I can typically prune about 70% of those calculations and search an additional turn ahead as a result.

Its been awhile since I used Nash Equilibrium as the decision logic. It's possible that the reason the previous version was awful was because of the bot not understanding the rules of Pokemon as well as it does now. I'm curious to see how it would perform now in a main ladder game. I'll test it out when I have some time.
 
Well I dug up some old code and tried using the Nash-Equilibrium as the decition making method. It wasn't as bad as I thought.

This also includes a good amount of other improvements to the bot, mostly with understanding more niche abilities that I was too lazy to initially make logic for.

Here is an excerpt from the updated README of my repo explaining the results:

In general the bot will perform much better in random battles and when using the "safest" decision making method.

These are the bot's rankings after several hundred games in each of the OU, PU, and RandomBattle formats using the "safest" decision making method:
189044


The non-deterministic Nash-Equilibrium decision making has similar results for standard formats, but poorer results for random battles:
189045



My best guess for the large difference in random-battle performance would be because of team-preview (or lack-thereof)
 
I love showdown bots and yours seems to perform pretty well. How deep can the bot search, which abilities did you include and does the bot understand moves like quash and soak?
 
I love showdown bots and yours seems to perform pretty well
Thanks :)

How deep can the bot search
The standard search depth is 2-turns ahead. This is configurable but going to 3 & above could (read: probably will) cause the bot to lose on time. I've given up on optimizing to search further into the game because I've found that I get much better performance if instead I focus on making the bot understand the mechanics of the game better.

which abilities did you include and does the bot understand moves like quash and soak?
I focus on moves/abilities/items that directly affect competitive play.

Squash Quash is not understood at the moment and unless I turn focus to doubles I don't see myself putting effort into that.
Soak is an interesting one. There is no mechanic for understanding type-changes right now but that is something I would like to do soon. Mostly for the ability Protean - a Protean Greninja will always be Water/Dark in the eyes of the bot. Once that is completed Soak will be pretty straight forward as well.


Also as a general update, I've taken a shot at implementing the following strategy:
I'd like to have the ability to score a Pokemon's worth differently based on what the opponent's Pokemon are. I feel something like this would help with the main ladder as preserving win-conditions is a natural strategy.
The results look pretty good. I've seen an increase in performance in the standard battle formats (OU, PU). I've also formalized my testing process to be a bit more standardized: 75 battles on a fresh account.

These are the types of results I get with the current implementation of the bot:

192440
 
Last edited:
Amazing job!

I have been trying this bot out, and the maximum it got was to 1500 in ou, although it seems to have been a very rare exception because most of the time it hovers around 1150-1200.

I watched it battle hundreds of times and have some observations that might help you out improve it:

- It seems like it doesn't take into account ditto's choice scarf
- it got into a 1v1 situation having a clefable with life orb against a wailmer, and chose to use flamethrower instead of thunderbolt. Then used thunderbolt, and then moonblast.
- Sometimes sends out the wrong pokemon. Many times I have seen send tyranitar out (after previous pokemon fainted) when the opponent has excadrill, resulting in a earthquake KO. It had most of the time also Corvinight healthy in the back.
- Related to the previous point, sometimes it desides to sac most of the team. It sends a pokemon out, then switches and dies to an attack, then sends again the same pokemon as the first time, switches again and dies, etc.
- I tried it with a Dracovish team, and it seems to not consider when opponent has storm drain.
-Bot got its choice scarf knocked off, and still was not changing moves when clearly there were better choices.
-In one occassion it had only scarf darmanitan left, and opponent had pelipper, charizard and ferrothorn, all of them below 30%. It decided to click U-turn, when clearly that move lost it the game and with others could have possibly won.
- It lets opponents set up easily, many times it had excadrill vs Corvinight and was just using Iron Head (no dmg at all) while opponent was bulking up.
- I believe it is not updating the possible sets of each pokemon with relevant data gathered every turn.

I hope some of this gives you some information that could be used to improve this bot. It is very exciting and thanks a bunch for putting so much effort into it!
 
All excellent points.

It seems like it doesn't take into account ditto's choice scarf
Oof. Yeah the whole ditto transforming aspect is not understood. I've had some problems with that.

it got into a 1v1 situation having a clefable with life orb against a wailmer, and chose to use flamethrower instead of thunderbolt. Then used thunderbolt, and then moonblast.
Sometimes sends out the wrong pokemon. Many times I have seen send tyranitar out (after previous pokemon fainted) when the opponent has excadrill, resulting in a earthquake KO. It had most of the time also Corvinight healthy in the back.
Related to the previous point, sometimes it desides to sac most of the team. It sends a pokemon out, then switches and dies to an attack, then sends again the same pokemon as the first time, switches again and dies, etc.
I tried it with a Dracovish team, and it seems to not consider when opponent has storm drain.
Hard to say exactly what it was thinking without logs of the battles. I've certainly observed some bonehead decisions as well. I can say for certain that any items/moves/abilities used with a high likelihood are accounted for (likelihoods are acquired from the previous month's usage data: https://www.smogon.com/stats/2019-11/moveset/gen8ou-0.txt) and that the effects of stormdrain are known.


Bot got its choice scarf knocked off, and still was not changing moves when clearly there were better choices.
This one's odd. I tested it out and it certainly has the ability to use other moves: https://replay.pokemonshowdown.com/gen8ou-1038681160. Here I force the bot to select FireBlast on the first turn and then let it take over. My best guess is that it predicted a switch in your situation but impossible to know without the logs.

A pattern you can probably see is that because it only sees two turns ahead it does not really understand the longer-term strategy of the battle. It also seems to be exploitable and can be set-up on quite easily in certain situations.

Minimax is truly an awful way to try to play competitive Pokemon.


I believe it is not updating the possible sets of each pokemon with relevant data gathered every turn.
It isn't! Well, mostly not. I have some logic that will infer a choice-scarf if there is absolutely no other way the opponent could have gone before the bot. What I believe you're referring to is ruling out certain sets/spreads after doing/taking damage. Having the insight to do this is a HUGE aspect of competitive Pokemon, minimax or otherwise. It is what I am working on next.
 
Hello,
the idea of making a bot sounds very interesting to me. However, I'm still an introductory CS student in college and know next to nothing about how to undertake/begin a project like this. Do you have any advice on how to get started?
 
Hi, i've tried to make the bot works but i found problems during requirements installation. This is the error https://pastebin.com/cJZSq0mx. Maybe i'm doing something wrong though i followed your instruction precisely?
Looks like you're having some trouble installing pandas on Windows. I'd consult a guide on how to do that :)

Hello,
the idea of making a bot sounds very interesting to me. However, I'm still an introductory CS student in college and know next to nothing about how to undertake/begin a project like this. Do you have any advice on how to get started?
It takes time. If you're interested in writing your own from scratch I'd recommend familiarizing yourself with websockets, and then read the pokemon-showdown protocol. It's a project I really enjoyed and I learned a lot by doing it - I'm sure you will as well.




In other news, here's a Gen8OU update:

Still managing an average of ~1450 elo, varying from ~1250 to ~1650 over many games played. Unfortunately randombattles, which the bot performed much better in, is now very difficult because Dynamax is not properly understood.


nnn-2020.PNG
 
Last edited:
Hello I have been trying to get the bot to run for a few hours now and I have been using heroku to run the files however, my application deploys and then is unviewable. when I click the view application button it brings up an error web page from heroku. Not sure whats wrong as I incorporated all the files. I think it might be a .env file issue as I havent been able to make one using my mac yet. Any help would be greatly appreciated thanks!!

So I have been trying to use the run with docker option and everything is built but the last stepdoes not work for me. when I try to tun the .env file, it cant find the directory or file. I have copied the file onto my desktop and into the showdown folder and it still doesnt work. not sure why any help would be great thanks

Additionally I would like to look into creating a bot that learns a certain persons play style. any ideas on how to do that?
 
Last edited:
You will probably need a lot of data from that person playing the game. I once saw a paper from Stanford students that trained their model using replays from the top players ("Artificial Intelligence for Pokemon Showdown" by Khosla, Lin and Qi; however, I can't find it anymore). Maybe this could be helpful for your problem as well.
 
Hello pmariglia, pretty cool project.

Some quick thoughts :
The Engine folder seems intensive in maintenance. I was wondering why weren't you using the official smogon damage calc to get most of the logic. To my knowledge no API is available but you can still run js code inside your py project with something along the following :

Python:
from Naked.toolshed.shell import execute_js, muterun_js

def executeJSCalc(file):
    """
    executes the JS file specified
    file: type is string; example: '~/Downloads/myFile.js'
    """
    response = muterun_js(file)
    if response.exitcode == 0:
        calc=response.stdout.decode()
        desc=calc.split('\n')[0]
        damage=[int(dam.strip()) for elt in [d for d in calc.split('\n')[1:] if ',' in d]for dam in elt.split(',') if dam!='']
        return desc, damage
You obviously need to write the calc JS file which can be written in your py project

Python:
def writeJSCalc( file,gen, attacker, defender, field, move):
    """
    writes inside a .js file
    """  
    s='...' # check out how the script is written might be in typescript so needs to be executed to print the js file
    myfile = open(file, 'w')
    myfile.write(s)
    myfile.close()
Accessing the headless showdown battle logs would be useful as accessing replays might be slow
 
Hey hidekov, thanks for the kind words.

You're right. The engine involves a lot of maintenance, and I definitely don't see myself keeping it updated in perpetuity.

I was wondering why weren't you using the official smogon damage calc to get most of the logic
So my engine actually does more than just calculate damage. It is a Pokemon battle engine that will spit out all of the possible transpositions (and their probabilities) that can happen given a state and a pair of moves. I've written a bit about the engine here.

I would've loved to use the official PokemonShowdown battle simulator here (it would've saved a lot of work!), but the PokemonShowdown battle simulator works a bit differently. The PokemonShowdown battle simulator will transpose the state, randomly choosing one of the possible outcomes which is all it really needs to function as a battle server. Outside of perhaps a monte-carlo tree search algorithm, to implement a battle-bot you want more: You want the bot to see all of the possible things that could happen. For example when I use thunderbolt as a paralyzed pokemon: I could be fully-paralyzed and do nothing, I could hit, I could hit and paralyze, etc.

Note: The engine I wrote is not perfect. There are a large set of battle-mechanics that are not understood. I take the "good enough for competitive single-battles" approach when determining what should be understood. It also does not work for doubles.

On top of this I also was interested in seeing what it would take to write my own engine from scratch.
 
Do you have a list of the battle mechanics that are not understood? I think I could try to implement some of them.

Btw, with what teams did you get to 1450+? the bot rarely goes over 1200 with my teams :(
 
Do you have a list of the battle mechanics that are not understood? I think I could try to implement some of them.
I have a private Trello board that I use to keep track of my personal todo list. I've had a few people express interest like yourself so I may start putting some items on the GitHub issues.

Btw, with what teams did you get to 1450+? the bot rarely goes over 1200 with my teams :(
This team gave me the best results. I haven't done testing in a couple of months since my last post about it though.
 

Users Who Are Viewing This Thread (Users: 1, Guests: 0)

Top