Programming Cube On : A battle bot (currently OU no mega only)

Artemis Fowl · May 21, 2015

huge graphical skills were needed for that

Summary

Hello, I've been programming for about ten months, randomly stopping and resuming the project, a battle bot, which has now reached the level of a beginner that never anticipates. Clearly not enough to win matches, but I thought it was interesting to present anyway, especially to show how the AI thinks.

It is written in python, and the code is at the end. It does not depend of the tier, only needing a set database to work in any tier. It also adapts to the given team.

Let's jump straight to the AI :

How the AI proceeds

The bot will use two databases according to the tier he's playing in. (Right now, ou.)
One of them is the huge statistic DB available to all : http://www.smogon.com/stats/2015-04/
And the other is one I make myself. It contains the most common sets about the most common pokemons.

Using this, the bot will, as soon as it sees the opponent team, make a "guess" about their sets. In the case he's dealing with a pokemon that is in the precise database, he will assume he has its most common set. If it is not in the handmade database, he will create a (most likely stupid) set using the statistics : most common item, ability, EV spread, and 4 most used moves. This leads to some stupid situations, for instance, if a pokemon use half the time Fire Blast, and otherwise uses Flamethrower... the bot could end up guessing he has both. But that is easily solved by extending the database.
Also, if he meets a pokemon that doesn't even appear in the usage stat, well, he crashes but ... just why did the opponent use Surskit ?...

This having been done, the bot thinks he approximatively has a knowledge of the situation. This lead us to the core of the matter : the bot works by situation rating.

In any situation, the bot will be able to rate it, and that is how the entire decision protocol is decided. When reviewing its possibilities, the bot will then replicate the game mechanics in order to imagine what the situation would be, if this - a specified turn - was to happen. It will then choose what heighten the "situation rating" the most, which ultimately leads to wiping out the enemy team.

That is where the AI is at first different from what would be the most basic (and future-less) way of thinking : looking one turn away, if you're able to inflict more damages than taking.

Let's take several real exemples of what should a situation rating thinking protocol fulfill :

- A fast pokemon that has barely any hp is not worthless. It can still outspeed something and strike. The bot shouldn't throw it away for nothing.

- A slow pokemon that has barely any hp is worthless, priorities apart. No matter the situation, it will get killed before being able to do anything. Thus, it is a death fodder : if the situation gets bad, the bot should sack it, to regain the upper hand.

- Let's say there is one sweeper in each team threatening each other highly. A fast one in the ally team does 50% to the other, and the other, slower, kills it in one hit. If the opportunity comes, the bot should rate very high inflicting 50% to the opponent opponent, because it knows it will completely flip the result of this decisive fight.

- In short, if the opponent has something that threatens the team, it should be ready to make 'losing' trades to weaken it.

There are several advantages to proceeding this way, and mainly the fact that as long as you define what each move does, plus giving particular ratings to weird moves such as hazards, the bot will have an objective sight of the thing. It will use a healing move if its situation is better after a healing move than after doing something else. (heals aren't made yet tho)

And so, here's my situation rating function :

Imaginating every possible 1v1 between allies and enemy, even if they're dead (that's 36.) and have them fight to the death. Then, take the remaining %hp (if both are dead, that's 0 and 0), and add the ally's positively to the score, and the enemy's negatively.
So, the "Situation rating" is a number, between -3600, and 3600.

This method, in addition to fulfilling the cases stated upper, and not being so heavy to calculate (it's pretty much a complexer euclidian division, you keep substracting things), gives an insight of long-term strategy, which is extremely hard to teach to a bot. Toxicing the tanks makes sense when thinking with this method, because it will heavily modify every fight in favor of the bot, since these are long fights (Also, it will make even more sense toxicing those that can heal, once heals are implemented).

There is however one downside to it : this function ignores what's currently happening on the field. Because of this, switches are fucked up without modifications : the bot would switch to someone else only if the current fight is so bad he'd rather take the hit with another mon and not answer. ... That doesn't happen often. Switching isn't only about tanking the hit, you've also got to think about the situation that is about to come. You don't switch in heatran against Mamoswine because it will make you able to tank Ice Shard.

To correct that, I had to add a subjective rating modifier when rating switches : the bot will look at the score modification once he has switched, plus each mon have hit each other once ! Tho right now, it's a bit retarded and only consider that each mon will hit each other once with their best damaging move - not their best move which could be a status one, for exemple.

This isn't randomly dropped : the idea is that if you've got a positive score with this, the opponent will switch out once you've switched in and got hit : the situation being perfectly symetric, this is likely to lead to a global score modification of 0, which is better than keeping on losing your fight two turns before.
Then again, all this is modified by the importance the situation rating gives to each pokemon, if your tank sucks against the opponent team and is only able to tank the pokemon on the field, he will throw it in right there : him getting hit barely changes the score not in your favor. If your pokemon on the field musn't get hit as it would flip the 1v1 with other mons, he will attempt to save it.

This ends my explanation of the AI.

I can see two downside to this method : the first one is that the bot has to assume he knows the situation, and base his entire reasoning out of it. But he doesn't. For instance he will think that all Latios when first met have the four moves he has in mind. But Latios has a coverage that far wider than four moves, thus, if he thinks for exemple the latios don't have earthquake, he will repeatedly throw heatran in ... and die every time latios happened to have earthquake. That is one of my main concern right now and is the real flaw this AI carries everywhere, which I don't know how to fix.

That's typically a problem that I doubt the programmer of this probability based bot : http://www.smogon.com/forums/threads/an-ou-bot.3529338/ has. The way our approach to the problem differ is what had me want to present mine.

The other one is that the bot isn't aiming to wipe out the opponent team, it's aiming to keep the highest situation rating. This can cause weird stuff at game ends, where the bot has an obvious way to finish the battle, but will not proceed as it would lead to a situation rating drop (and victory ._. ). I've never seen it happen (since he never wins >< ) but I can guess that if the opponent has setup moves, this can cause disasters.

What currently lacks

- Better database (like, he crashes when seeing weezing ...). What I currently have in mind is having it search in the database of the tier below if he doesn't find it. In main tier ladder that is, I mean by that Ubers -> OU -> UU -> RU -> NU -> PU.

- Heal knowledge. While this is very easy to program (about 2 minutes ..), this implies I would have to remake the "1v1 confrontation to death" part, which currently takes one move for both pokemon and imagine them spam it until eventual death. (Which is a bit dumb, i mean, priorities exist, so the best move changes over time). Also, endless fights management inc...

- LEARNING APTITUDE !! I currently have the structure made to exploit the learning part but no data acquisition part ready ! I mean by that that the pokemons in the main db have several sets stored in and so, the bot will, once it's done, look at the sets that do not contradict what has been seen already. (The idea about this is that something with LO has most likely an offensive set, while something Leftovers will have a defensive - and this is seen through the sets).

- Boosts. Well, the bot will rate the situation where the opponent pokemon is boosted ... but before putting that, I gotta make a cleverer "switch then 1 turn" evaluation. The entire point about boosting is that you will want to put the mon that can boost itself on someone it can boost on. Currently, the switching evualation is taking the upcoming move + having them hit each other.

- Mega-evolution. Well ... I don't know how to do on this part. The enemy pokemon litteraly changes to another ._.

- Hazards.

- Heal bell. Tho this shouldn't be too hard ... "reset all status to normal". But far on my priority list.

- Proper understanding of multi hit moves. Right now, he thinks they all hit once. (except bullet seed, for testing reasons)

- Almost all abilities.

Currently implemented :

The bot knows, aside from damaging moves :

- Paralysis, poison, toxic poison, burns

- Choice items

- Every damage-modifying item

- Priorities on moves, except ability-related ones (hello talonflame).

- Immunity abilities (Volt asborb, levitate, etc)

- Most end-turn hp modifications : Leftovers, Sludge Bomb, weather damage.

- Contact damage : Rough Skin / Iron Barbs / Rocky Helmet

Known bugs

- Alternative forms regrouped that do not have their specific entry in the databases are unknown. For exemple, Keldeo-Resolute needs an abstract rename, as only Keldeo has associated datas, in both smogon's DB and my handmade set DB. Well, handmade, it's a copypastes of smogon's analyses' sets.

- Having one of its mons killed by a pivot-move makes the bot crash. He tries to answer as soon as he witnesses the KO of its mon, except the opponent has something to choose first.

- Choice items do not work with Hidden power. "The opponent just used Hidden Power, so he won't be able to use that "hiddenpowerfire60" that is in his set... right ? RIGHT ?" - Cube.

- Multi hit moves only proc contact damage once in the head of the bot. Well, they only hit once in his head anyway ...

- The bot isn't aware of the toxic poison stacking on his active pokemon. Like, he will consider at every turn that the stacking starts now, from 6.25%.

Code

I'm french, so don't be surprised if there are half the annotations (# stuff) you cannot understand.

Core code (Connection + AI) : http://pastebin.com/hZriU1xX
Surge (Extracts string data and makes it usable matter for python) : http://pastebin.com/84dtu4db
Datacalc (calculation data) : http://pastebin.com/6qMErixR
Datasets (the set pasted data) : http://pastebin.com/DKSPheYJ
I couldn't put the pokedexjs one because it's too heavy for pastebin, but its nearly json.loads(the PS files available on Zarel Github)
To put an exemple for each : http://pastebin.com/nuZKg403

Replays

http://replay.pokemonshowdown.com/ounomega-234254752 (First Cube match on the official ladder)
http://replay.pokemonshowdown.com/ounomega-234387090
http://replay.pokemonshowdown.com/ounomega-234412684
http://replay.pokemonshowdown.com/ou-234412678

Test matches where I was the other player:
http://replay.pokemonshowdown.com/pokestrat-ounomega-2980
http://replay.pokemonshowdown.com/pokestrat-ou-3169

---

That's it, do not hesitate to ask / suggest anything !

QxC4eva · May 24, 2015

Hi! That's some interesting work you have! I don't have time to comment on everything but this one stood out to me:

Artemis Fowl said:
Imaginating every possible 1v1 between allies and enemy, even if they're dead (that's 36.) and have them fight to the death. Then, take the remaining %hp (if both are dead, that's 0 and 0), and add the ally's positively to the score, and the enemy's negatively.
So, the "Situation rating" is a number, between -3600, and 3600.

Competitive Pokemon doesn't work with % hp as much as nHKOs + speed. You don't often see the statement "move X does 700% damage to Y" compared to "move X OHKOs Y". So I recommend implementing a more specific damage calc for this if you can. It might get you some better AI but I'm not sure if you think it's too heavy to calculate. Say you have a 1 HP Alakazam and 1 HP Chansey- the former is worth more (it can revenge kill), while the latter is simply a death fodder. Zam can also clean up if Psyshock OHKOs everything on the opposing team, so if the bot needs to sack something it should not be Zam. The idea is:

Assign a score to each Pokemon based on the damage calcs it inflicts and receives from the opposing team (like in chess, a pawn is worth 1 and the queen is 9). Speed, priority and hazards can be left out for simplicity I guess. For example, if you have a team of 5 Scizor and a scarf Landorus-T against a mono fire team, the points would go something like 0.1667 for each Scizor (they get OHKO'ed by all 6 opposing mons) and 6 for the Lando (it sweeps all 6 fire mons). Now the AI should no longer switch out Scizor to Landorus to tank an Infernape's Flare Blitz -- as losing Scizor only costs 0.167 points, but having a lower health Landorus might degrade its points from 6 down to 5 (it's now in Mach Punch KO range after tanking the Flare Blitz).

That's my 2 cents, and good luck with your project. I don't think you should call yourself a beginner anymore. =]

Artemis Fowl · May 25, 2015

Actually, my method does exactly what you described, and to find it, I precisely did what you just did :p That is, stating situations where pokemon are worth little/much, and seek for a method that reacts properly in all situations.

Artemis Fowl said:
Let's take several real exemples of what should a situation rating thinking protocol fulfill :

- A fast pokemon that has barely any hp is not worthless. It can still outspeed something and strike. The bot shouldn't throw it away for nothing.

- A slow pokemon that has barely any hp is worthless, priorities apart. No matter the situation, it will get killed before being able to do anything. Thus, it is a death fodder : if the situation gets bad, the bot should sack it, to regain the upper hand.

- Let's say there is one sweeper in each team threatening each other highly. A fast one in the ally team does 50% to the other, and the other, slower, kills it in one hit. If the opportunity comes, the bot should rate very high inflicting 50% to the opponent opponent, because it knows it will completely flip the result of this decisive fight.

- In short, if the opponent has something that threatens the team, it should be ready to make 'losing' trades to weaken it.

Let's say we have a 1 hp slow pokemon and a 1 hp fast pokemon, and look at their "part" in the situation rating (the 6 1v1 they are evaluated on).
For the fast pokemon, in the 6 1v1 they are in, they drop the opponent pokemon, so let's say all the opponent's mons are full. Thus, the result for each fight is around [0.3%,100%]->[0%,60%].
Was he to die, the result for each fight would be [0%,100%]. So this pokemon's death would "cost" 240 points, because it would modify its part in the situation rating by 6*(0-60) to 6*(0-100) = -240.

Now, for a slow 1 hp pokemon, the result for each fight is [0.3% , 100% ] -> [0, 100%]. So this pokemon is worth nearly 0 points.
Because of this, if the bot happens to have a situation rating drop no matter the decision he takes, he will go for sacking the slow 1hp pokemon, because at least he only loses "nearly 0" (that would be 0.3% * 6 in our case).

QxC4eva said:
Assign a score to each Pokemon based on the damage calcs it inflicts and receives from the opposing team (like in chess, a pawn is worth 1 and the queen is 9). Speed, priority and hazards can be left out for simplicity I guess. For example, if you have a team of 5 Scizor and a scarf Landorus-T against a mono fire team, the points would go something like 0.1667 for each Scizor (they get OHKO'ed by all 6 opposing mons) and 6 for the Lando (it sweeps all 6 fire mons). Now the AI should no longer switch out Scizor to Landorus to tank an Infernape's Flare Blitz -- as losing Scizor only costs 0.167 points, but having a lower health Landorus might degrade its points from 6 down to 5 (it's now in Mach Punch KO range after tanking the Flare Blitz).

Then, for Scizors, their part in the situation rating is 6*(0-90) = -540. (because let's say bullet punch does 10%)
Were they to die without even hitting, their part in the situation rating would be 6*(0-100), so, suiciding scizor drops the situation rating by 60. Switching Lando into a flare blitz, that does, let's say, 50, would drop the situation rating by 300, because his part used to be 6*(100-0), and now it's 6*(50-0). The bot will prefer suiciding scizor, because scizor is worthless. As all pokemons are defined by their 1v1 capabilities, someone that gets outsped and OHKOed in every situation is worth 0. His death does not impact the situation rating.

To conclude, I'd say that doing 700% isn't different from doing 100%, since what matters here is the remaining %hp after the fight, not what one is able to inflict. Indeed, if OHKOs are in the place, they will be particularly interesting since they will keep the initial hp unchanged.

LightningLord2 · Oct 6, 2015

Can you put in a failsafe where if the AI throws an exception when having to select a move, the AI acts at random instead?

EDIT: Same with Pokémon he does not recognize - if he can't think of what the Pokémon could possibly run, make him guess a randomized legal learnset and EV spread instead.

Programming Cube On : A battle bot (currently OU no mega only)

Artemis Fowl

QxC4eva

Artemis Fowl

LightningLord2