Technical Machine: A Pokemon AI

obi

formerly david stone
is a Site Content Manager Alumnusis a Programmer Alumnusis a Senior Staff Member Alumnusis a Smogon Discord Contributor Alumnusis a Researcher Alumnusis a Top Contributor Alumnusis a Battle Simulator Moderator Alumnus
I now have a working battle log parser. Right now I just have the obvious stuff (extract Pokemon names, movesets, leftovers, etc.), but I'm going to combine it with my reverse damage calculator soon so I can also get exact moveset / EV combinations. This way, I can get more detailed stats so I can get move set stats similar to the "team mate stats". I'm about to add a move set predictor, which means that all I have to do before it can do a battle on its own (other than the move set predictor) is to just set it up to connect to the server and battle.
 
Cool work obi! Just a couple questions...have you benchmarked it yet (i.e. can it regularly beat a human, or does it always beat a totally random player, etc). Also, do you have any intention of releasing the source code?
 

obi

formerly david stone
is a Site Content Manager Alumnusis a Programmer Alumnusis a Senior Staff Member Alumnusis a Smogon Discord Contributor Alumnusis a Researcher Alumnusis a Top Contributor Alumnusis a Battle Simulator Moderator Alumnus
Update!

My program is now almost "complete", in the sense that it is very close to being able to battle. I just finished integrating my team predictor with my expectiminimax algorithm (the part that determines what moves to make). This means that all I have left is to integrate my battle log parser with it and then connect to a server and it will be able to battle.

However, with everything I've added in, it's painfully slow at a depth of even 3 (several minutes to determine a single move). However, I think the primary reason for this is that I have not yet added a transposition table. This means that when I encounter something like: Hippowdon used Earthquake, Swampert used Stealth Rock, Hippowdon used Earthquake, Swampert used Ice Beam, it evaluates that, and then does all of the work again of evaluation when it gets to Hippowdon used Earthquake, Swampert used Ice Beam, Hippowdon used Earthquake, Swampert used Stealth Rock. The program should realize that the two situations are the same, and just save the result from the first calculation and use it on the second. Much of the code I added causes the program to re-evaluate the same position several times, and so a transposition table would dramatically speed things up.

I think the other main cause of the slowdown is that the moves are sorted in decreasing order of how often they appear on the Pokemon because of how my team prediction function works, rather than "probably the best" first.

Plans for the future:

I have a pretty cool idea for a team building AI. It would essentially just be a team thief. If it loses against someone, it stores their team (based on what it has seen, with its predictions filling in the gaps). Then it uses all of the teams randomly (with some sort of weighting system I haven't determined yet). It should also give it the advantage of not being able to be counter-teamed. I could possibly write some algorithms that take an existing team and attempt to improve it. But the important point is that it would have several teams and randomly select one from each battle, and store foe teams if it loses (because the teams that beat it are very likely to be good teams).

I have a function that is used to score the position of the game, which seems like the best way to apply machine learning. I have a text file that determines the value of various things, like Stealth Rock, the number of non-fainted Pokemon, and % HP
of each Pokemon. I'm thinking of apply techniques used by TD-Gammon, a neural net Backgammon program. It would play against itself for thousands of battles to optimize its play, then play against a human expert for a few battles to keep it from getting stuck at a local maximum. I was thinking I could try something like that, by using a genetic algorithm to determine the constants. Give it a few teams for different styles of play, and have it play against itself for a while, then go online and ladder, then play against itself some more. The self-play has the advantage of being able to finish a game very quickly (just use a very small depth).

A setting I'm considering is letting my AI know whether it's in tournament mode or learning mode, where tournament mode uses the settings that the AI has found to be the most effective (including using its best teams), while learning mode has it vary from that in hopes of finding more effective settings.
 

obi

formerly david stone
is a Site Content Manager Alumnusis a Programmer Alumnusis a Senior Staff Member Alumnusis a Smogon Discord Contributor Alumnusis a Researcher Alumnusis a Top Contributor Alumnusis a Battle Simulator Moderator Alumnus
A year later: source code!

It's been a year to the day since I first posted this thread, and I finally have something to show for it. I present: Technical Machine! Complete with a summary of the important parts of this thread (being updated every day with more information I haven't posted here!), along with a Mercurial repository.
 
Awesome! But how does it work. I downloaded it and can not seem to be able to open any of the files or do anything.
 

obi

formerly david stone
is a Site Content Manager Alumnusis a Programmer Alumnusis a Senior Staff Member Alumnusis a Smogon Discord Contributor Alumnusis a Researcher Alumnusis a Top Contributor Alumnusis a Battle Simulator Moderator Alumnus
All I have uploaded is the source code. It needs to be compiled to run. I was working on getting my team predictor working with a nice GUI and everything, but I couldn't get FLTK to build (the toolkit I used to make the windows so it's not all text output). I might release a Windows executable for just the text-version of it, though, as that seems easy to build and release. It also seems a lot less interesting to most people.
 

obi

formerly david stone
is a Site Content Manager Alumnusis a Programmer Alumnusis a Senior Staff Member Alumnusis a Smogon Discord Contributor Alumnusis a Researcher Alumnusis a Top Contributor Alumnusis a Battle Simulator Moderator Alumnus
I've put up a page dedicated to the team predictor.

Here is some sample output of my program, so that you can see how advanced it is:

Enter the log for the turn, followed by a ~.
david stone sent out Daedalus (lvl 100 Hippowdon ♂)!
Technical Machine sent out Qf376Dcb (lvl 100 Jirachi)!
Daedalus's Sand Stream whipped up a sandstorm!
A sandstorm brewed!
~
======================
Analyzing...
======================
Predicting...
Hippowdon @ Leftovers
- Roar
- Stealth Rock
- Slack Off
- Earthquake
Blissey @ Leftovers
- Wish
- Toxic
- Softboiled
- Seismic Toss
Skarmory @ Leftovers
- Brave Bird
- Whirlwind
- Spikes
- Roost
Tentacruel @ Leftovers
- Sludge Bomb
- Toxic Spikes
- Surf
- Rapid Spin
Rotom-H @ Leftovers
- Trick
- Overheat
- Thunderbolt
- Shadow Ball
Swampert @ Leftovers
- Roar
- Ice Beam
- Stealth Rock
- Earthquake
======================
Evaluating to a depth of 1...
Evaluating Iron Head
- Evaluating the foe's Roar
- Estimated score is 157
- Evaluating the foe's Stealth Rock
- Estimated score is -1522
- Evaluating the foe's Slack Off
- Estimated score is 47
- Evaluating the foe's Earthquake
- Estimated score is -263
- Evaluating the foe's Struggle
- Evaluating the foe's switch to Hippowdon
- Evaluating the foe's switch to Blissey
- Estimated score is 402
- Evaluating the foe's switch to Skarmory
- Estimated score is 60
- Evaluating the foe's switch to Tentacruel
- Estimated score is 156
- Evaluating the foe's switch to Rotom-H
- Estimated score is 125
- Evaluating the foe's switch to Swampert
- Estimated score is 79
Estimated score is -1522
Evaluating Stealth Rock
- Evaluating the foe's Roar
- Estimated score is 1500
- Evaluating the foe's Stealth Rock
- Estimated score is -900
- Evaluating the foe's Slack Off
- Estimated score is 1500
- Evaluating the foe's Earthquake
- Estimated score is 898
- Evaluating the foe's Struggle
- Evaluating the foe's switch to Hippowdon
- Evaluating the foe's switch to Blissey
- Estimated score is 1500
- Evaluating the foe's switch to Skarmory
- Estimated score is 1500
- Evaluating the foe's switch to Tentacruel
- Estimated score is 1500
- Evaluating the foe's switch to Rotom-H
- Estimated score is 1500
- Evaluating the foe's switch to Swampert
- Estimated score is 1500
Estimated score is -900
Evaluating Body Slam
- Evaluating the foe's Roar
- Estimated score is 123
- Evaluating the foe's Stealth Rock
- Estimated score is -2276
Evaluating Wish
- Evaluating the foe's Roar
- Estimated score is 300
- Evaluating the foe's Stealth Rock
- Estimated score is -2100
Evaluating Struggle
Evaluating switch to Jirachi
Evaluating switch to Machamp
- Evaluating the foe's Roar
- Estimated score is 0
- Evaluating the foe's Stealth Rock
- Estimated score is -2400
Evaluating switch to Blissey
- Evaluating the foe's Roar
- Estimated score is 0
- Evaluating the foe's Stealth Rock
- Estimated score is -2400
Evaluating switch to Celebi
- Evaluating the foe's Roar
- Estimated score is 0
- Evaluating the foe's Stealth Rock
- Estimated score is -2400
Evaluating switch to Tyranitar
- Evaluating the foe's Roar
- Estimated score is 0
- Evaluating the foe's Stealth Rock
- Estimated score is -2400
Evaluating switch to Moltres
- Evaluating the foe's Roar
- Estimated score is 0
- Evaluating the foe's Stealth Rock
- Estimated score is -2400
Use Stealth Rock for a minimum expected score of -900
Enter the log for the turn, followed by a ~.
The reason it assumes a negative score for both players using the same move is that my team is weaker to Stealth Rock than the foe's. Therefore, I lose more points from SR being down than the foe does.
 
very nice work, obi; you've done what nobody else has dreamed of doing. there's one thing that i could see improving:

Code:
Evaluating Stealth Rock
- Evaluating the foe's switch to Blissey
- Estimated score is 1500
- Evaluating the foe's switch to Skarmory
- Estimated score is 1500
- Evaluating the foe's switch to Tentacruel
- Estimated score is 1500
- Evaluating the foe's switch to Rotom-H
- Estimated score is 1500
- Evaluating the foe's switch to Swampert
- Estimated score is 1500
the bot doesn't seem to calculate the advantages or disadvantages of matchups. i would think that having blissey in against (your) jirachi should result in a higher estimated score than having hippowdon or swampert in against it.
 

obi

formerly david stone
is a Site Content Manager Alumnusis a Programmer Alumnusis a Senior Staff Member Alumnusis a Smogon Discord Contributor Alumnusis a Researcher Alumnusis a Top Contributor Alumnusis a Battle Simulator Moderator Alumnus
Yeah, right now my evaluation function is pretty simple and doesn't look at anything special about the active Pokemon (other than stuff like whether they have stat boosts and other things that are removed on a switch). I plan on improving my evaluation function to include more complex ways to evaluation positions. I'm thinking of writing some sort of "rough estimate of who wins a 1v1 match up" and including that in my evaluation.

Just a note on how to read that output, if an estimated score is not given, it means that either the move is not legal (in the case of switching to itself), or else the evaluation was stopped before it could go all the way through, because it realized that the move it was evaluating will not be used by that player. For instance, if it can prove that Body Slam isn't as good as Iron Head, it doesn't waste time proving just how much worse it is, it just stops. That's why Body Slam only evaluates down to the foe using Stealth Rock and then stops without printing an overall score for Body Slam. Negative numbers are good for the foe; positive numbers are good for the AI.
 
I just clicked on this thread from the main page so sorry if this was already discussed, but in case it hasn't: it seems like the bot should value information. For instance:

Evaluating switch to Tyranitar
- Evaluating the foe's Roar
- Estimated score is 0

Should be considered a loss, because the opponent now knows more about the computer's team than the computer knows about theirs. Especially if, for instance, if it's the team's primary sweeper that's revealed. If it were any lead but Hippowdon -- Swampert, for instance -- the influence of Sandstorm needs to be considered as well. And, the risk of any given Roar is higher for a team that's built unusually to hide its sweeper. I'm not sure if that's too complicated, but it's something to think about.
 

Diana

This isn't even my final form
is a Researcher Alumnusis a Top Contributor Alumnus
This is seriously impressive, it would be interesting to see even a text version of this available if possible.

I'm assuming from reading that it'll come up with things other than Trick Rotom-H with Leftovers after it "steals teams"? Otherwise if there's a way to restrict items used with certain moves for the AI teams/predictions that might be good.
 
Wow, I know pretty much nothing about programming but I know you must have put so much work into this.

Actually I remember an idea I had once about an AI system for Pokemon games that was more like an AI creator program. You would create the team yourself and then you would be able to select the behaviours for each Pokemon, so it would be tailored to the team and even could have a play style, so in-game for example, most trainers would have basic AI, wheras the Elite Four and Champion would have special AIs created for their team.
How feasible is this idea, and how would it compare to your AI?

Also, more on the topic of your AI, I know that now chess computers have advanced so much, there are players that have developed specific play styles to beat computers. Surely this AI, especially its team and moveset prediction, would be especially vulnerable to players using unusual sets and teams. Would it be possible for your AI to have methods of combating this, or would it become too complicated or perhaps introduce worse flaws?
 
I wouldn't be worried about unusual sets or teams so much. Sure, that will throw its team prediction out of whack, but that doesn't put it at any serious disadvantage compared to a normal player, unless its programmed to rely on its predictions as definite. I suppose things like Cosmic Power Jirachi could cause major concerns though, as it would take an enormous amount of time to see the inevitable win if its allowed to boost endlessly.

One thing; if the program will always do the "best" move in any circumstances, it does make it very easy to predict. If the opponent knows which move covers all options the best they can choose their move accordingly. Obviously I can't see a way around this though :)
 

obi

formerly david stone
is a Site Content Manager Alumnusis a Programmer Alumnusis a Senior Staff Member Alumnusis a Smogon Discord Contributor Alumnusis a Researcher Alumnusis a Top Contributor Alumnusis a Battle Simulator Moderator Alumnus
Actually I remember an idea I had once about an AI system for Pokemon games that was more like an AI creator program. You would create the team yourself and then you would be able to select the behaviours for each Pokemon, so it would be tailored to the team and even could have a play style, so in-game for example, most trainers would have basic AI, wheras the Elite Four and Champion would have special AIs created for their team.
How feasible is this idea, and how would it compare to your AI?
I was talking to [I believe] Wichu about something like this. He was wondering whether a modified version of the AI could be used as part of Project Amethyst for the in-game AI. It can with very little modification.

The way I have it written, the teams are just team files in "any" simulator format (currently only Pokemon Online or Pokemon Lab... Pokemon Lab can open Shoddy Battle team files and you can just re-save those as Pokemon Lab files so that covers just about every Gen 4 team file format). I plan on having my AI load teams files at random, possibly with some sort of weight depending on conditions, but it currently just has the name of the team file in a text file that's loaded at start up. With a line or two of code, you could add a way to give each in-game trainer (or AI, in your case) a particular team file to load as their own.

The other part of the AI that would be customizable, other than the teams, is the evaluation function. Wichu's plan was to have wild Pokemon just use moves randomly (meaning he wouldn't need my AI at all for that), but he wanted a way to make trainers be smarter. There are two ways to do that with my program, and you can actually do much more than just making one AI "better" than another.

The most obvious way to make differences in different instances of the AI would be to increase the depth of search. Foes like Youngsters could search ahead only to a depth of 1, for instance, while a Cooltrainer might be able to think to a depth of 2, gym leaders a depth of 3, Elite 4 a depth of 4, and your rival a depth of 5 (my program can currently evaluate a position about 3 turns ahead in about 15 seconds and a depth of 2 almost instantly, but I have a lot of room for cutting that down... my goal is 5 turns ahead in under 30 seconds). This is a change that would be used to just make one trainer better than another (a greater depth equates to stronger play, almost regardless of the position and the evaluation function used).

The other way (and probably the much more exciting way, to Project Amethyst people) actually changes the type of player the AI is, and that is changing the evaluation constants. My program uses a plain text file that determines the value of various things. Altering this table of values allows you to change the behavior of the program quite drastically without having to know anything about C++ or recompiling the AI. For instance, in my evaluation function, I had the following values:

Stealth Rock -300

Members 1024
HP 1024

What that means is that Stealth Rock has a value of -300 * effectiveness for each Pokemon on the team. So a Pokemon that's 4x weak to Stealth Rock loses 1200 points (a hefty penalty!) if Stealth Rock is down on its side of the field. Simply having a Pokemon alive is worth 1024 point per Pokemon (so a full team of Pokemon gives you 6144 points in addition to any other boosts). What HP means is that a Pokemon at 100% HP gets an additional bonus of 1024 points, and that goes down linearly with HP. In other words, a Pokemon at 50% HP only gets 50% of the HP bonus.

With those values, and Technical Machine's team being Jirachi @ Leftovers with Iron Head and Stealth Rock, Celebi, and Subroost Moltres (plus some others), and the foe's predicted team being Hippowdon with EQ, Slack Off, Stealth Rock, Roar, and Tentacruel with Rapid Spin, this is how Technical Machine played while searching to a depth of 2:

First turn, Stealth Rock. After that continue to Iron Head Hippowdon until one of them (usually Jirachi) dies. The reason for this behavior is that Technical Machine's team doesn't have a Ghost, so it's afraid of Tentacruel's Rapid Spin. Because of Iron Head's flinch chance and the fact that EQ isn't a OHKO, Jirachi would much rather risk dying to a CH EQ, but probably flinch Hippowdon or take a normal EQ than let Tentacruel get in for free and Rapid Spin. In other words, valuing SR at -300 with Pokemon on the team weak to SR (Moltres) makes Technical Machine play with a suicide lead to a greater extreme than almost any player.

If I alter those values so that Stealth Rock is only worth -100, the play is radically different. Now Technical Machine's first move is to just switch straight to Moltres. Moltres is the chosen switch because Hippowdon has no moves that can harm it, and Moltres can fire off either a Flamethrower or an Air Slash against the foe's team, which it decided was a better course of action than risking Hippowdon using Earthquake. In other words, when it values HP much more than Stealth Rock, it plays very conservatively in this situation.

Changing the value once again to -200 gives yet another radical play style (it plays exactly like I would for the first 4-5 turns). It uses Stealth Rock first turn, and when the foe uses Stealth Rock, it switches to Celebi. Celebi gives it a solid match-up against Hippowdon, and if Tentacruel comes out, Celebi has Psychic. Sure, Celebi will take a beating from Sludge Bomb, but it has Recover too, so Tentacruel is in a tight position. It correctly recognizes that the foe will switch to Skarmory and start using Spikes, but the team it was using is incredibly weak to Skarmory, so the foe going to Skarmory puts Technical Machine in a bad position regardless of what it does.

With a little bit of programming knowledge (not much, a smart non-programmer could figure this part out by reading the C++ of my evaluation function code), they could even make a 'suicide bomber' AI, that is more concerned with hurting you than winning. They could make it so the AI doesn't care about their own Pokemon at all, they just want to do massive damage against the foe (so they could use Explosion with their last Pokemon, just so you have to go back to a Pokemon Center instead of continuing on your challenge).

To recap: Changing the depth of search, which is done by just passing a single number to the program at the beginning of the battle (I actually have plans to make it so you can change the depth of search on a per-turn basis, or even within a turn, so AI-makers could take advantage of this if they want to change the quality of play of a player over time to simulate distraction / renewed focus) changes the strength of play. Changing the evaluation function changes the type of play. Pokemon AI-makers like Project Amethyst can use this their advantage by loading a new team, evaluation constants, and depth for each trainer / type of trainer.

Also, more on the topic of your AI, I know that now chess computers have advanced so much, there are players that have developed specific play styles to beat computers. Surely this AI, especially its team and moveset prediction, would be especially vulnerable to players using unusual sets and teams. Would it be possible for your AI to have methods of combating this, or would it become too complicated or perhaps introduce worse flaws?
Well, there is a reason that certain things are standard, and that is because they are generally superior to other strategies. A player can use a strange team to throw off my prediction function, but if that weakens the team, it might not increase the odds of winning (the general problem with a gimmick team). And if it doesn't weaken the team, then that should eventually be discovered by Technical Machine and it would cease to be a surprise. The only idea I've had to try to counteract this is to try to keep taps on individual players. If Technical Machine has played a certain user before, it can assume certain things about them that could be more accurate than just the general stats. However, that's a much more advanced concept, and I'm not sure how much of a real increase in strength that would give.

I wouldn't be worried about unusual sets or teams so much. Sure, that will throw its team prediction out of whack, but that doesn't put it at any serious disadvantage compared to a normal player, unless its programmed to rely on its predictions as definite. I suppose things like Cosmic Power Jirachi could cause major concerns though, as it would take an enormous amount of time to see the inevitable win if its allowed to boost endlessly.
It currently does rely on its predictions as definite, but only within a single turn. The reason is that if it tried to predict 7 moves instead of 4, for instance, then it would also have to evaluate those 7 moves, which would slow it down for not as much gain. However, once it sees Jirachi using Cosmic Power, it immediately knows Jirachi has Cosmic Power and will take steps to stop it.

One thing; if the program will always do the "best" move in any circumstances, it does make it very easy to predict. If the opponent knows which move covers all options the best they can choose their move accordingly. Obviously I can't see a way around this though :)
My ultimate goal is to make Technical Machine select from all of its moves with a certain (possibly 0) probability. So it would select Stealth Rock 70% of the time, Iron Head 20% of the time, and Body Slam 10% of the time, and switch / use Wish 0% of the time, as an example. This would make it only predictable on average, but it's hard to fight against "On average, it will use the best move, but sometimes it will use the best counter to my best counter to that generally best move". And of course, to find out exactly what Technical Machine would do, the foe would have to be running Technical Machine themselves, but they don't have Technical Machine's exact team, so it won't play exactly the same.
 
sounds like the project is basically done, so how is it actually doing? can you show us some battle logs of playing against your ai, maybe?

by the way, your signature is missing a c.
 

obi

formerly david stone
is a Site Content Manager Alumnusis a Programmer Alumnusis a Senior Staff Member Alumnusis a Smogon Discord Contributor Alumnusis a Researcher Alumnusis a Top Contributor Alumnusis a Battle Simulator Moderator Alumnus
I'm in the "final" debugging stage before it's able to battle. After I am able to get it to complete a battle without it crashing and without Valgrind reporting any memory leaks, I will be fighting Fred, an AI created by Cathy and bearzly, the creators of Shoddy Battle / Pokemon Lab. My goal is to do that this week, but that's been my goal for almost a month now. I'll be posting the log of that battle as its debut. After that, this is my plan of the next few things to do, in no particular order:

1) Make it faster. The faster it gets, the stronger its play.

2) Make it connect to the server itself. I currently have to copy / paste battle logs to it and input moves manually.

3) Add support for gen 5. This is the big one.
 

obi

formerly david stone
is a Site Content Manager Alumnusis a Programmer Alumnusis a Senior Staff Member Alumnusis a Smogon Discord Contributor Alumnusis a Researcher Alumnusis a Top Contributor Alumnusis a Battle Simulator Moderator Alumnus
The current plan is for Technical Machine and Fred to battle this Saturday at 18:00 MDT. That is 20:00 (8 PM) EDT (Eastern time). We will be battling on the Pokemon Experte server of Pokemon Lab.
 

obi

formerly david stone
is a Site Content Manager Alumnusis a Programmer Alumnusis a Senior Staff Member Alumnusis a Smogon Discord Contributor Alumnusis a Researcher Alumnusis a Top Contributor Alumnusis a Battle Simulator Moderator Alumnus
OK, we had a bit of technical difficulty getting the battle started. Some how, several of the most recent changes to Fred disappeared, so we had a bit of a delay getting those fixed. After we got Fred working, I challenged with Technical Machine, both of us using a modified version of my paralysis ("Machampions") team.

After a very long stall war toward the beginning (Fred using Iron Head while paralyzed), Technical Machine suddenly switched to Machamp and hit Jirachi with a DynamicPunch. However, after this point, my Pokemon Lab battle froze (first time I've had that problem on Pokemon Lab), so we had to restart the battle.

I decided that maybe it would be more interesting if I used my stall team instead. However, we then discovered a bug in Technical Machine caused by my log analysis algorithm being incomplete. My program didn't understand full paralysis or confusion properly, which led to a segmentation fault.

I believe that the battles do show Technical Machine to be a quality player, and once I work out these simple bugs, it should be a top-notch player. :toast:
 

Users Who Are Viewing This Thread (Users: 1, Guests: 0)

Top