Technical Machine: A Pokemon AI

obi · Mar 29, 2010

Major update! Technical Machine home page and Mercurial repository.

I am working on an AI to play Pokemon (I'll make a more in-depth, general thread on it later), and one of the things I need is an evaluation function. I am using an expectiminimax algorithm with alpha-beta pruning, which functions best when moves are already ordered best. This means I need a relatively simple function that can try and "guess" at how good a particular position is. Therefore, I need to assign values to various things so that my program can properly weight things.

Unfortunately, Pokemon has elements of luck in it. This is why I have to use an expectiminimax tree instead of a minimax tree (I'm simulating the game as being a three-player game: AI, foe, and God. God moves whenever there are elements of luck). This means that I can't just get the order of things correct, I have to get their magnitude correct.

To summarize what an expectiminimax algorithm does is this:

The game is represented as a tree. Every move in a tree is called a node. At any point, the game is given a score. A positive score means the AI is winning; a negative score means the opponent is winning. The magnitude of the score determines how far ahead either person is. On any node for the AI, I try and find the move that maximizes the score. On any node for the foe, I try and find the move that minimizes the score (tries to make it more negative or less positive, which is maximizing the score from the point of view of the foe). On any node for God, I look at all nodes and average the score based on how likely each node is to be visited. For instance if the score after a CH is 600, and the score after not a CH is 200, the expected value for that is 225 (600 * 6.25% + 200 * 93.75%).

I am uncertain of what scores to assign to various conditions, however. This is what I want help with. To see an example, consider GNU Chess.

Here is my outline for what scores could possibly be, with vague reasoning:

Pokemon should be worth a large value of points (larger than anything else). Going from 6 Pokemon to 5 Pokemon should cause the player to lose fewer points than going from 3 Pokemon to 2 Pokemon.

Entry hazards should be worth less points. Stealth Rock is generally worth the most, then the order of Spikes vs. Toxic Spikes depends on opposing team. The first layer of Spikes is worth more than the second or third (twice as much, based on the damage it does). The value of Spikes and Stealth Rock should decrease with fewer opposing Pokemon, dropping to 0 when the foe is on their last Pokemon (or when all remaining Pokemon are immune). The value of Toxic Spikes should decrease with fewer opposing Pokemon that are not already statused (with permanent status like paralysis and poison lowering the value more than temporary status like sleep and freeze), but not dropping to 0 when all foes are statused due to things like Rest, Natural Cure, and Aromatherapy. The value should be 0 when the foe has 1 Pokemon remaining or all remaining Pokemon are immune.

PP should be given very little weight initially. However, the weight should increase very quickly when the PP gets very low (if two games are equal except in one the foe has a move with 0 PP instead of 4 and the AI has a move with 20 PP instead of 24, that should be a fairly large win for the AI). Perhaps this could be accomplished by giving a penalty for any move with "low" PP.

Pokemon should be given a penalty based on how much damage they've taken, with preference given to more damage on one Pokemon instead of little damage spread out again all Pokemon, in general. However, Pokemon should be given a boost if they have things like Blaze activated, or if they have Reversal / Flail / Endeavor.

Pokemon match up seems like it should be part of it, as well, but an evaluation function should be fast. I don't know of a fast way to determine which Pokemon has the advantage in a 1v1 match-up. I'm also completely unsure how to handle weather. Trick Room, Fog, Gravity, Uproar, Hail, Sun, Sand, and Rain all influence both sides of the battle. Perhaps I could do something like look at type-match ups and stats. If a water Pokemon is out during rain, give it a bonus, if a Fire Pokemon is out vs. a Grass Pokemon, give it a bonus sort of thing.

Pokemon should also be penalized for status, various conditions, etc.

What's more, the game does not start out with the score at 0 (an even match up). As an example, a team of 6 Suicune is likely in a good position against a team of 6 Entei. However, at the very start of the game, all the information the AI has is information about the lead Pokemon. The score of the game should depend on team match ups, but I'm not sure how to quickly decide which team has the advantage over the other.

In other words, I have a vague idea of how things should be scored, but I need actual numbers. They don't need to be exact: I can fiddle with them later. I'm just looking for a relatively close estimate of how valuable various things are.

cantab · Mar 29, 2010

About entry hazards - for an offensive team, how much they're worth depends on how many of the AIs team members gain nHKOes on how much of 'the metagame' with the hazard. This can be done by running damage calcs with the team members as offense, against the Smogon standard sets on defense, weighting for usage, and perhaps considering the 'need' of the nHKOes. (A team with 2 good Mence checks, only one of which needs SR, needs the hazard less than a team with only one decent Mence check that requires SR).

For stall teams, it's rather different. There, I might guesstimate how long the hazards will be up for (which depends on how likely the opponent is deemed to have Rapid Spin and whether the AI has a spin blocker), and therefore calculate the percentage damage they should do (based on any information about susceptibility to hazards of the opponent's Pokemon), then compare that to the expected damage from the AI's other move choices.

I may be being naive or missing something, but I think you'd get a good algorithm just by running probabilistic damage and nHKO calculations. Almost everything useful a Pokemon can do relates to damage. Attacks, poison and burn, entry hazards, and phazing with entry hazards cause it. Stat boosting increases the ability to cause it. Healing moves remove it from yourself. All status except poison (including Taunt) inhibits the opponent from causing it to you or healing it from them.
You could insert a large 'gap' between 1 and 0 HP, to encourage KOing a weak Pokemon rather than denting a strong one. You might also add a small weighting towards reducing uncertainty, so the AI would 'scout' if it's running a phazing move.

zarator · Mar 29, 2010

First of all, I'd like to express my awe for such a work. Pokémon requires such a hard work with heuristic factors that shaping a "smart" AI may turn out to be a pain in the ass. Since I'm not very expert in these, I'll just comment a couple things.

obi said:
The game is represented as a tree. Every move in a tree is called a node. At any point, the game is given a score. A positive score means the AI is winning; a negative score means the opponent is winning. The magnitude of the score determines how far ahead either person is. On any node for the AI, I try and find the move that maximizes the score. On any node for the foe, I try and find the move that minimizes the score (tries to make it more negative or less positive, which is maximizing the score from the point of view of the foe). On any node for God, I look at all nodes and average the score based on how likely each node is to be visited. For instance if the score after a CH is 600, and the score after not a CH is 200, the expected value for that is 225 (600 * 6.25% + 200 * 93.75%).

I guess you mean something like the number chess emulators assign to the match in order to determine who is winning (Can't really remember how it is called). It may be a good choice, but the fact that Pokémon is a game of imperfect information makes it tricky. For example, say I have a team which matches slightly badly against my opponent. However, the number of my revealed Pokémon may affect the "win/loss ratio" of the match. If the opponent doesn't know my full team, for example, he may be tricked into letting die a key threat for my team. I dunno if this "information gap" should be accounted for the AI's moves, and how.

Pokemon should be worth a large value of points (larger than anything else). Going from 6 Pokemon to 5 Pokemon should cause the player to lose fewer points than going from 3 Pokemon to 2 Pokemon.

On a general note, I agree, but in order to make the AI able to sacrifice intelligently his Pokémon (which is crucial especially if the AI is using an offensive team), you should implement a way for the AI to evaluate each of his Pokémon against the opponent's team, to find out every turn which Pokémon, if any, are expendable. For example, if the opponent's last mons were, say, Celebi, Salamence and Skarmory, my Breloom (let's assume for the sake of the example that my Breloom loses to all three of them and is slower) is kinda useless. It may turn useful to let the AI giving different values to his Pokémon depending on how they match against the opposition (in the example above, for example, if I had an Infernape with Hp Ice and Fire Blast, the AI should be able to recognize that Ape is much more important than Breloom).

Entry hazards should be worth less points. Stealth Rock is generally worth the most, then the order of Spikes vs. Toxic Spikes depends on opposing team. The first layer of Spikes is worth more than the second or third (twice as much, based on the damage it does). The value of Spikes and Stealth Rock should decrease with fewer opposing Pokemon, dropping to 0 when the foe is on their last Pokemon (or when all remaining Pokemon are immune). The value of Toxic Spikes should decrease with fewer opposing Pokemon that are not already statused (with permanent status like paralysis and poison lowering the value more than temporary status like sleep and freeze), but not dropping to 0 when all foes are statused due to things like Rest, Natural Cure, and Aromatherapy. The value should be 0 when the foe has 1 Pokemon remaining or all remaining Pokemon are immune.

In regard to entry hazards - assuming your AI is able to make switches - you should make the AI calc whether or not the switching Pokémon will survive entry hazards, in order to assign to the "switch move" the correct value.

PP should be given very little weight initially. However, the weight should increase very quickly when the PP gets very low (if two games are equal except in one the foe has a move with 0 PP instead of 4 and the AI has a move with 20 PP instead of 24, that should be a fairly large win for the AI). Perhaps this could be accomplished by giving a penalty for any move with "low" PP.

Maybe, you could also make the AI giving higher weight to moves with naturally low PP (8, maybe 16 too) right from the start if the opponent has Pressure. One more controversial check could be raising the weight of the most powerful move against X if X has used Rest or (assuming you can't 2HKO/OHKO) Recover (with these moves, making the AI able to run "in-depth" calcs like with chess in order to see if it can win the stall or not is probably the best thing).

I agree with the remaining part of your post. Actually, making these calcs reasonably fast is arguably the hardest challenge. I reckon, for example, that some of the features I suggested earlier would take away a lot of time for the AI to figure them out.

cantab · Mar 29, 2010

When thinking about speed, bear in mind you have nearly 5 minutes. In singles you have at most 9 options for your move (not including resignation), far fewer than the number in a 'typical' chess position. I don't think you'll have much trouble with calculation speed provided you use a fast language (ideally something compiled).

obi · Mar 29, 2010

How alpha-beta pruning works is it assumes that the best move it's seen so far is the best move the player can use. It then tries to find out whether this is correct. If the first move is the best, all it has to do is verify that this is the case. If the first move is not the best, however, it has to find the actual value of the best move, which is significantly slower. Here is the reason why:

Let's say all moves are sorted in order of best move to worst move.

I use some move and look at the opponent's responses (a 2-ply search, ignoring "God"). I see that after each of their moves, the game state can be a value of any of the following: -5, 2, 3, 7, 10. I'm going to assume the opponent will use the best move, which means the score will end up at -5 if I use that move. Then let's look at my next move and the score of the foe's responses: -10, 7, 20, 50, 100. In there, I have to check all 5 moves using a naive search. However, because everything is ordered properly and I'm using alpha-beta pruning, I only have to search one node for that second move! The reason is that as soon as I see that it starts with -10, it doesn't matter what the other moves are. My opponent already has a move which gives him 5 more points (-10 compared to -5), which means that if my opponent plays perfectly, I'm better off using the first move than the second. Therefore, I don't have to search the other moves at all.

If I don't have perfect move order, I might have to search more than just the first move to find the opponent's "better" response. I say better because it doesn't have to find the best. All it has to do is show the foe has at least one response that is better than their best response to the best move I've already checked. This is why it's faster to verify something is better than to find the best. To find the best, it will have to search all responses and pick the one that is best. With alpha-beta pruning, it stops searching when it shows the opponent can make the game state better.

The evaluation function doesn't need to be perfect, however. Anything in the evaluation function that involves looking ahead and calculating future outcomes is completely out of the question. That will be handled by my expectiminimax tree. I plan on doing something similar to quiescence search in chess. Quiescence search is when the program notices that the state of the game is likely to change a lot in the next couple of moves, it searches deeper than it normally would until the game becomes more stable. This is related to another thing I plan on doing, which is similar to an end-game tablebase in chess. When the 'material' gets low enough, I'll head more toward brute force and deep searches to try and find the win.

I need to call the evaluation function for every single outcome of a turn. This means that it can be called several thousand times per turn. Obviously, the more accurate my function is, the better, but if I get something that selects the best move 5% of the time more often, but takes twice as long to run, that will usually give a net loss in playing power. This is because I'm spending so much more time evaluating each position that I can search interesting positions as deeply. Deeper searches tend to give stronger play more than anything else. This is why the evaluation function should not have stuff like "If this Pokemon has a greater than 50% chance to win in a 1v1 battle with each side using optimum moves, give a bonus of 60 points.".

And just to explain a little deeper, the reason a good evaluation function is related to move ordering is that I plan on traversing the tree using a process called iterative deepening. What that means is that first I look one "ply" (a move, basically) ahead. I evaluate all the positions, and order them from best to worst. Then I repeat the search with one more ply, and re-order based on the values at the end of the tree (at the leaf node). This might seem wasteful to have to continually re-evaluate positions, but due to the large branching factor in Pokemon, almost all of the time will be spent at the deepest parts of the tree, meaning that re-ordering the tree early and often will give massive speed ups much greater than the cost of re-evaluating positions.

cantab said:
When thinking about speed, bear in mind you have nearly 5 minutes. In singles you have at most 9 options for your move (not including resignation), far fewer than the number in a 'typical' chess position. I don't think you'll have much trouble with calculation speed provided you use a fast language (ideally something compiled).

I'm writing this in C++.

There are really more than 9 options per turn. The best case is a move like Dragon Dance. It has one outcome. The worst case is a move like Fire Fang. It causes the tree to branch in an additional (16 values of r * 2 values of CH * 2 values of causing burn * 2 values for flinch =) 128 paths. Fortunately, with smart move ordering I can simply not evaluate many of those paths. For instance, if I can show that CH flinch burn Fire Fang is still not as good as the worst case of another move (let's say, Earthquake), then there is no need to evaluate any of those 127 other branches. In most cases it won't be on either extreme, but the branching factor is much more than 9 per move. The elements of luck is actually why I've been thinking that computers would play better stall Pokemon than offense, as long as they can properly evaluate entry hazards. Non-attacking moves tend to have only one outcome.

Even if we assume 9 options per move, a turn consists of both players moving, which means a turn has 81 outcomes. The average branching factor in a game of chess is about 20-30, which is smaller because of no simultaneous move execution. It's also simpler to evaluate attacks, because taking a piece is "Can I move to this square?", compared to Pokemon where you need to run a fairly long calculation.

Just to give you an idea of how much luck can cause the game to branch, if both sides are down to their last Pokemon and have four attacking moves each, with all of them having one side effect (Thunderbolt, Flamethrower, Ice Beam, Psychic, for instance), then each turn has 65536 possible outcomes. Of course, if they can switch the game only gets an additional 2585 outcomes, bringing the total to 68121, which is not quite the worst case (moves like Fire Fang make it worse), but it's much worse than average.

Little Green Yoda · Mar 29, 2010

Really interesting stuff there, obi. For the scoring system, you could try adopting the win probability format, similar to what baseball sabermetricians use (0% for a sure loss, 100% for a sure win, 50% for an even match). Percentages may get messy depending how precise the numbers get though. It should function well as a starting point and can be scaled later for aesthetic purposes.

Looking at "endgame" scenarios like 1v1 or 2v1 is probably the easiest place to start IMO. At the very least, you avoid the mystery Pokemon factor that you pointed out in the first post. But even then, with the multitude of possible moves on some Pokemon, getting that initial evaluation will be tough. Perhaps some weighting of move usage percentage from Shoddy statistics is something to consider?

cantab · Mar 29, 2010

So is it the case then that the "evaluation function" is really just a first guess, I'd say use the expected 'damage difference' based on weighted averages of the metagame perhaps modified by information you have, with my previous suggestion for calculating the damage done by entry hazards based on a guess of how long they will stay up for.
(It has to be a difference, not a ratio, because if you did damage done/damage received, that's always zero if the AI switches.)

BTW, to what extent might you disregard uncommon moves on the opponent? I said there are 9 possibilities for a player - but what the opponent could do is more. Considering every move the opponent's Pokemon might have will vastly increase the complexity, but if you don't think of rare options the AI might get caught out. (For example, it would be unwise for the AI to assume a Salamence that has Dragon Danced does not have Draco Meteor, since the Mixed Dancer set is used, and isn't even massively rare).

Another possibility might be ignoring rare added effects. In your previous example, Flamethrower, Ice Beam, and Thunderbolt all have only a 10% chance for the added effect. If no-one's a Serene Grace user or possible user, would it really hurt to ignore the chance?

obi · Mar 29, 2010

Well, 10% is actually pretty high. You tend to use those moves more than 6 times in a battle, which means you are more likely than not to have it happen at least once.

And yes, the evaluation function is basically a guess. You have to weigh accuracy vs. speed. Simply assigning a value to everything and adding it all up (or even having the values depend to one degree or two degrees on another value) means you'll have a very fast function that should be able to approximate the actual game state. Like in the GNU chess example I gave, pawns are worth 100 points (which means the smallest change in a score is 1/100 of a pawn, but that's not too important, since you can multiple all the numbers by the same factor and it won't change anything). However, pawns that are not supported by other pawns get a penalty, pawns near the king get a bonus, etc. So you start off with some value and take a few factors into account to try and make it more accurate. Addition, subtraction, multiplication, and division are all really fast operations, so as long as you're not doing too much of them, it shouldn't be an issue.

cantab · Mar 29, 2010

OK, looks like the evaluation function is meant to be evaluating the 'state' rather than actions. So most of my previous stuff was off the mark.

We may still be able to rescue some of my ideas though.

Let's start with total percentage HP. Essentially, the general idea is we will then estimate how much certain things will reduce this HP by in future, and subtract the estimate now. Since in most cases we penalise the victim, scores range from zero (everything KOed) to 600 (everything full health).

For entry hazards, we take the percentage they deal (weighted to take into account general weaknesses and resistances over the whole metagame, so the weighting is constant - 1 layer spikes thus gets rather less than 12.5% because of all the airborne stuff), and multiply that by one more than the number of phazers the non-hazard team has (I assume a phazer induces as many extra switches as a team would do normally, which I know is VERY rough, but hey), then multiply that by the average number of switches per battle times the (average turns per battle - elapsed turns / average turns per battle), and subtract the result from the hazard teams total HP.

If we know one team lacks a counter to a Pokemon the other team has, subtract 100% (assuming an uncountered Pokemon gets 1 kill - perhaps 100% is low). If we know one team lacks a counter to a Pokemon the other team MAY have, subtract the probability the opponent has that Pokemon based on usage. This basically requires an array of what counters what. (About a quarter of a million entries). For simplicity, go on species - either assume eg all Magnezone counter all Scizor, or weight it by set usage. Just saying "yes" or "no" only needs one bit making the memory requirements for the array much smaller.

If one of the currently out Pokemon usually counters the other, we assume a switch will happen, and simply hit the switching team with a flat 'free hit' penalty (20% seems a good ballpark figure, since most switchins resist what's out, though of course they can be predicted). For more refinement, if the non-switcher has a stat-up move, it's a bigger hit for the switching team.

Most probabilities should be done as lookups, based on whole-metagame statistics. Whether the lookup table should be updated during battle I don't know - doing so is more effort, but not doing so creates an AI that may paradoxically weaken as battle goes on.

We still need to consider field effects. I would assume the benefit falls on who initiated them (and thus track this), but how much the benefit is is unclear.

We also need to consider stat-ups. I'm unsure how big an impact these have, since they are situational. If the stat-up Pokemon is known uncountered by the opponent, however, it's probably game-winning, so I might simply shove the opponent's score to zero right there.

Jonathan · Mar 29, 2010

Really interesting project you have here, I'm curious to see what comes out of it.

Regarding Pokemon match-ups, would it maybe be possible to store Doug's detailed statistics we get monthly for move usages? I know very little about programming so I have no idea how easy this would be to do, and whether the AI would be able to handle all that data at a reasonable speed. But I feel that information would be very useful to implement as it's a big part of what people use to determine Pokemon matchups.

Really I think even if you could only implement something as simple as [switch out if PLAYER's Pokemon has a type that is SE against the AI's Pokemon's typing] would be a big step-up from the in-game AI. From there you can build things up slowly. My experience with these kinds of things, actually experimenting with the AI system you create will give you better ideas about how to implement weather/gravity/etc. than trying to think everything out beforehand.

Acritter · Mar 29, 2010

Wouldn't an AI be extremely reliant on what its team is? For example, a stall team is loath to lose even one member, while a team oriented on getting one Pokemon to sweep wouldn't care about who it lost so long as the sweep was set up. A universal AI is probably impossible: I know I play much differently with different teams. Maybe it would be a good idea to start by making a team you want to program the AI for.

Jonathan · Mar 30, 2010

^Yeah this is a good idea, you could program in a bunch of teams from the RMT section potentially.

FlareBlitz · Mar 30, 2010

It seems like weather would be relatively easy to account for; the effects of weather are always fixed, so they don't need to be accounted for as far as branching or iteration goes. A fixed point value for weather would be acceptable, somewhat like the point value adjustments for isolated/backward pawns, with possible weights based on the relevance of the weather's effects.

For instance, Sandstorm increase the SDEF of Rock types by 1.5x, does residual damage equal to 6.25%, and activates the ability Sand Veil. Given this, if a sandstorm is currently raging, all Pokemon on both sides would have set point values added/deducted from their total point value.

Essentially, if a Rock type Pokemon in sand will gain points, a Pokemon with sand veil will gain points, a Pokemon immune to sand will not gain or lose points, and a Pokemon vulnerable to sand will lose points.

Rain/Sun can be accounted for similarly; Rain boosts water type attacks, increases the accuracy of Thunder, boosts the speed of Pokemon with swift swim, and so on. If a Pokemon has Swift Swim and it is currently raining, that Pokemon's point value should be sharply increased.

Of course, quantification would be an issue, but then, it's an issue for pretty much all these systems, in any context. Even in chess, we never know whether a Bishop is worth more than a Knight intrinsically, so we just assign them equal point values (some systems assign the Bishop more) after taking into account their value relative to other pieces. I assume a similar amount of "eyeballing" will be involved here.

I would like to ask one thing. Pokemon is obviously a game of imperfect information; however, the best players are able to deduce information about an opponent's set based on certain battle conditions. To take a simple example, assume that the AI has a Pokemon with both Fire and Ground attacks, with Spikes up, and a Bronzong switches in. The Fire and Ground attacks will do roughly similar damage without taking into account Bronzong's ability, so the AI would choose the Fire attack (because it's a coin flip as to what ability it has, and doing some damage is better than doing no damage). However, an actual player would, upon seeing Bronzong switch in, notice that it has taken damage from Spikes, and would therefore simply use the Ground attack, which is the optimal choice. I feel that in order for an AI to be successful, it certainly needs some rudimentary prediction and extrapolation skills, and I'm curious as to how you will program this in.

Chinese Dood · Mar 30, 2010

This is pretty interesting. I am currently taking an AI course at school (I actually just made a 1.5 hour presentation on Chess AI last week lol). There are so many unknowns in pokemon that this sounds like it would take an insane amount of time to perfect. I mean, Deep Blue took over 10 years of development from primarily 3 programmers to be completed to beat the best chess player in the world. With our more advanced technology right now, it would hopefully take much less time, but still this will take a long time to perfect.

My only suggestion (in case you weren't doing this already) would be to build something small that works first (e.g. something like the Battle Tower AI... or probably a bit more intelligent than that) , with the framework for expansion, and then just iterate on it, instead of trying to build the best thing out there right from the start, because chances are the project will never be completed that way.

obi · Mar 30, 2010

I have several working parts already. For instance, I have a working reverse damage calculator that takes the damage done in an attack and uses all known data to find hidden values (such as EVs). This helps improve the limited information bit. What I really need help with is determining the relative value of things for my evaluation function.

Chinese Dood · Mar 30, 2010

Cool! Like others said, these relative values really depend heavily on what type of team it is (stall vs offense). It would obviously be easier to model an AI for an offense team, since battles with an offense team will have fewer turns than stall. For an offense team, secondary effects of a move (e.g. burn/paralysis/poison/flinch) would have very little weight (or no weight even) because the objective is to KO, not relying on status, whereas the opposite is true for a stall team (except ... when all opponents' pokemon are statused already, then those 2ndary status effects should then have no weight, obviously).

From the above, it would then also make sense for the AI to determine the value in having an opponent being statused. This would probably be an entire database of values for each status for each pokemon. E.g. (not that these are any indication of any accurate values)
Dragonite might have a burn value of +.2 if you don't know what set it is, +.3 if you know it has at least one physical move, +.5 if it has only physical moves, +.1 if it is totally special. From statistics, we could determine the probability that an unknown Dragonite is physical/mixed/special{or how they are paired with other team members if there is info on that}... which would be a factor in determining this burn value. Then there's paralysis, poison, sleep, confused, and possibly even accuracy (not applicable unless the team being used has accuracy lower moves). A general determination by the AI as to whether or not the pokemon is defensive or offensive will probably also play into what those values are. Obviously, potential status-related abilities like Natural Cure / Quick Feet / Guts / Marvel Scale will also change those values. That Dragonite's remaining HP will also obviously be a factor.

If +1 means AI is winning by 1 Pokemon, then inducing a successful burn on a purely physical Dragonite (just cuz I used that as the example earlier) might result in an additional +0.6 or something like that (depending on the nite's remaining HP), but might only result in +0.2 for a Special nite, or +0.15 if it's a special one with roost.

Hm, I should probably just shut up now since I'm pretty much just thinking out loud and typing it out lol. Basically, it's probably eyeballing the #s and then tweaking until it feels right (or if you can, write an AI that tweaks these values by itself... i.e. make it able to learn. That's obviously... difficult to create). I can't think of any other way of doing this atm really.

Bamce · Mar 30, 2010

Are you going to assign different roles different values?

Clearly the opposing teams revenge killer is very important. Being able to identify certain pokes, scizor. In such a role would warrant them special attention. With that poke out of the way certain sweepers have a much easier time.

I would also consider hard coding it to not boost more than once. Just to keep it from randomly over boosting. Its not often that you need more than one anyway. Factoring in intimidates or something for physical sweepers like

If X(attack) = +2 then no boost
if X(attack) = +0 then swords dance
if x(attack) = 1 or -1 then swords dance

Replace attack with speed, for rock polish or spatk for nasty plot.

As for assigning values. Perhaps a team should be made first? This way we could all comment on how certain things weight in relation to that team. Give that team to the AI first. Or make 2 opposing teams, and after discussion have them battle it out between them. Or to avoid unintentional mucking of results have yourself and a different person make the teams this way you don't subconsciously make one team more dominate than the other.

edit:
When talking about additional effects you could trim down the actual effects.
Burn= attack drop = intimidate
para= speed drop = icy wind

I don't see where the major status additional effects warrant much in depth calculating. The damage from poison/burn will be taken into account with the health programing portion. The battle effects could attributed to X number of stat drops from things like intimidate.

I don't recall ever truly thinking that I should use psychic and hoping for a spec d drop. Its nice when it happens however never something that is truly planned for. Except in a last poke stalling battle, but then you may as well just hope for a crit.

ck49 · Mar 30, 2010

*Brain explodes*
Interesting but complex. I have a couple thoughts. How will the AI make its team? Will we make it for it, then let it play? I would find it interesting to know the "best" team possible, according to the AI.

Also, I think this hasn't been mentioned, except the AI should make some assumptions, and I was also thinking of Jonathan's idea of puts Doug's stats inputed in. For example, from the teammate stats, we know that a person who runs Jolteon is very likely to run Gyarados. Thus, when facing a Jolteon, he shouldn't use Earthquake, rather stone edge. The toughest part I would think is accounting for predicting what the opponent is going to do, which there isn't as much of in chess because its a game of perfect information. I don't know programming so, I can't help in that, but I'd love to see this go through, and I'd love to help in any way.

Odinwolf · Mar 30, 2010

Obi, would it be a bad idea to use a descriptive model for ordering the value of various moves? As in, use the usage stats on Shoddy to order them, rather than the intuition of people on the forums? If your playing program is collecting stats on its own play, you might be able to re-order them yourself over time, but in the short term using what the population has already ordered seems to be the best option.

Not sure how you handle the differences in magnitude between them in that instance, perhaps use a system like our tiers... The top 5 moves are all equally spaced from each other, with a gap for the next X that are all equally spaced, but with wider gaps than the first set, etc. Again, you could adjust that over time if you are collecting stats of your own to evaluate your model.

petrie911 · Mar 30, 2010

A full evaluation of the game's state seems like it would be very, very difficult, as there are many, many variables, and subtle nuances like team matchups would be hard to evaluate. This is why I suggest perhaps a more perturbative approach to figuring out an evaluation function. We start with a simpler evaluation function that takes into account only a few variables. The AI will likely make mistakes with such a function, but those mistakes can then tell us what variables need to be taken into account.

For example, suppose we start with the most basic evaluation function possible, the sum of the AI's team's remaining HP minus the sum of the opponent's team's remaining HP. We then notice the AI never takes time to set up SR. We then know that SR itself needs some extra weight, as the AI is not perceiving its long-term value.

To be honest, I'm actually kind of curious how well an AI would play using just that basic evaluation function mentioned.

Afti · Mar 30, 2010

Back to something mentioned a couple posts ago, on how it would identify a stall team as opposed to an offensive team and act accordingly:

Perhaps assign each Pokemon a point value from 100 to -100, where 100 would be something like Deoxys-A and -100 would be, say, Shuckle? The epitomes of fragile offense and inoffensive wallishness. From there, add the totals of these values as the information becomes available, and assume the opposing team is an offensive team if it has a positive score, and stall if negative. Individual styles could be accounted for, too; a bulky offense team would be lower than a team of extremely fragile things like Infernape/Azelf/Gengar.

Keatsta · Mar 30, 2010

Reading this thread really makes me want to program a competing AI that could challenge yours, but I don't have the time, programming skill or knowledge of Pokemon.

What it also got me thinking of, though, when people were comparing it to chess, was Gary Kasparov's idea of computer assisted chess, where the human player would enter each move into a program that would then analyze all possible moves and present the player with its top choices. The player then would use his insight to pick the best move out of a field of good choices, a problem difficult for computers.

A Pokemon equivalent of this could simply be a program that used Doug's server statistics to predict what the opponent could be running based on what you have already seen. It would be much easier to code than an AI since the actual decision making would be done by the human, so the program would need to do nothing more than some statistical calculations and present what it guesses for the rest of the opponent's team. I think this would be cool since it would shift success at Pokemon away from being able to memorize movepools and towards just having good insight and clever teams (in fact, unique sets and teams would be even more potent, as an obscure strategy that would somewhat fool the program would be necessary).

Anyways, just a thought. If I had more free time, I'd try coding this myself, but I don't know how far I'd get.

Wikey · Mar 30, 2010

The one thing I wanted to bring up was somewhat hit on by Keatsta's post. Basically, one of the big flaws of any AI program is its predictability. Once an opponent grasps the gist of what the evaluation is going to determine, they know what the AI is going to do. Basically, if last time I had my Pokémon X out against his Y with similar battle conditions he did Move Z. With human players you can never be 100% sure what they're going to do, but with an AI you can be. Beating it would be less Pokemon and more like beating a boss in Megaman. Memorizing what it does in certain situations and acting accordingly. I'm not sure if you had any plans of countering this or how much of a problem it would be, just something I was thinking of.

Pim · Mar 30, 2010

Wikey said:
The one thing I wanted to bring up was somewhat hit on by Keatsta's post. Basically, one of the big flaws of any AI program is its predictability. Once an opponent grasps the gist of what the evaluation is going to determine, they know what the AI is going to do. Basically, if last time I had my Pokémon X out against his Y with similar battle conditions he did Move Z. With human players you can never be 100% sure what they're going to do, but with an AI you can be. Beating it would be less Pokemon and more like beating a boss in Megaman. Memorizing what it does in certain situations and acting accordingly. I'm not sure if you had any plans of countering this or how much of a problem it would be, just something I was thinking of.

Are you aware that the reason that computers win chess matches consisting of multiple games often is exactly what you described, except it is the computer who "learns"? I would guess it's really hard to pull off, but if there is anyone who could, it's Obi.

As for the evaluation function, I think a using the existing statistics to value the pokemon in each team, where higher ranking is higher value, could be an option. Especially because it seems that it would need relatively little computation.

Another option could be base stats, possibly weighing stats with EV's heavier (because Breloom's Sp.Atk doesn't matter).

Good luck with this awesome (and huge) project

PallyHeroRush52 · Mar 30, 2010

So I can't procure or receive critical hits in Smogonbattle if I'm an atheist since there's only two players? Sweet.

Technical Machine: A Pokemon AI

obi

formerly david stone

cantab

zarator

^_^

cantab

obi

formerly david stone

Little Green Yoda

cantab

obi

formerly david stone

cantab

Jonathan

Acritter

Jonathan

FlareBlitz

Relaxed nature. Loves to eat.

Chinese Dood

obi

formerly david stone

Chinese Dood

Bamce

ck49

Odinwolf

petrie911

Afti

Keatsta

Wikey

Pim

PallyHeroRush52

Users Who Are Viewing This Thread (Users: 1, Guests: 1)