Applying Game Theory To Pokemon redux

Matthew · Jul 13, 2010

The original thread will be posted here but you can find it here as well (all credit goes to McGrue for the OP!)

Introduction

Game Theory is the study of decision making by individual agents (players) in response to others in any given scenario (game). It is applicable to all walks of life, so why not Pokemon? The aim of this thread is to give you a new perspective on battling, with the long term goal of educating the masses.

Assumptions

1. Players are rational.
This must be true if you expect to predict accurately.

2. Players aim to maximize their chances of winning.
Otherwise, there is nothing to it.

The Importance of Information

As anyone who has scouted a tournament opponent (or been the victim of scouting) will testify, knowing your opponent's team gives you an undeniable edge. However, scouting a battle can tell you so much more; notably how risk loving or risk averse your opponent is (how often does he switch out of a neutral match up? does he use fire blast over flamethrower? etc). All these tidbits cumulate to provide you with a profile to work with later.

Before a battle, you need to know just how good your opponent is, so you can associate him with a profile; for example weaker players will switch out of immediate dangers, stronger players will account for secondary dangers (your likeliest predictions) and the strongest players will be familiar with all sorts of mind games (I know that you know that I know that you know that I know that you know... etc).

During a battle, you need to figure out as soon as possible whether your profiling is accurate. This applies for even familiar opponents, because nobody battles the same on different days, for any number of exogenous reasons. Further, given that players tend to be more cautious early on (perhaps because they are doing the same thing!), this is more difficult than touted.

Catalogue all information that occurs during the battle, and in particular note the turns where you outplay your opponent and vice verse. After a while, the strongest players will develop a sense of what is going on, and press home their advantage. Essentially, a battle amongst the best players is a race to pigeon hole how well their opponent is playing on that particular day (all else held equal), and predict accordingly.

The strongest players are able to adjust their style if they realize that they are being outplayed.

The Choice Band Dilemma

This is a play on The Prisoner's Dilemma, but it is slightly more complicated. Consider the following (not unlikely) scenario:

~Bug catcher Aeolus leads with choice band Heracross with megahorn/close combat/stone edge/pursuit
~Schoolboy Batpig leads with Swampert with earthquake/stone edge/ice beam/stealth rock
~Bug catcher Aeolus is a good battler, schoolboy Batpig is top of his class.
~Neither player knows anything about the other's team.

Clearly the matchup favours Heracross and with a lack of information, we can safely assume Aeolus will not switch. The question is what attack he will use. If Swampert stays, it will certainly use stealth rock. Given this, Aeolus does best with close combat or megahorn. If Swampert switches, neither of these rate to be optimal, as the new pokemon will likely resist one or both. The risk of predicting no switch and Batpig changing to a ghost means megahorn is a better option than close combat.

Given a switch, pursuit will not damage Swampert significantly and more or less forces Aeolus to switch on turn two. Therefore, if Batpig switches, Aeolus does best with stone edge. We have thus concluded that given this scenario, Heracross will only ever use megahorn or stone edge (through similar analysis, you can be reasonably sure that a choice band Tyranitar will use crunch as its first attack etc).

Since Batpig is top of his class, he understands this mechanism and assigns a probability for which attack Aeolus will choose. He knows that if he leaves Swampert in to megahorn, the price could be defeat. On the other hand, if Heracross uses stone edge and Swampert stays in, not only has he laid stealth rock first turn, but Heracross must switch second turn.

Since Aeolus is merely a “good” battler, Batpig decides that Heracross will stone edge 70% of the time (arbitrary example), expecting a switch. If neither player began with a team edge over the other, Batpig needs this play to win 30% more battles than he would otherwise, to break even in the long run and justify staying in.

Decision Tree

In the above scenario, I used words to describe something that is easier to communicate visually. Observe the generalized decision tree of one turn in Pokemon battling (N is a third agent “nature” also known as “luck” i.e. hit/miss, CH? and RNG):

It is important to note that A and B act simultaneously with the same information set (the tree suggests B acts after A). The payoffs are expectations based upon the probability of that outcome. More precisely, there should be 4 options rather than “attack” and 5 options over “switch”, but that would over complicate things.

Independence

Subsequent decisions are more interesting. The outcome of a repeated matchup will not only depend on the situation of the battle, but some causation can be attributed to what has happened before. The good battler tries to be unpredictable by not adhering to patterns, but the best battlers always consider each situation on its merits.

Undoubtedly, you will have gathered more information about your opponent's team, thus changing the probabilities and expected payoffs of every possibility. Mind games really kick in now, and your profiling better be accurate.

Commitment

If you want someone to do something, commit to an action that obliges them to do just that. For example in real life, if you want your girlfriend to go to a football match with you when she would rather watch a teen flick, book the tickets in advance. This incurs an additional cost to not going to the football, so she is more likely to err in your favour.

Similarly, commitment occurs in pokemon with choice items. Once you lock yourself into an attack, you can only continue to use that attack, or switch. Since your attacking option is known, your opponent can only react optimally to that attack. For example, by committing to ice beam for a kill, your opponent is unable to switch to a Salamence who could otherwise dragon dance and sweep.

Comment

These methods may seem mechanical and even long winded, but after accumulating experience, you will find that they become almost second nature. Other battlers may well recognize in themselves everything that I have said, but probably have never thought about battling so explicitly. This is kind of short (honest) because I got fed up with it halfway through. I might add stuff later depending on interest.

---
Often when playing a game we're forced with difficult decisions that will either mean we win a game or lose a game. We might think player A will switch when really he's given up on keeping the pokemon alive, and in return we might lose the advantage which we once had. This is where we come into a player's style. When battling someone it's key to watch when they switch, how they attack, and when they think they're safe to set up. One too many times have I seen a Suicune come in on a Heatran and think it's a solid move to start stat-upping right away when it's still early in the game, only to be met face to face with a Roserade, Celebi, or even Explosion from Heatran. Was player A wrong to Calm Mind early game? Well it depends on what the Heatran was. Did he see LO recoil? Did he see Leftovers? If he saw nothing is it safe to assume it's Choice Scarf or Specs? All these things fuel how we play.

There's some people that don't pay attention to the logs, however. They'll graze over a Heatran with Leftovers and still fear Explosion off of it. I was playing a match today where I switched in a Tyranitar into a Rotom's Thunderbolt, the player didn't see Leftovers and must have thought that I would be some kind of Choice Set (little did he know I was Expert Belt.) He made a good call and switched Rotom out while a Scizor came in to take a Crunch. Now he didn't see LO damage which fits his idea that Tyranitar is some sort of Choice set. The problem is it's still early game, much of our team is yet to be revealed and he's willing to bet a pokemon on whether or not Tyranitar is Choiced. Every call he made was and seemed correct, so why didn't it pan out for him? This point goes back to the beginning of McGrue's post, "is your profiling accurate?" While it's easy to say "I'd like to improve my prediction/profile/whatever you kids call it," a lot of it comes with experience from the game and what you take out of it.

obi · Jul 13, 2010

I don't think you can reduce it to "Megahorn or Stone Edge". Pursuit is almost certainly a bad choice, but Close Combat has some advantages. Most importantly, it has 100% accuracy. If you are assuming Swampert will either Stealth Rock or switch, then Close Combat is most likely the better choice if Swampert stays in. Both options will 2HKO Swampert, but Close Combat does it 100% of the time, whereas Megahorn will only 2HKO 72.25% of the time. More likely, you are guaranteed to do a lot of damage to Swampert that first turn and then both parties switch.

If Swampert switches, sure, a Ghost means Close Combat does nothing, and a Pokemon like Rotom-A is the safest switch, but it's not guaranteed that they have it or that they will switch to it over something like Gliscor or Salamence.

You really should use expectiminimax. Find the average pay out of all moves (I think they'll stay in 60% of the time and switch 40% of the time, and the likely break down of those switches is..., and if they stay in my average pay out with this move is...), and then assume your opponent picks the move that gives you the minimum average pay out. Of course, once you get more information about your opponent's play style, you can kind of eliminate that "minimax" part and do more average pay out (I know my opponent is risk averse, so he's likely to switch out Swampert more...).

And if you want to get more advanced, you play the truly optimal strategy, which is almost certainly a mixed strategy. What that means is you would play Megahorn a% of the time, Close Combat b%, switch to Tyranitar c% of the time, etc.

magickzzl · Jul 13, 2010

Most of this is way over my head, but absolutly amazing. Im just curios if your predicting battles this way, what percent you give to the unknown/human factor. that bit of irrationality or typo or just user error? is this covered and I missed it?

Thank you for this analysis though! amazing!

BurningMan · Jul 13, 2010

how shall the swampert player know that its CB Heracross and not a swordsdance/Flame orb Set or any other weird set(wich isnt that unlikely, cause seriously who leads with heracross o.O).

Outside of that its well written the biggest problem to apply the strategy is that lots of the times assumption 1 will be wrong when your playing on the ladder.

Matthew · Jul 13, 2010

BurningMan said:
how shall the swampert player know that its CB Heracross and not a swordsdance/Flame orb Set or any other weird set(wich isnt that unlikely, cause seriously who leads with heracross o.O).

That's an issue which I brought up when I was explain my Expert Belt Tyranitar. The player I was facing made every right decision yet in the end he still lost his Scizor, it has something to do with how many different pieces of Pokemon there are and how each piece individually effects the game. The situation stated in the OP is just an example, a better situation is

Youngster Aldaron sent out his Metagross while I sent out my Choice Specs Heatran.
We can assume that Aldaron's goal is to get Stealth Rock down at the beginning of the game, but he also wants Metagross to stay alive. Overheat off of Specs Tran will OHKO Metagross through Occa, preventing him from setting down Stealth Rock, but a Water-type, like Suicune could easily switch in, take the attack, then Rest off the damage. Since it's the beginning of the match This goes to what Obi says, I should Overheat a%, use Hidden Power Grass b%, and I should switch to Rotom-W c%.

noob3 · Jul 13, 2010

This is what I do not like about pokemon: You can make the right move, predict perfectly and get 6-0'd because your opponent was playing poorly. A lot of the time I find myself making the wrong move and being rewarded for it, it's a good way to throw off better players.

alamaster · Jul 13, 2010

noob3 said:
This is what I do not like about pokemon: You can make the right move, predict perfectly and get 6-0'd because your opponent was playing poorly. A lot of the time I find myself making the wrong move and being rewarded for it, it's a good way to throw off better players.

That's often called over-prediction. I do this a lot but usually after a couple of turns I can somewhat gage how the opponent thinks and respond accordingly. Also, you say if you predict perfectly but still get 6-0'd, then that isn't predicting perfectly. What you're saying is basically the theory in itself; is I predict perfectly according to what the most logical move the opponent is going to do, then I should be rewarded. You could be making the right move logically, but the opponent predicts one step further. They could have been playing poorly, or they could have just outplayed you.

MegaKick · Jul 13, 2010

very nice, I remember the original thread, immensly helpful. However, there is another aspect to this for the heracross's user: can they handle swampert? I know that this may not go with what this is talking about, but if swampert gives trouble to your team, we can assume they will megahorn rather than stone edge. If the rest of the team handles it well, the heracross will most likely stone edge.
Also aggreeing over-prediction sucks, especielly when you predict the opponet will stay in, you use the wrong attack, and everyone calls you a noob >___< /rant
Excellent thread genny, an excellent reread for older players and an excellent read for newer players.

Sandstreamer · Jul 13, 2010

noob3 said:
This is what I do not like about pokemon: You can make the right move, predict perfectly and get 6-0'd because your opponent was playing poorly. A lot of the time I find myself making the wrong move and being rewarded for it, it's a good way to throw off better players.

The summary of every remotely interesting game ever.

Thorns · Jul 13, 2010

don't predict on ladder

an important thing to note is how they react to certain moves, and how they react to those same moves later on. often opponents, and perhaps yourself apply minmaxing or the optimal probabibility assignment system obi talked about. it's useful to see how much your opponent knows about your team. for example, say you've got your expert belt tyranitar out against a zapdos, without stone edge. as you crunch to nail whatever he switches in for the most damage possible, your opponent brings in his metagross. at this point, you can safely assume he expected stone edge. however, it's possible to fool him into thinking you were just throwing out a crunch to scout. when you bring in your ttar in on his zapdos again, just do a double switch. not only does this make you more unpredictable, but it conceals information about your tyranitar. and hopefully, it'll give you a perfect opportunity to fire blast that metagross!

ginganinja · Jul 13, 2010

I actually try and play like this on occasion when I am facing a opponient who seems to have some skill. It makes for very enjoyable matchs. One that stands in my mind was a match I had vs IPL where we were double switching and predicting everything. What made it more exciting was that I lacked SR on the team at the time so I was trying to stop his DD Mence and Gyarados from sweeping and he was trying to predict the Latias switch in and it was just awesome. Anyway All of this Game theroy was being applied except for the scouting your opponinets team bit since it was a suprise ladder battle. I really should warstory it lol but it occured just before the latias banning and so is very likely useless to post now.

Applying Game Theory To Pokemon redux

Matthew

I love weather; Sun for days

obi

formerly david stone

magickzzl

BurningMan

fueled by beer

Matthew

I love weather; Sun for days

noob3

alamaster

hello

MegaKick

Sandstreamer

Thorns

ginganinja

It's all coming back to me now