Programming PokéAgent Challenge (NeurIPS 2025): PokéChamp & Metamon are Open-source Highly Ranked AI Battlers

pokeagentchallenge · Oct 16, 2025

Hey Smogon,

We’ve been working on something wild over the last year: a full open-source Pokémon AI benchmark built around Pokémon battles.
It’s called the PokéAgent Challenge, and it’s going to be hosted at NeurIPS 2025 in San Diego this December.

Our goal is simple: use Pokémon as a real testbed for AI reasoning and learning.
There are two main agents powering the challenge with crazy performance on Gens 1,2,3,4 and 9 OU (and even VGC in the works!):

PokéChamp:
Large Language Model-based agents.
We use LLMs like ChatGPT, Claude, and Gemini to make think ahead, model opponents, and plan actions. It can even explain its choices like a human player.
→ GitHub: github.com/sethkarten/pokechamp

Metamon:
Reinforcement Learning agents that learn from experience: no scripts, no hand-coded rules.
→ GitHub: github.com/UT-Austin-RPL/metamon

Data
Did I mention we have the largest Pokemon battle dataset? Through a combination of human replays and bot ladder battles, we have almost 10M replays (and growing).

If you’re into bot development, battle analysis, or just want to see how close AI can get to real human play, check out:
https://pokeagent.github.io

Screenshot 2025-10-16 at 01-41-13 PokéAgent Challenge - NeurIPS 2025.png

fiish · Oct 16, 2025

pokeagentchallenge said:
crazy performance

PokéChamp attains a projected Elo of 1300-1500 on the Pokémon Showdown online ladder,

Xuwu · Oct 16, 2025

I'm very interested in your project. Could you please provide some specific information about what achievements these bots have reached on the Gen 9 OU ladder?

Brigtel · Oct 16, 2025

interesting, so this is a whole open challenge? I'm also interested in looking at that dataset and the details of where the data is from

Programming PokéAgent Challenge (NeurIPS 2025): PokéChamp & Metamon are Open-source Highly Ranked AI Battlers

pokeagentchallenge

fiish

Xuwu

Brigtel