As the title states, I am currently trying to train a pokemon singles battle bot with RL (PPO + MCTS). But I have realized recently that training from scratch through selfplay is very time consuming and pretty impractical. I am achieving steady rise in reward meaning that the model is learning, but it is learning too slowly. Based off of prior reading and my own estimations, it would take at least 100 million steps to reach a relatively good bot, but right now it takes about 8 hours to run 1 million steps. This is unsustainable.
I believe the limitation is because pokemon showdown runs on cpu and thus there isn't really a way to speed it up(without using more resources). Thus my idea was to train a simple base model first with existing battle data. But I was unable to find any data on this.
Do you guys know where I could find datasets that contain entire battle histories (all the battle states, team info, etc.)?
Or if you have any other ideas I would really appreciate the help.
Thanks!
I believe the limitation is because pokemon showdown runs on cpu and thus there isn't really a way to speed it up(without using more resources). Thus my idea was to train a simple base model first with existing battle data. But I was unable to find any data on this.
Do you guys know where I could find datasets that contain entire battle histories (all the battle states, team info, etc.)?
Or if you have any other ideas I would really appreciate the help.
Thanks!