Go Back   Smogon Community > Contributions & Corrections > Projects
Register FAQ Social Groups Calendar Search Today's Posts Mark Forums Read

Reply
Categories: HTML, Programming, Other
 
Thread Tools
Old Sep 11th, 2011, 4:18:03 PM   #1
Antar*
That's Dr. Antar to you
is a Battle Server Administratoris a Super Moderatoris a Programmeris a Community Contributor
 
Antar's Avatar
 
Super Moderator
Join Date: Feb 2010
Posts: 2,050
DC Metro Area
Default Turning Battle Logs into Usage Stats

As I'm sure many of you have seen, I recently took a crack at generating usage stats.

Unlike my predecessors, however, the only raw data I was able to access were the battle logs stored on the server. These logs, which are pretty much identical to the ones that get generated client-side, leave a lot to be desired--they only show the pokemon that appeared in the battle itself, they don't contain natures/items/EV spreads/movesets, and they don't tell the players' current ranking.

Also, they're in HTML. Great for turning into warstories, pretty annoying for trying to cull data from.

But, nonetheless, I managed to write a few python scripts which turn these battle logs into usage stats (what we're going to end up DOING with these stats is a question for another thread), and I'm posting them here. Feel free to make suggestions as to how to modify them or improve them--I'll need all the help I can get.
__________________
Codes and Hacks I Use
PBR FC: 4898-8739-8815 (See here)
Black FC: 4040 5386 0128 / White 2 FC: 4771 3664 7215
My Narrated PBR & Gen V Battles
My Trade Thread
Convert any sim team to pkms
Pokemetrics: A Blog

Last edited by Antar; Sep 12th, 2011 at 12:56:08 PM.
Antar is offline   Reply With Quote
Old Sep 11th, 2011, 4:18:35 PM   #2
Antar*
That's Dr. Antar to you
is a Battle Server Administratoris a Super Moderatoris a Programmeris a Community Contributor
 
Antar's Avatar
 
Super Moderator
Join Date: Feb 2010
Posts: 2,050
DC Metro Area
Default Course of Action

To turn battle logs into usage stats, here's what needs to be done:
  1. Identify the tier and whether the battle was rated.
  2. Make sure the battle meets with any arbitrary criteria we decide upon ("longer than 5 turns," "player has rating above 1000", "loser said gg after the battle...")
  3. Find all lines beginning with <div class="SendOut">
  4. Identify the name of the trainer and the species of the pokemon sent out (THANK GOD we play with Species clause). This is a bit tricky because the string is different depending on whether the pokemon was nicknamed or not.
  5. Remove redundant entries (to account for switching)
  6. Write the species of all pokemon used in the battle to a file (write the species name twice if both trainers used it, obviously).
  7. Make another script. This one will take that giant file and simply tally each pokemon's usage (doing this step separately, rather than keeping a running tally, prevents racing conditions if you're parallelizing the workload).
  8. Sort the usage stats.
  9. PROFIT!!!
__________________
Codes and Hacks I Use
PBR FC: 4898-8739-8815 (See here)
Black FC: 4040 5386 0128 / White 2 FC: 4771 3664 7215
My Narrated PBR & Gen V Battles
My Trade Thread
Convert any sim team to pkms
Pokemetrics: A Blog
Antar is offline   Reply With Quote
Old Sep 11th, 2011, 4:19:41 PM   #3
Antar*
That's Dr. Antar to you
is a Battle Server Administratoris a Super Moderatoris a Programmeris a Community Contributor
 
Antar's Avatar
 
Super Moderator
Join Date: Feb 2010
Posts: 2,050
DC Metro Area
Default LogReader.py

This script will take a battle log (server version 1.0.23) and write the names of all pokemon used in the battle to a file corresponding to the battle's tier.

Usage:
Code:
python LogReader.py "name-of-log-file.html"
Source:
Code:
import string
import sys
filename = str(sys.argv[1])
file = open(filename)
log = file.readlines()

if (len(log) < 15):
	sys.exit()
#determine tier
if log[2][0:25] != '<div class="TierSection">':
	sys.exit()
tier = log[2][string.find(log[2],"</b>")+4:len(log[2])-7]
if log[3][0:19] == '<div class="Rated">':
	rated = log[3][string.find(log[3],"</b>")+4:len(log[3])-7]
else:
	if log[5][0:19] == '<div class="Rated">':
		rated = log[5][string.find(log[5],"</b>")+4:len(log[5])-7]
	else:
		print "Can't find the rating"
		for line in range(0,15):
			print line
		sys.exit()

#make sure the battle lasted at least six turns (to discard early forfeits)
longEnough = False
for line in log:
	if line == '<div class="BeginTurn"><b><span style=\'color:#0000ff\'>Start of turn 6</span></b></div>\n':
		longEnough = True
		break
if longEnough == False:
	sys.exit()

#trainer = []
#species = []
ts = [] #handle in one array to allow for sorting
#find all "sent out" messages
for line in range(6,len(log)):
	if log[line][0:21] == '<div class="SendOut">':
		ttemp = log[line][21:string.find(log[line],' sent out ')]

		#determine whether the pokemon is nicknamed or not
		if log[line][len(log[line])-8] == ')':
			stemp = log[line][string.rfind(log[line],'(')+1:len(log[line])-8]
		else:
			stemp = log[line][string.rfind(log[line],'sent out ')+9:len(log[line])-8]

		#determine whether this entry is already in the list
		match = 0
		for i in range(0,len(ts)):
			if (ts[i][0] == ttemp) & (ts[i][1] == stemp):
				match = 1
				break
		if match == 0:
			ts.append([ttemp,stemp])

ts=sorted(ts, key=lambda ts:ts[0])

outname = "Raw/"+tier+" "+rated+".txt"
outfile=open(outname,'a')

outfile.write(str(ts[0][0]))
outfile.write("\n")
i=0
while (ts[i][0] == ts[0][0]):
	outfile.write(str(ts[i][1]))
	outfile.write("\n")
	i = i + 1
outfile.write("***\n")
outfile.write(str(ts[len(ts)-1][0]))
outfile.write("\n")
for j in range(i,len(ts)):
	outfile.write(str(ts[j][1]))
	outfile.write("\n")

outfile.write("---\n")
outfile.close()
Here's a new version that does quite a bit more--this one identifies not only usage but culls data for other "pokemetrics." It does this by keeping track of all matchups in a battle and the outcome of that matchup.

LogReaderOnCrack.py


Change Log
__________________
Codes and Hacks I Use
PBR FC: 4898-8739-8815 (See here)
Black FC: 4040 5386 0128 / White 2 FC: 4771 3664 7215
My Narrated PBR & Gen V Battles
My Trade Thread
Convert any sim team to pkms
Pokemetrics: A Blog

Last edited by Antar; Oct 26th, 2011 at 1:16:22 PM.
Antar is offline   Reply With Quote
Old Sep 11th, 2011, 4:22:36 PM   #4
Antar*
That's Dr. Antar to you
is a Battle Server Administratoris a Super Moderatoris a Programmeris a Community Contributor
 
Antar's Avatar
 
Super Moderator
Join Date: Feb 2010
Posts: 2,050
DC Metro Area
Default StatCounter.py

Once the LogReader has been run over the set of battle logs, you're left with a bunch of pokemon names and not much else. StatCounter.py tallies these lists and turns them into usage stats.

I'm planning to modify this script soon to have the end result appear in a forum-friendly option, rather than the excel-friendly csv it currently does.

Usage:
Code:
python StatCounter.py "Raw/[Tier].txt"
where [Tier] is the tier you want to generate the stats for, e.g. "Raw/Standard OU Rated.txt"

Source:
Code:
import string
import sys

file = open("pokemons.txt")
pokelist = file.readlines()
file.close()

lsnum = []
lsname = []
for line in range(0,len(pokelist)):
	lsnum.append(pokelist[line][0:str.find(pokelist[line],':')])
	lsname.append(pokelist[line][str.find(pokelist[line],' ')+1:len(pokelist[line])])
filename = str(sys.argv[1])
file = open(filename)
species = file.readlines()
battleCount = 0
teamCount = 0
counter = [0 for i in range(len(lsnum))]
trainerNextLine=True
for entry in range(0,len(species)):
	found = False
	if trainerNextLine:
		trainer = species[entry]
		trainerNextLine = False
		ctemp = []
	else:
		if species[entry] == "***\n" or species[entry] == "---\n":
			trainerNextLine = True
			#decide whether to count the team or not
			#if you were going to compare the trainer name against a database,
			#you'd do it here.
			if len(ctemp) == 6: #only count teams with all six pokemon
				for i in ctemp:
					counter[i] = counter[i]+1.0 #rather than weighting equally, we
					#could use the trainer ratings db to weight these... 
				teamCount = teamCount+1
			
			if species[entry] == "---\n":
				battleCount=battleCount+1
		else:
			for i in range(0,len(lsnum)):
				if species[entry] == lsname[i]:
					ctemp.append(i)
					found = True
					break
			if not found:
				print species[entry]+" not found!"
				sys.exit()
total = sum(counter)

#for appearance-only form variations, we gotta manually correct (blegh)
counter[172] = counter[172] + counter[173] #spiky pichu
for i in range(507,534):
	counter[202] = counter[202]+counter[i] #unown
counter[352] = counter[352] + counter[553] + counter[554] + counter[555] #castform--if this is an issue, I will be EXTREMELY surprised
counter[413] = counter[413] + counter[551] + counter[552] #burmy
counter[422] = counter[422] + counter[556]  #cherrim
counter[423] = counter[423] + counter[557] #shellos
counter[424] = counter[424] + counter[558] #gastrodon
counter[615] = counter[615] + counter[616] #basculin
counter[621] = counter[621] + counter[622] #darmanitan
counter[652] = counter[652] + counter[653] + counter[654] + counter[655] #deerling
counter[656] = counter[656] + counter[657] + counter[658] + counter[659] #sawsbuck
counter[721] = counter[721] + counter[722] #meloetta
for i in range(507,534):
	counter[i] = 0
counter[173] = counter[553] = counter[554] = counter[555] = counter[551] = counter[552] = counter[556] = counter[557] = counter[558] = counter[616] = counter[622] = counter[653] = counter[654] = counter[655] = counter[657] = counter[658] = counter[659] = counter[722] = 0

#sort by usage
pokes = []
for i in range(0,len(lsname)):
	pokes.append([lsname[i][0:len(lsname[i])-1],counter[i]])
pokes=sorted(pokes, key=lambda pokes:-pokes[1])

print " Total battles: "+str(battleCount)
print " Total teams: "+str(teamCount)
print " Total pokemon: "+str(total)
print " + ---- + --------------- + ------ + ------- + "
print " | Rank | Pokemon         | Usage  | Percent | "
print " + ---- + --------------- + ------ + ------- + "
for i in range(0,len(pokes)):
	if pokes[i][1] == 0:
		break
	print ' | %-4d | %-15s | %-6d | %6.3f%% |' % (i+1,pokes[i][0],pokes[i][1],100.0*pokes[i][1]/teamCount)


#csv output
#for i in range(len(lsnum)):
#	if (counter[i] > 0):
#		print lsnum[i]+","+lsname[i][0:len(lsname[i])-1]+","+str(counter[i])+","+str(round(100.0*counter[i]/battleCount/2,5))+"%"
Change log


Old version that writes as csv


StatCounter1337.py


StatCounterOnCrack.py
__________________
Codes and Hacks I Use
PBR FC: 4898-8739-8815 (See here)
Black FC: 4040 5386 0128 / White 2 FC: 4771 3664 7215
My Narrated PBR & Gen V Battles
My Trade Thread
Convert any sim team to pkms
Pokemetrics: A Blog

Last edited by Antar; Oct 26th, 2011 at 1:30:09 PM.
Antar is offline   Reply With Quote
Old Sep 11th, 2011, 4:23:32 PM   #5
Antar*
That's Dr. Antar to you
is a Battle Server Administratoris a Super Moderatoris a Programmeris a Community Contributor
 
Antar's Avatar
 
Super Moderator
Join Date: Feb 2010
Posts: 2,050
DC Metro Area
Default RunMe.sh

Putting it all together, I wrote a bash script to compile stats for the entire month on my Linux computer.

The computer has multiple processor cores, so I did some parallelizing to make use of them.

File Structure:
  • RunMe.sh sits in a folder with my two python scripts.
  • The month's battle logs are all in a folder called "2011-08".
  • In that folder are sub-folders for each day's logs (example: "2011-08-05").
  • Back in the main folder where the scripts sit, there are two empty folders, called "Raw" and "Usage". "Raw" will contain the lists of pokemon, while "Usage" will contain the stats.
Usage:
Code:
$./RunMe.sh
Source:
Code:
rm Raw/*
rm Stats/*

maxjobs=6 #set to number of multiprocessors

for  i in 2011-08/* 
do
	for j in "$i"/*
	do
		jobcnt=(`jobs -p`)
		while [ ${#jobcnt[@]} -ge $maxjobs ]
		do
			jobcnt=(`jobs -p`)
		done
		echo Processing $j
		python LogReader.py "$j" &
	done

#serial version:
#	for j in "$i"/*
#	do
#		echo Processing $j
#		python LogReader.py "$j"
#	done

done
wait




#stupid tier name changes--gotta consolidate...
cat "Raw/BW LC Rated.txt" >> "Raw/Standard LC Rated.txt"
cat "Raw/BW LC Unrated.txt" >> "Raw/Standard LC Unrated.txt"
cat "Raw/BW OU Rated.txt" >> "Raw/Standard OU Rated.txt"
cat "Raw/BW OU Unrated.txt" >> "Raw/Standard OU Unrated.txt"
cat "Raw/BW UU Rated.txt" >> "Raw/Standard UU Rated.txt"
cat "Raw/BW UU Unrated.txt" >> "Raw/Standard UU Unrated.txt"
cat "Raw/BW RU Rated.txt" >> "Raw/Standard RU Rated.txt"
cat "Raw/BW RU Unrated.txt" >> "Raw/Standard RU Unrated.txt"
cat "Raw/BW Uber Rated.txt" >> "Raw/Standard Ubers Rated.txt"
cat "Raw/BW Uber Unrated.txt" >> "Raw/Standard Ubers Unrated.txt"
rm "Raw/BW*.txt"

echo Compiling Stats...
for i in Raw/*; do python StatCounter.py "$i" > "Stats/${i/Raw}" ; done
__________________
Codes and Hacks I Use
PBR FC: 4898-8739-8815 (See here)
Black FC: 4040 5386 0128 / White 2 FC: 4771 3664 7215
My Narrated PBR & Gen V Battles
My Trade Thread
Convert any sim team to pkms
Pokemetrics: A Blog

Last edited by Antar; Sep 11th, 2011 at 6:31:53 PM.
Antar is offline   Reply With Quote
Old Sep 11th, 2011, 4:27:33 PM   #6
Antar*
That's Dr. Antar to you
is a Battle Server Administratoris a Super Moderatoris a Programmeris a Community Contributor
 
Antar's Avatar
 
Super Moderator
Join Date: Feb 2010
Posts: 2,050
DC Metro Area
Default Miscellaneous Scripts

PPB.py


TableReader.py


RemoveRedundancy.py


ThreeMonth.py


PullOU.py


Tiers.py
__________________
Codes and Hacks I Use
PBR FC: 4898-8739-8815 (See here)
Black FC: 4040 5386 0128 / White 2 FC: 4771 3664 7215
My Narrated PBR & Gen V Battles
My Trade Thread
Convert any sim team to pkms
Pokemetrics: A Blog

Last edited by Antar; Oct 5th, 2011 at 9:52:26 AM.
Antar is offline   Reply With Quote
Old Sep 11th, 2011, 10:08:13 PM   #7
whitefag
 
Join Date: Jul 2010
Posts: 46
Tomsk, Siberia
Default

Smogon isn't really friendly to developers, isn't it?
Anyway, do you have any idea on what to do with this stats? I have a similiar problem: i made a script that converts PO binary usage stats into MySQL db, which allows generating any kind of statistics, yet I can't think of anything usefull...

Last edited by whitefag; Sep 12th, 2011 at 8:41:05 AM.
whitefag is offline   Reply With Quote
Old Sep 12th, 2011, 10:16:17 AM   #8
Antar*
That's Dr. Antar to you
is a Battle Server Administratoris a Super Moderatoris a Programmeris a Community Contributor
 
Antar's Avatar
 
Super Moderator
Join Date: Feb 2010
Posts: 2,050
DC Metro Area
Default

Quote:
Originally Posted by Fat whitefag View Post
do you have any idea on what to do with this stats?
That's what we've been discussing here.

Quote:
i made a script that converts PO binary usage stats...
do you have any specific knowledge of whether the Smogon server is still generating this data? Because if it is, with your help, I'll be able to parse it, and all the problems described in the thread above will vanish.
__________________
Codes and Hacks I Use
PBR FC: 4898-8739-8815 (See here)
Black FC: 4040 5386 0128 / White 2 FC: 4771 3664 7215
My Narrated PBR & Gen V Battles
My Trade Thread
Convert any sim team to pkms
Pokemetrics: A Blog
Antar is offline   Reply With Quote
Old Sep 12th, 2011, 11:40:01 AM   #9
whitefag
 
Join Date: Jul 2010
Posts: 46
Tomsk, Siberia
Default

Quote:
Originally Posted by Fat Antar View Post
do you have any specific knowledge of whether the Smogon server is still generating this data? Because if it is, with your help, I'll be able to parse it, and all the problems described in the thread above will vanish.
I'm pretty sure it is, it's done by server plugin and since Smogon provides limited usage stats each month, I assume they collect it. You need too ask the new server administartor for that though.
I used Beta's stats since they are always available.

As for the script, here's the package (nevermind russian, just press the big black button).
It's my second python script (after Hello, world!), so it might be coded pretty poorly.
The idea is pretty simple: It converts PO's binary files directly into MyISAM files (this is the fastest way) and adds necessary files (db structure and Index file) from templates so it can be used by MySQL.
The data there is exactly as it present in PO's files, here's the code that generates it.

Last edited by whitefag; Sep 12th, 2011 at 11:46:42 AM. Reason: whoops, pyc
whitefag is offline   Reply With Quote
Old Sep 12th, 2011, 12:55:46 PM   #10
Antar*
That's Dr. Antar to you
is a Battle Server Administratoris a Super Moderatoris a Programmeris a Community Contributor
 
Antar's Avatar
 
Super Moderator
Join Date: Feb 2010
Posts: 2,050
DC Metro Area
Default

Quote:
Originally Posted by Fat whitefag View Post
I'm pretty sure it is, it's done by server plugin and since Smogon provides limited usage stats each month
That's actually the big issue--the plugin was causing the server to crash, so they had to disable it. That's why I've had to write all these scripts.

Or so is my understanding...
__________________
Codes and Hacks I Use
PBR FC: 4898-8739-8815 (See here)
Black FC: 4040 5386 0128 / White 2 FC: 4771 3664 7215
My Narrated PBR & Gen V Battles
My Trade Thread
Convert any sim team to pkms
Pokemetrics: A Blog
Antar is offline   Reply With Quote
Old Sep 13th, 2011, 3:15:28 AM   #11
whitefag
 
Join Date: Jul 2010
Posts: 46
Tomsk, Siberia
Default

I remember having an issue like this, but wasn't it fixed?
Quote:
Your client version (1.0.30) doesn't match with the server's (1.0.23).
Oh, nevermind...
I guess we have to wait untill Smogon gets better with server handling.
whitefag is offline   Reply With Quote
Old Sep 15th, 2011, 3:55:41 AM   #12
Tomahawk9
is a Team Rater Alumnus
 
Tomahawk9's Avatar
 
Join Date: Oct 2010
Posts: 1,439
Default

While this is obviously pretty cool, I have one question: does this only take into account battles where 'Save Log' is on? Cause then the stats would be kinda off...
__________________
C&C Work | Smog Work
Tomahawk9 is offline   Reply With Quote
Old Sep 16th, 2011, 5:58:10 AM   #13
Antar*
That's Dr. Antar to you
is a Battle Server Administratoris a Super Moderatoris a Programmeris a Community Contributor
 
Antar's Avatar
 
Super Moderator
Join Date: Feb 2010
Posts: 2,050
DC Metro Area
Default

Quote:
Originally Posted by Fat Tomahawk9 View Post
While this is obviously pretty cool, I have one question: does this only take into account battles where 'Save Log' is on? Cause then the stats would be kinda off...
I do not believe so. It's already been shown that the server and the client software produce slightly different battle logs (client version 1.0.30 gives the full teams, while no version of the server does so currently), so I really doubt the server is querying whether the users have opted to save their battle logs.

But the only way to be 100% sure would be do dig around in the PO source code, and--I'll be honest--I'm not going to be doing that.
__________________
Codes and Hacks I Use
PBR FC: 4898-8739-8815 (See here)
Black FC: 4040 5386 0128 / White 2 FC: 4771 3664 7215
My Narrated PBR & Gen V Battles
My Trade Thread
Convert any sim team to pkms
Pokemetrics: A Blog
Antar is offline   Reply With Quote
Old Oct 5th, 2011, 9:53:01 AM   #14
Antar*
That's Dr. Antar to you
is a Battle Server Administratoris a Super Moderatoris a Programmeris a Community Contributor
 
Antar's Avatar
 
Super Moderator
Join Date: Feb 2010
Posts: 2,050
DC Metro Area
Default

Just updated the third post with some more scripts. Enjoy!
__________________
Codes and Hacks I Use
PBR FC: 4898-8739-8815 (See here)
Black FC: 4040 5386 0128 / White 2 FC: 4771 3664 7215
My Narrated PBR & Gen V Battles
My Trade Thread
Convert any sim team to pkms
Pokemetrics: A Blog
Antar is offline   Reply With Quote
Old Dec 8th, 2011, 3:48:53 PM   #15
Antar*
That's Dr. Antar to you
is a Battle Server Administratoris a Super Moderatoris a Programmeris a Community Contributor
 
Antar's Avatar
 
Super Moderator
Join Date: Feb 2010
Posts: 2,050
DC Metro Area
Default

I've uploaded my current scripts to a shiny new github repo. If you have the desire to contribute / modify any of my code, feel free to contribute through there.
__________________
Codes and Hacks I Use
PBR FC: 4898-8739-8815 (See here)
Black FC: 4040 5386 0128 / White 2 FC: 4771 3664 7215
My Narrated PBR & Gen V Battles
My Trade Thread
Convert any sim team to pkms
Pokemetrics: A Blog
Antar is offline   Reply With Quote
Reply Smogon Community > Contributions & Corrections > Projects

« Previous Thread | Next Thread »
Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



All times are GMT -4. The time now is 11:44:51 PM.