P Squared
a great unrecorded history
Hello friends. Here's some fun stuff I can do for forum threads. Unfortunately, learning how to make these do-it-yourself applications is not a priority for me, but feel free to request stuff itt and I can probably do it relatively quickly for you.
Scraping is done with Python (the Scrapy library in particular) and everything else in R, because I like R. I guess you can check out my Github, but it is probably embarrassing and bad. In general I am just a kid learning how to do interesting things, so if you have any suggestions for improvements let me know.
Premier league signup sheets
Turns signup threads into nice spreadsheets for the PS auction bot. It is fairly dependent on people actually following the signup format, which about 40% of people don't, but even if they don't I will still get their username and post number, link, and timestamp. I wrote this one back in December when I found out the old method required the user to go through every page of a signup thread (that was shocking) and did a simple search for tier names--I think it's good practice to design search parameters so that, for example, a signup that includes a word that has "ou" in it (like "hey I just found this thread!") doesn't get counted as a signup for OU, and more commonly that a signup that says "everything except DPP" does not count as a signup for DPP. I got to play around with negative lookaround regular expressions for this, so that was fun. Also, no shade at the previous signup scraping people, they're still great for working to automate a very tedious process.
Some recent examples:
http://spo.ink/oupl4signups
http://spo.ink/dpl4signups
This can also be done for non-premier league threads of course. If you just want to see a list of all the posters in a thread with their post numbers, links, timestamps etc that works too.
Graph posts over time
Can be a nice quick way to see how active a thread is, and compare it to other threads. I coded this back in January when I noticed how active the OST 14 signups thread was and wanted to see if it was actually different from previous years.
If you're using this for signups, a caveat is that it's just counting posts, because it's basically impossible to distinguish signups and non-signup posts in the same thread. I could probably do something about double posters though.
Most liked posts in a thread
More spreadsheets yay. This was one of the earlier things I did after I learned how to scrape Smogon, so the sheets are outdated and not pretty. I did this for the SPL 7 Commencement thread and later the old QDB when it was moved to a public forum (spiders can't access private subforums like Firebot and Inside Scoop without some modifications that admins would probably frown upon). The links in the latter are all broken now because firemods deleted a bunch of posts in the thread, which messes up the links.
I haven't rewritten the likes-scraping part of my code since we moved to Xenforo 2, but I imagine it won't take too long. Also note that I used to be able to scrape each poster's like count and join date, but since those are no longer immediately visible on Xenforo 2 (you have to hover to see more info on a user in a thread), I can't do that anymore without learning how to use Selenium or something. Maybe one day...
I won't do the following things for you by request, but feel free to adapt them for your own use.
GP check formatter
Clicking colors is definitely the worst and most time-wasting part of GP checking. I started using bold / italic / underline while GPing instead of colors since those three have keyboard shortcuts (later when we moved to Xenforo 2 I replaced italics with strikethroughs) and then find-and-replaced
Anyway, this speeds up GPing immensely and I definitely recommend it to GPers if they are confident enough to not need to see colors while they check. The R code is up on my Github. Just substitute my stamps and colors for yours (and filepaths).
Markov chain text generation (post simulator)
Initially I had the ambitious goal to write my own Markov chain text generator, but after weeks of laziness I just scraped posts and fed them into a website that did the work for me. Still kinda fun though. Here's my old Smogon simulator thread about it if you want to see examples or learn more about Markov chains.
Scraping is done with Python (the Scrapy library in particular) and everything else in R, because I like R. I guess you can check out my Github, but it is probably embarrassing and bad. In general I am just a kid learning how to do interesting things, so if you have any suggestions for improvements let me know.
Premier league signup sheets
Turns signup threads into nice spreadsheets for the PS auction bot. It is fairly dependent on people actually following the signup format, which about 40% of people don't, but even if they don't I will still get their username and post number, link, and timestamp. I wrote this one back in December when I found out the old method required the user to go through every page of a signup thread (that was shocking) and did a simple search for tier names--I think it's good practice to design search parameters so that, for example, a signup that includes a word that has "ou" in it (like "hey I just found this thread!") doesn't get counted as a signup for OU, and more commonly that a signup that says "everything except DPP" does not count as a signup for DPP. I got to play around with negative lookaround regular expressions for this, so that was fun. Also, no shade at the previous signup scraping people, they're still great for working to automate a very tedious process.
Some recent examples:
http://spo.ink/oupl4signups
http://spo.ink/dpl4signups
This can also be done for non-premier league threads of course. If you just want to see a list of all the posters in a thread with their post numbers, links, timestamps etc that works too.
Graph posts over time
Can be a nice quick way to see how active a thread is, and compare it to other threads. I coded this back in January when I noticed how active the OST 14 signups thread was and wanted to see if it was actually different from previous years.
Most liked posts in a thread
More spreadsheets yay. This was one of the earlier things I did after I learned how to scrape Smogon, so the sheets are outdated and not pretty. I did this for the SPL 7 Commencement thread and later the old QDB when it was moved to a public forum (spiders can't access private subforums like Firebot and Inside Scoop without some modifications that admins would probably frown upon). The links in the latter are all broken now because firemods deleted a bunch of posts in the thread, which messes up the links.
I haven't rewritten the likes-scraping part of my code since we moved to Xenforo 2, but I imagine it won't take too long. Also note that I used to be able to scrape each poster's like count and join date, but since those are no longer immediately visible on Xenforo 2 (you have to hover to see more info on a user in a thread), I can't do that anymore without learning how to use Selenium or something. Maybe one day...
I won't do the following things for you by request, but feel free to adapt them for your own use.
GP check formatter
Clicking colors is definitely the worst and most time-wasting part of GP checking. I started using bold / italic / underline while GPing instead of colors since those three have keyboard shortcuts (later when we moved to Xenforo 2 I replaced italics with strikethroughs) and then find-and-replaced
[b]
with [b][color=blue]
and so on. Then I learned how to write shell scripts in R, so I automated the process so I just have to type (for example) RScript formatCheck.R hajime 2
into Git Bash to format a GP check and add my Hajime stamp and a GP 2/2 to the top. There are also optional arguments for colors (RScript formatCheck.R hajime 2 dodgerblue tomato mediumorchid
); defaults are deepskyblue, red, and limegreen. This is also how I get to circumvent Xenforo 2's disappointing default color picker colors. :)Anyway, this speeds up GPing immensely and I definitely recommend it to GPers if they are confident enough to not need to see colors while they check. The R code is up on my Github. Just substitute my stamps and colors for yours (and filepaths).
Markov chain text generation (post simulator)
Initially I had the ambitious goal to write my own Markov chain text generator, but after weeks of laziness I just scraped posts and fed them into a website that did the work for me. Still kinda fun though. Here's my old Smogon simulator thread about it if you want to see examples or learn more about Markov chains.