The Garchomp Experiment

Aeolus

Bag
is a Top Tutor Alumnusis a Tournament Director Alumnusis a Site Content Manager Alumnusis a Battle Simulator Admin Alumnusis a Top Smogon Discord Contributor Alumnusis a Top Tiering Contributor Alumnusis a Top Contributor Alumnusis an Administrator Alumnusis a Top Dedicated Tournament Host Alumnus
Ok, after many long discussions in #insidescoop, we defined how we want to go about the test. These are the steps:

1) Implement a Garchomp free ladder on the Smogon Shoddy Battle server for one month.

2) Invite eligible users to cast a vote... finally putting the issue to bed.

The trouble came in defining eligible. After extensive debate, it was determined that using a numeric value based on ladder data (rating & deviation) as a threshold to earn a voice in the poll was the most objective way of ensuring a few things:
  • That those invited to vote actually participated in the test.
  • That those invited to vote are well versed enough in the game to be able to cast an informed vote.
The objectivity of this system gives absolutely everyone equal opportunity to earn a vote. If someone is willing to put the time in to reach the necessary level, their vote will be counted.

We also recognized that there are individuals (and I hope he doesn't mind me mentioning him), like Jumpman16, who don't have the time or inclination to claw their way up a ladder, but are clearly well versed enough on the issue to have a vote in the poll. An application to vote, and be exempted from the ladder threshold requirement, will be made available to accommodate these people. The applications will be judged extremely harshly and I don't expect many will be granted votes... but people agreed that the mechanism should be in place.

The very last thing to iron out is the actual numeric value(s) that will serve as the break point in voting. This is where I'm looking for the input of people like X-Act, Doug, and anyone else really REALLY understands the mechanics of the rating system to indicate the numbers that will guarantee the truth two bullet points listed above.

I'd like that to be the primary discussion in this thread rather than other possible amendments to the process.

#insidescoop discussion below:


Code:
[20:20:46] <@Aeolus> Jump and I were talking... and we think going ahead with a vote on Garchomp based on a numeric ladder threshold is an acceptable course of action
[20:20:49] <@Jumpman16> pokemon
[20:21:23] <@Aeolus> yes, pokemon.
[20:21:26] <+Hipmonlee> so would it be solely your numeric ladder rating
[20:21:28] <@Jumpman16> DJD said "yes" yesterday
[20:21:31] <@Jumpman16> idk what the next steps are
[20:21:32] <@Aeolus> no
[20:21:34] <+husk> so when'll the new ladder be up?
[20:21:40] <@Jumpman16> he asked me if we needed a leaderboard and i said "sure"
[20:21:42] <+JabbaTheGriffin> next steps are fun times of awesome no garchomp play!
[20:21:47] <@Aeolus> there would also be an application available to people who don't want to ladder their way in
[20:21:49] <@Jumpman16> yeah
[20:21:51] <@Jumpman16> im tired of it
[20:21:53] <+aldaron> man I'm excited
[20:21:55] <@Jumpman16> it's so dumb and gay
[20:21:57] <@Jumpman16> me too
[20:22:17] <+Brawley> I cant wait for no garchomp play
[20:22:19] <+husk> conviniently my current ou team doesn't have garchomp =)
[20:22:20] <+JabbaTheGriffin> playing without garchomp might possibly be better than sex!
[20:22:21] <@Aeolus> now, we need to discuss the logistics of the threshold
[20:22:25] <+Hipmonlee> some other ideas
[20:22:31] <+Hipmonlee> I think I suggested this before
[20:22:33] <+JabbaTheGriffin> then again my ex was pretty bad so it's not really a fair comparison
[20:22:39] <+Brawley> lol
[20:22:45] <+husk> someone is sore
[20:22:50] <+Hipmonlee> but you could have a vote of top ladder rated people, a vote of bsdge holders and a vote of everyone els
[20:22:51] <+Hipmonlee> e
[20:22:55] <+Hipmonlee> and weight them evenly
[20:23:08] <+aldaron> please not CRE it is too undependable a number, some threshold of mean rating based off of deviation would tell us more about the experience on that ladder
[20:23:14] <+Brawley> I really think we should look at unbanning ubers before garchomp,but thats just me
[20:23:14] <@Aeolus> I don't get the logic of weighting those votes evenly
[20:23:17] <+Hipmonlee> that way you get everyones opinion with a bias toward higher level players
[20:23:25] <+Brawley> and stealth rock before chomp
[20:23:33] <+Brawley> but just my opinion
[20:23:40] <+Hipmonlee> everyone feels like their vote gets heard
[20:23:44] <+JabbaTheGriffin> well your opinion was noted
[20:23:48] <+JabbaTheGriffin> when everyone voted on it
[20:23:50] <+husk> do the "higher level players" get to vote with "everyone else" also?
[20:23:51] <@Aeolus> everyone has the opportunity to earn a voice
[20:23:54] <+Hipmonlee> but we still end up with the top level being more important
[20:23:56] <+husk> and badge holders as well...
[20:23:58] <+JabbaTheGriffin> you're the overwhelming minority :/
[20:23:59] <+husk> if they happent o have them
[20:24:06] <+Brawley> :<
[20:24:07] <+husk> to*
[20:24:27] <+Brawley> well we could always unban garchomp so loss
[20:24:31] <+Hipmonlee> well I dunno if we should ignore the voice of the randoms entirely, it'd be nice if they all universally agreed garchomp was fine, that that was taken into consideration
[20:24:45] <@Aeolus> but they have just as much chance as everyone else to earn a vote
[20:24:49] <@Aeolus> they aren't ignored at all
[20:24:49] <+Brawley> I just wanna get to Lati@s
[20:25:33] <@Aeolus> Brawley, don't distract from the conversation
[20:25:39] <+Hipmonlee> im worried that if I were a random according to the deoxys vote, I would not have earned a say
[20:25:41] <+Brawley> sorry
[20:25:44] <+aldaron> lol brawley with the digression and I do not believe we want the random individual's input in establishing a metagame meant for experienced players
[20:26:05] <+Brawley> >_>
[20:26:08] <+aldaron> the number threshold gives us some sort of impression based on battle success so at least that tells us something
[20:26:10] <+IggyBot> lol
[20:26:13] <+Hipmonlee> well it is a pretty minor imput
[20:26:25] <@Aeolus> it also tells us who actually participated in the test
[20:26:27] <@Articuno64> we don't even need to read the random posts
[20:26:30] <@Articuno64> just make a thread for them to post in
[20:26:31] <@Aeolus> which is the most important part
[20:26:38] <+Hipmonlee> what would the number threshold be?
[20:26:39] <@Aeolus> jason, I like that
[20:26:46] <@Aeolus> that's what I want to talk about
[20:26:50] <@Aeolus> instead of alternate voting methods
[20:26:59] <+husk> top 10% of battlers
[20:27:11] <@Jumpman16> i thought it was "over 1400"
[20:27:13] <@Aeolus> no
[20:27:18] <@Aeolus> there was no threshold decided
[20:27:24] <@Jumpman16> ok
[20:27:30] <@Jumpman16> that's more fair
[20:27:30] <+aldaron> first we need to decide what number to use
[20:27:32] <+husk> aldaron is right in his thinking
[20:27:36] <@Aeolus> I agree
[20:27:46] <@Aeolus> other problem
[20:27:50] <@Aeolus> this will be a 1 month test
[20:28:00] <@Aeolus> do do you have to be in the top 10% at the end of the month?
[20:28:10] <+Hipmonlee> top 10% of which ladder
[20:28:11] <@Aeolus> or just achieve it at some point after week 1?
[20:28:15] <@Aeolus> the test ladder
[20:28:17] <@Jumpman16> non-chomp
[20:28:17] <@Aeolus> w/o garchomp
[20:28:18] <+husk> I think the end of month is more fair
[20:28:28] <+JabbaTheGriffin> then it becomes some sort of competition for votes
[20:28:29] <+husk> so each person there shows that they've worked for the whole month
[20:28:31] <+Hipmonlee> I think people in top 10% of non test ladder should have their say as well
[20:28:40] <@Jumpman16> however, we do need to take into account that people may not have played at all on the other ladder
[20:28:41] <+husk> or that they're extremely good I guess...
[20:28:42] <@Articuno64> the problem is that the ladder points are scaled back over time aren't they?
[20:28:43] <+Hipmonlee> or whatever
[20:28:45] <+JabbaTheGriffin> shouldn't someone who was competent enough to reach the top 10% have a say
[20:28:48] <@Aeolus> yes jason
[20:29:08] <@Aeolus> I think that just making the threshold at some point after week one should be good enough to get a vote
[20:29:13] <+JabbaTheGriffin> even if they were eventually forced out by more competent people
[20:29:18] <+Hipmonlee> like people in the top 10% of the test ladder are probably going to have a bias in favour of the test conditions
[20:29:19] <+JabbaTheGriffin> i agree
[20:29:19] <+husk> you can touch the top 10% in a day
[20:29:24] <@Aeolus> I dont want there to be a competition for votes
[20:29:25] <@Articuno64> maybe just take a leaderboard snapshot each week
[20:29:27] <+Hipmonlee> yeah 10% is easy
[20:29:28] <@Articuno64> and anyone who is in at any point
[20:29:29] <@Aeolus> I dont want to limit votes
[20:29:30] <+husk> that doesn't mean you know anything about the garchomp-less metagame
[20:29:38] <+husk> you've only played for a day
[20:29:40] <@Jumpman16> yeah
[20:29:48] <@Jumpman16> that's a good point
[20:29:50] <+JabbaTheGriffin> do they even have to know that much? that's still what i'm not getting
[20:29:51] <+aldaron> all this would be solved if we use deviation and rating
[20:30:04] <+husk> yeah
[20:30:09] <@Aeolus> I'm fine with that too
[20:30:18] <@Aeolus> the time issue is the main obstacle
[20:30:25] <+husk> the point in choosing people with high ratings is that they know what they're doing
[20:30:34] <@Aeolus> well
[20:30:38] <+husk> they "have to know that much" I think
[20:30:38] <@Aeolus> not entirely
[20:30:42] <@Aeolus> partially, yes
[20:30:52] <+aldaron> deviation goes down in correlation with number of battles so someone with a low deviation and high rating has to have experience and ability
[20:30:52] <@Aeolus> but also people who have high ratings have demonstrated participation
[20:30:59] <+aldaron> whereas the CRE would only tell us ability
[20:31:39] <@Aeolus> I dont know enough about the rating system I dont think
[20:31:46] <+Hipmonlee> he's right
[20:32:01] <+husk> (this is what I meant when I said aldaron is right in his thinking)
[20:32:02] <+Hipmonlee> cre doesnt really tell ability though
[20:32:25] <@Aeolus> ok, so let's define it based on rating and deviation
[20:33:03] <+Hipmonlee> but the other point, is someone who only battles garchompless, might not really have a good idea about the garchomp free metagame
[20:33:18] <@Aeolus> what do you mean?
[20:33:26] <+Hipmonlee> for instance if mre decided to play on chompless ladder to prove a point
[20:33:35] <+Hipmonlee> he hasnt played dp in about a year
[20:33:56] <@Jumpman16> yeah
[20:34:03] <@Jumpman16> i mentioned tha ta few minutes ago
[20:34:08] <@Jumpman16> you could throw me in there as well
[20:34:08] <+Hipmonlee> or you might get someone who plays on chompless ladder, and decides he hates it
[20:34:10] <+husk> the testing period is only a month...few people can be amazing at the chompless ladder and not know the current game
[20:34:14] <@Aeolus> if he makes the threshold though, that would denote at least a minimum level of experience
[20:34:30] <+husk> that's another thing though
[20:34:33] <@Jumpman16> maybe i should whore for two days to prove a point =(
[20:34:36] <+husk> people who like the chompless ladder will play it more
[20:34:40] <+Hipmonlee> yeah
[20:34:41] <+husk> people who don't like it won't
[20:34:43] <+Hipmonlee> that was my point
[20:34:49] <+husk> o
[20:34:52] <@Aeolus> and garchomp enthusiasts will play it earn the right to vote in the poll
[20:34:53] <@Jumpman16> "well then they dont get to vote"
[20:34:55] <+husk> sorry I'm a bit slow
[20:34:57] <@Jumpman16> and they will know that
[20:34:58] <+Hipmonlee> you should let people from both ladder play
[20:35:00] <+Hipmonlee> vote
[20:35:22] <@Aeolus> that's an interesting idea
[20:35:35] <+aldaron> from both inclusive right not one or the other?
[20:35:35] <+husk> if you haven't played the garchompless metagame it's hard for you to vote
[20:35:38] <@Aeolus> but voting rights would be more difficult to earn on the regular ladder
[20:35:46] <+husk> I'm certain people won't try to maintain a good ranking on both ladders
[20:35:52] <@Aeolus> yes, but I trust the top 10 people on the regular ladder
[20:35:58] <@Aeolus> that they know what they are talking about
[20:36:00] <+husk> so the quality of the main ladder drops as the top players play the garchompless one
[20:36:15] <@Aeolus> aldaron
[20:36:32] <@Aeolus> what do you think is a good rating/deviation level to define the voting threshold
[20:36:34] <+Hipmonlee> to be honest, I dont think testing is really necessary, so maybe I am just arguing for the sake of it
[20:37:00] <@Aeolus> or anyone else
[20:37:03] <@Aeolus> husk whoever
[20:37:16] <@Aeolus> define the rating/deviation threshold for the chompless ladder
[20:37:20] <+Hipmonlee> I'd say ask x-act
[20:37:28] <+husk> I'm not exactly sure what the mathematical relationship is
[20:37:29] <+Hipmonlee> I mean I understand the concept
[20:37:31] <+aldaron> I have absolutely no idea regarding the deviation since I'm not sure how fast it drops but rating can just be an arbitrary percentage like top 30% of all people with lower than X deviation or something
[20:37:32] <+Hipmonlee> but not the specifics
[20:37:38] <@Aeolus> ok
[20:37:42] <@Aeolus> then i'm going to ask x-act
[20:37:46] <@Aeolus> and we're going to use that
[20:37:55] <+husk> you'll have to ask doug I think
[20:38:05] <+husk> he'll know the actual numbers shoddy has
[20:38:14] <@Aeolus> we'll also have to have a tracking method
[20:38:15] <+Hipmonlee> xact knows them pretty well
[20:38:24] <@Aeolus> like a snapshot taken at the end of every day
[20:38:29] <+husk> oh I didn't know he was interested in that
[20:38:32] <+husk> nvm then!
[20:38:34] <@Aeolus> of all the people who meet the threshold
[20:38:42] <@Aeolus> and anyone who meets it 10+ times get a vote
[20:39:06] <+aldaron> o well if we use deviation and rating it should just be the rating at the end of the period of testing
[20:39:27] <@Aeolus> depends on how fast the rating deteriorates imo
[20:39:34] <+husk> the longer you give it the closer the rating reaches what you actually are
[20:40:03] <@Aeolus> ok, that's another thing i'm going to ask x-act about then
[20:40:28] <+husk> given that you're playing regularly
[20:40:28] <@Aeolus> so, 30 days after the ladder is in place
[20:40:34] <@Aeolus> invitations to vote will be distributed 
[20:40:37] <@Aeolus> and we'll have an answer
[20:40:44] * darkie|away is now known as darkie
[20:40:49] <+aldaron> ok, so the testing period for each suspect will be one month?
[20:41:03] <+husk> 30 days seems long enough
[20:41:12] <+husk> it'll be hard to hold interest for too much longer than that
[20:41:42] <@Aeolus> Is 30 days too long?
[20:41:48] <@Aeolus> maybe...
[20:41:54] <+JabbaTheGriffin> i think a bit
[20:42:02] <@Aeolus> 2 weeks?
[20:42:04] <+JabbaTheGriffin> 20 days seems better
[20:42:04] <@Aeolus> too short?
[20:42:19] <+JabbaTheGriffin> 21 maybe for the full 3 week rotation
[20:42:21] <+aldaron> o lol I thought it was too short
[20:42:24] <+JabbaTheGriffin> 2 weeks seems too short
[20:42:35] <@Aeolus> i agree
[20:42:39] <@Aeolus> let's just stick to a month
[20:42:52] <+husk> there's stuff like this doublescreen deoxys idea that we're only seeing now, 3 months after it's unbanned
[20:42:52] <+JabbaTheGriffin> (1 day is too long)
[20:43:08] <@Aeolus> yes, but that is different husk
[20:43:11] <+JabbaTheGriffin> but in relation to banning garchomp
[20:43:14] <@Aeolus> we unbanned deoxys
[20:43:21] <@Aeolus> we are banning garchomp
[20:43:29] <+husk> I mean as far as what will be viable
[20:43:41] <@Aeolus> yes I understand
[20:43:55] <+JabbaTheGriffin> i don't think that matters in relation to garchomp's tier status though
[20:44:00] <@Aeolus> honestly, much longer than a month and people will lose interest
[20:44:04] <+JabbaTheGriffin> what DOES matter in relation to his tier status
[20:44:07] <+JabbaTheGriffin> during this test
[20:44:11] <@Aeolus> a week is an eternity on the internet
[20:44:16] <@Aeolus> a month is like..... forever
[20:44:17] <+husk> if other stuff becomes ridiculously good when garchomp is gone...
[20:44:18] <+JabbaTheGriffin> are people supposed to be looking for something?
[20:44:24] <+JabbaTheGriffin> then we ban them too
[20:44:29] <+husk> yeah I think 3 weeks to a month should be good
[20:44:30] <+JabbaTheGriffin> if they're uber
[20:44:36] <+aldaron> other stuff like Lucario >=)
[20:44:36] <@Aeolus> ok, a month it is
[20:44:39] <+JabbaTheGriffin> you don't leave ubers in OU to keep other ubers in check
[20:44:45] <@Aeolus> x-act will help define the threshold
[20:44:48] <@Aeolus> and we'll vote
[20:45:01] <+husk> it's like going from 1 pokemon that's tough to handle to 5
[20:45:08] <+husk> are you going to get rid of those 5 then?
[20:45:12] <+JabbaTheGriffin> yes
[20:45:16] <+husk> oh ok
[20:45:26] <+husk> so you weren't better off with that 1 unstoppable poke
[20:45:30] <@Aeolus> I'm going to post about this on the forums
[20:45:32] <+JabbaTheGriffin> nope
[20:45:39] <+JabbaTheGriffin> i've always been a proponent of this view
[20:45:41] <@Aeolus> If there are any serious objections... raise them now
[20:45:49] <+JabbaTheGriffin> you don't keep broken pokemon to check broken pokemon
[20:46:03] <+husk> well
[20:46:10] <+husk> hopefully we don't lose a lot of the ou tier then
[20:46:10] <+JabbaTheGriffin> it doesn't matter how many broken pokemon a single one keeps in check
[20:46:14] <@Aeolus> If there are any serious objections the method laid out in the chat, say so now
[20:46:23] <+husk> nope
[20:46:24] <+JabbaTheGriffin> the most we would even possibly come close to losing is i'd say
[20:46:25] <+JabbaTheGriffin> salamence
[20:46:32] <+husk> lucario too
[20:46:38] <+husk> tyranitar as well
[20:46:39] <+aldaron> Well husk that's the other purpose of this test, to determine exactly what the Suspect-free metagame is
[20:46:42] <+JabbaTheGriffin> eeeeeh i don't see it getting too much better
[20:46:53] <+husk> sd lucario...
[20:46:58] <+aldaron> yea I'd say Salamence, Dragonite, Lucario and Tyranitar will be the man Pokemon I am looking at 
[20:47:03] <+JabbaTheGriffin> loses one pokemon that can outspeed and ohko
[20:47:05] <+JabbaTheGriffin> there are plenty more
[20:47:08] <+husk> you can't sd on your first chance because full health garchomp is nearly always there to stop you
[20:47:10] <@Articuno64> jumpman
[20:47:18] <+darkie> jabba do you want garchomp banned?
[20:47:19] <+Brawley> cress will get better imo
[20:47:21] <+JabbaTheGriffin> yes
[20:47:26] <+aldaron> Jabba SD Lucario gets a huge boost with no Garchomp on everyteam to stop it
[20:47:32] <+Hipmonlee> my only objection is the method of deciding whose applications to vote will be accepted
[20:47:32] <+darkie> ok cool
[20:47:33] <+husk> so you cc on the switch usually and your opponent can take advantage of that, etc.
[20:47:35] <+JabbaTheGriffin> well sacrificing 76% to stop it
[20:47:36] <+Hipmonlee> which I might have missed
[20:47:39] <+Hipmonlee> but I didnt see explained
[20:47:43] <+JabbaTheGriffin> is neutering your best sweeper
[20:47:55] * Joins: super_king (~BUBBRUBB@pool-96-229-32-223.lsanca.dsl-w.verizon.net)
[20:47:55] * ChanServ sets mode: +v super_king
[20:47:56] <+JabbaTheGriffin> and there are plenty of other pokemon you can let get neutered if you want to go that route
[20:48:04] <+husk> uh
[20:48:11] <+husk> that's after lucario has it's counters gone, you know
[20:48:17] <@Aeolus> does anyone know how I can copy all the text in this chat window?
[20:48:17] <+husk> there aren't too many other pokes who can do that
[20:48:31] <+JabbaTheGriffin> wait i don't even get this hypothetical situation
[20:48:38] <@Articuno64> aeolus: if you keep logs, just open your log file
[20:48:46] <+Hipmonlee> I couldnt figure out how to do it
[20:48:48] <@Aeolus> oh, that's a good idea
[20:49:07] <+darkie> it's probably a huge file though
[20:49:09] <+darkie> :\
[20:49:17] <+JabbaTheGriffin> salamence becomes more viable with garchomp out of the way
[20:49:18] <@Articuno64> lol probably not
[20:49:20] <@Aeolus> well, i'll have 3 years of logs
[20:49:24] <+JabbaTheGriffin> salamence is a better luke check than garchomp is
[20:49:25] <+darkie> yea..
[20:49:27] <@Aeolus> so, lots of small files
[20:49:30] <@Articuno64> it's just text
[20:49:38] <+husk> that's not really the point 
[20:49:42] <+husk> but I guess we'll see
[20:49:44] <+darkie> it takes a little while to load
[20:49:59] <+JabbaTheGriffin> i guess i'm not seeing the point >_>
[20:49:59] <+JabbaTheGriffin> oh well
[20:50:15] <+husk> it's theorymon either way
[20:50:22] <+husk> when the ladder is in place we'll see what happens 
[20:50:31] <+aldaron> Huh, Salamence isn't a better Luke check
[20:50:31] <+husk> obviously neither of us is 100% sure about what we're discussing anyway
[20:50:52] <+aldaron> HP Ice and Stone Edge both easily take care of Mence 
[20:51:23] <+JabbaTheGriffin> unless it's faster
[20:51:23] <+Brawley> scarf tar is fun to take on luke with
[20:51:34] * Quits: +zerowing (~BUBBRUBB@bad.meets.evil) (Ping timeout)
[20:51:51] <+JabbaTheGriffin> which can still take extremespeed well with just max hp
[20:52:09] <+JabbaTheGriffin> but yeah whatever theorymon sucks when we have no clue at all what's going to happen
[20:52:24] <+aldaron> I'm excited for Dragonite to be honest
[20:52:29] <+Brawley> me too
[20:52:33] <+Brawley> ^_^
[20:52:38] <+Brawley> (shut up Iggy)
[20:53:12] <+JabbaTheGriffin> yeah dragonite is my 2nd favorite pokemon maybe i'll finally use it now!
[20:53:14] <+Brawley> dd outrage with adamant does a little less than sd jolly outrage
[20:53:23] <+JabbaTheGriffin> (maybe entei will become viable now too???)
[20:53:25] <+Brawley> Although I like dd roost nite sometimes
[20:53:26] <+JabbaTheGriffin> heh
[20:53:40] <+Brawley> just because I dont have to invest a lot in special defense
[20:53:56] <+Brawley> I also see gyarados usuage going down a little
 
Yes, the vote happens at the end of the 1 month test.

I don't like the idea of Top 100 for several of the reasons described in the chat log... basically it is only based on CRE, which people didn't think was the best idea and it puts a hard cap on the number of voters.
 
From reading the IRC log, it sounds like people want some combination of rating and deviation. However, the Conservative Ratings Estimate IS a combination of rating and deviation (and volatility too). Perhaps it isn't the right combination, for what we are trying to achieve.

You could set a CRE minimum and a deviation minimum. The deviation is the indicator of whether the user is battling regularly or not. This number increases over time (which is bad for your CRE).

I don't think the CRE minimum should be too high. It needs to represent competent battling, but I don't think it should require you to be spectacularly skilled. I think 1400 CRE is an appropriate threshhold for a good-but-not-great battler. If you want to raise the bar a bit, then maybe go as high as 1450. Any higher than that is asking a lot from the general metagame community. If you can carry a 1400 CRE with a decent deviation, then you probably have a good understanding of the metagame.

Deviation is a little harder to peg. If you want to be lenient, then require the deviation to be less than 100. If you want to require very active play, then maybe 50. Something in between 50 and 100 probably makes the most sense. But, it's really dependent on what sort of activity level you want to require.

Here's some query results to help get an idea of the magnitude of the deviation numbers. I ran a query of all players in the standard ladder with a CRE higher than 1400. The list had 400 players in it. I sorted them by deviation. I selected the top of the deviation list and the bottom of the deviation list. I also included a few names from the middle of the list. You can see the deviations for each player listed.

At the top of the list is Justinawe, who battles religiously every single day. At the bottom are some brand new players and some good players that haven't played in a while. You might notice Aldaron at the top of the bottom grouping. Aldaron has a good CRE (he's in the top 100 actually) but his deviation is high in comparison to the rest of the 1400+ crew. He's in the top 100 because his rating is very high (when he does play, he wins a lot).

Code:
+--------------------+------------------+------------------+------------------+--------------------+
| name               | estimate         | rating           | deviation        | volatility         |
+--------------------+------------------+------------------+------------------+--------------------+
| JUSTINAWE          | 1608.68895167983 |  1701.8136917796 | 18.8432964249474 |  0.045819831122582 |
| PELFORTH00         | 1490.28145498914 | 1578.53840915782 | 20.2461131880763 | 0.0535692770150728 |
| ISAACMM            | 1529.25210865137 |  1615.5724773514 |  21.580092175008 | 0.0403026220790722 |
| POKEFAN362         | 1448.65824684931 | 1505.65299004362 | 21.7088431627389 | 0.0504456968404671 |
| MONIKA             | 1589.69519839728 | 1676.94986943382 | 23.0652176379511 |   0.05559900535359 |
| KOISHII X          | 1569.76648271046 | 1637.15225872416 | 23.1209633647075 | 0.0514395239782885 |
| BRIAN MCCANN       | 1641.58154932579 | 1749.51363018258 |  23.826447722307 | 0.0430761245557509 |

| MYSTICA            | 1605.58133418505 | 1762.55687926587 | 40.6764615621571 | 0.0578770415584254 |
| MS                 | 1496.71114859707 | 1661.66807810237 | 41.2392323763248 | 0.0561460512741924 |
| GREAT SAGE         | 1617.87128381138 | 1700.83902524391 | 41.3428081753287 |   0.05872567643966 |
| HIPMONLEE          | 1653.73731668894 | 1825.32513002834 |  41.755000074737 | 0.0565364748433684 |
| IGGYBOT            | 1614.54435784085 | 1745.11940215716 | 42.7531736264108 | 0.0571063174711074 |
| AEOLUS             | 1553.67429351572 |  1729.1982950629 | 44.4547090195041 | 0.0573898658982861 |

| MOP                | 1451.41803751264 | 1726.64222086984 | 68.8060458393005 |  0.059937707312514 |
| BASS               |  1498.7726048245 | 1718.68885742091 | 68.9414421694514 | 0.0596224216015946 |
| EARTHWORM          |  1532.4036392476 | 1821.22227778399 | 72.2046596340969 | 0.0599359214611483 |
| SON OF THUNDER     | 1462.72506900209 | 1755.39399252858 | 73.1672308816214 | 0.0599360366412476 |

| ALDARON            |  1537.7497485907 | 1877.70077485782 | 84.9877565667788 | 0.0599788515373006 |
| TAY-HIME           | 1483.28459315858 | 1785.62439245466 |  85.551759258057 | 0.0599873577387333 |
| DIXIE PLUG         | 1481.59827640327 | 1824.21554006573 |  85.654315915616 | 0.0599078794589876 |
| PHIDDLESTICKS      | 1466.01127046381 | 1725.28177632287 | 85.9437516441258 | 0.0596634703160348 |
| NUMBERTHREE        | 1453.84975598751 | 1802.75346709441 | 87.2259277767252 | 0.0599649684081411 |
| HAVAK              | 1440.63894908234 | 1792.76387895629 | 88.0312324684883 |  0.059687560719295 |
| TESTWORM           | 1492.15045166262 |  1844.5121602841 | 88.0904271553689 | 0.0598364878671414 |
| STVN               | 1526.34441819495 |  1881.7313623277 | 88.8467360331881 |  0.059941998284007 |
| TRAINERX           | 1483.52786078307 | 1801.17080755673 | 89.0879745618183 | 0.0597960092717397 |
| GARZEL             | 1428.79999356628 | 1651.40975885193 | 90.3551934831062 | 0.0599964219556018 |
| SNOW               | 1444.48055764421 | 1811.23242166134 | 91.6879660042836 | 0.0598256423489489 |
| DARKNESSMALICE     | 1428.29106117053 | 1782.88213422593 |   95.59575209992 | 0.0599716395738991 |
| TOUCHING SKY       | 1414.45083653313 | 1878.79887685877 | 116.087010081411 | 0.0599776618567775 |
| BLUE HAIRED LAWYER | 1406.49987472725 | 1897.12046881391 | 131.297032266676 | 0.0599676036933241 |
| POKEBEACH          | 1487.88659526752 | 1803.48968498681 | 137.778778245708 | 0.0599856308548972 |
| JEB                | 1494.13334009885 | 1741.39518502785 | 148.251087278874 | 0.0599911750096703 |
| AMERICA FUCK YEAH  | 1626.95174196449 |             1500 |              350 |               0.06 |
| NEMESIS_NKI        | 1514.22466097746 |             1500 |              350 |               0.06 |
| HIRO NAKAMURA      | 1401.70778222769 |             1500 |  354.62545085763 |               0.06 |
+--------------------+------------------+------------------+------------------+--------------------+

I ran the same query for the Uber ladder. That returned a total of 37 players (in standard it was 400 players). Here's the list sorted by deviation, same as above.

Code:
+----------------+------------------+------------------+------------------+--------------------+
| name           | estimate1        | rating1          | deviation1       | volatility1        |
+----------------+------------------+------------------+------------------+--------------------+
| KING OF HAX    | 1581.42527831272 | 1700.39430154726 | 28.6433809769266 | 0.0418790947860655 |
| I AM COOL      | 1725.45950435118 | 1849.96287569304 | 31.1258428354645 | 0.0361168998447122 |
| FRESH PRINCE   | 1436.55915265615 | 1570.08600867945 | 33.3817140058256 | 0.0543298471146678 |
| MIROR B.       | 1576.73230344111 | 1722.72114934567 | 36.4972114761409 | 0.0593455063255379 |
| IKE123         | 1725.04911618454 |  1887.9998864809 |   41.33541063236 | 0.0570127007675329 |
| DRUGGEDFOX     | 1563.40832964697 | 1732.04192043235 | 43.0385753356756 | 0.0593837476082098 |
| JADEN YUKI     | 1428.70724564721 | 1604.44907735782 | 43.9354579276522 | 0.0593712693878452 |
| TRAIN MAN      | 1553.56565690036 | 1731.82304190661 | 44.5643462515631 | 0.0547317807066533 |
| FIND TAB       | 1647.60638926436 | 1826.95891808618 | 44.8381322054561 | 0.0572955560764978 |
| UBER KING      |  1510.9830370918 | 1702.45276007897 | 47.8674307467934 | 0.0590248196902181 |
| ULTIMIFIER     | 1587.95815123339 | 1788.26247406516 | 50.0760807079413 | 0.0576519693549194 |
| MISSINGNO      | 1942.84107831298 | 2180.54553360899 | 59.4261138240027 | 0.0557108512092074 |
| ABSOLDEATH     | 1432.02068958083 |  1610.3496521267 | 60.2038160093236 | 0.0598538103589566 |
| PHSYCICAL      | 1460.33671477452 | 1707.08801574186 | 61.6878252418352 | 0.0599910477037719 |
| ENTEI          |  1672.7749689658 | 1925.34379443351 | 63.1422063669264 | 0.0560978333664268 |
| OVERLORD310    | 1459.81653760446 | 1732.37289223876 | 68.1390886585744 |  0.059463478287111 |
| PSYROCLASM     |   1504.836020436 | 1778.62425682098 | 68.4470590962458 | 0.0593354105986154 |
| WDRO           | 1482.70343533394 | 1756.70156646516 | 68.4995327828042 | 0.0593915301840289 |
| COALEX         | 1441.94475368139 | 1724.97820194917 | 70.7583620669442 | 0.0590089422268966 |
| MIND           | 1568.64271345153 | 1862.10275446734 | 74.1982020064234 | 0.0595314747353223 |
| MUDKIP         | 1439.41393133438 | 1740.67467540397 | 75.3151860173986 |  0.059391549781022 |
| PHISSISSACK    |  1500.9739473088 | 1739.31988537546 | 76.4392911878543 | 0.0599373016541581 |
| BARIS          | 1449.66246878459 | 1759.51596216082 |  77.463373344058 | 0.0593112995305068 |
| STVN           | 1526.81931688318 | 1838.32811239952 | 77.8771988790853 | 0.0599487963828171 |
| MANIACLYRASIST | 1537.20161951181 |  1853.6536190139 | 79.1129998755227 | 0.0596651614520173 |
| GARCHOMP       |   1480.770443785 |  1802.6750644604 | 80.4761551688491 | 0.0595827387086626 |
| SULPHUR        | 1427.67861884184 |  1765.7277247483 | 84.5122764766156 | 0.0594777842408145 |
| PIKACHUXD      | 1460.54079055534 | 1799.08732434896 | 84.6366334484046 | 0.0597859850995211 |
| PIPO           | 1400.60837621053 | 1733.38522564795 | 85.1407692568779 | 0.0598679069947194 |
| SCYTHER        |  1583.1582367638 | 1946.87387737788 | 90.9289101535205 | 0.0597035411157762 |
| JAY            | 1590.69395745814 | 1935.21809272305 | 95.7907127273445 | 0.0599677154636131 |
| RAIKOU         | 1429.12559530626 | 1839.57729130369 | 102.612923999357 | 0.0599816320461386 |
| I FAIL         | 1421.70902371775 | 1845.37057290683 | 105.915387297269 | 0.0599821299071192 |
| THE BEST       | 1456.87740079121 | 1922.07796713985 | 116.300141587159 | 0.0599725156342016 |
| PALKIA         |  1514.1481369929 | 1992.85510956807 | 119.676743143793 | 0.0599410314145658 |
| JEB            | 1488.94775336221 | 1989.00842492119 | 153.307807066243 | 0.0599477783823905 |
| J00PHAIL       | 1446.38481177525 |             1500 |              350 |               0.06 |
+----------------+------------------+------------------+------------------+--------------------+

Perhaps these numbers will you understand what deviation is reasonable. Like I said above, I think somewhere between 50 and 100 is probably what we should be considering.
 
Is there any idea when this ladder would actually be started? I'm not sure if it's going to be in mid-August or the beginning of September. I would like to have an idea, because both times have their advantages and disadvantages for me.

Also, I read the IRC log, but I was wondering, could the bold votes in the separate thread for the randoms/people who just fell short still be counted? I believe that it would be a good idea to take some of those votes into account, but to be a lot harsher with those votes, meaning that the arguments must be very well done in order for them to be accepted.

I'm not sure how harsh the eligible voters' votes would be critiqued, but if we were to accept votes from the randoms in the other thread, would the judge(s) be just as harsh with the experienced players when accepting votes?
 
I don't like CRE much to be honest.

In my implementation of the new rating system for Competitor (which is ready, by the way, if anyone wants me to post it in a new thread in IS), I suggest that a deviation of more than 100 would be a provisional rating. Naturally, I can give you the same limit here and suggest that a deviation of less than 100 would be considered 'reliable enough' for voting... but it seems like ShoddyBattle's rating system allowed the deviation to go as low as 18 (which is very bad in my opinion; I'm sure Justinawe's rating can barely change even if he wins 10 games in a row). Because of that, I'd go for an even more reliable deviation here, so I'm suggesting:

1) People eligible to vote must have a deviation of not more than 70.

Secondly, the rating must be high enough to warrant voting. I'm not talking about the CRE here; I'm talking about the actual rating. I'll suggest:

2) People eligible to vote must have a rating (not CRE) of not less than 1700.

We can stipulate a date at which the rating of all persons will be looked at for this purpose (so that someone like Aldaron and Earthworm can make the cut by playing a few games and thus lower his deviation to the acceptable 70). The date can be the same date when the test ends.
 
A minimum CRE with a minimum Deviation is fine for me; I just figured since the "Rating" number is the actual rating, and since with a low deviation the uncertainty in that number is effectively reduced, that we might as well use the "real" rating.

But that's really just arguing semantics I guess, since the main point is enforcing that Deviation line.

I'll agree that 50 might be too stringent based on the results you've posted, but I'm not looking in the middle of the range of 50-100 either.

I'd probably support 60 for a decent line, as I know Bass doesn't battle too often but nearly enough to be able to intelligently form an opinion of the metagame, and his deviation is 68.

So yea, I'll basically support any number between 50-100 that is closer to the 50 side than the 100 side.

As for CRE or Rating, it doesn't make a difference with the minimum Deviation, so either is fine by me.

O btw, X-Act, I love reading about Rating systems and methods, so I'd love it if you posted your plans for Competitor's rating system in a new thread ^_^
 
Okay, I didn't realise that 80 is at the very low end of the player's spectrum, so yeah I'll make it lower. 70 is a good all round number I think.
 
Would a person wishing to vote have to have these stats at the end of the testing period, or would it count if they had acheived it some time during the month? I don't ladder frequently/at all, it's just from what I gather the leaderboard changes quite fast, so I want to know if / when I should make the effort to acheive the goals.
 
If we could X-Act, could we perhaps lower the Deviation number, to emphasize playing on the ladder, and maybe lower the Rating, here to be more "lenient."

I'm looking at the list and there are some members that I know who play and are intelligent, and with the rating at 1700, they would be ineligible to vote.

I figure we would have to have a subsequent drop in deviation to balance it out (kinda)
 
Okay, after talking a bit in #is, we arrived at the following:

To be eligible to vote, a player's rating (not CRE!) should be 1650 or more and his deviation should be 60 or less, in BOTH the Garchomp-less ladder and the Standard ladder. Both readings would be noted at the end of the Garchomp-less ladder test.

(The player's rating is the middle value of the rating range shown by ShoddyBattle, if anyone wants to know his or hers. The player's deviation is half the difference between the two extremes in the rating range shown by ShoddyBattle. Example: if your Rating range says 1306 - 1490, you have a rating of (1490 + 1306) / 2 = 1398 and a deviation of (1490 - 1306) / 2 = 92.)
 
I like that a lot X-Act. We'll use that as the threshold.

Next question that someone already asked:

Does it make sense to require players to be above the threshold at the end of the test, or would it be ok to let them vote if they hit it at any time during the month?

Also, Doug: Is there any obstacle in actually collecting a list of people who meet the requirements we define?
 
I like that a lot X-Act. We'll use that as the threshold.

Next question that someone already asked:

Does it make sense to require players to be above the threshold at the end of the test, or would it be ok to let them vote if they hit it at any time during the month?

Also, Doug: Is there any obstacle in actually collecting a list of people who meet the requirements we define?

If we want to collect it at a single defined time (like the end of the month), it's a breeze. If we want to collect it every day, it is harder, but still not very hard. Ratings are calculated every day at 11:30pm. If we want to add a little extract query to write out players above the threshold, it should be very simple to add. The ratings thread (a processing thread, not a forum thread) is already there.
 
To be eligible to vote, a player's rating (not CRE!) should be 1650 or more and his deviation should be 60 or less, in BOTH the Garchomp-less ladder and the Standard ladder. Both readings would be noted at the end of the Garchomp-less ladder test.

I assume you're talking about minimum rating here, not average rating.

Anyways, I'd be for "If you're 1650+ rating and 60- deviation on X day, you're good" as far as timing goes; if you absolutely have to have that rating on the last day, I (and probably a lot of other people) would just wait until the end of the month and try to race to the required rating, which would be kind of a mess.

And is there a way for a normal user to check deviation? When I click on "View Record" I only see CRE, Rating, and Volatility.
 
No, he means average rating. Taking the two ends of the range, add them and divide by two.

You find deviation by taking half of the difference of those two.
 
If your displayed Rating range is between Min and Max, then

Rating = (Max + Min) / 2
Deviation = (Max - Min) / 2

Also, your displayed CRE = (5*Min - 3*Max) / 2, and this is the rating that's displayed on the leaderboard. We won't be using this, though.

Example: Suppose your rating range in the Standard Ladder is between 1664 and 1710. Then your CRE, the 'rating' displayed on the leaderboard, would be equal to (1664*5 - 1710*3) / 2 = (8320 - 5130) / 2 = 3190 / 2 = 1595. This is NOT the rating that we're going to use, though. We're using the following two numbers:

Rating = (1710 + 1664) / 2 = 1687.
Deviation = (1710 - 1664) / 2 = 23.

If Rating is 1650 or more, and Deviation is 60 or less for both ladders, you would be eligible to vote. In this case, for example, the player passes the criteria for the Standard ladder; he would need to pass the same criteria for the Garchomp-less ladder as well in order to be able to vote.

Hope that is crystal-clear now.
 
Alright, one question, do we have to reach the 1650 rating for both ladders on the same alias?

For example:

bologo - standard ladder = 1650
bologo - suspect test ladder = 1650

Is this the way it has to be?

Or can it be done this way:

bologo - standard ladder = 1650
bologo-testing - suspect test ladder = 1650

Can it be done on two separate aliases like this?

Yeah, I'm just wondering, because I remember that the shoddy mods/admins can check the aliases that belong to each IP address. It might help if people making too many aliases is a problem, since people will make a fresh start on these ladders with a new alias if they have to. If people can just use two of their current aliases to do it, then they won't need to make a new alias, I'm just saying this because I saw that thread that talked about making a limit on aliases in IS.

Just a suggestion, the first way is fine too if you guys want to keep it, or if too many aliases isn't that much of a problem.
 
To be honest, I would be fine with getting above the threshold with different accounts. Although, of course, we would have to know which alias account corresponds to which normal account.
 
Back
Top