Go Back   Smogon Community > Competitive > The Policy Review
Register FAQ Social Groups Calendar Search Today's Posts Mark Forums Read

Reply
 
Thread Tools
Old Apr 29th, 2011, 7:48:20 PM   #26
Rising_Dusk
is a Site Staff Alumnusis a Team Rater Alumnusis a Battle Server Admin Alumnusis a Super Moderator Alumnusis a Programmer Alumnusis a Smogon Media Contributor Alumnusis a Contributor Alumnus
 
Rising_Dusk's Avatar
 
Join Date: Dec 2009
Posts: 4,742
Default

Quote:
Originally Posted by Fat khz
Perhaps someone more informed about the programming can answer this: is it possible to implement Shoddy's old rating system?
It can be implemented, but requires new clients and a new release of PO and stuff. It also shouldn't be terribly difficult, but getting any of the PO coders to give it any priority would probably be tough. That said, I am on the programming push list for PO, but what with stats and other massive undertakings I am working on, can't really invest much time into it at the moment. Maybe I can work something out with coyote for the next PO release. It would certainly not happen for awhile, in any case.
Rising_Dusk is offline   Reply With Quote
Old May 1st, 2011, 2:47:59 PM   #27
Firestorm
I did my best, I have no regrets!
is a Site Staff Alumnusis an Artist Alumnusis a Super Moderator Alumnusis a Smogon IRC AOp Alumnusis a Smogon Media Contributor Alumnusis a Battle Server Moderator Alumnus
 
Firestorm's Avatar
 
Join Date: Apr 2007
Posts: 7,261
Vancouver, BC
Default

Quote:
Originally Posted by Fat franky View Post
the problem with the old paragraph system is that they will essentially echo each other's statement and it's too time consuming. with a rating of 1500+ we already assume that the player is well-versed enough to vote without justifying him/herself. with all the rating decay though, i think we can solve this by
I would say paragraphs should be used to justify the reason for you vote rather than you being knowledgeable enough. Being a good player doesn't mean you're necessarily voting to ban things that keep the metagame from being competitively playable. Rather, you may be voting to ban things because you dislike it / it annoys you / it increases your chances to keep on winning.
__________________
Firestorm is offline   Reply With Quote
Old May 2nd, 2011, 7:29:23 PM   #28
Kevin Garrett
is a competitor
is a Tutor Alumnusis a Tournament Director Alumnusis a Site Staff Alumnusis an Artist Alumnusis a Super Moderator Alumnusis a Smogon Media Contributor Alumnusis a Tiering Contributor Alumnusis a Battle Server Moderator Alumnusis the Smogon Tour Season 12 Championis a Smogon Premier League defending champion
 
Kevin Garrett's Avatar
 
Join Date: Jan 2008
Posts: 3,237
Passaic County, NJ
Default

I have been in favor of having paragaphs for suspect tests since the beginning of BW. I talked to Philip about it at length and I understand he was trying to keep subjectivity out of the council, but it's quite clear that votes are subjective to begin with. It needs some kind of check. From what I understand, the question about having paragraphs is, "How long should they be?" If they are too short, it is easy to sugarcoat your opinion with facts. If they are too long, it is a stress on the council to read through them all. Nonethelss, I think there is a middle ground we can come up with to make everyone happy.
__________________
Kevin Garrett is offline   Reply With Quote
Old May 3rd, 2011, 4:22:06 AM   #29
khz
 
Join Date: May 2010
Posts: 264
Default

Quote:
Originally Posted by Fat Rising_Dusk View Post
It can be implemented, but requires new clients and a new release of PO and stuff. It also shouldn't be terribly difficult, but getting any of the PO coders to give it any priority would probably be tough. That said, I am on the programming push list for PO, but what with stats and other massive undertakings I am working on, can't really invest much time into it at the moment. Maybe I can work something out with coyote for the next PO release. It would certainly not happen for awhile, in any case.
Just thought I'd post this:
Quote:
Originally Posted by Fat Wiz
It is possible. The rating system is implemented in the PO server application, which is open source. It can even be changed without breaking compatibility with the current client application.

I've thought about implementing it myself around 8 months ago, but I talked to coyotte508 and he didn't like the idea.
And just so I stay on topic: Like I said before, I'm not totally convinced that high ranking = qualified to vote based on reasonable measures. But even if you have paragraphs justifying your vote how hard is it to want to vote "Blaziken because I cbb putting a counter on my team" (just an example) but then in a paragraph put reasons that you found in one of many discussion topics? To be perfectly honest I think most people vote for rational reasons, but given that paragraphs do very little I don't think it's worth the effort, unless someone can show me that there are some people who will put in bad reasoning in these paragraphs to warrant them being stripped of voting privileges, which I'm all for.
khz is offline   Reply With Quote
Old May 3rd, 2011, 4:47:33 AM   #30
animenagai
is a past World Cup of Pokemon champion
 
animenagai's Avatar
 
Join Date: Nov 2007
Posts: 1,202
I make it rain silver points!
Default

Quote:
Originally Posted by Fat Haunter View Post
This. Bring back the old requirement of 1400. It's already hard enough to accomplish when factoring in hax and rating decay. We definitely need to enlarge our voting pool.
I agree, and I've been saying this for a while now. Tell me, what exactly is the 15/15 system supposed to achieve? What I've heard from the higher-ups is that it's supposed to reward effort and consistency throughout the tiering process (please correct me if I am wrong). However in practice, it does no such thing. Put it this way, no one needs to ladder for the majority of the testing period. Since a snapshot is taken a the end of the tiering process and that alone (basically) tells you who qualified and who didn't, why wouldn't you just ladder at the end of the period? This system punishes good players who don't have a lot of time on their hands. Some of us are busy people in uni, at work or both. Not all of us are teenagers who can ladder for hours on end every day.

I've seen people argue that you can just reach a high ladder ranking and then win 1 game every day to stop your decay. The problem with this argument is that being in the 15/15 range early in the test and being qualified later in the test would require completely different scores. As the testing period goes on, the average score on the leaderboard will go up. What gets you in the 15/15 range in the first week probably won't be good enough by the time the test ends. Seriously, if you guys are just worried about people qualifying early (and hence not knowing what is broken etc.), just set a time frame. You could do something like 'to qualify for voting, you need to achieve a rating of 1400 or higher after the 2nd week of the suspect test". That would still give us more time and flexibility than the current system. Heck, raise it to 1450 if you want to. Just establish a reasonable number that is concrete throughout the suspect test.
animenagai is offline   Reply With Quote
Old May 4th, 2011, 5:05:14 PM   #31
coyotte508
is a Programmer
 
coyotte508's Avatar
 
Join Date: Jul 2007
Posts: 157
Default

Quote:
Originally Posted by Fat PK Gaming
It allowed good players to ladder occasionally instead of religiously every single day.
I'm not going to argue about anything, just saying that you don't have to play every day. (if the decay is set to be once every 24 hours). Just stop playing for a few days and then play the same number of battles as the
number of days you were offline, and decay is erased.
coyotte508 is offline   Reply With Quote
Old May 4th, 2011, 9:14:29 PM   #32
jrrrrrrr
wubwubwub
is a Contributor to Smogonis a Forum Moderator Alumnusis a Tiering Contributor Alumnusis a Battle Server Moderator Alumnus
 
jrrrrrrr's Avatar
 
Join Date: May 2006
Posts: 3,118
wubwubwub
Default

Just posting to say that I agree with the OP. The PO rating system is terrible and it's an embarrassment to the suspect testing process that we rely so heavily upon it. One bad luck streak over a small handful of battles and suddenly your 1400 "near qualification level" rating is now down to an 1100 "pathetic" rating. I haven't had a chance to run through the proposed system but SOMETHING really needs to change here if we want the tests to have any legitimacy. I have no problem with making the voting pool exclusive but using POs rating system boils it down to "who can avoid hax the most" instead of the original goal of the suspect test, which was to have voters based on their knowledge of the metagame.
__________________
NIGMAN: JRRRR IS A SHIT
NIGMAN: HO ESLE THINK IT??
jrrrrrrr is offline   Reply With Quote
Old May 18th, 2011, 7:28:45 PM   #33
Smith
☆*:.。. o(≧▽≦)o .。.:*☆
is a Team Rater Alumnus
 
Smith's Avatar
 
Join Date: Nov 2009
Posts: 1,342
Denton, Texas, USA
Default

I wasn't going to post in here because I didn't see what was wrong with the system- but I had never made an attempt at voting before. I've been laddering for quite a bit now to try and get voting reqs and the system is absolutely ridiculous. I'm really angry so can I just talk about the actual process of laddering? I hate it, it's terrible. I've gotten so many +7 - 24 battles or the like and I think I'm going to scream. I've dropped about 60 points today, because I got critted and stuff- and once I got angry at that, it just went downhill in a negative feedback loop. I'm running a stall team so I actually have a much higher chance of losing to noobs than people who actually know what they're doing (I actually got swept by an Electivire that had the exact four moves it needed to sweep, two of which weren't viable in the least) because of the random shit they pull. Once I get paired up against a batch of noobs and I lose, my rating drops, I get even angrier, I face more noobs because my rating is lower, and it's awful. Now I know this tirade is kind of off topic but the rating system is simply killing me, and I wish laddering were a bit easier. I've easily had over 100 battles this round and I clearly know enough about the metagame to vote (at least in my opinion), but there is just no way I am getting into that 15 + 15 range. I am not a bad Pokemon player, only an inconsistent one.

I have a couple of ideas- firstly, I'd like to bring the bar like to 1400 like everybody is talking about. I don't get whats even wrong with people parking their accounts anyway, they clearly are qualified to vote if they can get that high. Not to mention that 15 + 15 sets the bar really really high- right now, it puts it at 1499, and it's only going to get higher. That would make this even WORSE than Round 1, when EVERYBODY complained that voting reqs were too high. 1400 WORKS. I don't care how lucky you are, you simply cannot hax your way up to 1400 without some knowledge of the metagame- and even if you did, you must've learned something in the massive amount of battles that would require.

The other thing I would like to see is MORE SUBJECTIVITY in special applications. Yeah, I know, that's exactly what we've been trying to avoid, but why? Getting high on the ladder is the most objective proof available, we really don't need any more "objectivity". I think that a high-ish ladder rank would be a great thing to include in your special application, but I don't think Iconic or Eo or somebody should have to explain why they got haxed out of voting reqs (despite the fact that that would never happen). You should just have to send a PM to reach with your experience, any evidence you can supply (like people you've beaten, ladder peaks, tournament wins or placings, etc.) and what you think of the metagame. Just impress upon reach that you know what you're talking about, and he shouldn't have to ask about your ladder peaks (although they obviously help in showing your competence, if they're high). I have faith in the fact that reach isn't a moron and that Phil wasn't a moron for putting him up top; I think I can leave it to him to decide who's worthy of voting.

In summary, we all hate hax, hax is everywhere, ratings aren't always so telling of skill, we expect higher ratings than we ought to, and I trust the people up top.
__________________
[20:44] <@DrFidget> Am I the only one who still uses the internet strictly for porn and videogames?
Smith is offline   Reply With Quote
Old Jul 10th, 2011, 12:26:09 PM   #34
BKC
bringer of torture
is a Smogon IRC AOpis a Team Rater Alumnusis a Forum Moderator Alumnus
 
BKC's Avatar
 
Join Date: May 2010
Posts: 2,228
Prague
Default

Echoing Smith. This rating system is fucking terrible. For instance, I was at 1446 or so last week, and I got a +7 -24 match...I was about to win, go up to 1453 and get my voting reqs, but my Landorus got crit flinched and then flinched again by Excadrill's Rock Slide. It's not just the hax, it's the decay...
conversation with symphonyx64

It really comes down to who can battle the most to avoid decay, or who can avoid hax the most. I think if Shoddy's old rating system was implemented, it would be alot more beneficial because it means one or two hax losses vs. scrubs won't kill you. If we're not going to do that, the bar should definitely be set at 1400...if you can get 1400 you definitely have played enough of the metagame to vote. You can't simply hax your way to that high a rating without being somewhat knowledgeable of the metagame.
__________________
RYM
BKC is online now   Reply With Quote
Old Jul 10th, 2011, 2:27:48 PM   #35
Moo
Professor
is an Artistis a Pokémon Researcheris a Contributor to Smogon Mediais a Contributor to Smogon
 
Moo's Avatar
 
Join Date: Jun 2010
Posts: 1,447
Default

Rating system is annoying, and decay is a pain, no doubt about it, but I think it's up to the people that run PO to change the ladder system, not us.
One of my friends is a PO admin and helped make it, I could ask about the shitty rating system
__________________
GP / UU QC, VM for check
VM for a UU rate
Moo is offline   Reply With Quote
Old Jul 11th, 2011, 3:53:32 PM   #36
Bologo
Have fun with birds and bees.
is a Contributor to Smogon
 
Bologo's Avatar
 
Join Date: May 2006
Posts: 2,828
Brampton, Ontario
Default

I absolutely agree with the sentiment that this rating system is absolute garbage. I mean, I don't like to badmouth these things because I know that a lot of work gets put into them, but this is a little extreme.

If we're not going to change the rating system, for god's sakes, at least make the required rating 1400. 1450 is way too high, because at that point, literally any battle you have has the potential to wreck all of your work for one day. I know I'm really just echoing the previous statements, but something really needs to be done, because it really is all about who can withstand the most hax and decay.

Also, I know that when the server's down the ratings aren't supposed to decay, but I'm sure that I saw quite a bit of decay in my rating after the server crash. I know I took like 2 or 3 days off from pokemon, but I don't think that would account for a 43 point loss that I had (1424 to 1381). Are you guys sure that it doesn't decay while the server's down? Because if it does turn out to decay, then I feel like the requirements should really be set lower, at least for the current round.

Sorry if this sounded like an off-topic rant, but it's just frustrating that there are either very clearly experienced players that are disguised as low ranked alts, or noobs that get extreme amounts of hax, and that one battle with them can result in an hour of laddering being a waste of time. :/

Is there at least a way to make it so that you don't have to battle people with a 100 rating difference? I know you used to be able to set it to lower than 100, but it feels like making the minimum difference 100 only caused problems, and didn't actually benefit anyone. At least if you could battle more people within your range you could have +15, -15 battles instead of +7, -24 all the time (or worse, I even had a +7, -25 earlier for some reason, though admittedly I thought the minimum was 200 :/).

Last edited by Bologo; Jul 11th, 2011 at 4:59:31 PM.
Bologo is offline   Reply With Quote
Old Jul 20th, 2011, 9:02:37 PM   #37
capefeather
hey, even pirates need attorneys
is a Forum Moderatoris a Battle Server Moderator
 
capefeather's Avatar
 
Moderator
Join Date: Apr 2009
Posts: 2,603
especially internet pirates
Default

OK so, I probably should have been more aware of this sooner, but PO's rating system basically (tries to) implement the Elo rating system, which is used in FIDE ratings and similar places. While Elo itself is, I suppose, "legitimate", there are still a few problems with this that result in what I had demonstrated in the OP:

Shoddy had Glicko2. Elo is still worse than Glicko2 (the convergence complaint in the OP still applies). Thus, it's still a step backward. I'll happily admit that I (and most others) probably wouldn't be complaining at all if Shoddy never existed, but Shoddy has still set this standard as well as others. We should never accept steps backward.

There's also the fact that chess player ratings, at least at the top, come from many years of playing thousands of matches. Compare that to Pokémon, specifically our suspect tests. Our tests last a month; in my best attempt at laddering to the voting requirements, I played a bit over 200 battles within a month, and Jibaku apparently played around 300. IMO, more would have been unreasonable to anyone with a life. Elo, with its lack of a deviation, was simply never meant to be used for such "low" numbers for matches. We time our suspect tests to reflect the time that it supposedly takes to understand the metagame, but we pay no heed to the time that it may take to get a proper reading of true player ratings. Coyotte likes to argue that players will approach their true ratings "eventually", but in light of the chess comparisons, it's a pretty lazy argument.

There's a significant random element in Pokémon. Chess players may experience performance variation for other reasons, but the luck factor is still not nearly as prevalent (zomg I'm Black I'm slightly disadvantaged!!!). The results of randomness reflect in "wrong" rating changes and continue to impact the rating.

The matchmaker uses the ratings to find opponents. This may not seem like a big deal, but considering everything said before, it really works to make laddering more of a chore than it should be (or at least more than it was on Shoddy). This works to widen the gap between tournament performance and ladder performance. Also, I'm not sure how relevant this is atm but I'll mention it here anyway. I really do believe that the matchmaker really magnifies the issue (though it's hardly its fault).

I know that this post may be a bit irrelevant right now considering the server problems (lol PO strikes again!), but it came up in chat, so it's here.
__________________
If we cannot take joy in things that are merely real, our lives will always be empty.

<+joshe> im a registered sex offender for up to calc 3

<+Reflect_Suicune> i was thining of fucking jellicent for some reason

<DetroitLolcat> I AM AROUSED BY BIMETALLIC CURRENCY!

Last edited by capefeather; Jul 27th, 2011 at 7:34:27 PM.
capefeather is offline   Reply With Quote
Reply Smogon Community > Competitive > The Policy Review

« Previous Thread | Next Thread »
Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



All times are GMT -4. The time now is 10:52:30 AM.