Ladder and rating system policy

Zarel

Not a Yuyuko fan
is a Site Content Manageris a Battle Simulator Administratoris a Programmeris a Pokemon Researcheris an Administrator
Creator of PS
atomicllamas, Stratos, I'm confused, what do your posts have to do with W/L? Or are you saying that AM lied to me?

I'm sure Antar has strong opinions re: GXE and COIL cutoffs in suspect tests, and I'm not opposed to GXE floors in suspect tests, but this thread is about W/L and ladder resets...
 

atomicllamas

but then what's left of me?
is a Site Content Manager Alumnusis a Senior Staff Member Alumnusis a Community Contributor Alumnusis a Top Tiering Contributor Alumnusis a Contributor Alumnus
atomicllamas, Stratos, I'm confused, what do your posts have to do with W/L? Or are you saying that AM lied to me?
No, PDC misspoke when he said that OU uses W/L in suspect tests, they don't, they use a max number of games, which is the same as a GXE floor. The point of my post was to correct your misconception about OU suspect tests, as well as your misconception of why people start new alts (which is for many reasons aside from a bad W/L).
 

Stratos

Banned deucer.
I don't defend W/L for suspect tests but atomicllamas said antar stopped them from using a GXE floor on their tests, which is ridiculous, so I was speaking up about that. Sorry if that was a bit of a tangent. W/L is only good for pride, but there's really no reason to hide it because people who are going to ditch an alt are going to do it either on the first day, where you can easily remember your W/L, or after a really bad losing streak. You don't accrue a hundred battles on an alt and then do /rank and go "oh shit i only have a 3:1 W/L time to make a new alt"


and yeah OU hasn't used W/L for suspect tests for a while, petition to revoke PDC's right to say -OUTL for that one
 

Zarel

Not a Yuyuko fan
is a Site Content Manageris a Battle Simulator Administratoris a Programmeris a Pokemon Researcheris an Administrator
Creator of PS
as well as your misconception of why people start new alts (which is for many reasons aside from a bad W/L).
I already know about those reasons, and most of those are good reasons to start new alts and I'm fine with them. I'm just saying a bad W/L is a bad reason to start a new alt.

W/L is only good for pride, but there's really no reason to hide it because people who are going to ditch an alt are going to do it either on the first day, where you can easily remember your W/L, or after a really bad losing streak. You don't accrue a hundred battles on an alt and then do /rank and go "oh shit i only have a 3:1 W/L time to make a new alt"
You might remember your W/L, but if no one can see it, is it still necessary to reset?
 

Stratos

Banned deucer.
The only time most people are ever gonna reset because of other people seeing their W/L is when going for undefeated streaks. You have effectively killed that, but i kind of liked it :(
 

PDC

street spirit fade out
is a Team Rater Alumnusis a Top Tiering Contributor Alumnusis a Smogon Media Contributor Alumnusis a Four-Time Past WCoP Champion
we have used it in the past; in the aegislash test for example. (in april)

just posting to say that i didn't make it up!
 

Arcticblast

Trans rights are human rights
is a Forum Moderatoris a Tiering Contributoris a Social Media Contributor Alumnusis a Senior Staff Member Alumnusis a Community Contributor Alumnusis a Battle Simulator Moderator Alumnusis a Past SPL Champion
The people who change alts after seeing their w/l are still going to change their alts with a lower-than-desired GXE or a sudden streak of losses. Hiding w/l isn't actually going to change anything. It makes it harder to judge how well you're doing over a long period of time for those who don't frequently ladder, and once you're already high on the ladder it makes it hard to judge a team's performance (before you could simply compare before and after w/l).

I honestly don't see any reason to have removed w/l in the first place.
 

Josh

=P
is a Team Rater Alumnusis a Tiering Contributor Alumnusis a Contributor Alumnus
- Should W/L be displayed in the ladder?

There's no reason I can think of not to, so yes. Similar to the UU thing, a lot of AG players hate how terrible the ladder is and we try and see how far we can go without losing; iirc the record is 51 games. Removing w/l will remove a lot of desire to ladder, for that metagame at least. If this is targeted more for newer people, let me just say that as a relatively new member to this site, I clearly remember when I was one of the ladder randoms who didn't even know what Smogon was, only laddered randbats, etc, and I didn't care at all for my w/l. The only thing I looked at was my ELO. That may not ring true for every new player, but it certainly did for everyone I knew that played PS at the time. As for tournament/higher level players, this thread already shows how they feel about it. I just don't see how this benefits anyone, if your problem is alt creation lagging the server or something then just limit each IP to x alts per day, lol.

- Should we have a ladder reset option?

I don't like the idea of people being able to reset their own ranks, but a complete ladder reset every 3 months sounds great to me.
 
To people finding raw W/L useful for anything, please enlighten us to how, because an entire cohort of academics would love to get knowledge of your groundbreaking research on the subject.
 

Josh

=P
is a Team Rater Alumnusis a Tiering Contributor Alumnusis a Contributor Alumnus
To people finding raw W/L useful for anything, please enlighten us to how, because an entire cohort of academics would love to get knowledge of your groundbreaking research on the subject.
  • Lets you see how far you've laddered without losing.
  • Useful way of seeing your progress for people who don't really understand how GXE works past "higher GXE = better record".
  • Easier way of seeing progress during suspect tests, again, for those of us who don't understand the technical side of GXE
 

Arcticblast

Trans rights are human rights
is a Forum Moderatoris a Tiering Contributoris a Social Media Contributor Alumnusis a Senior Staff Member Alumnusis a Community Contributor Alumnusis a Battle Simulator Moderator Alumnusis a Past SPL Champion
To people finding raw W/L useful for anything, please enlighten us to how, because an entire cohort of academics would love to get knowledge of your groundbreaking research on the subject.
[The lack of a visible w/l] makes it harder to judge how well you're doing over a long period of time for those who don't frequently ladder, and once you're already high on the ladder it makes it hard to judge a team's performance (before you could simply compare before and after w/l).
But to elaborate:

Let's say I'm 7th on the Doubles OU ladder, and last night I built a couple teams and want to test them. I play 20 games on each, and play about as well with each team.

With a visible w/l, I have a metric to compare the general success of each team, by recording my w/l before and after using each team. Without it, I have to record each game individually - something I'll easily forget to do when I just want to play Pokemon.

Or let's say I've decided I want to pick OU back up. The last time I played OU, I was (hypothetically) 27-10. If I go play again and can't see my progress, how am I supposed to know how much I need to work on my game to get back into OU?
 

DragonWhale

It's not a misplay, it's RNG manipulation
is a Top Social Media Contributor Alumnusis a Community Leader Alumnusis a Community Contributor Alumnusis a Dedicated Tournament Host Alumnusis a Battle Simulator Moderator Alumnus
To people finding raw W/L useful for anything, please enlighten us to how, because an entire cohort of academics would love to get knowledge of your groundbreaking research on the subject.
Because the raw data is simple and easier to understand. Even if it isn't useful from an academic standpoint (which doesn't seem to be a priority for most people), if it isn't a hindrance why remove it? As many others said if people make new alts when seeing bad w/l chances are they'll do the same with bad GXEs.
 
need to defend where i belong...

To people finding raw W/L useful for anything, please enlighten us to how, because an entire cohort of academics would love to get knowledge of your groundbreaking research on the subject.
then why was this thread posted? i thought the only thing that academia can agree on is that they disagree. maybe you should have talked to your enlightened trust of academics on why this thread is here before it's posted instead of asking us ne'er–do–well-like clowns.

regardless... the use of any ladder rating/metric has/should have been tied to the attrition of the average ladder player. other admins already argued that the main rationale for ditching winloss is that it doesn't give an accurate representation of skill/ability. it doesn't, but neither does GXE, ELO, ACRE, or anything that's been used in the past. every rating system ever used on PS is hindered by the players that use it. no player (barring outliers) that actively uses the ladder is among the best on PS or this forum.


if your goal is to use ladder metrics that accurately detail who is the best player, then you've already failed. it's apparent through other posts and on PS lobbies that, whether you think so or not, the normal ladder player intangibly gains from seeing their w/l ratio, and doesn't want someone to hold their hand and tell them what's best. i don't see what's the problem here, and i don't understand why you're trying to go against the only demographic that cares about ladder statistics (besides yourself, i guess!).

by the way, if you want the ladder to reflect relative skill as well as it can, it would probably be best to change the medium that ranks players in the top500 instead of removing a metric that had no influence on anything besides personal achievement.
 
Last edited:
I'm rather indifferent towards the subject but I've just got a request...

Can everyone stop being huge dicks? Just saying, the attitude here, on IRC, or whatever when something happens with PS that you disagree with is ridiculous. It's one thing to get mad at a dummy like Oglemi for ANOTHER stupid post or policy decision, but Zarel and the PS administrators are actually putting in a metric fuck ton of time and effort into maintaining a functional simulator. Of course you are entitled to your opinions and can be dissatisfied, which is why we have these threads and why we discuss the subjective issues we disagree with, but is it entirely necessary to voice those opinions with overly entitled snark or insults pointed towards the people that made this simulator?

Please be civilized. These people are putting in so much work they did not have to just so we do not have to play on Pokemon Online *shudders*. Let's try and remember that next time we decide to be bitchy for likes.

Thanks.
 

Stratos

Banned deucer.
This dude PMed me

Tsaeb XIII said:
As a serial Smogon lurker, I've no chance of ever actually having permission to post in the Policy Review threads. However, I was just reading through the one about displaying W/L, and I was wondering whether it might be more helpful to display current streak rather than W/L. GXE remains as a way of actually ranking how good someone's record is, but current streak allows for bragging rights. Just a suggestion, since you seemed to be in the camp favouring some way of tracking things like record win streaks. If you start a new alt, your current streak tracks consecutive wins, without reducing all ladder records down to W/L.

The other added bonus is that it possibly ends up with people making fewer fresh alts to aim for streak records since if it's streaks that are tracked rather than W/L then people might not feel the need to go [X] and 0 and instead just aim for a given streak length (regardless of whether the same alt had losses prior to the streak beginning). This is good for the comparative ranking schemes like GXE that much prefer players to maintain rankings that accurately reflect their skill rather than constantly starting fresh accounts. There's probably something to be said for showing current streak regardless of whether W/L is shown or not.
thoughts on adding a "current streak" category to ladder stats? Is this technologically feasible? from a competitor's perspective I really really like this, as you can now compare winstreaks and have proof that they're real without needing to make new alts for it and—more importantly—means that there's no reason to keep W/L hidden anymore as the only major reason people reset alts because of W/L was to get streaks. Besides, a streak of 50 starting from the top of the ladder is more impressive than a streak of 50 starting from 1000.

I'm sure I'm overreacting here but having streaks be recorded is actually kinda cool motivation to ladder since it's a ladder achievement that I would actually respect (at least until some plebian with teams that auto-lose to kyurem-b gets a 50 win streak), and there are none of those atm.

Even with streaks, W/L is still useful for the reason Arcticblast outlined so I'd like to see them both implemented if possible.



@ blarajan: the only uncivility in this thread toward a ps dev has been towards slayer95 and he really brought that one on himself, theres no need for tone policing :|
 
It's true that W/L is not an accurate representation of skill, but it is a very simple, very basic measurement that, in my opinion, is harmless to the casual PS user. It's a feel-good tool. By all means keep W/L away from suspect tests but really, I would guess that the average PS user is a casual player with no need for this change. If the tournament players on Smogon want W/L despite knowing that GXE is better at measuring skill level, i would imagine that casual players who can't vote or just don't care about suspect tests would want their W/L even more.
 
I agree with the sentiment to keep W/L besides it's not actually useful apart from personal satisfaction and acquaintance, because I think that won't prevent people to change alts in suspect ladders, since in those cases in which people feel to change them due to "losing too much", the W/L index won't matter, whether or not it's viewed. Players can keep track of their W/L by themselves, or just feel bad after 4 consecutive loss and create a new alt due to rage/whatever other reason. Also, it's not necessarily true that changing alt should be related to win/loss, even if that's surely one of the main reasons.
Number of games (one of the current reqs in some ladders, as OU) can make people change alts at the same way as W/L does, since if you have a too bad GXE after several games you won't likely get reqs in that detailed number of games required so you just have to make a new alt; I don't think we should take off number of played games for this reason, though (TLs can't see who actually made reqs in that way :x).
To make it short, I don't think keeping or removing W/L score will fix the issue since there are so many ways to keep track of your ladder results and removing the score will just make people mad because in that way they have to keep track by themselves, while the feature was comfortable and not harmful at all. Also, why making new alts is an issue? I just didn't get what's the problem (just curious). If the issue is something like "the server has too much usernames registered" you can just make something like an auto-erase of usernames which has not been used in # time (a year?) or put a more restrictive limit to the possibility to register new usernames (like 1 every week).
 
Last edited:
the problem with glicko is that it's built on a theory which doesn't allow for improvement (basically it assumes that you were equally good for every game you played and it's just learning more about how good you are). if you're not going to let us reset our ratings, at least promise periodic ladder resets.
The fuck? No it doesn't. It assumes you played at the same skill level across a PERIOD, which should be a day, right Zarel? No sane rating system doesn't allow for improvements.

Yell at Antar then cause he approved on it when we just had our most recent suspect test or do what everyone else does, make a policy review thread.
I remember something about being asked about a suspect test. I totally don't remember anything in it having to do with using W-L. But I've been incredibly sleep-deprived these past few months, so it's entirely possible I missed it.

Stratos, regarding those very pretty MS Paint curves--you can just do that by capping nBattles.


A summary of my previously-stated opinions on everything
  • We should be heavily discouraging the use of W-L as a metric. If idiots want to display it, whatever. But it should not be used for suspect tests.
  • We SHOULD make it easy for players to reset their ratings rather than encouraging the creation of alts. Personally, I believe we should have displayed ratings and "hidden" ratings. Displayed ratings can be used for suspect tests and bragging rights and shit, and those should be resettable on a whim, but we should NEVER reset a player's hidden rating, and that's the rating we should be using for matchmaking* and stats stuff.
*obvs this has problems because Elo point loss/gain is so transparent. So really, probably it's best not to have hidden ratings at all (I can always calc ratings manually for stats, but I don't think it makes too much of a difference in the aggregate).
 

Shaka Brah

Banned deucer.
As we know with things like RNG, broken mons, and a community of varying skill levels that information with truth can be scarce as the unwarranted democracy of the Smogon elite has left many peasants unable to see the light before their eyes. Today, another act of propaganda was eschewed by these elites in the form of misinformation: Have you won or have you lost? In a world of 6.25% to always win or lose, it can be hard to tell. Did you win? Or did you lose? Regardless, these numbers are marked and tallied but are no longer provided. To become a great player... must you win and lose, or merely participate? Is Pokémon itself merely a test?

Participation warrants numbers just as winning or losing does, however, losing provides experience and winning provides trophies and recognition. You must ask yourself: If I participate without winning or losing, will I ever gain trophies or recognition? Can you truly influence the metagame and join the elite to make the 6.25% world a more meaningful place where skill can be displayed? In truth, the answer is no. The same elite in private rooms deciding the fate of your metagame by making gang votes and mafia style hits on influential personalities are the ones who don't want you to win or lose. They want you to participate. To them, you are merely a test, and the 6.25% world is one they will always champion as long as your triumphs go unrecorded.

The final solution is clear as long as this question is answered truthfully not the inner breeder or fan inside of you, but the inner compassionate but hard-driving trainer and competitor you are: Would you rather win, lose, or merely participate? Personally, I'd prefer losing and knowing it and learning than just participating. For if any of us were to just participate, then Pokémon would truly be just merely a test.
 

p2

Banned deucer.
As a careful and respected theologian of pokemon and someone who has grinded serious hours to hone their skills, I want to speak respectably of noobs and respresent them here. As you can see by my join date(yup, check it), I've endured the throes of this chaotic society only to shine light on those who feel blinded by the doublespeak of the Smogon world. This to say, simplicity is bliss.

Simply put, W:L is an obvious statement of how someone is playing. On an alt it can describe a teams experience through testing with two simple numbers. For new players, whom I respectably represent here despite my join date since I believe in their God given rights to sink or swim, I think W:L can be important. Think of yourself at first. Remember when you went 0-30 and almost quit because your DD Nite didn't have multiscale yet? Gen 4 was tough, but the record was clear. Then the next day when you found out Salamence was actually that much better and you didn't have to DD on a Milotic to be successful?

I'm getting caught up in the reverie of my own lore..

W:L is an easy statement to new players and those who frequently use alts to gauge their play on a daily basis. In fact, new and old players alike who frequent new alts can use W:L to know right away, without learning what GXE or %'s are, to see their progress. Today 7-7, tomorrow 8-6. The next day 5-9, then 9-6. Who knows that it might be, but it is an easy to read indicator that is important more to new players than old, veteran and truly venerable posters such as myself.

also this, http://vocaroo.com/i/s0xWEGtiHI4p
 
Last edited:
I think it's been established that W/L is, mathematically, not a good representation of skill. The matchmaking system makes sure of that. For this reason, I fully agree that W/L should not be used as a metric in suspect tests at all due to the mathematical utility of GXE. The idea is silly.

However, it is probable that most PS users are not mathematically clued in enough to fully understand the benefits of GXE > W/L. I can testify to that. They might also just not care. I see no issue with displaying it. Perhaps bolding your GXE in the /rank and having some message that appears on hover about its greater mathematical value would convey the same message that a lecture that you have to click through every time you want to see your W/L does.

I have always felt that there should be periodic ladder resets and support the idea, especially if they were to happen every three months when there's a big tier shift.
 

Zarel

Not a Yuyuko fan
is a Site Content Manageris a Battle Simulator Administratoris a Programmeris a Pokemon Researcheris an Administrator
Creator of PS
Stratos, regarding those very pretty MS Paint curves--you can just do that by capping nBattles.
COIL doesn't seem to support that, but that seems like a pretty good idea.

My approach with formulas is always to make the numbers mean something.

So how I think COIL could be configured is by two numbers:
1. F = GXE floor
2. NF = number of battles a player at GXE floor needs to play to qualify

So, if F=75 and NF=100, a GXE 75 player would need 100 battles, a GXE 80 players might need 80ish, a GXE 90 player might need 40ish.

We modify the COIL formula to cap N at NT, which will give us pretty much exactly the second graph Stratos posted.

And then we can plug in GXE=F, N=NF, and solve for B.

Then, tier leaders can announce F and NF and minimum COIL, and then those numbers are a lot more intuitive to the players, too.

@ blarajan: the only uncivility in this thread toward a ps dev has been towards slayer95 and he really brought that one on himself, theres no need for tone policing :|
I've definitely been in way ruder threads (I'm still kind of bitter about the teambuilder tiering one), but in general there's at least a fair amount of entitlement in these sorts of threads.

Pretty much everyone's attitude here is "This is what I want, give this to me" rather than "This is what I want, how can we reach a compromise where everyone's happy?"
 

Zarel

Not a Yuyuko fan
is a Site Content Manageris a Battle Simulator Administratoris a Programmeris a Pokemon Researcheris an Administrator
Creator of PS
Anyway, hi, guys, W/L is no longer on a clickthrough, and there is now a "Reset W/L" button.

Let it be known that I pretty much always give people what they ask for in the end, so if you could just be patient and not be so demanding, that'd be great. <3
 
Last edited:

Users Who Are Viewing This Thread (Users: 1, Guests: 0)

Top