Proposal: Dissociating Usage and Tiering in DOU

Stratos

Banned deucer.
This proposal is DOU focused because that's my area of expertise but if other tier leaders feel similarly then coolio

I don't know how it is for other tiers, but the ladder stats for DOU are shit. There are like ten Pokemon which shouldn't be DOU and are, like ten Pokemon which should be DOU and aren't, and even the Pokemon which are in the right tier are often massively wrong on how high or low they should be in the tier (Sylveon top 5 while Kyurem-B scrapes in at third to last).

Yes, Gen IV and Gen V had their fair share of bad Pokemon as well but nothing like this. The only reason anyone good ladders is because they feel a responsibility to try and affect the usage stats. It's not enjoyable in the slightest; even though plenty of good people are on the ladder, we're just flooded by bad players anyways so we never see each other. The ladder has been inherently broken by a problem of volume and proportions. It's not fair to good players to use ladder stats for tiering.

When I was new and bad in DPP, the main way I learned the metagame was by going on the smogdex, sorting by OU, and reading all the analyses. I thought the things in OU were better than the things that weren't. Yes, I thought Ninjask was better in OU than Crobat. I'm sure not much has changed since then. Newer players are still in the business of getting misled by usage stats, but now instead of two or three Pokemon wrong, it's more than one fifth of the tier shouldn't be in the tier and more than one fifth of what should be in the tier isn't. You can verify for yourself that this is the case because when something drops from a tier its usage instantly goes down from 3.40% to like .5%. It's not fair to newer players who want to get better to use ladder stats for tiering.

We can't just help the ladder stats by weighting them higher. Half of the top spots on the ladder are ladder heroes who only use one team: Lando-T Heatran Kangaskhan Sylveon Thundurus filler, so the ladder stats just become ridiculously centralized around those couple of Pokemon (which are already massively overrated as is (except Lando)) and sure it sends half of my complaint list to DUU but also Latios, Hydreigon, Weavile, and does nothing for like any of the DUU Pokemon which I said should move up.

We can't just help the ladder stats with education. It works in theory but the problem is almost nobody on the ladder wants to be educated.

I think the only solution is not using ladder stats, but I'm not sure what I do want to do. If anyone was around for the old tiering system, I'd love your insights in particular.
 

Bughouse

Like ships in the night, you're passing me by
is a Site Content Manageris a Forum Moderator Alumnusis a CAP Contributor Alumnusis a Tiering Contributor Alumnusis a Contributor Alumnus
As Doubles UU is not currently a functional "official" tier, this is largely a moot point right now. Moreover, all the mons are usable in Doubles OU anyway, so it's not as if the bad usage stats are somehow ruining Doubles OU. Nor will moving anything to a different tierlist change the mons people use on the ladder, at least not until Doubles UU becomes a tag on the Pokedex (I mean... Doubles OU isn't even a tag yet) and mons get filtered by Doubles tiers rather than singles tiers in the PS teambuilder, two things that are not coming any time soon.

But I do definitely agree that Doubles OU has a particularly bad problem of viability not equating to usage, and should Doubles UU ever become official, it would be ideal to have ironed out some better solution beforehand because ideally tiers should reflect viability. We use weighted usage to do so in general because democratic systems are good and in theory people will want to use what is most viable. This has clearly failed in Doubles and it would be very silly to present situations where Darkrai is worse than Dragonite and Garchomp (used only once in SPL iirc) is a top 20 Pokemon.

The alternative of course I guess is to just accept it for what it is. Every other tier does that already. Shitmonchan among many other examples exist and it's fair to ask why you think Doubles should be different. I get that we're more isolated and can play around more without affecting other metas, but that's only a reason against not doing it rather than an actual reason FOR doing it.
 

Ununhexium

I closed my eyes and I slipped away...
is a Community Contributoris a Smogon Discord Contributoris a Contributor to Smogonis a Smogon Media Contributoris a Social Media Contributor Alumnusis a Forum Moderator Alumnus
Bughouse I think the point he's trying to make is that if a new player sees a shit Pokemon on the tier list they'll use it regardless how how good it actually is. It happens all the time. Claydol stayed in RU for ages until it dropped, and now if you look at the current usage

| 61 | Claydol | 1.85211% | 2139 | 1.852% | 1826 | 1.952% |

I think he means it's less a problem of the fact that it's DOU than it is that the shit Pokemon are being used just because they're OU and good Pokemon aren't being used because they're in UU
 

Bughouse

Like ships in the night, you're passing me by
is a Site Content Manageris a Forum Moderator Alumnusis a CAP Contributor Alumnusis a Tiering Contributor Alumnusis a Contributor Alumnus
Bughouse I think the point he's trying to make is that if a new player sees a shit Pokemon on the tier list they'll use it regardless how how good it actually is. It happens all the time. Claydol stayed in RU for ages until it dropped, and now if you look at the current usage

| 61 | Claydol | 1.85211% | 2139 | 1.852% | 1826 | 1.952% |

I think he means it's less a problem of the fact that it's DOU than it is that the shit Pokemon are being used just because they're OU and good Pokemon aren't being used because they're in UU
This would be a relevant objection if the teambuilder listed mons by Doubles tiers when you go to make a team. It doesn't.

Nor does the dex have tags that indicate its usage.

A user would have to go look for the doubles usage thread to find usage if they wanted to inform their teambuilding that way, but considering the fact that, at present, the Doubles usage stats thread has only ~2,000 views and the Doubles Viability Rankings thread having ~10,000 (both posted in their current incarnations within a week of each other) this does not seem a legitimate concern either.

Users are disproportionately using bad mons because they want to. End of story.
 

Stratos

Banned deucer.
Bughouse's point is that since there's no way to see what's DOU vs DUU on the sim (which is actually a big issue of its own but w/e) or in the dex, it's not like having an accurate DOU would change nearly as much as having accurate singles tiers. He's right about this, although I'd prefer to have the framework in place for when we finally do get these things

I have to disagree with what you said about DUU, though. A lot of the reason the tier sees absolutely no play is because the terrible ladder stats meant that it was horribly broken with some really centralizing things being allowed and other things that would fit well in the meta being left out, so it was little fun to play. The recent round of rises/drops helped a lot when it got rid of Camerupt (i havent played since so i cant comment on the current meta) but still, between just cross our fingers and wish DUU a playerbase while its list of Pokemon is a broken mess or do something, I'd rather do something.

You say we should just accept it like UU does trevenant or RU does hitmonchan but why should those tiers accept it? No more crossing your fingers every time usage stats come out for the whims of hordes of people using completely unviable shit determining your tier list.

As for why to solve the problem now, I'd rather solve a problem than kick the can down the road for when we do get a DOU tag and people see Dragonite, Whimsicott, and Meowstic but not Darkrai, Jirachi, or Volcarona.




EITHER WAY the point of this thread was mainly to discuss alternate ways of tiering, not to discuss whether or not we should do it.
 
Last edited:

ryan

Jojo Siwa enthusiast
is a Tutor Alumnusis a Site Content Manager Alumnusis a Team Rater Alumnusis a Battle Simulator Admin Alumnusis a Social Media Contributor Alumnusis a Senior Staff Member Alumnusis a Community Contributor Alumnusis a Smogon Discord Contributor Alumnusis a Tiering Contributor Alumnusis a Top Contributor Alumnusis a Top Smogon Media Contributor Alumnus
This is a fundamental question of tiering via ladder or through some other means because at the end of the day, enough people at the top of the RU ladder are using Hitmonchan and of the UU ladder are using Trevenant to keep them from dropping. Either they are more viable than people make them out to be or not many good RU or UU players like to ladder. Either way, you can't blame the system for these Pokemon remaining in their respective tiers when usage stats are weighted by rank.

I'd be open to another way of determining tiers if there was a better one, but I don't believe there is. Anybody who wants to have some kind of effect on tiering has that ability. It's just a matter of putting forth the effort.
 

atomicllamas

but then what's left of me?
is a Site Content Manager Alumnusis a Senior Staff Member Alumnusis a Community Contributor Alumnusis a Top Tiering Contributor Alumnusis a Contributor Alumnus
Pretty much agree with Hollywood, without a proposed alternative to using usage stats there is really not too much I can say, barring that using usage stats is pretty much a necessary evil. Without ladder stats the alternatives all consist of some variant of hand picking which Pokemon are in what tier, which is a pretty terrible idea. On top of the logistical nightmare that is choosing who is picking the Pokemon (not to mention the logistical nightmare that would be picking the Pokemon), personal bias would have an extremely large bias on the tiering of each Pokemon. It would be like the viability ranking threads except it would actually be important (aka a nightmare).

Obviously the ideal solution would be to improve the ladder, which as you already stated is far easier said than done. One way to improve the stats that is actually quite easy, is holding a suspect test, as it increases the number of good players on the ladder. During RU suspect test, the 1630 stats (the ones used for tiering) are about as good as or better than the normal 1760 stats (the top line), as such I've been trying to time it so RU has suspect tests in the third month so more people are interested in the tier and thus give a better idea of RU for the month that counts as 83.3% of the weight, unfortunately they've tended to end up in the second month so rip. Obviously this only works if your tier has something in need of testing though, as you can't test drops in the last month without getting usage stats for a different meta game, rip.

Another solution would be to increase the baseline weight for tiering, but apparently this makes the game sample size to small, or something, (idk but the stats are always 300x better in the 1760 stats).

But yeah I actually don't think there is a good alternative to tiering via usage stats.
 

Oglemi

Borf
is a Forum Moderatoris a Top Contributoris a Tournament Director Alumnusis a Site Content Manager Alumnusis a Community Contributor Alumnusis a Researcher Alumnusis a Tiering Contributor Alumnusis a Top Smogon Media Contributor Alumnusis an Administrator Alumnusis a Top Dedicated Tournament Host Alumnus
Another solution would be to increase the baseline weight for tiering, but apparently this makes the game sample size to small, or something, (idk but the stats are always 300x better in the 1760 stats).
TBF, any sample size from any baseline nowadays is going to be significantly, and I mean really significantly, larger than what our entire tiering was based off of back in pre-2010. I find any arguments about too small of sample size to base tiering off of to be kinda laughable when looking at it in that kind of retrospect. I think if further increasing the baseline helps better reflect what the metagame should or does look like at a higher level, we should definitely strongly consider doing sooner rather than later.
 

MikeDawg

Banned deucer.
Why not keep the usage system, but allow for leader-approved switchups (hitmonchan may be ru, but it's been decided that it is an objectively poor fit for the tier, so it can be manually dropped)?

There really aren't /that/ many pokemon that are trash viability-wise but still used enough to keep them in the tier. A few exceptions in the lower tiers wouldn't kill anybody, especially if everyone involved is on board.

As far as usage stats misrepresenting viability in the tier itself (kyurem and sylveon), isn't that what the viability rankings are for? People are going to use what they want to use, but if they are looking to a resource, I don't see why they can't just use viability rankings over /stats (is this an issue? More attention in the roomintro is helpful with that. There was definitely similar discussion held a long while ago about whther viability rankings should somehow be incorporated into showdown).
 

Bummer

Jamming to the beat
is a Top Artist Alumnusis a Senior Staff Member Alumnusis a Community Contributor Alumnusis a Top Smogon Media Contributor Alumnus
If you want to inform the players of what constitutes a good and a bad pick, then on-site analyses and viability rankings are the way to go. Write DOU analyses even for the bad ones so that you can write about how badly they perform.

If the main concern is how bad mons show up in the DOU selection on PS, then the only other solution would be to create another section below DOU and above BL where those struggling mons would be listed (a reverse BL, if you will). "Introducing Doubles OverRated! We don't use 'em, and neither should you!"
 

Bughouse

Like ships in the night, you're passing me by
is a Site Content Manageris a Forum Moderator Alumnusis a CAP Contributor Alumnusis a Tiering Contributor Alumnusis a Contributor Alumnus
If you want to inform the players of what constitutes a good and a bad pick, then on-site analyses and viability rankings are the way to go. Write DOU analyses even for the bad ones so that you can write about how badly they perform.

If the main concern is how bad mons show up in the DOU selection on PS, then the only other solution would be to create another section below DOU and above BL where those struggling mons would be listed (a reverse BL, if you will). "Introducing Doubles OverRated! We don't use 'em, and neither should you!"
I can say this one more time lol. There IS no Doubles option in team selection. Mons are listed by singles tiers. This almost certainly hurts Darkrai and Skymin and helps Dragonite, etc.

As to your other point, I don't think analyses have even a minimal impact on ladder usage. We do try to make sure the tone is appropriate to not over/undersell anything. But the real proof is that our good EV spreads get used only by a tiny minority. It doesn't seem that would be a fruitful change to make. In any case we DO have analyses for most of these "bad" mons. And in numerous cases we are already working on tweaking them to make sure the lower viability is perceived (just as good policy... Again I don't think this will change a thing on the ladder)

And we also DO have good viability rankings. They don't get viewed much at all relative to the number of battles the ladder sees.

The long and short of it is that there's not much if any obvious improvement to education/resources that we could do that would have any impact. This is probably true of every tier. We all work hard to put out good materials.

But it didn't work. Now there really just are two options: suck it up and go by usage, or try something else.
 
Wow. Haven't had one of these threads in a while.
  1. Stratos, tiers are not viability rankings. They are threatlists. If you can go 20 battles in the metagame and, more likely than not, NEVER SEE that Pokemon, it doesn't matter if it's S-rank: IT'S NOT AN IMPORTANT THREAT. More on this.
  2. atomicllamas, we should absolutely not raise the baseline for DOU any higher than 1695. I wouldn't be okay doing that for SINGLES OU. The net effect of raising it higher is two-fold: the top players have an inordinate amount of influence on the usage stats, leaving us vulnerable to Molk -ing; on the other side, it actually INCREASES the influence of bad players, as they are now on equal footing, stats-wise, as great-but-not-excellent players. More on this.
This is the first I've heard in years about our tiers "being shit." In fact, I've been hearing a lot of excitement about bad stuff finally falling. If tiers are legitimately shit, then tier leaders need to be coming to me privately, because it could be another bug in our rating system or even in the way I'm collecting stats.
 

atomicllamas

but then what's left of me?
is a Site Content Manager Alumnusis a Senior Staff Member Alumnusis a Community Contributor Alumnusis a Top Tiering Contributor Alumnusis a Contributor Alumnus
Wow. Haven't had one of these threads in a while.
  1. Stratos, tiers are not viability rankings. They are threatlists. If you can go 20 battles in the metagame and, more likely than not, NEVER SEE that Pokemon, it doesn't matter if it's S-rank: IT'S NOT AN IMPORTANT THREAT. More on this.
  2. atomicllamas, we should absolutely not raise the baseline for DOU any higher than 1695. I wouldn't be okay doing that for SINGLES OU. The net effect of raising it higher is two-fold: the top players have an inordinate amount of influence on the usage stats, leaving us vulnerable to Molk -ing; on the other side, it actually INCREASES the influence of bad players, as they are now on equal footing, stats-wise, as great-but-not-excellent players. More on this.
This is the first I've heard in years about our tiers "being shit." In fact, I've been hearing a lot of excitement about bad stuff finally falling. If tiers are legitimately shit, then tier leaders need to be coming to me privately, because it could be another bug in our rating system or even in the way I'm collecting stats.
I don't know about DOU's ladder stats, given that I take almost 0 interest in non RU (or UU) ladder stats, but using the 1630 stats is the equivalent of having ~14000 games per month (took the number of game * the average game weight, correct me if I'm wrong), when the "molking" occurred with metang in RU, there were slightly over 6000 ladder games per month (the stats were also completely unweighted). In the same month as the 14000 games @ 1630 weighting, RU had almost 3000 games using the 1760 weighting, which is slightly under half of what occurred when Metang rose to RU that fateful July. While it is true that half the games mean that you only have to play half as many games (which to get Metang to RU was still ~850 games in a month, a ridiculous amount), you also have to be good enough to do so at a level greater than 1760 glicko2. On top of this if a Pokemon is manipulated into being in a tier through spamming at the top level, then by the same logic as you dismissed Pwnemon's Kyurem-B argument (its not being used, its not a threat!) I can dismiss your argument (Metang is being used (during high level play) it is a threat, even if it is an E rank Pokemon in tier X) that "usage manipulation" is a problem (in quotes, cause actually using a Pokemon is different than going to team preview and forfeiting like was done w/ Emboar, Amoonguss, Absol, and Cincinno). Not even saying that we need to raise the baseline from 1630 to 1760, but is there any reason to not raise the baseline at all? We've already determined that we don't want none competitive players determining our tier and there are quite a few none competitive players hovering around a glicko2 of 1630 in RU.

Also people being exciting about stuff finally dropping isn't a sign that the ladders aren't shit, the number one unweighted RU mon by usage is Ambipom, an E rank threat, and every month people are like "Hitmonchan finally gonna drop next month!" only for it to see enough usage in the third month to remain RU. It also took the entirety of XY RU for Claydol to drop, not very efficient considering Claydol was the second worst RU pokemon during all of XY (only better than Chan, lol).

I guess the summary of my points are 1) by your own logic spamming isn't really a problem, seeing as a Pokemon being spammed is a "threat" as you've defined it and deserves to be in that tier, 2) we already have a (completely arbitrary) cut off for what we determine is a competitive Pokemon player, we should be able to raise that cut off if we find that the cut off is still too low (to any number, doesn't have to be all the way to 1760), 3) while ladder stats have improved in gen 6 due to the fact we've switched to weighted, they still aren't at all reflective of the tiers in the competitive tournament scene, I personally think this is a problem, others may not.
 
Last edited:
I kind of don't see the issue in Doubles though? Usage won't affect what Pokemon are allowed in Doubles so regardless of the usage stats, the tier remains the same? You want the usage stats to mirror the viability rankings and that's not only unattainable but also kind of naive. This only matters if Doubles UU is a thing, which it's not.
 
is there any reason to not raise the baseline at all? We've already determined that we don't want none competitive players determining our tier and there are quite a few none competitive players hovering around a glicko2 of 1630 in RU.
I'm going to ignore the rest of your post--you are right about Metang being a "threat" by my logic, which is why it went into RU for that three-month period--but for this one, I'm going to reference the SECOND part of my reply on the subject, and I urge you to actually follow the link and view the graph.

on the other side, it actually INCREASES the influence of bad players, as they are now on equal footing, stats-wise, as great-but-not-excellent players. More on this.
 

atomicllamas

but then what's left of me?
is a Site Content Manager Alumnusis a Senior Staff Member Alumnusis a Community Contributor Alumnusis a Top Tiering Contributor Alumnusis a Contributor Alumnus
I'm going to ignore the rest of your post--you are right about Metang being a "threat" by my logic, which is why it went into RU for that three-month period--but for this one, I'm going to reference the SECOND part of my reply on the subject, and I urge you to actually follow the link and view the graph.
I mean yes I understand that a new player has about the same weight as someone who has a glicko rating of 1695 with a deviation of 25, but the 1760 stats are much more representative of the competitive scene regardless, due to the fact both have even less weight in this system. Also a deviation of 25 is incredibly low, I have an alt with ~50 games and the deviation is 58 and has fallen maybe one point in the last 3 games. On top of that, I never said raising it to 1760 was my intention, what if we raised it to 1695? We recently implemented a new coil system in RU and I doubt someone with a glicko score of 1630 could even qualify for reqs (given it roughly correlates with GXE) so why are we having people we don't trust with voting reqs have "full" weight in the usage stats? These players aren't good but not great, they are actually really mediocre. So while the graphs say one thing, the usage stats say another. I guess I also don't understand how you chose to measure the candles with these specific weights in the first place? Like why was the choice for lower tiers either 1630 or 1760? Was 1695 considered or not? Maybe somewhere in between could have been the optimum in terms of balancing games played and getting accurate stats.
 

Stratos

Banned deucer.
i should probably respond at some point

This only matters if Doubles UU is a thing, which it's not.
doubles UU is a thing though; it's not an official meta but it has a ladder and a small playerbase. Besides as I said earlier, I want to have a good tiering system in place before it becomes a dire necessity.

i want to make it clear that the problem isn't that the doubles ou 1695 stats aren't representing the viability rankings but they simply aren't representing what's used in higher level play. level 51 is a dear enough to collect stats from the seasonal tournaments we run and when you compare with the ladder some things really don't match up. In the ladder, Kyurem-B is 46th; in the seasonal it's 8th. Hydreigon also goes from 33rd to 14th. Rotom-H, Jirachi, and Victini all rise at least fifty places from the ladder to seasonal stats; Tyranitar, Zapdos, Excadrill, Zapdos, Hitmontop, and Scizor fall more than 20 apiece while Garchomp and Whimsicott drop by 50+. Pokemon like Greninja, Dragonite, Meowstic, and Dusclops, all of which have >3% usage on the ladder, are completely unused.

Unfortunately, though the tournament stats are pretty good, they suffer from their own limitations that make them unfit to be used for tiering.

srk put it best, we have two options: suck it up and accept that our ladder stats and thus tiering are poorly representative of the metagame, or do something else. Personally I like MikeDawg's suggestion a lot. We default to the ladder stats, but if a supermajority (say 80%) of the tiering council thinks that a Pokemon should be dropped to the next lower tier, then it is. This gives us an unambiguous default position which can be used to settle any disputes, while still allowing for a Pokemon to drop a tier when it clearly doesn't deserve to be in the tier. It also still works for any singles tiers, where the relationship between upper and lower tiers is less flexible.
 

Haruno

Skadi :)
is a Tiering Contributor Alumnus
doubles UU is a thing though; it's not an official meta but it has a ladder and a small playerbase. Besides as I said earlier, I want to have a good tiering system in place before it becomes a dire necessity.
Relatively fair, but if doubles UU has an almost nonexistent playerbase and with doubles ou players showing minimal/no interest, why is this a concern? Doubles UU is something that most certainly won't happen for a while (if it even happens at all) so while it might be useful for "long term" usefulness, at the moment there is zero practical use in this.

i want to make it clear that the problem isn't that the doubles ou 1695 stats aren't representing the viability rankings but they simply aren't representing what's used in higher level play. level 51 is a dear enough to collect stats from the seasonal tournaments we run and when you compare with the ladder some things really don't match up. In the ladder, Kyurem-B is 46th; in the seasonal it's 8th. Hydreigon also goes from 33rd to 14th. Rotom-H, Jirachi, and Victini all rise at least fifty places from the ladder to seasonal stats; Tyranitar, Zapdos, Excadrill, Zapdos, Hitmontop, and Scizor fall more than 20 apiece while Garchomp and Whimsicott drop by 50+. Pokemon like Greninja, Dragonite, Meowstic, and Dusclops, all of which have >3% usage on the ladder, are completely unused.
What exactly is the problem with ladder not demonstrating higher level play? Ladder across various tiers in singles and doubles are known for being god awful and are something that is mostly used for a quick way to test teams (questionable in itself unless you're a god like whitequeen using ladder as your sole testing ground for tours) or for the general public to enjoy a quick game or two. Unless you can provide a specific reason on why ladder should demonstrate what is used in higher level play as opposed to something used for tiering purposes (the term OU itself and what mons are differentiated in it are explained in and out of itself, the most overused pokemon will be OU whether it be fan favorites, actual good mons, etc).

This is disregarding how you're using incredibly skewed results based off of tournament play (as much as I respect level 51's herculean work) since you do not account for a super majority of the usage stats in actual tour games since you completely ignored the first seven rounds so the actual credibility of his work is questionable in itself. Notice how tours like suspect tours, spl, grand slam, etc use usage stats for all games played regardless of how poor some matches might be (early rounds in suspect tours, games played after guaranteed in/out of playoffs for spl, etc) disregarding so many games simply make level 51's statistics useless as far as judging tournament play, and this isn't even considering how you make a very fallacious assumption that tournament play and high level ladder play must correlate when multiple tiers (including DOU) shows that this isn't the case.

In fact I would go as far as to say that tournament play and ladder play are two completely different things, I'm unsure of how things are in dubs since I absolutely hate ladder, but a trend in OU would be that gothitelle stall is significantly better on ladder since it deals well with the ladder trends but does relatively poorly in tours due to offense being king there which that playstyle struggles against. So tournament/high level play and ladder playing should not be intertwined and be correlative of each other. For all you know, those mons that work well in tours are absolutely garbage on ladder.

srk put it best, we have two options: suck it up and accept that our ladder stats and thus tiering are poorly representative of the metagame, or do something else. Personally I like MikeDawg's suggestion a lot. We default to the ladder stats, but if a supermajority (say 80%) of the tiering council thinks that a Pokemon should be dropped to the next lower tier, then it is. This gives us an unambiguous default position which can be used to settle any disputes, while still allowing for a Pokemon to drop a tier when it clearly doesn't deserve to be in the tier. It also still works for any singles tiers, where the relationship between upper and lower tiers is less flexible.
Unfortunately, you are absolutely forced to suck it up since time has shown that when council is given control over potential bans (a rise is equivalent to a ban for anything bar OU and a drop is equivalent to making that mon less used in the metagame; see gallade in RU and it becoming NU for details) then shit ensues since it gives far too much power to the council and this is a sure fire way to fuck PR which I'm unsure if doubles OU really cares about as a whole. However even if doubles OU disregards this completely, you cannot justify it for other tiers such as UU/RU/NU since again there are a shit ton of mons that a respective council might deem as unhealthy and allowing the council to ban mons without a suspect test/explanation is madness. If anything, your proposal causes more potential problems than what it would fix.
 
Last edited:

Stratos

Banned deucer.
What exactly is the problem with ladder not demonstrating higher level play? Ladder across various tiers in singles and doubles are known for being god awful and are something that is mostly used for a quick way to test teams (questionable in itself unless you're a god like whitequeen using ladder as your sole testing ground for tours) or for the general public to enjoy a quick game or two. Unless you can provide a specific reason on why ladder should demonstrate what is used in higher level play as opposed to something used for tiering purposes (the term OU itself and what mons are differentiated in it are explained in and out of itself, the most overused pokemon will be OU whether it be fan favorites, actual good mons, etc).

This is disregarding how you're using incredibly skewed results based off of tournament play (as much as I respect level 51's herculean work) since you do not account for a super majority of the usage stats in actual tour games since you completely ignored the first seven rounds so the actual credibility of his work is questionable in itself. Notice how tours like suspect tours, spl, grand slam, etc use usage stats for all games played regardless of how poor some matches might be (early rounds in suspect tours, games played after guaranteed in/out of playoffs for spl, etc) disregarding so many games simply make level 51's statistics useless as far as judging tournament play, and this isn't even considering how you make a very fallacious assumption that tournament play and high level ladder play must correlate when multiple tiers (including DOU) shows that this isn't the case.

In fact I would go as far as to say that tournament play and ladder play are two completely different things, I'm unsure of how things are in dubs since I absolutely hate ladder, but a trend in OU would be that gothitelle stall is significantly better on ladder since it deals well with the ladder trends but does relatively poorly in tours due to offense being king there which that playstyle struggles against. So tournament/high level play and ladder playing should not be intertwined and be correlative of each other. For all you know, those mons that work well in tours are absolutely garbage on ladder.
"why should higher level play be used for tiering?" this is actually what you just asked. you actually just based your entire argument on the premise that we should intentionally use lower-quality matches as the basis for tiering. you acknowledged that the ladder is noticeably worse than tournaments at playing the game and concluded that of the two mediums, the one that does not accurately demonstrate what the metagame looks like at the high level is an accurate method of tiering.

I've always been of the impression that it's smogon's aim to cater to the high level of play. That's the line that's always used during suspect tests. Why should our tiering be any different? If a pokemon is never seen at the high level, i don't think it belongs in OU.

p.s. the reasoning for using only top 32 is that the influx of new players who had no DOU teams of their own gave unreasonably high usage to the two sample teams in the DOU forum in the earlier rounds.

Unfortunately, you are absolutely forced to suck it up since time has shown that when council is given control over potential bans (a rise is equivalent to a ban for anything bar OU and a drop is equivalent to making that mon less used in the metagame; see gallade in RU and it becoming NU for details) then shit ensues since it gives far too much power to the council and this is a sure fire way to fuck PR which I'm unsure if doubles OU really cares about as a whole. However even if doubles OU disregards this completely, you cannot justify it for other tiers such as UU/RU/NU since again there are a shit ton of mons that a respective council might deem as unhealthy and allowing the council to ban mons without a suspect test/explanation is madness. If anything, your proposal causes more potential problems than what it would fix.
if u actually read my post you'd see i didn't propose giving power over rises (not to mention, even if i did propose that (which i intentionally didn't), uu has been doing a council system for a couple years and it's fine), but im pretty sure u just wanted to make a snippy comment about how i suck at PR
 
I've always been of the impression that it's smogon's aim to cater to the high level of play.
No. Smogon's tiers are designed to reflect the experience of the "average competitive player." We DO NOT cater to the "high level of play" in tournaments, otherwise we would use tournament stats for tiering, and it's my understanding that tournament usage differs significantly from our usage stats for EVERY metagame, not just DOU.

This is the correct course of action. Tournament players are not interested in influencing the metagame--their goal is to destroy it, to find a weakness and exploit it, and thus win the tournaments. Otherwise, tournament players wouldn't be so secret about their sets and wouldn't do their playtesting on non-main servers.

So no, we don't base our tiers on what tournament players do. We base our tiers on what the "average competitive player" experiences on the ladder and what they need to prepare for. I played maybe a dozen matches on the DOU ladder a week or two ago, and based on that--albeit small--sample, the tiers seemed to match up pretty well with what I encountered (this was using that team you made me, Stratos, and it's still doing fine, FYI). Tiers are threatlists, designed to help the average competitive player with teambuilding. Tournament players have completely different requirements--going 19 and 1 for a ladder player would be amazing; for a tournament player, that means failure--and catering to them in our tiering would be moronic.

ALL THAT BEING SAID, Arcticblast asked me to look into some stuff regarding the ladder, and it's entirely possible there are issues with the stats stemming from:
  1. The ladder being more "luck-sensitive" than Singles ladders. This could throw off Glicko ratings.
  2. Early forfeits throwing things off. For singles battles, I discard all battles lasting fewer than six turns. For Doubles, I obviously can't do the same. So it could be that early forfeits are throwing off the stats.
But, again, in my limited experience, the stats seem to line up with what I experienced. So I really think this is Stratos trying to get the tiers to do something they weren't--and shouldn't be--designed to do.
 
Update: the rating system is fine.

Using my own (offline) implementation of Glicko, actual win rate matches expected win rate as well as can be expected:

upload_2015-6-23_13-25-37.png


This was over the entire DOU suspect test.

I've also validated this for PS' Glicko implementation.
 

Stratos

Banned deucer.
in that case, the "average" dou ladder player is retarded and the ladder stats are retarded so i guess theyre doing their job. i just misunderstood what usage tiers were supposed to do.

that being said the line "Tournament players are not interested in influencing the metagame--their goal is to destroy it, to find a weakness and exploit it, and thus win the tournaments. Otherwise, tournament players wouldn't be so secret about their sets and wouldn't do their playtesting on non-main servers." seems like a drastic misunderstanding of the effect tournament players have on the metagame. idk why you dont respect tournament players when we're usually the guys who push the metagame forward. nothing makes a tournament player more proud than seeing a set of his become standard or a pokemon he likes become relevant—like arcticblast with subcube, tgmd with azu, myself with wp wg aegis; there's a reason we all love to argue over who invented what. even in the case where we steal a set that we found on the ladder, it doesn't become accepted as a part of the metagame until it sees success in tournaments, such as my use of cm cress in spl or kamikaze17's use of seismic toss kang in seasonals. to say that tournaments isnt the place to learn about the meta is horribly wrong.
 

Darkmalice

Level 3
is a Tiering Contributoris a Site Content Manager Alumnusis a Forum Moderator Alumnusis a Top Contributor Alumnus
ALL THAT BEING SAID, Arcticblast asked me to look into some stuff regarding the ladder, and it's entirely possible there are issues with the stats stemming from:
  1. The ladder being more "luck-sensitive" than Singles ladders. This could throw off Glicko ratings.
  2. Early forfeits throwing things off. For singles battles, I discard all battles lasting fewer than six turns. For Doubles, I obviously can't do the same. So it could be that early forfeits are throwing off the stats.
But, again, in my limited experience, the stats seem to line up with what I experienced. So I really think this is Stratos trying to get the tiers to do something they weren't--and shouldn't be--designed to do.
Would it be possible to do the same for DOU, but with a lower turn count? Say any battles lasting fewer than 3 turns? I don't expect it to make a huge difference, but it might make a small difference, and often a small difference is all that is needed to push a Pokemon down to UU or up to OU.

3 is just a number I thought would be reasonable - 6 divided by the number of Pokemon a player sends out on the field.
 
Stratos, I wasn't dissing tournament players. Tournament players are unquestionably the best. But they don't influence the metagame, in terms of usage, playstyle and trends, since they don't share their sets until after the completion of the tournaments.

Darkmalice, yeah, I could use three turns (instead of no turns, which is what I have now), and that would be consistent with what I do for singles 6v6 tiers, and really I should, for consistency, but as I reported in the post down from the one you quoted, there's no need--the rating system (and, ergo, the stats) are fine.
 

Arcticblast

Trans rights are human rights
is a Forum Moderatoris a Tiering Contributoris a Social Media Contributor Alumnusis a Senior Staff Member Alumnusis a Community Contributor Alumnusis a Battle Simulator Moderator Alumnusis a Past SPL Champion
Stratos, I wasn't dissing tournament players. Tournament players are unquestionably the best. But they don't influence the metagame, in terms of usage, playstyle and trends, since they don't share their sets until after the completion of the tournaments.
Some of the sets considered metagame-defining in the tournament scene such as Substitute Kyurem-B and Calm Mind Cresselia (the latter increasingly often on the ladder) began to influence the metagame almost as soon as they were introduced - but only among those who were aware of them. I think a part of the plight of Doubles stems from many ladder players not being "in tune" with what the top players are using. I've had a few ideas on how to address this, but when many of said ladder players don't use Smogon it's difficult. Adding Doubles tiers to the teambuilder is something Zarel would like to do but has been unable to execute, starting a regular sort of blog is currently being held up by The Smog's lack of an as-generated release format (skylight the fate of a metagame could be resting in your hands right now), and putting more aids on the forum hasn't yielded the results we'd like because of their low visibility; even the regulars in the PS room rarely use the forum sometimes.

I've mentioned before that I disagree with Stratos' proposal, although I also disagree with Antar - I do think there is a problem with the ladder statistics (not a problem with the method used to gather them per say) but I don't think manually dropping things is the way to go.

Darkmalice, yeah, I could use three turns (instead of no turns, which is what I have now), and that would be consistent with what I do for singles 6v6 tiers, and really I should, for consistency, but as I reported in the post down from the one you quoted, there's no need--the rating system (and, ergo, the stats) are fine.
when you run the stats for June, could you run an extra set cutting off games under 3 turns and see if there are any significant changes? Sorry to ask so much of you lately D:
 

Users Who Are Viewing This Thread (Users: 1, Guests: 0)

Top