Voting Threshold for Past Gen Votes

Hogg · Jun 23, 2021

We have been seeing a significant uptick in past gen votes lately. As most of you who have been following this forum know, we recently unlocked past gen lower tiers. However, when we did so, we never established a voting threshold for such votes. As a result, the required threshold for votes has been all over the place. I’d like to correct that.

For current gens, the threshold for a vote is 60% for OU and other static tiers, or 50%+1 for usage-based lower tiers. The reasoning behind lower tiers requiring only a simple majority to enact a tiering change was that because they experience frequent significant changes due to tier shifts, a lower threshold for tiering actions was necessary. (Ubers, meanwhile, requires a two-thirds majority or 66% for any tiering changes, under the argument that bans should only occur in Ubers under exceptional circumstances and with the backing of a clear majority of the community.)

Since the unlocking of past gen lower tiers, it seems that most votes have been occurring using these same thresholds. However, because tier shifts are no longer occurring and these are well-established metagames at this point, I believe it is appropriate to raise the voting threshold for past gen lower tiers to 60%.

I’m open to discussion on this, so I’ll leave this thread open - others I have spoken to have proposed raising this even higher, to the 66% required for Ubers, since past gen tiering changes should be exceptional events. However, there are a number of ongoing discussions and votes (and I anticipate more in the near future), and I’d like to make sure that they are handled consistently right away. Therefore, after discussing this with the past gen tiering leaders, this 60% threshold for past gen lower tiers will become the policy effective immediately.

Some things I’d like to see discussed:

Should past gen OU tiers (and other static tiers such as LC and Monotype) continue using this 60% figure (the same threshold as current gen bans), or should we consider raising this to ⅔ majority?
Is this 60% figure for past gen lower tiers appropriate, or should they either revert back to 50%+1 or rise to a ⅔ majority requirement?
If a past gen tests unbanning something that was initially banned by council vote and never received a community vote, should the voting threshold change?

Adaam · Jun 23, 2021

Gonna keep this short since I don't have much to say outside of supporting a bump in the minimum threshold. The distinction between 60% vs 66.67%, to me, is arbitrary (as it could also be 70%, or 65% etc). Given old gen OU votes have been using the 60% threshold without complaints, I'm inclined to support that for lower tiers as well.

Bughouse · Jun 23, 2021

There've been a number of past gen tiering votes in the last few years (all for OU/LC/DOU/other tiers that are not UU-PU) and relatively few votes have come in that narrow window between 60% and 67%. Most of these old gen tiering decisions are happening relatively unanimously, since we're generally only trying to meddle where it's necessary, and that's a good thing. I don't see any harm in it being higher, since the large majority of votes would have surpassed it anyway, but also I don't think there's inherently anything wrong with a consistent 60% whether OU or any other tier.

As to the last point, while I think that it can make sense to have different thresholds for retesting council bans in current gens, I don't think we need a hard and fast rule for this for past gens. It should be quite rare that we test a past gen unban... and even rarer that it was a council ban. I'd be fine to leave that one up to case by case basis. How council bans have worked over Smogon's long history has almost assuredly varied widely over time, especially since tiering vote standards used to be a lot more fragmented in the distant past.

pokemonisfun · Jun 23, 2021

Hogg said:
Is this 60% figure for past gen lower tiers appropriate, or should they either revert back to 50%+1 or rise to a ⅔ majority requirement?

Yes it is appropriate, if anything, it should be higher because how infrequently these tiers are played.

If a past gen tests unbanning something that was initially banned by council vote and never received a community vote, should the voting threshold change?

Unsure - I'm not sure people actually distinguish overtime whether something was banned by public or council. In my view, both tiering decisions are equally legitimate as long as the rules for the suspect are known to the public and decided before the suspect is conducted. Which is nearly always the case because Smogon is run rather professionally. But this is where the discussion should be in my view: are council and public tests seen as equally legitimate or are public tests more legitimate because they let more of the player base participate directly?

--------------------------------------------------------------------------------------------------------------------------------------------------------------
While on the subject of old gen tiering, Hogg, I think it's your power/responsibility to answer, or at least ask, this question for us (if this is tldr, you can get the gist by reading the colored large text. But Hogg, I would be grateful if you read it all):

Should we factor in secondary effects to the metagame when we do an old gen tiering? In other words, if Gliscor is broken in a hypothetical old gen metagame, do we ban it knowing that it likely makes Lucario broken in that metagame?

I asked this multiple times in a recent test on stall in gen7uu. The answer the thread came to, to the extent it did come to an answer, was: yes, we do take into account metagame effects. In short: we shouldn't ban Blissey (to nerf stall) even though it was clearly the root of stall, because it would affect other playstyles. Therefore we should ban Quagsire (to nerf stall) even though it's not the root of stall, because this doesn't affect other playstyles nor does it affect other tiers (Pyukumuku, another Pokemon in question, would affect other tiers).

We are waiting on the results of this test but even if the Quagsire ban goes through, I don't think it should set this important precedent because this is a question of Smogon's tiering policy, which UU should not have the power to control. Also, it's a unique case where a playstyle is being suspected, not a Pokemon.

I think this should be made at the policy head level, because as far as I'm aware, it's against policy right now for tiers to consider secondary effects. If a Pokemon is broken, we ban it, even if it makes something else broken, because we then ban that second broken Pokemon. And so on and so forth.

See the three reasons why we ban things in our tiering policy:

Hogg said:
II.) Uncompetitive - elements that reduce the effect of player choice / interaction on the end result to an extreme degree, such that "more skillful play" is almost always rendered irrelevant.

This can be matchup related; think the determination that Baton Pass took the battling skill aspect out of the player's hands and made it overwhelmingly a team matchup issue, where even the best moves made each time by a standard team often were not enough.

This can be external factors; think Endless Battle Clause, where the determining factor became internet connection over playing skill.

This can be probability management issues; think OHKOs, evasion, or Moody, all of which turned the battle from emphasizing battling skill to emphasizing the result of the RNG more often than not.

III.) Broken - elements that are too good relative to the rest of the metagame such that "more skillful play" is almost always rendered irrelevant.

These aren't necessarily completely uncompetitive because they don't take the determining factor out of the player's hands; both can use these elements and both probably have a fair chance to win. They are broken because they almost dictate / require usage, and a standard team without one of them facing a standard team with one of them would be at a drastic disadvantage.

These also include elements whose only counters or checks are extraordinarily niche Pokemon that would put the team at a large disadvantage elsewhere.

Uncompetitive and Broken defined like this tend to be mutually exclusive in practice, but they aren't necessarily entirely so.

Baton Pass was deemed uncompetitive because of how drastically it removed battling skill's effects and brought the battle down to matchup, but it could also be deemed broken because of the unique ways in which you had to deal with it.

While this isn't always the case, an uncompetitive thing probably isn't broken, but a broken thing is more likely to be uncompetitive simply due to the unique counter / check component. For example, Mega Kangaskhan was deemed broken because it was simply too good relative to the rest of the metagame and caused the tier to centralize around it, but it could also be labeled as uncompetitive because of the severe team matchup restriction it caused by punishing players if they did not pack one of the few obscure counters or checks for it.

IV.) Unhealthy - elements that are neither uncompetitive nor broken yet are deemed undesirable for the metagame such that they inhibit "skillful play" to a large extent.

These are elements that may not limit either team building or battling skill enough individually but combine to cause an effect that is undesirable for the metagame.

This can also be a state of the metagame. If the metagame has too much diversity wherein team building ability is greatly hampered and battling skill is drastically reduced, we may seek to reduce the number of good-to-great threats. This can also work in reverse; if the metagame is too centralized around a particular set of Pokemon, none of which are broken on their own, we may seek to add Pokemon to increase diversity.

This is the most controversial and subjective one and will therefore be used the most sparingly. The Tiering Councils will only use this amidst drastic community outcry and a conviction that the move will noticeably result in the better player winning over the lesser player.

When trying to argue a particular element's suspect status, please avoid this category unless absolutely necessary. This is a last-ditch, subjective catch-all, and tiering arguments should focus on uncompetitive or broken first. We are coming to a point in the generations where the number of threats is close to overwhelming, so we may touch upon this more often, but please try to focus on uncompetitive and broken first.

Nowhere in our policy does it state we take into account broken threats being created by banning other broken threats. Most familiar with tiering have heard the mantra "Smogon doesn't do broken checks broken." However, there is good reason to discuss this for old gens - old gens are supposed to be harder to change, harder to meddle with.

I believe we have now solved that issue with this thread, as we will raise the change status quo threshold to 60% or 2/3.

In other words, I am saying we should treat tiering old gens essentially the same as tiering current gens. I know it's supposed to be harder to make changes to old gens. We've now done that by raising the threshold to change the status quo to either 60% or 2/3. I would like for us, if possible, to come to a consensus at the policy level, to make all Smogon official old gens tiered the same way as current gens, with the exception of vote threshold.

That means we should no longer consider secondary effects to a metagame when making arguments for a tiering decision, even if its old gen. To go back to the Gliscor example, we would hold a vote on Gliscor knowing that Lucario may be broken afterwards because we can simply ban Lucario afterwards, as if it were a current gen, knowing that both of these votes would need to reach a higher threshold.

TLDR: If we have a higher threshold to change the status quo in old gens, we should be allowed to tier more freely, as if it were a current gen.

Iguana · Jun 24, 2021

Thanks very much, Hogg, for starting this thread. This is an important topic for sure given the recent unlocking of old Gen lower tiers' tiering.

I think a consistent voting threshold for old Gens' tiering is appropriate; in other words, irrespective of whether the tier is OU (or another "static" tier, as Hogg mentioned) or a usage-based lower tier, the voting threshold for tiering should be consistent if it's an old Gen. The difference between voting thresholds in CG makes sense, but my rationale for keeping this consistent with old Gens is that the "frequent significant changes due to tier shifts" Hogg outlined that necessitate a lower threshold no longer apply once the tier is an old Gen. In case that's vague, here's a hypothetical example:

Let's say a large group of active metagame contributors and players decides Miltank is uncompetitive and broken in DPP UU. There's a suspect test that happens. If this happens while DPP is the current Gen, that suspect test would require a 50%+1 majority because the tier is rapidly shifting due to usage rates. If this happens today, when DPP is no longer the current Gen, the metagame is no longer rapidly shifting due to usage rates, thus no longer necessitating that lower voting threshold for a suspect test.

One interesting consideration is to raise this 60% voting threshold to 2/3 under the premise that old Gens tiering meets the criteria that Hogg outlined for Ubers: "that bans should only occur in Ubers under exceptional circumstances and with the backing of a clear majority of the community." Clearly, unless old Gens tiering becomes frequent, these are exceptional occurrences and probably with the backing of a lot of the tier's community. This would be a fairly steep threshold to meet, and it's a jump from the 50%+1 threshold that many of these lower tiers had when they were CG. I don't have statistics on Ubers' rate of banning vs. not banning in suspect tests handy, but I'd imagine that bans don't happen too often there. In any case, I'm not sure if I necessarily agree with raising the threshold from 60% to 2/3, but this is one thought I wanted to point out for us all to consider.

Hogg's third question ("If a past gen tests unbanning something that was initially banned by council vote and never received a community vote, should the voting threshold change?") is really fascinating to consider. I don't have a strong opinion on this right now, but that's in part because I'd like more information. Is there any history or precedents to build off of here? If there really isn't any history to work with, I would lean towards keeping consistent the voting threshold with all other old Gen suspects. I understand the rationale to make the threshold higher––overturning a council vote to unban something after the tier is no longer CG should require a lot to pass. But, if no community vote has been taken place, I see little reason to restrict the community's ability to unban any more than would be the case normally.

pif makes some excellent points and raises important questions in their post. The idea of creating a policy that would have CG and old Gens tiering identical with the exception of voting thresholds is one worth examining. I had mulled that over a little as well as I was reading this thread and the SM UU stall one. These are just some thoughts I had generally on this subject, in no particular order:

While this I believe deviates from pif's point, perhaps this is where old Gens of OU and lower tiers would diverge in their tiering systems. Old Gens of OU all have permanent ladders on PS, making a ladder-based suspect reqs system similar or identical to CG tiers' possible. Old Gens lower tiers do not have the same ladders available to users (with the exception of occasional RoA Spotlight ladders, but those don't account for every old Gen lower tier), so another system may be necessary for them.
It might be possible to temporarily bring back an old Gen lower tier's ladder on PS for the purpose of laddering reqs, but then that gets into an issue of are the conditions under which the players are laddering for the suspect genuine since the ladder has been dormant for (potentially) years until the suspect test brought it back. I'm getting into a lot of specifics about this, and I'll save more of these until/if discussion goes in this direction.
What about more live tours for players to obtain reqs to vote in suspects? This seemed to work quite well for the Dugtrio/Arena Trap suspect in DPP OU fairly recently (and perhaps in other suspects?), and I'd definitely be interested in expanding that sort of a reqs-obtaining system, particularly since old Gen ladders are often either dormant or inactive. Or, at least offering players the choice between laddering for a certain GXE or excelling in a live tour would be interesting.

I totally agree with pif on how the "secondary effects question" considerations in old Gens tiering should be sorted out, especially since some players are likely to consider these given how old Gens tiering does not happen often (unlike in CG where another suspect can happen easily if one ban causes issues in the metagame). I'm going to propose an idea that might be a bit controversial: We allow players to consider the potential effect(s)/impact(s) on the metagame by banning something in their suspect vote, such as imbalancing a metagame to the point of it being unhealthy, or making another Pokémon broken/unhealthy in so doing. I'll try and outline some of the pros and cons of this here.

Pros
- It would provide flexibility to players involved in a complex tiering process. Old Gens tiering is inherently going to be more complex than CG tiering because the metagames are generally more static, and the playerbases and tiering less active. This is no criticism at all of old Gens; it's just generally the reality, so granting players more leeway in their deliberations would be one pro of this proposal.
- Frankly, some players already do consider these impacts in their considerations, so this would justify and ground their thinking in policy.
- Regarding potential impacts of a ban on a metagame could help old Gen tiering councils structure/frame future suspects. In other words, if a lot of players cite how banning Pokémon A could make Pokémon B unhealthy in the metagame, if Pokémon A does end up being banned, then the appropriate tiering council can keep an eye on Pokémon B to see if another suspect is warranted.
Cons
- Trying to anticipate hypothetical situations is always risky. Banning one Pokémon may not have the impact(s) that a player thinks it may have on the metagame. There's really no way to know for sure until the Pokémon is banned and the metagame transforms.
- This could potentially set a precedent for this line of thinking to be adopted into CG tiering policy, where it has long been seen as illegitimate.

Hopefully the latter part of this post isn't getting ahead of myself, but policy discussions are always complex and multi-faceted. Looking forward to hearing from you all on these matters!

I usually don't do these, but a TL;DR seems appropriate here since this is a long post and I believe on a very important subject.

TL;DR

-I propose we keep the voting threshold between old Gens of OU and old Gens lower tiers consistent.
-60% majority is my preferred choice at this time, but I think 2/3 should be considered since old Gens tiers do seem to meet the criteria set forth in CG Ubers' tiering policy.
-More information on history/precedents regarding unbanning and overturning a council vote? I'd be interested in getting more information on the subject before deciding on whether the voting threshold should increase or not for old Gens tiering.
-The idea of trying to create a tiering system nearly identical between CG tiers and old Gen tiers (with the exception of voting thresholds) is one worth examining.
-I also propose that we consider including secondary effects of a ban on an old Gen metagame as a rationale for banning/not banning.

Hogg · Jun 25, 2021

pokemonisfun said:
(snip)

You ask some fairly complex questions, and they deserve complex answers. I’ll do my best to give my own thoughts… but keep in mind that there are always limitations to any policy, and these questions touch on areas where policy has reached those limits and some amount of judgment is required.

So, to your core question: should we factor in secondary effects to the metagame when we do old gen tiering?

It’s a bit hard to just give a straightforward yes or no to this question. On the one hand, our tiering policy is generally written around looking at broken elements in a vacuum rather than attempting to account for every possible implication down the road if something gets banned. (Granted, we’ve broken or at least bent this rule in basically every single gen since we’ve started the modern tiering process, so I think it’s better to think of it as a guiding principle rather than a hard rule.)

On the other hand, part of the reason we can afford this type of viewpoint is that if it turns out banning one broken element breaks something else, we have the ability to continue making changes until the metagame reaches a stable state. Because of this, we can avoid speculative arguments about whether or not banning something will just cause other problems, because we have the opportunity to see firsthand whether that’s true, and if it is, we can do something about it. But as you pointed out, these kinds of continuous changes go against the core idea of making minimal changes to past gen tiers/metagames. This becomes even more of an issue with past gen lower tiers, which generally do not have active ladders and have a smaller tournament presence, making it harder to properly gauge the impact of tiering changes.

So, I know you want a hard policy answer to this question, but I’m not entirely sure that one is possible. My own opinion would be that we relax the “ignore secondary effects” rule in general, but still keep it in mind as an ideal principle. But I think there’s not really a way to objectively say yes or no here.

Regarding whether the higher voting threshold for tiering changes means that past gen lower tiers can tier more frequently… I’m less sure that I see the correlation there. The reason we only use a simple majority with current gen lower tiers is that the nature of lower tiers means that they change frequently, and can often lose or gain core elements with very little warning due to the nature of tier shifts. Therefore, we made a policy decision that making tiering changes should be somewhat easier, so that lower tiers can more rapidly respond to these fluctuations. (On mobile or I’d link the original post discussing it, if no one beats me to it then I’ll edit it in later.) But that’s no longer true for past gens, so I believe it makes sense to use the OU threshold of 60% instead. I’m not sure I see how it follows from there to the idea that tiering changes should be allowed to be made more freely. I still tend to be of the opinion that tiering changes in past gens (whether lower tiers or OU) should generally arise out of necessity.

Adaam · Jun 25, 2021

I want to support the above two proposals. Old gen tiering is a new, murky area, and not everything is clear yet. One unclear factor is determining what we can and cannot tier. In the SM UU stall thread, there was pushback on Blissey because of the secondary effects it would have, which is undesirable in an old gen tiering decision. Here are some comments from the thread:

TDK said:
I'm not convinced of that, but I absolutely dread a world where we just ban stall and accept what's to come with nothing to come after. Even with Hogg lifting the lock on old gen tiering, that doesn't mean we are to be actively tiering these metagames as if they're current. We should keep the changes to a minimum, and I do not believe this falls under the same scope as ORAS Conkeldurr, for example. I do not think any single change would make the metagame be in a better off state and I do not want us to be actively making multiple changes to a tier that is just fine.

shiloh said:
we should be attempting to minimize impact on the old generation lower tiers as much as possible, because we will not have as much of a playerbase playing the tier year round / being able to develop the metagame in the ways a far impacting ban could have. if sm was still the current generation, there would be plenty of time for the meta to adapt, more tests and action to be taken in timely fashion, and less overall harmful impact on the tier compared to how the ban would be post generation. so yes, while our tiering goal for current generations should not be minimizing impact on the tier with a ban, i think that it should absolutely be the case when dealing with old generation lower tiers. especially in cases like this where there are more calls for dealing with the overall playstyle rather than the individual mon.

These are valid sentiments. The playerbase for old gens is objectively smaller and reduced to subforum and circuit tournaments. There is no old gen ladder for lower tiers like there is for an old gen OU tier, so we run the risk of not being able to accurately weight the impact of a ban. However, I think we can do better.

When I made the SM UU stall thread, I did not gloss over Scizor. We had a test for it, and while the reasoning for not banning it may not have been fair, it still happened. I did not think it was my place to try and undo it, nor did it fall under the implied framework for tiering old gen lower tiers. However, after reading the most recent posts, there seems to be a lot of support for potentially making significant changes (see how many users want to look at Scizor first).

It is clear people are unhappy with the tier, and SM UU is probably not the only one. There are other old gen lower tiers with broken elements that are not as clear-cut as ORAS UU Conkeldurr was, and I do not want us to just live with it. Like the users above, we should be able to test Pokemon with significant secondary effects upon its removal.

To be sure that these decisions are made correctly and fairly, I propose the following:

Bump the ban threshold to 60%. As pokemonisfun mentioned, if this threshold is crossed, then clearly a significant portion of the now-active playerbase thinks this ban is needed.
To start the process, a user creates a thread proposing what to test. The thread should stay up for a set minimum amount of time (say, 4 weeks) to ensure full discussion is had.
When the 4 weeks are up, an unbiased group of people review the thread and deem whether or not there is proper foundation to proceed. If there is, perhaps we can add the ladder back so users can test a metagame without the proposed ban.
Ladder remains up for ~2 weeks. After then, a list of voters is acquired, and the voters are notified of the test.
After another week, the voting thread is put up.

Other things to consider are adding suspect tournaments so non-tournament players or players who have been inactive in recent years to have a chance to vote. For example, Pearl and pif were both excluded from the SM UU stall voting list, which is silly considering they are the most influential stall users of the tier. There should be more ways to allow people to vote for changes this big, but this discussion is more fit for this thread.

The ladder will allow users to figure out if any future action needs to be taken, assuming the proposed Pokemon is banned. The old gen low tier also has an active playerbase with their respective circuit and subforum tournaments, so there should be a consistent active playerbase adapting to these changes. Old Gen OUs have made huge changes on the basis of the active tournament playerbase, and while SPL is much more prestigious than say, UUPL, people still play both. Let's not limit ourselves when we finally have the tools to improve our favorite tiers.

pac · Aug 5, 2021

Bughouse said:
Most of these old gen tiering decisions are happening relatively unanimously, since we're generally only trying to meddle where it's necessary, and that's a good thing.

I don't have much to say as I don't feel particularly qualified on the topic, but I would like to drop a very recent example of a tiering decision that was passed by a very close vote (52%). Could be something worth referring to in the discussion.

https://www.smogon.com/forums/threads/rby-uu-dragonite-voting.3685785/

Voting Threshold for Past Gen Votes

Hogg

grubbing in the ashes

Adaam

إسمي جف

Bughouse

Like ships in the night, you're passing me by

pokemonisfun

Banned deucer.

Iguana

formerly mc56556

Hogg

grubbing in the ashes

Adaam

إسمي جف

pac

pay 5000, gg?

Users Who Are Viewing This Thread (Users: 1, Guests: 0)