Current Gen OU Council's framework for "competitive" (and "uncompetitive")

Aldaron · Oct 2, 2015

First off, apologies to you all for this being about a week late. I decided to push this through exactly when I'm traveling all over the USA lol...

I wrote the first version of this by compiling various on topic / on point posts in the thread. I then circulated it with the OU Council and some others, and they tweaked and added / removed things. WECAMEASROMANS then compiled the current version with all of the discussed tweaks.

Note the mostly informal nature of the voice in this; it is a casual writing style with examples given at seemingly arbitrary points. One of the major ways we will use this is by making it sure it is fluid. This means that you, as OU policy debaters, will use the current version of this as a guideline for OU tiering debates. The examples should be updated as a generations goes along to remain relevant to the reader's mind.

Any disagreement you have with tiering direction / philosophies / paradigms will be taken up in topics related to this topic, NOT in actual tiering topics. This separation of how and what to argue and the actual argument will help ensure our tiering topics aren't as arbitrary and all over the place as they currently are. We'll also add to / change the framework as we go as necessary.

Anyway, I figured there was little point in simply defining uncompetitive; if we're hoping to set up a framework for how to argue within the OU tier, it really needs three 3 things:

1.) The assumptions we are making when debating tiering policy
2.) The definitions that are vital to debating tiering policy
3.) The overall goal and direction of tiering policy

-----------------------------------------------------------------------------------------------------------------------------

Assumptions in Tiering Policy:

I.) We play, to the best of our simulator's capabilities, with the mechanics given to us on the cartridge.

A.) The ONLY exception to this is Sleep Clause.
B.) Suggestions to "remove critical hits" or "make Baton Pass fail in battle" are not valid tiering solution proposals.

II.) We cater to both ladder players (the higher end of the ladder) and tournament players.

A.) The majority of our accepted "elitism skill" is concentrated in tournaments, but the overwhelming majority of our battles occur on ladder.
B.) For actions to be taken in tiering policy, it is important to show how that action affects BOTH the ladder scene and the tournament scene.
C.) Stats for both will be highly emphasized but not a sole determining factor.

III.) Providing justification is the onus of the side changing the status quo.

A.) It is important to note that the status quo can be changed in the case of releases. This is the situation with Hoopa-Unbound, where it started directly in OU unlike other 680 BST legendaries which start as Ubers and then potentially get suspected to drop to OU.
B.) If a proposal is made to ban a Pokemon, Ability, Item, or Move, the side suggesting this ban must demonstrate all of why this is necessary, how it affects the ladder and the tournament scene, and provide evidence for both.
C.) If a proposal is made to unban a Pokemon, Ability, Item, or Move, the side suggesting this unban must demonstrate all of why this is necessary, how it affects the ladder and the tournament scene, and provide evidence for both.
D.) Complex bans proposals must provide additional information into why the simpler bans are not sufficient.

IV.) Probability management is a part of the game.

A.) This means we have to accept that moves have secondary effects, that moves can miss, that moves can critical hit, and that managing all these potential probability points is a part of skill.
B.) This does NOT mean that we will accept every probability factor introduced to the game. Evasion, OHKO, and Moody all affected the outcome "too much" and we removed them.
C.) "Too much" is if a particular factor has the more skilled player at a disadvantage a considerable amount of the time against a less skilled player, regardless of what he does. In relation to the latter part, "too much" also refers to factors that nearly completely take a game out of the player's hands and turn the PRIMARY point of the game to wait for the RNG.

1.) OHKO moves are an example of the "too much" portion. With a 30% success rate, the other player will be put in an immediate disadvantage by the OHKO move user a considerable amount of the time no matter what he does.
2.) Moody and SwagPlay are examples of the "taking the game out of a player's hands". Both turn the PRIMARY point of the game waiting to see what the RNG spits out.

V.) Team match up management is a part of the game.

A.) This means we have to accept that we will be at an advantage or disadvantage from the very beginning.
B.) This does NOT mean we will accept a component that the majority of the time will turn the battle against the more skilled player. This component must both be an issue a majority of the time AND influence the battle dramatically.
C.) With optimal team building skills, the pool of options (Pokemon, Moves, Items) present in the tier should allow you to build teams addressing the different team-archetypes at least decently, and offer a solution in-battle to a large majority of the principle threats of the metagame.
D.) There is also an important point to note in that team match up is only an issue if there is an extraordinarily low chance to win from the get go.

1.) This means that, even if the better skilled player made the right plays, he lost.
2.) Team match up is only a concern if no matter what the better player did, he had zero or an extremely slim chance of winning.
3.) Basically, for tiering debate purposes, even if the better player had a team disadvantage and made the better moves the majority of the game, did he screw up a turn or two? If he did, then yes, part of the reason he lost was the team match up, but a major factor was also the poor decision.

VI.) Even though assumptions I., IV., and V., limit us, we will, within those limitations, work to maximize the concept of "player skill" determining the result of a match the majority of the time.

A.) Skill is defined in more depth in the next section.
B.) The majority of our potential suspect discussion will center around the defined versions of uncompetitive, broken, and unhealthy and how a particular suspect element lowers some component of player skill within those 3 constructs.
C.) Any of the sub-sections in skill can be emphasized for a potential suspect.

1.) If Shadow Tag reduces the battling skill component too much via removing smart switching and reducing the ability to assess risk, these should be mentioned when stating Shadow Tag is uncompetitive, broken, or unhealthy.
2.) If Mega-Sableye is uncompetitive, broken, or unhealthy, point out how it reduces player skill from being the major determining factor in a match and which component of skill it drastically takes away from.

---------------------------------------------------------------------------------------------------------------------

For what it is worth, we lay out skill in various sub-sections before defining uncompetitive, broken, and unhealthy because all three "buzzwords" have to be used within the context of the suspect element reducing some component of player skill. Our tiering goal, stated later in more detail, is to create a game where the better player wins the majority of the time.

This means that, in suspect debates, we need to show why and how a suspect element is uncompetitive, broken, and/or unhealthy within the context of reducing the effect skill has on the outcome of a battle. Specifically point out which component or components of skill are being affected and how and to what extent.

---------------------------------------------------------------------------------------------------------------------

Definitions for Tiering Policy:

I.) Skill - the subjective metric we use to judge player worth in competitive Pokemon

A.) Team Building Skill - the part of skill that is involved in the preparation for a battle

1.) Assessing threats - ability to recognize major threats in the metagame and identify how they both individually and in tandem deal with your team

a.) Involves having metagame knowledge through playing and observing
b.) Involves the ability to think beyond individual Pokemon threats and into the realm of threatening strategies and concepts

2.) Dealing with threats - ability to maximize the 6 Pokemon slots, 24 move slots, and 6 item slots to handle metagame threats

a.) Ability to recognize which slots are not serving maximum utility
b.) Ability to replace low efficiency slots with higher efficiency options

3.) Building Towards a Strategy (or strategies) - ability to build a team that is "greater than the sum of the individual parts"

a.) Having the 6 Pokemon work together to cover weaknesses and emphasize strengths instead of just having 6 Pokemon with no cohesive strategy

* The most basic and common examples for covering weaknesses include combinations like CeleTran (Celebi and Heatran) or GyaraZone (Gyarados and Magnezone) in DPP
* One of the most basic and common example for emphasizing strengths includes a combination like DoubleDragon (using two Dragon Dancers to punch holes for each other).

b.) Obviously isn't limited to combinations or trios; can refer to overall team strategies (think BP chains before outlawed or simple stall cores that work to cover each other's flaws)

4.) Creativity - ability to come up with unique strategies or sets to swing momentum in your favor

a.) This means being able to surprise the opponent with a unique set or strategy without losing on general utility (too much)
b.) Doesn't just mean creating new sets, but also being able to use existing sets in a creative manner

5.) Catering to Metagame / Opponents - ability to predict opponent trends, patterns, and tendencies

a.) Involves knowing the percentages of what you'll encounter on ladder and being able to build accordingly.
b.) Involves knowing your opponents in tournaments and take note of their common trends in building and prepare accordingly.

B.) Battling Skill - the part of skill involved in actually battling

1.) Picking the Right Lead - ability to look at your team and your opponent's Pokemon and make an intelligent determination of what your win condition is and which Pokemon will best promote that in the beginning
2.) Recognizing the Win Condition - ability to look at your opponent's team in addition to the information gathered during a battle to recognize viable win conditions
3.) Picking the Right Move - ability to pick the best move in a discrete moment in time

a.) Encompasses ability to judge the opponent's potential moves
b.) Encompasses ability to choose between short and long term benefits and choose accordingly

4.) Smart Switching - ability to switch intelligently to swing momentum in your favor

a.) Encompasses the ability to predict an opponent's moves and switch for the best scenario
b.) Encompasses the ability to continuously switch (double or triple switching) if necessary

5.) Gathering Information and Making Assumptions

a.) The ability to predict or assume opponent sets in order to better plan a win condition
b.) The ability to to set probabilities for what the opponent has based on his actions in order to maximize predictions

6.) Long Term vs. Short Term Goals

a.) The ability to weigh when to bring in a potential win condition
b.) The ability to judge whether an immediate benefit, such a revenge kill, is worth showing your hand or bringing out the win condition too early.

7.) Assessing Risk

a.) Knowing when to sacrifice for a greater position later
b.) Knowing when and how to make a high risk, high reward move

8.) Probability Management

a.) The ability to take into account the numerous probability factors that are in the game, including accuracy, secondary effects, and critical hits, and consider the best strategy
b.) Knowing how to minimize the risk presenting by probability factors

9.) Prediction

a.) The ability to take into account all of the opponent's potential actions, apply weights to them, and move accordingly
b.) The ability to double or triple switch based on opponent tendencies to move momentum back in your favor

II.) Uncompetitive - elements that reduce the effect of player choice / interaction on the end result to an extreme degree, such that "more skillful play" is almost always rendered irrelevant

A.) This can be match up related; think the determination that BP took the battling skill aspect out of the player's hands and made it overwhelmingly a team match up issue, where even with the best moves made each time by a standard team often were not enough.
B.) This can be external factors; think endless battle clause, where the determining factor becomes internet connection over playing skill.
C.) This can be probability management issues; think OHKOs, SwagPlay, Evasion, or Moody, all of which turn the battle from emphasizing battling skill to emphasizing the result of the RNG more often than not.
D.) Note uncompetitive elements are almost always present in the battling skill aspect; they will, however, be present in the team building aspect should we allow them in the sense of having to rely on excessively specific counters (such as loading teams with Sturdy or Keen Eye Pokemon and the like).

III.) Broken - elements that are too good relative to the rest of the metagame such that "more skillful play" is almost always rendered irrelevant

A.) Important to note that it is a relative statement; a 200/200/200/200/200/200 BST Pokemon with standard movepool would be broken in a metagame where the average is say, 100/100/100/100/100/100, not where the average is 200/200/200/200/200/200
B.) Examples are mostly Pokemon and include strong Ubers like Kyogre, Groudon, and Arceus. These aren't necessarily completely uncompetitive because they don't take the determining factor out of the player's hands; both can use these Pokemon and both probably have a fair chance to win. They are broken because they almost dictate / require usage, and a standard team facing a standard team with one of them would be at a drastic disadvantage. These examples limit team building skill.
C.) Examples also include ones whose only counters or checks are extraordinarily gimmicky Pokemon that would put the team at a large disadvantage elsewhere. These examples also limit team building skill.
D.) Uncompetitive and Broken defined like this tend to be mutually exclusive in practice, but aren't necessarily entirely so.

1.) BP was deemed uncompetitive because of how drastically it removed battling skill's effects and brought the battle down to match up, but it could also be deemed broken because of the unique ways in which you had to deal with it.
2.) While this isn't always the case, an uncompetitive thing probably isn't broken, but a broken thing is more likely to be uncompetitive simply due to the unique counter / check component. For example, Mega Kangaskhan was deemed broken because it was simply too good relative to the rest of the metagame and caused the tier to centralize around it, but it could also be labeled as uncompetitive because of the severe team match up restriction it caused by punishing players if they did not pack one of the few gimmicky and obscure counters or checks for it.

IV.) Unhealthy - elements that are neither uncompetitive nor broken, yet deemed undesirable for the metagame such that they inhibit "skillful play" to a large extent

A.) These are elements that may not limit either team building or battling skill enough individually, but combine to cause an effect that is undesirable for the metagame.

1.) We haven't really had an example of an unhealthy ban yet, but a potential example is Stealth Rock; it certainly is on the mind of every team building experience and games are often steeped in Stealth Rock strategy. Whether or not this adds up to limiting team building skill or battling skill is part of the conversation to be had.
2.) One important thing to note with this is that distribution both matters (in the case of large distributions) and doesn't matter (in the case of low distributions).

a.) If Stealth Rock or Scald weren't so common, they probably would not be as controversial issues as they are.
b.) However, just because something isn't highly distributed, like Shadow Tag, doesn't mean it isn't unhealthy. Some tried to state that Shadow Tag wouldn't be broken on a 10/10/10/10/10/10 BST mon, but this is the wrong way to look at it.
c.) Things aren't broken (or unhealthy or uncompetitive) only in vacuums; they can contribute to the whole being greater than the sum of its parts. Instead, consider how potentially broken elements would be with average distribution on average BST Pokemon. If Shadow Tag was on, let's say 4-5 OU potential Pokemon as opposed to 1-2 and the average BSTs were something like 80/80/80/80/80/80, would it be broken?The take away from this is to not ignore distribution, but if lowly distributed, to assume how the element would take away from team building or battling skill if it was distributed to average pokemon in an average quantity.(Yes, we will provide average statistics)

B.) This can also be a state of the metagame. If the metagame has too much diversity wherein team building ability is greatly hampered and battling skill is drastically reduced, we may seek to reduce the number of good to great threats. This can also work in reverse; if the metagame is too centralized a particular set of Pokemon, none of which are broken on their own, we may seek to add Pokemon to increase diversity.

1.) The Mega-Metagross suspect could be said to fall under this umbrella; Mega-Metagross wasn't really broken, but it was the best Pokemon in a game with far too many good to great threats. It was felt that, for the sake of metagame health, we should seek to reduce the number of these threats (however, you'll note the community voted to keep it in the tier).

C.) This is the most controversial and subjective one, and will therefore be used the most sparingly. The OU Council will only use this amidst drastic community outcry and a conviction that the move will noticeably result in the better player winning over the lesser player.
D.) When trying to argue a particular element's suspect status, please avoid this category unless absolutely necessary. This is a last ditch, subjective catch-all, and tiering arguments should focus on uncompetitive or broken first. We are coming to a point in the generations where the number of threats is close to overwhelming, so we may touch upon this more often, but please try to focus on uncompetitive and broken first.

-----------------------------------------------------------------------------------------------------------------------

Again, you'll note that for all three of uncompetitive, broken, and unhealthy, reducing skillful play was emphasized. We're going to center future OU tiering debates around people showing exactly how a potential suspect element is any of uncompetitive, broken, or unhealthy as defined here and which aspects of skill are affected.

-----------------------------------------------------------------------------------------------------------------------

Overall Goal and Purpose of Tiering Policy:

I.) To create a metagame that is conducive to the more "skilled" player winning over the less "skilled" player a majority of the time.

A.) "Skilled" is, as stated previously, a bit of a nebulous term, but it encompasses both team building ability and battling ability. More on this in the previous definitions section.
B.) What this means is that, with all of the probability management inherent in the mechanics of Pokemon and with all the team matchup factor inherent with the sheer number of threats in Pokemon, we strive to create a metagame in which the better player winning over the less skilled player happens significantly more than the less skilled player winning over the better player
C.) This does NOT provide justification for using win:loss ratios in tiering decisions...win:loss ratios don't tell us anything because they don't take skill into account.
D.) It is difficult to break down whether or not the metagame is achieving this, but certain metrics can help us. For example, looking at records on the Showdown Ladder and looking at Tour records for a tier. If we have people winning consistently, we are moving towards the goal of having better players win the majority of the time (for what it is worth, look at the Tour statistics from Adv - BW and you will note that every generation has had players who win consistently).
E.) What all 4 of the previous points seek to maximize is keeping the biggest determining factor in the match PLAYER CHOICE such that the better player wins the majority of the time.

II.) To ensure that both our ladder and tournament crowds are catered to regarding I.)

A.) Because ladder tends to be a scene where you play many battles in a short amount of time, "skill" for the ladder emphasizes beating the overall set of threats in a general sense.
B.) Because tournament battles tend to be a scene where a well played surprise wins the match, "skill" for tournament battles emphasizes the ability to both possess creativity and deal with creativity.
C.) The previous two are not mutually exclusive, just pointed out for emphasis. We strive to create a metagame in which someone can both deal with the general set of threats and be creative while dealing with creativity (read: balancing act between diversity and centralization).
D.) For tiering change suggestions, justification can be provided for either or both tournaments or ladder. Both is preferred as it makes an argument more complete.

1.) There will very rarely be a case where a true suspect element is not a problem in both environments, so be sure to be complete in your suspect justification.
2.) If something is overwhelmingly a problem in one environment and not the other, be sure to show how it is a problem in one and try to explain why it isn't a problem in the other.
3.) We expect some differences in both environments, but if a suspect element is non-existent in one environment, it is worth delving into why this is the case and whether there is something else to look at.

III. To ensure that actions are taken with appropriate and complete justification.

A.) Statistics help frame the context of a discussion.

1.) Be careful with adding spin to statistics instead of just reporting them; there are countless examples of using statistics incorrectly to draw deterministic conclusions that inevitably ruin a thread. Don't do this.
2.) Usage statistics and their implications correlate most strongly with how we have defined broken.

a.) While they can certainly provide context for uncompetitive and unhealthy, the way we have defined both means something does NOT need to be highly used to be either.
b.) This doesn't mean they don't need to be used at all. If say, Shadow Tag / Gothitelle is brought up as an uncompetitive suspect element but it has only 1 usage in 100 competitive tournament battles, people would rightly be justified in pointing this out as a counterpoint.
c.) What specifically constitutes "enough usage" is specifically left open-ended and it will be judged on a case by case basis for each suspect element.

B.) Do not haphazardly and brazenly declare anything is uncompetitive or unhealthy and shoot down objective counterarguments.

1.) It will be on you to demonstrate how a particular component of skill is drastically reduced to a significantly damaging extent in spite of any potential low usage.
2.) This is NOT an easy task and suspects for uncompetitive or unhealthy will NOT be pushed through "willy nilly".

C.) We will expect and demand in-depth analysis into what particular factor(s) of skill is reduced, how the proposed suspect element is actually the cause, and why and how removing (or adding) this element will improve the metagame.
D.) If logs are provided, don't simply provide logs where the suspect element won the battle.

1.) Show that the battle was won or lost in spite of the player's mostly correct moves.
2.) If another person points out that the battler did not make optimal moves, be prepared to debate if the element's suspect nature was the cause of the loss or the player not making the best moves (or both).

E.) Arguments that show how a specific suspect element affects skill in relative terms to other elements in the metagame will be very, very, highly emphasized.

1.) Arguments emphasizing relativity were emphasized in defining broken, but this is referring to the "how x is it" part of the argument, where x is any of uncompetitive, broken, unhealthy.
2.) Simply stating Gothitelle is uncompetitive because it reduces player skill by limiting smart switches is not enough; show how it does so more than other elements in the metagame and how this is detrimental.

-----------------------------------------------------------------------------------------------------------------------------

That is the first version. Feel free to suggestion additions to any of the sections or to debate particular points.

McMeghan M Dragon PDC TDK Tesung AM boudouche will answer and discuss any proposed additions / changes with the rest of you.

Special thanks to WECAMEASROMANS, Reymedy, Vinc2612, Minority Suspect, and M Dragon for particularly on point and helpful posts (used these a lot in writing this up).

Once we settle this, we can open up other conversations for OU tiering.

Aldaron · Oct 2, 2015

tagging TDs Ciele Eo Ut Mortus Jirachee Zebraiken Oglemi as well because directly related to workspace

taggings OU mods ginganinja Subject 18 Reverb Aragorn the King NixHex

Figured I'd specifically tag you all

Aldaron · Oct 5, 2015

I changed D in the last section from mentioning statistics and justification to just justification and explained how statistics should be used in more depth.

D.) For tiering change suggestions, justification must be provided for both the ladder and tournament battles.

1.) Statistics help frame the context of a discussion. Be careful with adding spin to statistics instead of just reporting them however; there are countless examples of using statistics incorrectly to draw deterministic conclusions that inevitably ruin a thread. Don't do this.
2.) Usage statistics and their implications correlate most strongly with how we have defined Broken. While they can certainly provide context for uncompetitive and unhealthy, the way we have defined both means something does NOT need to be highly used to be either.
3.) Do not use the previous two points to haphazardly and brazenly declare anything is uncompetitive or unhealthy and shoot down usage references; it will be on you to demonstrate how a particular component of skill is drastically reduced to a significantly damaging extent in spite of the low usage. This is NOT an easy task and suspects for uncompetitive or unhealthy will NOT be pushed through "willy nilly".
4.) We will expect and demand in-depth analysis into what particular element(s) of skill is reduced, how the proposed suspect element is actually the cause, and why and how removing (or adding) this element will improve the metagame.
5.) If logs are provided, don't simply provide logs where the suspect element won the battle. Show that the battle was won or lost in spite of the player's mostly correct moves. If another person points out that the battler did not make optimal moves, be prepared to debate if the element's suspect nature was the cause of the loss or the player not making the best moves (or both).

If people want to talk about any of the sections / edit / add / remove stuff, I'll leave this open for 2 more days, and then I'll post this to the OU forum.

Afterwards, I'll open up PR again to continue discussion on whatever is pending.

Mazinger · Oct 5, 2015

I might have missed something but so far these are the parts I'm unsure about:

Aldaron said:
Assumptions in Tiering Policy:

IV.) Probability management is a part of the game.

A.) This means we have to accept that moves have secondary effects, that moves can miss, that moves can critical hit, and that managing all these potential probability points is a part of skill.
B.) This does NOT mean that we will accept every probability factor introduced to the game. Evasion, OHKO, and Moody all affected the outcome "too much" and we removed them.
C.) "Too much" is if a particular factor has the more skilled player at a disadvantage a considerable amount of the time against a less skilled player, regardless of what he does. In relation to the latter part, "too much" also refers to factors that nearly completely take a game out of the player's hands and turn the PRIMARY point of the game to wait for the RNG.

1.) OHKO moves are an example of the "too much" portion. With a 30% success rate, the other player will be put in an immediate disadvantage by the OHKO move user a considerable amount of the time no matter what he does.
2.) Moody and SwagPlay are examples of the "taking the game out of a player's hands". Both turn the PRIMARY point of the game waiting to see what the RNG spits out.

Definitions for Tiering Policy:

II.) Uncompetitive - elements that reduce the effect of player choice / interaction on the end result to an extreme degree, such that "more skillful play" is almost always rendered irrelevant

C.) This can be probability management issues; think OHKOs, SwagPlay, Evasion, or Moody, all of which turn the battle from emphasizing battling skill to emphasizing the result of the RNG more often than not.

I agree with the general idea here, but what about elements that don't fit this description but still only have the purpose of taking control away from the players (Brightpowder, Confuse Ray, maybe Scald depending on how you look at it, etc)? One could even argue that Evasion Clause should not cover Sand Veil/Snow Cloak according to this, since a 20% miss chance vs a specific pokemon in a specific weather can hardly be called the single most important factor in the outcome of a game. Usually this kind of objection is dismissed because things like brightpowder don't have enough of an impact, which is also why I don't feel too strongly about it, but it will inevitably come up in the future so I think it's worth talking about while we're already defining the basis of the tiering process.

Basically, are we sure that we want the threshold for "too much" to be this high?

Overall Goal and Purpose of Tiering Policy:

II.) To ensure that both our ladder and tournament crowds are catered to regarding I.)

D.) For tiering change suggestions, justification must be provided for both the ladder and tournament battles.

I don't mind both the ladder and tournaments being factors, but I can see this being problematic. If something is considered broken/uncompetitve/unhealthy in one environment but not the other, is the ideal response really "haha guess you lost then"? It seems more reasonable to either:

Look at competitive pokemon as the same regardless of where it is played and use arguments based on either the ladder or tournaments, since they're equally valued ways of playing it,
Decide which one should be given priority to avoid limiting both in an attempt to maintain equality, or
Seperate the tier lists,

because as it is II/D really seems to do nothing but restrict changes that are only needed in one field should they arise, without accomplishing anything if there are no such cases. There might be a better solution than the ones I proposed, but either way this seems too restrictive.

Aldaron · Oct 6, 2015

ikarus said:
I might have missed something but so far these are the parts I'm unsure about:

I agree with the general idea here, but what about elements that don't fit this description but still only have the purpose of taking control away from the players (Brightpowder, Confuse Ray, maybe Scald depending on how you look at it, etc)? One could even argue that Evasion Clause should not cover Sand Veil/Snow Cloak according to this, since a 20% miss chance vs a specific pokemon in a specific weather can hardly be called the single most important factor in the outcome of a game. Usually this kind of objection is dismissed because things like brightpowder don't have enough of an impact, which is also why I don't feel too strongly about it, but it will inevitably come up in the future so I think it's worth talking about while we're already defining the basis of the tiering process.

Basically, are we sure that we want the threshold for "too much" to be this high?

Yea, in discussions I had with people, this was a sticking point. Because people tend to hyperbolize probability issues, setting the threshold as some subjective concept of "too high" was necessary. It should be too high, but a vague phrase like too high allows people to argue that Brightpowder may affect the competitiveness in a "too high" manner (basically, it up to the person bringing it up to show that it is "too high").

I don't mind both the ladder and tournaments being factors, but I can see this being problematic. If something is considered broken/uncompetitve/unhealthy in one environment but not the other, is the ideal response really "haha guess you lost then"?

This has been a pain point forever. I in a senior staff thread pushed for separate tournament and ladder tiering, but that isn't a feasible road currently (for various reasons that I can't get into now). However, when I brought this specific concern up, someone correctly pointed out that the concern is purely academic. Note this isn't a bad thing and we should certainly consider it, but the point was that if something is actually a big enough problem in one environment and completely nonexistent in the other, there is another, larger issue at hand.

The point is, if people are playing to win at our high ladder range and our official tournaments, there shouldn't be extreme differences in problem elements. Yes, tournament battles occur at a FAR less rate than ladder battles and are therefore a victim of sometime undependable usage and lots of noise.

Still, the theoretical, academic edge-case here is worth considering just in case. I'll edit it to state that justification in a single environment is enough to merit discussion / consideration, but justification in both is stronger.

This is also why I de-emphasized usage statistics from a top level requirement to something that is more support.

Aldaron · Oct 6, 2015

I changed II in goal of tiering policy to be more broad as ikarus requested. I also added a III that takes a lot of what was in subsection II D. and makes it its own point.

II.) To ensure that both our ladder and tournament crowds are catered to regarding I.)

A.) Because ladder tends to be a scene where you play many battles in a short amount of time, "skill" for the ladder emphasizes beating the overall set of threats in a general sense.
B.) Because tournament battles tend to be a scene where a well played surprise wins the match, "skill" for tournament battles emphasizes the ability to both possess creativity and deal with creativity.
C.) The previous two are not mutually exclusive, just pointed out for emphasis. We strive to create a metagame in which someone can both deal with the general set of threats and be creative while dealing with creativity (read: balancing act between diversity and centralization).
D.) For tiering change suggestions, justification can be provided for either or both tournaments or ladder. Both is preferred as it makes an argument more complete.

1.) There will very rarely be a case where a true suspect element is not a problem in both environments, so be sure to be complete in your suspect justification.
2.) If something is overwhelmingly a problem in one environment and not the other, be sure to show how it is a problem in one and try to explain why it isn't a problem in the other.
3.) We expect some differences in both environments, but if a suspect element is non-existent in one environment, it is worth delving into why this is the case and whether there is something else to look at.
III. To ensure that actions are taken with appropriate justification.

A.) Statistics help frame the context of a discussion.

1.) Be careful with adding spin to statistics instead of just reporting them; there are countless examples of using statistics incorrectly to draw deterministic conclusions that inevitably ruin a thread. Don't do this.
2.) Usage statistics and their implications correlate most strongly with how we have defined Broken. While they can certainly provide context for uncompetitive and unhealthy, the way we have defined both means something does NOT need to be highly used to be either.
B.) Do not haphazardly and brazenly declare anything is uncompetitive or unhealthy and shoot down objective counterarguments.

1.) It will be on you to demonstrate how a particular component of skill is drastically reduced to a significantly damaging extent in spite of any potential low usage.
2.) This is NOT an easy task and suspects for uncompetitive or unhealthy will NOT be pushed through "willy nilly".
C.) We will expect and demand in-depth analysis into what particular factor(s) of skill is reduced, how the proposed suspect element is actually the cause, and why and how removing (or adding) this element will improve the metagame.
D.) If logs are provided, don't simply provide logs where the suspect element won the battle.

1.) Show that the battle was won or lost in spite of the player's mostly correct moves.
2.) If another person points out that the battler did not make optimal moves, be prepared to debate if the element's suspect nature was the cause of the loss or the player not making the best moves (or both).

Aldaron · Oct 6, 2015

Also added the following to the end of the justification section in goals:

E.) Arguments that show how skill is affected by a particular suspect element in relative terms to other elements in the metagame will be very, very, highly emphasized.

1.) Arguments emphasizing relativity were emphasized in defining broken, but this is referring to the "how x is it" part of the argument, where x is any of uncompetitive, broken, unhealthy.
2.) Simply stating Gothitelle is uncompetitive because it reduces player skill by limiting smart switches is not enough; show how it does so more than other elements in the metagame and how this is detrimental.

because that is very important and I'm not sure how I didn't have it in there the first time >.>

McMeghan · Oct 9, 2015

The framework has been out for a week now, and thus everyone has had enough time to read it and get familiar with it.

As a result, PR threads have been unlocked. It is expected that you use parts of the framework from now on when you will argue about the presumed uncompetitiveness, brokenness or unhealthiness of a game's element.

This thread will stay sticky and open should we decide to edit the framework later down the road.

Aldaron · Oct 9, 2015

official thread here:

http://www.smogon.com/forums/threads/ous-tiering-policy-framework-read-and-understand-this.3552154/

Current Gen OU Council's framework for "competitive" (and "uncompetitive")

Aldaron

geriatric

Aldaron

geriatric

Aldaron

geriatric

Mazinger

Aldaron

geriatric

Aldaron

geriatric

Aldaron

geriatric

McMeghan

Dreamcatcher

Aldaron

geriatric

Users Who Are Viewing This Thread (Users: 1, Guests: 0)