UU cutoff step 2: Cutoff level

I think that we really need to get moving with this. I'm going with the old formula here for the following reasons:

1. Collective cutoff is the kind of thing that should perhaps be left until after stats are released, anyway.

2. Discussion seems to indicate that people care a lot more about seeing a Pokémon than having it included in a threat list.

3. Individual cutoff got an overwhelming majority in the poll.

The previous thread will remain open because it had some discussion on weighted usage stats that may not be appropriate here, and I did say that the poll wasn't binding. However, I really think that the individual cutoff should be discussed soon.

There have been concerns that the cutoff level shouldn't be decided by a poll, so I won't do that. What I will do here is ask what I think is the most important question to be asked about each Pokémon that gets to the heart of the matter:

Does a "frequent battler" actually see [Pokémon] over [time interval]?

Now, the obvious problem to solve first is how to define a "frequent battler". I'm fairly certain that most of PR battles "frequently", and I would ask people here how often they battle in, say, a day, but "PR" is hardly an objective standard for a "frequent battler". On the other hand, including all battlers is absurd, as from a quick look at PO's usage stats and ranking board (I know, lol PO, but still), I doubt that the "average" account gets even 1.00 battle a day.

The reason I ask this is that I often see people saying that the cutoff should be higher, probably because they don't see certain Pokémon in OU. I certainly very rarely saw Electivire in Gen IV OU, though maybe people with worse ratings did (I guess that's a different matter).

I think that a day is the most sensible time interval just because we base our cyclic routines on days. A month is another plausible option since we seem to be planning to have usage stats by month.

Quoting myself here:
Let u be the usage of a Pokémon. Then the probability that it won't appear in a random selection of T teams is (1 - u)^T. We want this to equal 1 - x when u = C, so C = 1 - (1 - x)^(1/T).
x = 0.5 is good for objectivity, but a case could be made for x = 0.95, since 95% is very often used as a benchmark of being certain enough.

I suppose I'll use myself as an example. During Round 3, I battled 212 times (T = 212). Plugging x = 0.5 into this gives 0.33%, while x = 0.95 gives 1.40%. Basically, if a Pokémon clocked in at 1.4% usage, I probably saw it. I averaged about 6 battles a day (T = 6) (rounded up to account for missed days). x = 0.5 gives 10.91%, while x = 0.95 gives 39.30%. Interestingly, this means that I'm never really certain of seeing any given Pokémon every day, assuming no Pokémon get that absurd an amount of usage. There are really a lot of ways to go about this; T might not end up being 6 or 212 or whatever.

The problem that I have with T = 20 at the moment is that I doubt that even most "frequent battlers" manage 20 battles in a day. I think that this is ultimately the problem that others have with T = 20 as well.

Well, that's my rambling on this subject. It's less organized than I would have liked, but it says what I think is important for determining this "step". (BTW, if you ever heard me talking about gathering stats on unique IPs, this is why.) I really hope that people focus on determining how to get objective answers to the questions that I've asked, and not just say, "I think T should be 10," or something similar.
 

Seven Deadly Sins

~hallelujah~
is a Site Content Manager Alumnusis a Forum Moderator Alumnusis a Tiering Contributor Alumnusis a Top Contributor Alumnusis a Top Smogon Media Contributor Alumnusis a Battle Simulator Moderator Alumnus
I don't think that's necessary simply because if we have stats at the time, there's a significant chance of bias in selecting a cutoff level that excludes "pokemon people think should be uu" and increases the chance of subjectivity.

The formula for "What is OU" should be as objective as possible and shouldn't have anything to do with stats.

However, I will say this. In Generation 4, even with all the complicated formulas and shit, the UU cutoff always hovered right around #50. As a result, I'd like to discuss something unconventional but something that also makes sense in terms of simplification.

Proposed Definition said:
OU is the 50 most used Pokemon in the first non-Ubers tier.
It may not seem as scientific or as "mathy" as other options, but I'm not entirely convinced that "mathy" is what's needed here. Even with all the formulae and shit, it took a bunch of calculating just to get a cutoff that was essentially "#50" in the end. I think we can do away with the fancy formulae for trying to figure out "What is overused" and just stick with a hard cutoff number.
 
Although I wouldn't mind a hard "Top N" cutoff that much, the main problem that I have with it is, "Why 50?" I've seen comments about having that number lowered, and I kind of agree with them. But why? I think that some of us would like to answer that question instead of simply coming up with a measurement to use. (I also think that it's borderline dangerous just to accept a measurement, but that's another matter.) A possible compromise is to run a formula once and then fix whatever number of OU Pokémon comes up.

I mean, I agree with the general personal sentiment that we shouldn't require a program and more than just the usage numbers to determine the cutoff for us, but X-Act's formula isn't *that* complicated and only uses the usage numbers.

Also, a contribution from Zarator:
Code:
041451 <Zarator> hey cape, I read your "UU cutoff step 2" thread, and while I am not a member of the PR forum, I'd like to contribute. What I don't understand, though, is the kind of answer you seek. What are you asking besides which number T should be? Which kind of arguments and/or experiences are you looking for?
041952 <capefeather> I was hoping to find a way to determine how often a "frequent battler" battles... eventually
042109 <Zarator> With the help of usage stats, you could perhaps determine how many battles a Smogon player does on average every day/month, or maybe weight that number with rating
042131 <Zarator> for example, seeing how many battles on average plays a person with a rating above a certain cutoff
042144 <Zarator> can't think of other objective measures right now, though
042402 <capefeather> a "Smogon player" can be just as infrequent a battler
042502 <Zarator> I meant a Smogon PO player, not necessarily a member of the Smogon forum
042505 <Zarator> sorry for the mixup
042528 <capefeather> yeah maybe
042545 <capefeather> for all we know, "rated battler" could be enough...
042600 <Zarator> umm, I guess
042621 <Zarator> either way, though, you'll probably need the help of R_D to sort out that number in an objective way
042644 <Zarator> I mean, if you're going with "average number of matches played by a rated battler in a unit of time"
I'll admit that, in a way, I'm fishing to have stats like users' battling frequency extracted from the raw stats to help to answer this question. We've collected battling frequency stats before, anyway. (Where is that thread, anyway? I should go find it.)
 

Seven Deadly Sins

~hallelujah~
is a Site Content Manager Alumnusis a Forum Moderator Alumnusis a Tiering Contributor Alumnusis a Top Contributor Alumnusis a Top Smogon Media Contributor Alumnusis a Battle Simulator Moderator Alumnus
I'm not wedded to the number 50, I just use it because that's the number that came out of the Generation 4 formula. If we're going to use a formula, I don't see why we should change it, but I think that just using 50 (or 40 if people would rather have that, or some number in between) is a lot simpler and generates a basically identical number.

I think in the end, we get very little benefit over debating what counts as a "frequent battler" or whether we want to use stuff that's "frequently seen" or stuff like that. A hard cutoff at a set number (doesn't have to be 50, that's just how Gen 4 worked out) is much more objective and a lot simpler than a formula that's hard to decipher and vague on top of that.
 
I don't think that's necessary simply because if we have stats at the time, there's a significant chance of bias in selecting a cutoff level that excludes "pokemon people think should be uu" and increases the chance of subjectivity.

The formula for "What is OU" should be as objective as possible and shouldn't have anything to do with stats.
What if a wrong number is chosen, due to the lack of knowledge of the reality, if that happens so the number doesn't represent what's OU in the stats will you keep insisting that we should stick to the number decided beforehand in order to be 'objective'? Even if it would cause a badly fit cutoff for all months to come?

Moreover, a cutoff adapted for the most recent month is likely to fit better for next months than a purely theorical cutoff for what represents OU.

I mean, I'm fine with the theory and all, but you should at least be able to validate it on a month of stats before setting it in stone as the rule.
 

Seven Deadly Sins

~hallelujah~
is a Site Content Manager Alumnusis a Forum Moderator Alumnusis a Tiering Contributor Alumnusis a Top Contributor Alumnusis a Top Smogon Media Contributor Alumnusis a Battle Simulator Moderator Alumnus
"why"

There is no "right" and "wrong" here, and implying that somehow we'll "get it wrong" is a false dilemma based on the assumption that there's an incorrect UU. There is no incorrect UU or OU, and implying that there is turns the argument into a subjective one ("what should OU look like") rather than an objective one ("how big should OU be").

I'm fine with waiting for stats and taking a look at them to try and determine a cutoff, but I just feel that if, say, Electivire and Weavile and Dusknoir end up being in the 40-50 range, people might be inclined to choose one over the other simply because they "feel those Pokemon should be UU".

I'm looking for two things here: objectivity and simplicity. The only reason that I chose 50 was because that's how last gen worked out (even when we used a formula, it still hovered in the 49-50 range), and it doesn't seem too out of the ordinary that 50 would be equally representative of OU in this new generation. Introducing stats first and deciding based on that adds a layer of subjectivity that I'm not comfortable with. Simplicity-wise, I think it looks a lot better for a ruleset to say "Any non-Uber Pokemon below #50 usage in OU is allowed in UU" than "Any non-Uber Pokemon whose usage falls below the number produced by this formula taking into account..." at the end of the day, considering that both incarnations of UU are not only equally valid but also likely to be incredibly similar.
 
What if a wrong number is chosen, due to the lack of knowledge of the reality, if that happens so the number doesn't represent what's OU in the stats will you keep insisting that we should stick to the number decided beforehand in order to be 'objective'? Even if it would cause a badly fit cutoff for all months to come?
That's not going to happen if a cutoff is made that gets to the heart of the matter. The probability that I'll see a Pokémon at x% usage is not going to depend on the usage stats. I mean, I get what you're saying about possibly committing to a cutoff that may give a result that's different from what we all expected, but barring an anomaly in player battling habits, I don't think that it would be that bad.
 
I like the simplification of SDS' method but I'm also not sure if simply picking a set number would be as efficient as having a simple % cutoff like last generation. Like we said, we can choose what precisely that number is once we have some stats to work with, but I think percents are a lot better for determining "how often something is used". Percent usages compare to the 'mon itself, whereas simple "Top 50" cutoffs would, indrectly, be actually be comparing the 'mon to his teammates. The reason I like percents more is that we don't necessarily need to knock down "Heracross" or "Umbreon" just because everybody starts using "Venusaur" or "Milotic" (I just used plausible examples from 4th gen). It's a measure of how used that mon is, not how often every other one is. It seems to make more sense for defining a tier of "Underused" Pokemon.
 
"why"

There is no "right" and "wrong" here, and implying that somehow we'll "get it wrong" is a false dilemma based on the assumption that there's an incorrect UU. There is no incorrect UU or OU, and implying that there is turns the argument into a subjective one ("what should OU look like") rather than an objective one ("how big should OU be").
Yes, there is. When you ladder a lot, when you think of X pokémon, you can easily say UU or OU from what comes to your mind for most of them, even though there aren't any usage stats produced yet. Something resembling the threat list for the OU tier would make a good OU.

Anyway, about the 50 number, you have to know that in addition to this gen being a lot different than gen 4, Rising Dusk added some kind of weighting to stats in order to remove newbie teams, so it's bound to change a few things. You can't base the number on gen 4, you should base it on gen 5 (why i want at least a snapshot of usage stats). There's also the fact, from what i gathered in the previous UU topics, that people felt that last gen OU was too big.

Also, if a metagame becomes centralised (and less gimmicks with Rising Dusk's weighting), the pokémon near the 50 limit can be seen very little, making them not fit for OU. I feel that a % cutoff, that at least guarantees you see the OU pokemon with a certain ratio, is better than a top X pokemon cutoff.
 

Seven Deadly Sins

~hallelujah~
is a Site Content Manager Alumnusis a Forum Moderator Alumnusis a Tiering Contributor Alumnusis a Top Contributor Alumnusis a Top Smogon Media Contributor Alumnusis a Battle Simulator Moderator Alumnus
Yes, there is. When you ladder a lot, when you think of X pokémon, you can easily say UU or OU from what comes to your mind for most of them, even though there aren't any usage stats produced yet. Something resembling the threat list for the OU tier would make a good OU.

Anyway, about the 50 number, you have to know that in addition to this gen being a lot different than gen 4, Rising Dusk added some kind of weighting to stats in order to remove newbie teams, so it's bound to change a few things. You can't base the number on gen 4, you should base it on gen 5 (why i want at least a snapshot of usage stats). There's also the fact, from what i gathered in the previous UU topics, that people felt that last gen OU was too big.

Also, if a metagame becomes centralised (and less gimmicks with Rising Dusk's weighting), the pokémon near the 50 limit can be seen very little, making them not fit for OU. I feel that a % cutoff, that at least guarantees you see the OU pokemon with a certain ratio, is better than a top X pokemon cutoff.
That's where you're still wrong. There is no "wrong" metagame, and if a player says "well thats bs because ive never seen one of those", it counts for very little, because the stats don't lie. They may not have seen it, but some people obviously have, because the Pokemon had to be used a non-trivial time to make top 50.

That said, I'd be fine with lowering the cutoff to 40. I get the idea that people felt OU was too big, and at times, I felt the same thing. But to say that a metagame with a big OU is somehow "wrong" is incorrect.
 

Oglemi

Borf
is a Top Contributoris a Tournament Director Alumnusis a Site Content Manager Alumnusis a Community Contributor Alumnusis a Researcher Alumnusis a Tiering Contributor Alumnusis a Top Smogon Media Contributor Alumnusis an Administrator Alumnusis a Top Dedicated Tournament Host Alumnus
I agree whole-heartedly with everything SDS said.

If people don't want something to be OU, they can, you know, stop using it, or at least tell everyone else how terrible it is and convince them that they should replace it with something else on their team.

I think the cutoff should still be 50. While the metagame felt "big" last time, we did add a bunch of new shit, and if anything I feel 50 will be kind of cutting it a bit short this time.

But honestly I don't care either way. We just need to be, as SDS said, as objective as possible with our decision.
 

JabbaTheGriffin

Stormblessed
is a Top Tutor Alumnusis a Senior Staff Member Alumnusis a Top Tiering Contributor Alumnusis a Top Contributor Alumnusis a Smogon Media Contributor Alumnusis a Battle Simulator Moderator Alumnus
I don't think we should be considering a cutoff point at this time.

The way I see it there are two viable options: T=20 and T=10. If we had to choose right now I'd say to stick with T=20 like we did in gen 4. The metagame is pretty broad right now I'd say having an expansive OU would be a good representation of what is actually used (which I think T=20 failed at in gen 4 actually). However, if weather bans undergo changes in the next few months, that's going to drastically shift the centralization in the metagame, which would result in T=10 more than likely being a better representation.

So my recommendation for now is to wait until at least one more test is concluded and if there are very little changes (say latios and d-s both getting their 2nd simple majority, with nothing else banned), I'd recommend going with T=20. If there are bigger shifts, I'd wait even longer. However if we absolutely want to go through with this now I'd say stick with T=20.

Also please, please, please for the love of god no arbitrary static numbers.
 

Eo Ut Mortus

Elodin Smells
is a Programmeris a Tournament Director Alumnusis a Site Content Manager Alumnusis a Senior Staff Member Alumnusis a Community Contributor Alumnusis a Top Tiering Contributor Alumnusis a Top Contributor Alumnusis a Smogon Media Contributor Alumnusis a Battle Simulator Moderator Alumnusis a Past SCL Championis a Past WCoP Champion
I don't think it's possible to create the most objective definition of OU possible while at the same time justify our initial OU banlist. That being said, choosing a cut-off after viewing the stats would probably be the most practical option, and I support this, but if we're striving for objectivity regardless, choosing a percentage-based cut-off point would be a far more accurate measure of "Overused" than would an, as Jabba put it, "arbitrary static number."
 

Users Who Are Viewing This Thread (Users: 1, Guests: 0)

Top