# Ubers suspect testing aftermath

#### Disaster Area

##### formerly Piexplode
guys can we end this now it's really boring you guys not even providing tl;dr's when we all basically know that suspect tests aren't a good system in ubers AND every action from the tier leaders and site staff from now on will already be aware of that? o_o

#### HackerKing

I'll be honest: I don't play Ubers. I play OU.

But I've still been following the Gengarite and Shadow Tag suspect tests, and I also think a lot of other OU players were too. The reason being that this was a historic moment; this was the first time that the Ubers tier had a suspect test to decide if a pokemon (or group of pokemon) was to be banned in the Ubers tier. The waves made by that decision splashed way outside of the Smogon Ubers forum. For better or for worse, lots of people who don't even play Ubers have been reading the voting thread and watching the whole test.

Anyway, that's just my 2 cents.

#### WebBowser

Piexplode The people who are still talking are mostly people trying to decide whether or not suspect tests are still a good idea. If you find this boring then please feel free to click on the "Unwatch Thread" button at the top right corner of the page. Nobody is forcing you to return here.

xJownage I hope this doesn't come across as condescending, but I really think you should take a very careful look at the COIL formula: C=40*GXE*2^(-B/N))

Most importantly, this term: 2^(-B/N))

This is a negative exponent. This means that for positive B and N, this value will always be less then one. The higher B (the arbitrary, preset constant) is, the closer this value gets to 0, lowering COIL. The higher N (number of battles) is, the closer this term gets to 1. No matter how high N is, COIL will never exceed 40 * GXE, regardless of number of battles fought. In fact, one could even use B = 0 to have COIL equal exactly 40 * GXE, completely negating the number of battles. COIL is capable of representing anything GXE can represent, as well as using number of battles as a term. If it takes too long to get to a high enough COIL, lower B. If win streaks are resulting in reqs too often, raise B. If unskilled players are hitting reqs through raw determination, raise COIL. If nobody is hitting reqs, lower COIL. Any problem you have with COIL can be solved by tweaking some constants. If it can't be solved by tweaking constants, then GXE can't give you what you want either, nor will any other ladder rating system (probably).

Moderator
lol piex owned, can we pls lower our tempers here and be a bit more calm.
What piex meant was that it has been decided that suspects won't happen in the forsseable future, so it's pointless to discuss how they should be done. Also, you post these long, huge, gigantorous walls of text about the same, yet don't provide tl;dr's, that's pretty irritating as well.

#### WebBowser

lol piex owned, can we pls lower our tempers here and be a bit more calm.
What piex meant was that it has been decided that suspects won't happen in the forsseable future, so it's pointless to discuss how they should be done. Also, you post these long, huge, gigantorous walls of text about the same, yet don't provide tl;dr's, that's pretty irritating as well.
1. It's not piex's job nor place to decide what discussion is and is not pointless. Even if it is the case that there will not be any suspects in the foreseeable future, that does not make this discussion "pointless". IMHO, informed debate is inherently useful, even if it is about hypothetical stuff. Besides, if this topic was pointless, a mod would've locked this thread ages ago, or at least posted something to redirect discussion to something not pointless.

2. The posts are long because the problem is complicated. Complicated problems have complicated answers. Complicated answers take longer to justify, therefore, my posts are long (not to mention that I am by nature rather long-winded, that's not helping matters).

3. I believe that my response to Piex was perfectly reasonable for someone who essentially walked into an active debate and contributed absolutely nothing other then "this topic is pointless and the posts are too long". I am fine with people not wanting to take part in this discussion, I'm even fine with people saying my posts are too long, but I am emphatically not fine with anyone who tries to quash informed debate in this manner.

#### ApplepieFTW

yeah ^

the only purpose coil had, is making sure you cared enough to sit down and make reqs. they weren't meant to filter based on skill (just on a very basic level) in the first place.

#### WebBowser

Piexplode I love how I have made two posts explaining the math behind COIL and why you are wrong, link to Antar's post included, and yet I'm the one being called an idiot.

Frankly, I have little to say to someone who calls people idiots without bothering to read their arguments while at the same time using flawed logic to support their own. I have attempted to remain civil with you and I would like you to show me similar respect.

ApplepieFTW Actually, COIL's main purpose was so that one could not simply get a lucky win streak to obtain a high enough GXE in 10 or 20 battles to obtain reqs. COIL cannot exceed GXE, no matter how many battles you play! This is a mathmatical fact that anyone with an understanding of high school Algebra and 20 seconds of actually looking at the equation can figure out.

#### Disaster Area

##### formerly Piexplode
Piexplode I love how I have made two posts explaining the math behind COIL and why you are wrong, link to Antar's post included, and yet I'm the one being called an idiot. Why am I wrong? I'm not making any claims related to the mathematics of it.

Frankly, I have little to say to someone who calls people idiots without bothering to read their arguments while at the same time using flawed logic to support their own. I have attempted to remain civil with you and I would like you to show me similar respect.

ApplepieFTW Actually, COIL's main purpose was so that one could not simply get a lucky win streak to obtain a high enough GXE in 10 or 20 battles to obtain reqs. COIL cannot exceed GXE, no matter how many battles you play! who cares? This is a mathmatical fact that anyone with an understanding of high school Algebra and 20 seconds of actually looking at the equation can figure out. No, ApplepieFTW is correct. When he is referring to the purpose of COIL he is talking within the context of this suspect test, and not within the frames of comparison between different ranking systems. THIS HAS NOTHING TO DO WITH THE PURPOSE OF THE MATHEMATICS WHATSOFUCKINGEVER. THIS IS SIMPLY TO DO WITH THE PURPOSE OF HAVING A SYSTEM FOR GETTING REQUIREMENTS. I DO NOT GIVE A FLYING DUCKSHIT EXACTLY WHICH RATING SYSTEM WE USE. THE WHOLE MOTHERFUCKING POINT IS THAT DOING LADDERING FOR REQS IS ONLY MEANT TO SHOW THAT YOU CARE ENOUGH ABOUT THE METAGAME TO INVEST SOME TIME INTO IT. IT HAS PROVEN ITSELF IN THE PAST THAT WHATEVER SYSTEM YOU'RE USING, SUSPECT REQUIREMENTS ARE AN IMPERFECT WAY TO SHOW WHO HAS THE UNDERSTANDING TO MAKE AN INFORMED CHOICE. NOT EVERYONE HAS THE SAME AMOUNT OF TIME TO BURN LADDERING. With respect to how these work, all that really matters in terms of the suspect system is that there is a sensibly high minimum amount of games needed to secure 'voting requirements' to show basic dedication to the tier. That is all.

#### Karxrida

I love how this is supposed to be a healing thread, yet people are being assholes to each other over math.

#### Disaster Area

##### formerly Piexplode
It's more philosophy than mathematics I think. If it were maths then there'd be 1 right answer that could logically be shown to be agreed upon.

#### Zebstrika

It seems like we're just scrapping suspect tests altogether here because one went badly... Some of these suggestions were things other people have said, but here are a few improvements I think would really help if we did go back to suspect testing.

1. Transparency in votes. I know they're all revealed in the voting thread now (although it's not shown which were originally deleted, but at this point it doesn't matter). The explanation given in the voting thread is that it avoids public shaming, but honestly I have never heard of that happening to anyone because of a suspect vote. If you want to ban something and potentially change a tier, you should be confident enough about your opinion that you don't believe anyone will be able to ridicule it. If people are getting mistreated due to their votes, we have rules so the mods can deal with them also. Another option, if someone doesn't want their paragraph shown, allow them to write a little note in their vote saying to edit it out when the votes go public. Whatever, there are a ton of options to deal with this one.

2. The mods reviewing the paragraphs should have opposite biases, if they have any. Honestly this should be common sense. Even if the mods have the best intentions, the biases can definitely affect those posts that are "on the line" without them realizing it. Or possibly disqualifying paragraphs because they disagree with the argument, even when it's not flawed.

3. Supermajority (60 or 67%) required to ban. This is the tier that bans as little as possible, we should be confident as a community that we want something banned if we are to ban it.

4. Guidelines for paragraphs posted before the vote, since clearly they do exist. There was a post at the end of page 1 that probably put this better than I can. I just don't think the mods should have a right to reject someone's paragraph when they refuse to tell them what they even want until it's too late, and we can't go back and correct votes either. Also, it's kind of unclear, even now, about what the mods do with paragraphs that have some good arguments and some flawed arguments.

5. Our real goal here is to make sure everyone enjoys the game as much as possible, even bad players, they're people too, we were all one at some point. There are a lot of people in this thread that will disagree with me here, but imo if a bad player wants to play 400 games to make reqs, go ahead and let them. It's clear they really care about the tier to spend that kind of time and dedication. They're active, and they're the on the ladder the most to enjoy their decision on the tier, if they have any effect on it.

#### xJownage

##### Even pendulums swing both ways
It seems like we're just scrapping suspect tests altogether here because one went badly... Some of these suggestions were things other people have said, but here are a few improvements I think would really help if we did go back to suspect testing.

1. Transparency in votes. I know they're all revealed in the voting thread now (although it's not shown which were originally deleted, but at this point it doesn't matter). The explanation given in the voting thread is that it avoids public shaming, but honestly I have never heard of that happening to anyone because of a suspect vote. If you want to ban something and potentially change a tier, you should be confident enough about your opinion that you don't believe anyone will be able to ridicule it. If people are getting mistreated due to their votes, we have rules so the mods can deal with them also. Another option, if someone doesn't want their paragraph shown, allow them to write a little note in their vote saying to edit it out when the votes go public. Whatever, there are a ton of options to deal with this one.

Another idea is to release all the votes with names censored, which performs the same task without having any chance of "public shaming"

2. The mods reviewing the paragraphs should have opposite biases, if they have any. Honestly this should be common sense. Even if the mods have the best intentions, the biases can definitely affect those posts that are "on the line" without them realizing it. Or possibly disqualifying paragraphs because they disagree with the argument, even when it's not flawed.

Pro-Ban mods should only look at Pro-Ban votes, and vice versa. Like you said, this is just common sense.

3. Supermajority (60 or 67%) required to ban. This is the tier that bans as little as possible, we should be confident as a community that we want something banned if we are to ban it.

I don't know whose idea it was to not require a supermajority in the first place, but whoever it was wasn't being very smart. I pointed this out on the suspect thread and got my post deleted because I wasn't pro-ban (obvious bias in good posts that got deleted which were anti-ban and shitposts that were pro-ban that were kept)

4. Guidelines for paragraphs posted before the vote, since clearly they do exist. There was a post at the end of page 1 that probably put this better than I can. I just don't think the mods should have a right to reject someone's paragraph when they refuse to tell them what they even want until it's too late, and we can't go back and correct votes either. Also, it's kind of unclear, even now, about what the mods do with paragraphs that have some good arguments and some flawed arguments.

This was my whole issue with people having the right to reject a vote. Your criteria for a counted vote have to be constant, if it is human-controlled it will be objective basiacally every time.

5. Our real goal here is to make sure everyone enjoys the game as much as possible, even bad players, they're people too, we were all one at some point. There are a lot of people in this thread that will disagree with me here, but imo if a bad player wants to play 400 games to make reqs, go ahead and let them. It's clear they really care about the tier to spend that kind of time and dedication. They're active, and they're the on the ladder the most to enjoy their decision on the tier, if they have any effect on it.

Who says that only "intelligent players" should decide on a tier? Why shouldn't it be the most "dedicated" players instead? This just isn't something that is well treated in the entire smogon community, and I am unwilling to try to touch on this in a community of "intelligent players" rather than "dedicated players".

#### WebBowser

It's more philosophy than mathematics I think. If it were maths then there'd be 1 right answer that could logically be shown to be agreed upon.
This is largely a philosophical argument, agreed. However, a lot of the reasoning you have given in your posts are based on a flawed understanding system of COIL and possibly GXE. Most notably this paragraph in one of your previous posts:

if you don't understand the maths behind it go f**king find Antar's thread on it it's not hard to find, and the simple fact is increasing voting requirements for this increase player annoyance at having to get reqs, and drains more of their time (and some players simply won't have the time past a certain point, especially with tournaments frequently around too), but it won't increasethe quality of the votes by a notable margin. That is the truth of it, now stop the f*cking arguements and the tl;dr's, ...
The bolded statement is outright false, and I have already explained why twice. To briefly reiterate to save yourself from having to actually read my posts, the part that the number of battles actually influences is a negative exponent that maxes out at 1 given an infinite number of battles. This term cannot increase COIL past a certain value for a given GXE, no matter how many battles you play. To reduce the effect this term has, you can adjust a constant in the formula, which lowers the influence that the number of battles have on COIL while at the same time increases the influence of GXE (which is essentially a glorified W/L ratio). If you want a better explanation, go look up Antar's thread on the subject, or just read my explanation.

The fact that your arguments are based on a flawed understanding of a mathematical formula means that math does indeed have something to do with this debate, and no amount of immature swearing and name calling on your part is going to change that.

Zebstrika, xJownage: I agree with a lot of that, however I still believe that we need some way to ensure that players voting actually have an idea of what the suspect does and why it's being suspected in the first place. The paragraphs idea tried to address this, but failed largely due to biases and poor explanations of the requirements. For this reason I suggested a page or two ago to try replacing the paragraph requirement with a replay using the suspect. The requirements can even be made totally objective too, just submit your vote alongside a replay showing the suspect on your team at a certain elo, the suspect must use at least one move during this battle.

#### Zebstrika

I agree with a lot of that, however I still believe that we need some way to ensure that players voting actually have an idea of what the suspect does and why it's being suspected in the first place. The paragraphs idea tried to address this, but failed largely due to biases and poor explanations of the requirements.
We could also make the paragraphs just do exactly what you said: explain what the suspect does and why its being suspected. A paragraph that demonstrates understanding, rather than justifying their vote. Since the paragraphs aren't necessarily arguing for one side or the other, the voters can first submit their paragraphs as a part of making reqs, and there shouldn't be bias since it's not actually tied to a vote yet.

The other flaw you pointed out was poor explanations of the requirements, which has an obvious solution, actually explain them.

#### Sweep

Ya I agree with webbowser, enough pointless swearing (even if the cusses are not really personal attacks), you're coming across as overly aggressive. I really don't want to delete more posts, but I also want to make Muk proud and effectively moderate the forums so calm down. Also, we're never doing another suspect test in Ubers. Fireburn has been clear about this, so I don't see the point in discussing them.

#### WebBowser

Also, we're never doing another suspect test in Ubers. Fireburn has been clear about this, so I don't see the point in discussing them.
Is that so? I personally find the decision to be a tad hasty and premature, but if you want us to stop discussing suspects here, then so be it.

#### Disaster Area

##### formerly Piexplode
The bolded statement is outright false, and I have already explained why twice. To briefly reiterate to save yourself from having to actually read my posts, the part that the number of battles actually influences is a negative exponent that maxes out at 1 given an infinite number of battles. This term cannot increase COIL past a certain value for a given GXE, no matter how many battles you play. To reduce the effect this term has, you can adjust a constant in the formula, which lowers the influence that the number of battles have on COIL while at the same time increases the influence of GXE (which is essentially a glorified W/L ratio). If you want a better explanation, go look up Antar's thread on the subject, or just read my explanation.

The fact that your arguments are based on a flawed understanding of a mathematical formula means that math does indeed have something to do with this debate, and no amount of immature swearing and name calling on your part is going to change that.
I perfectly well understand that COIL is calculated off of W/L ratio, and that you cannot get past a certain COIL value with a given GXE (60 GXE was the point in this suspect test where you'd have to play 'infinite' games to get 2400 COIL).. It is fucking irrelevant.

There are players who don't understand the ubers philosophy who just stole an RMT or archive team with around 80 GXE (we have some true legendary players like Conflict (go look him up) who had their vote consistently rejected because they misunderstood the purpose of the test and the philosophy of the tier) and who may well have the time to get some dumb 3000 COIL or whatever. Understanding the philosophy and understanding the tier are 2 very importantly different concepts. Laddering a certain amount shows dedication to the tier enough to say that the player cares about the suspect test. Making laddering requirements may weed out some poorer players, but if you kept ramping up the COIL requirements for the sake of arguement, you'd find it's perfectly likely that competent players who have an acceptable understanding of ubers philosophy cannot make requirements (due to usually time constraints), whilst someone else might be a highly experienced competitive battler who can steal a good team and make it destroy the ladder and have sufficient time to reach requirements even if the requirements are exceptionally high, and yet they may well not understand the philosophy of the tier and the suspect test. You see, the point of paragraphs isn't to weed out 'bad' players, it's meant to weed out people who don't understand the philosophy sufficiently to make a well-informed and sensible judgement, that is not ignorant of the tier philosophy or how the suspected aspects of the battle affect the metagame. This is why increasing suspect requirements would be ineffective at improving the way in which the suspect test is held, because it weeds out people in a fashion that is unsuitable for the task at hand. I hope this has made it a little more clear, and perhaps my language a little more eloquent.

#### Majickary

And how are you going to tell who "understands the philosophy of the tier and the suspect test" without making a subjective judgement? That's just opening up the potential for subconscious bias. By excluding Conflict - and idk who he is - whoever is judging the paragraphs is essentially saying that he isn't qualified to vote because he "doesn't understand the ubers philosophy" even if he outright beats so-called qualified voters in a head-to-head match. That is a position I find extremely arrogant.

Simple solution: set the COIL / GXE / whatever requirement at some clear value and let whoever makes it vote, paragraphs not necessary. If this allows bad players to vote, and if you don't want bad players to vote, then set the bar higher. If the higher bar precludes certain qualified players from voting because they don't have the time to make the reqs, too bad. Clear, objective, no bias, and nothing like what happened in the STAg vote will happen again.

#### Celticpride

And how are you going to tell who "understands the philosophy of the tier and the suspect test" without making a subjective judgement? That's just opening up the potential for subconscious bias. By excluding Conflict - and idk who he is - whoever is judging the paragraphs is essentially saying that he isn't qualified to vote because he "doesn't understand the ubers philosophy" even if he outright beats so-called qualified voters in a head-to-head match. That is a position I find extremely arrogant.
Did you just read any of the past 118 posts? The ladder doesn't mean jack when it comes to understanding tiering philosophy. Anyone can take one of Dice's teams and wreck the ladder with it. That doesn't mean they understand why something is being suspected. You could play 1000 suspect ladder matches and still not come into a Stall vs Mega Gar/Goth scenario because the ladder was a HO filled mess. I don't understand how winning a ladder match means you know the difference between uncompetitive and broken.

#### Disaster Area

##### formerly Piexplode
yea kinda anyone who doesn't get what I already said probably won't get it however I rephrase it, but what this person above me said is correct, and please if anyone has anything interesting to say that doesn't relate to the recent tedium it'd be appreciated.

#### Majickary

Did you just read any of the past 118 posts? The ladder doesn't mean jack when it comes to understanding tiering philosophy. Anyone can take one of Dice's teams and wreck the ladder with it. That doesn't mean they understand why something is being suspected. You could play 1000 suspect ladder matches and still not come into a Stall vs Mega Gar/Goth scenario because the ladder was a HO filled mess. I don't understand how winning a ladder match means you know the difference between uncompetitive and broken.
Not relevant. Such an objection should be raised before the reqs are publicly posted. After they are posted, to deny someone who made those voting reqs a vote is unfair. Come up with some other unambiguous, objective reqs if you think so lowly of the ladder. Make it so that only the people whose teams have been selected for archiving are allowed to vote (but then the selection process for archiving must also be made unambiguous and objective), or make it so that only the top 16 of a Ubers tournament are allowed to vote, etc - but once the reqs are made, everyone who makes the reqs should be allowed to vote. In the latter example above, it doesn't matter if someone "makes reqs" because he used Scarf Skymin and got 313480 flinches on the way to winning the tournament - he made reqs, he gets to vote, and his vote should not be disqualified.

#### PISTOLERO

##### I come to bury Caesar, not to praise him.
Why are you arguing with someone who thinks

1. Extremespeed Deoxys-S is viable

2. that LO Deoxys-A should run 252 Atk EVs because who cares about Psycho Boost

3. that Mega Scizor without Roost is good, and that it also beats Lugia without Roost but loses with it, and also that Lugia runs Calm Mind or Defog

4. that Scarf Kyogre sweeps a team that has Toxic Spikes, Extremekiller, and a faster Scarf user....

he is definitely somebody who is experienced enough in Ubers lol

You make it sound like it's so simple ... what if Deoxys S does have Espeed? That's a dead Deoxys AND one layer of hazards anyway. Xerneas "walls" Yveltal but say you got Deoxys in against support Groudon and he's got a Yveltal and Gengar in the back, are you going to Psycho Boost anyway? Even if no Gengar, are you going to Psycho Boost anyway? Finally so what if Scizor hard walls Lugia, what are you going to do, BP it to death? Lugia can simply do things like spam Toxic until it's sure Scizor is not switching out, then Roost up to full health (if it's not running Defog / CM etc) and then switch since Scizor poses it zero threat. With Knock Off you can take off his item for permanent damage and then Uturn out. Scizor doesn't need to be at full health to check Xerneas as well since as long as whoever is in attacks Xerneas hard on the turn it sets up Scizor can revenge with BP. As for Uturn, it punishes switch ins of course, but using Uturn on the predicted switch means not using Roost. Since Space Jams is an offensive team I find I generally prefer to Uturn to keep the momentum.
wander over to http://www.smogon.com/forums/threads/sample-teams-for-starting-ubers.3511683/#post-5867923 if you don't believe me~

Last edited:

#### ApplepieFTW

Why are you arguing with someone who thinks

1. Extremespeed Deoxys-S is viable

2. that LO Deoxys-A should run 252 Atk EVs because who cares about Psycho Boost

3. that Mega Scizor without Roost is good, and that it also beats Lugia without Roost but loses with it, and also that Lugia runs Calm Mind or Defog

4. that Scarf Kyogre sweeps a team that has Toxic Spikes, Extremekiller, and a faster Scarf user....

he is definitely somebody who is experienced enough in Ubers lol

wander over to http://www.smogon.com/forums/threads/sample-teams-for-starting-ubers.3511683/#post-5867923 if you don't believe me~
just wanted to add that espeed deo-s has a tiny niche of anti-leading deo-a, who would otherwise anti-lead you. its not at all worth not having knock off/other utility, but its slightly better than e.g psycho boost deo-s.

#### xJownage

##### Even pendulums swing both ways
There will have to be a non-objective way to determine who is worthy of reqs, and who has the knowledge of the tier.

If humans judge paragraphs, it has to work like this: Pro ban reads pro ban paragraphs, anti-ban reads anti-ban paragraphs. Even then, we will see obvious bias, unfortunately, due to human nature and where people favor or dislike certain arguments (do you ever read something that you strongly disagree with and automatically classify as unintelligent? That is what i am referencing).

One of the few ways I feel like this could be solved is in forcing the person to use the suspect, or constantly play against the suspect. Also, could we find some sort of way to fairly prevent "duplicate teams" on the ladder, so people will register their teams and nobody else can copy and use? This would prevent people from stealing teams and using them. Doing this could be difficult, but it may yield some results.

I brought up a "point system" last page, regarding total of ladder, tournament battles, etc. relating proportionally to create a "battler value." Tournament battles would have to be weighted by the skill of the players in the tourney (ladder ranks would be decent to weigh the power of the tourney) as well as the number of players. There would be a way to do this but it would be somewhat complicated. In all honesty, there isin't a really good way to do this, but its an idea.

The only way to figure out how to truly determine player intelligence is to talk to them, or make them talk. Paragraphs were a start, but to be honest, maybe we should make people talk about Something other than the suspect to eliminate biased views and gauge their knowledge base. If we got the people to start talking about teambuilding, why certain sets have their niches, why certain mons aren't quite as good as others, how certain aspects of ubers affect the metagame as a whole, etc; we can really start seeing some decent results provided non-bias.

Also, is not as intelligent players voting as bad as we are making it? They play the tier just as much, maybe more than us, and we shouldn't cater to the top players only, we should cater to all the players who play the tier, keeping the popularity up.