OU Suspect Testing Proposals

TheFourthChaser · Aug 2, 2013

MCBarrett said:
I think that if we were to adopt a "Paragraph Reqs" system it would clear up a ton of problems with the current voting populous, since as of right now, we cannot tell if the voters truly know (or care) about the state of the OU metagame. I think this system would be extremely beneficial for many reasons, since a voter will be forced to think about their decision and showcase logical reasoning for said decision, they will be that much more likely to make the correct decision. Also, it will promote more quality posting activity in the Suspect Thread and it may get some lurkers to come out of their shell and post in meaningful discussion to showcase their knowledge of the metagame. While it may make Smogon seem even more elitist it will prevent people who just simply do not care about the metagame, and only vote for their own personal agenda, from voting. Which, of course, is much more important than seeming welcoming to the guy who got lucky and made ladder reqs, and subsequently made a careless vote that shifted the outcome of the poll.

While I agree that paragraphs will probably become an increasingly important part of the Suspect process, I do not agree with all of these points. You can try using this as a way to remove a problem user's opinion but it is not hard at all to bs these paragraphs. Lurkers may come out of their shell but paragraphs were used in the past and the Suspect threads were not noticeably better, they will always be shitty.

MMII · Aug 2, 2013

I disagree with the paragraphs requirements. Ignoring the issue of turning away voters due to the extra work required (as if grinding enough, although we do have a big enough group to probably get away with it), they simply aren't effective at the intended goal of weeding out incompetent voters/bandwagoning. It doesn't take any thought or intelligence at all to rehash arguments the guy above you posted (and how are you supposed to weed those guys out from the ones who simply have nothing new to say). On top of that, there are those who understand the metagame a lot better than they can express and vice versa so we aren't even necessarily addressing the targeted issue. Lastly, this introduces even more subjectivity into the process and leaves a lot of power in council. (which I assume they wanted to avoid in adopting this current system)

tehy · Aug 2, 2013

So why not just have anyone who wants to suspect vote PM whatever members of the OU council would judge this normally?

Sure, it'd be annoying to get alerts and have it just be these, but in reality it wouldn't really make much of a difference, work-wise. The worst thing that could happen is one guy gets overloaded, although if that's a concern there could be a thread where people post saying who they PM'd, and a tally kept in the OP so people can PM evenly. (Or PM them all, possibly?)

This way it's pretty much impossible for people to steal arguments, while still working about the same.

KM · Aug 2, 2013

On the whole 50/66 argument...

Would it be possible to have a two-tiered voting system? In this, you'd have a lower set of requirements for your average, decent player, who would vote in a simple, one-off poll. The time to obtain these requirements would be very short - only a window of a couple days - meaning that it mainly falls on people who've played the game already. (I'm thinking something along the lines of 1850 ± 55). If this vote reaches a 60/40 consensus that a pokemon/ability/whatever should be banned, it then goes to a higher set of requirements - limited to only the most skilled (in theory) of players, who then vote - and if it obtains a 51/49 majority, that pokemon is then banned.

In this manner, pokemon are banned on their use in both lower level and higher level play - and it helps more people get involved with the banning process. The requirements would be raised so that the people in the higher board of voting would truly be exclusive and therefore have enlightened things to say about the ban, whereas the lower set of requirements would be mainly on the face-value brokenness of a pokemon.

This is sort of complicated over our current system, but I feel as though it could work well.

There's another nice thing about this system. If something is truly broken - (see, chansey in UU), and needs to be quickbanned - as I'm sure will happen quite a lot in gen 6, the process can be streamlined to finish in only a couple of days if the average players rule it out in an absurdly high 80/20 vote.

This is all hypothetical, of course, but I'm interested to hear your guys' thoughts. For certain pokes, lots of discussion and talk among people with lots of experience is certainly necessary - but especially in OU, bans can save the metagame from being very overcentralized.

Iconic · Aug 3, 2013

re: Clear Guidelines

If the world were perfect and if Pokemon weren't nearly as intricate as it is, these guidelines would already have been established. Unfortunately, it is virtually impossible to set a perfectly explicit definition to the work "broken" that is void of subjectivity because of competitive Pokemon's incredibly complex nature. Doug's Characteristics of a Desirable Metagame and Jumpman's Portrait of an Uber are two valuable documents that I believe any informed player should peruse before voting, and they introduce and address some interesting conundrums in attempting to define Pokemon jargon. Just think about this for a minute: these topics use definitions to reduce the subjectivity of the words "desirable" and "broken", but the definitions in and of themselves are open for interpretation. This isn't some grand discovery though -- most people are well aware of the paradox I just outlined, and Doug actually addresses the numerous issues of subjectivity that his definitions introduce. Anyways, the purpose of their topics isn't exactly to lay down 100% irrefutable definitions, but rather to narrow the scope of these frequently used terms. No matter how hard we try, I honestly don't believe we'll ever arrive at a universally accepted definition of "broken" for competitive Pokemon because of the complex ways the game's variables interact with one another.

Want my personal opinion? I think "broken" can be summed up with the phrase "bad for the metagame". Now clearly this is an EXTREMELY broad interpretation, but if you combine that sentiment with the definitions Doug and Jump outlined, hopefully you can get a clearer picture of what I mean. Anything that has a net negative (again, broad, but this is unavoidable) impact on the metagame when you consider the majority of its effects (this includes Pokemon it counters, Pokemon that beat it, how it interacts with weather, how it interacts with various playstyles, luck factors, etc) is "broken" in my view. Each and every voter has their own unique view of the term "broken", but I have faith in the voters that they have thought critically and rationally enough to arrive at something similar to what I just explained. After all, I wouldn't have pushed so hard for the return of Suspect Testing if I didn't believe that the voters as a whole were capable of this.

re: Supermajority/Percentages

Admittedly this is an issue we didn't address as thoroughly as we should have when Suspect Testing returned, as it was low on the priority list of things to do in order to get Suspect Testing off the ground. Without going into too much detail I can just say that we will probably return to the old system of requiring two consecutive votes of 50%+ or one supermajority vote of 66%+1 in order for a Suspect to be banned. This in effect means Landorus will be tested again since it received a majority ban but didn't reach a supermajority. As long as it receives a second majority ban, it will be kicked out of OU (EDIT: it is possible that we do not apply this to Landorus because its vote is already done, but I'll have to sort this out with the Council). Aldaron explained the reasoning for this system better than I ever could so I will just leave his words here (spoiler: it boils down to math... so fun!):

Aldaron said:
As anyone with any statistical knowledge can tell you, analysis over any dataset has some error variable. In pokemon suspect voting, this error could encompass rating system deficiencies, voter stupidity, or even acts of god (site / sim being down for large amounts of time). One of the purposes of a supermajority is to make sure we are above 50% + error, to ensure a ban. Note, the reason this is done this way and not for the 50% -error for keeping the suspect is that banning a suspect changes the status quo, which logically requires we be sure we want to change it.

Granted, nobody has ever bothered to measure this error, but I figured I should give you objective reasoning for why some percentage over 50% is preferred. The exact, true, 50 + error % might never be known, but it is for sure above 50%, hence the desire for some percentage over 50%. Since measuring stuff like the errors I mentioned is probably impossible, and since we know the 50 + error percentage is above 50%, the best option we have is to use a judgment to determine what the "best" percentage is...basically hope that the 5 council members are somewhat competent in math and logic :p

You can apply some principles like a larger voter pool will probably have a lower error range (standard error is standard deviation divided by the square root of sample size) or a larger time range to qualify will probably have a lower error range (more time for ladders to stabilize, more time for metagame in general to stabilize), but in the end it will end up being a judgment call by the council.

tl;dr - laws of stats require a meaningful change of the status quo to be above 50+error%; its very difficult to quantify that error in pokemon voting, so it will ultimately end up as a council judgment call.

re: Speed

Speed has certainly been a tricky issue in the past (iirc Jumpman mentioned somewhere that the slowness of the DPP Suspect Testing process was one of its biggest pitfalls but don't quote me on this). However, I think the problem is being exaggerated as a break in between tests is certainly not crippling to the process. There is actually a lot of merit to a small break in between tests, especially after a ban has taken place, in order to (re)evaluate the course of action. I'm certainly opposed to creating a rigid schedule of Suspect Tests because philosophically and pragmatically it makes very little sense.

re: Retesting OU Pokemon

I have never been opposed retesting Ubers in OU, but I am certainly opposed to testing Pokemon for the sake of it. It's always wise to consider the prospect of retesting Ubers based on metagame changes, but there has to be a ton of supporting material in order to even contemplate changing the status quo. I mean, the OU Council often has discussions regarding the possibility of retesting some Ubers, but we generally tend to agree that we don't believe it's even worth a test. I think Aldaron has mentioned retesting Thundurus and someone may have dropped Excadrill's name at some point, but given the current state of the metagame I don't think I could get behind any retest. Speaking in massive hypotheticals, if something such as a full weather ban takes place, then we will obviously look into Manaphy and Excadrill retests, but given the current metagame I don't think anything similar will be happening. I think it's dangerous to view a stable metagame as boring and thus grounds for a new Suspect Test, because this idea is at odds with the fundamental principles of Suspect Testing.

re: Paragraphs

Again, this has been a tricky issue for the entirety of Suspect Testing's lifespan. The merits of paragraphs are so obvious that I don't need to repeat them, but the concerns they introduce are extremely problematic. The way I see it, paragraphs introduce another unnecessary layer of subjectivity when it comes to voting on suspects. Not only do voters have to state their subjective (see my response to point #1 as to why I think this subjectivity is inherent) views on the Suspect, but now the Council has to subjectively evaluate whether there is merit to the voter's assertions. I think the Council as a whole is smart enough to discern between intelligent and blatantly stupid lines of reasoning, but we'll probably run into problems when we disagree with controversial principles from which the voter's opinion was formed. Statements like "Landorus is broken because the Rock Polish variant destroys most well-built offensive teams" require us to determine whether this justification is acceptable and thorough enough to allow the player to vote. But how do you evaluate this acceptability? Well, the Council would have to do this subjectively, but at this point the opinions of the paragraph writer lose a lot of their meaning because this process puts so much power in the subjective views of the Council that we may as well scrap the Suspect Testing system altogether.

I am definitely not explaining this as well as I could (sorry it's late!), but basically what I'm trying to convey is that I just don't think the benefits of paragraphs outweigh the repercussions. Sure, it would let us weed out extremely stupid voters (which I maintain are anomalous though that's besides the point), but remember the whole reason why Suspect Testing works is because we believe that voters AS A WHOLE are smart enough to make good decisions for the metagame. I just cannot stress this enough... without this trust the whole thing falls apart. If we did not trust that voters AS A WHOLE were smart enough to do this, Suspect Testing would never have been conceived in the first place.

------------------

If you can take anything from this post it's that I acknowledge that Suspect Testing is an imperfect process, but all things considered, I think it's pretty darn good. Given the number of variables competitive Pokemon presents us with, there are literally an infinite number of ways to produce a balanced metagame and therefore an infinite number of ways to run tiering. But we operate under a series of principles for a reason, and unfortunately there are far too many of them for me to list and explain individually. Speaking broadly though, all I can say is that I personally believe in a system that maximizes efficiency (time, resources, etc) to the best of its ability without compromising the integrity of the final product, all the while ensuring public involvement remains at the forefront of priorities for the individuals who organize the tiering process.

I'm sure there are things that could be improved with Suspect Testing that fall in line with this philosophy, but before you criticize the current system, keep in mind that we are very much following a path that was carved out by the DPP Tiering Leaders (a very good path if you ask me) and that it is far more difficult to alter this paradigm than it is to continue using its basic principles. On the topic of being critical, I would also like to stress that while I love reading posts that at odds with my views because I find them insightful, people should be careful not to get too hung up on their opinions that they become aggressive or inflammatory. While I don't post too often here, I read most threads that are relevant to tiering, and far too often I see people that get so caught up on winning an argument that their points become clouded by their emotions. Be happy and enjoy, because after all the point is to have fun.

I just wanted to conclude by saying I didn't consult with the rest of the Council in making this post, and so other than Aldaron's post I quoted, these are simply my views and nothing else.

Schpoonman · Aug 3, 2013

Melee Mewtwo said:
I disagree with the paragraphs requirements. Ignoring the issue of turning away voters due to the extra work required (as if grinding enough, although we do have a big enough group to probably get away with it), they simply aren't effective at the intended goal of weeding out incompetent voters/bandwagoning. It doesn't take any thought or intelligence at all to rehash arguments the guy above you posted (and how are you supposed to weed those guys out from the ones who simply have nothing new to say). On top of that, there are those who understand the metagame a lot better than they can express and vice versa so we aren't even necessarily addressing the targeted issue. Lastly, this introduces even more subjectivity into the process and leaves a lot of power in council. (which I assume they wanted to avoid in adopting this current system)

I agree with tehy about his points, but I'd like to add something about your bit about discouraging voters. Good. If we have voters who don't care enough about the metagame to articulate their thoughts, however inelegantly they might do so, then it's highly likely that their votes are not going to be for the good of the metagame (and by that I mean they are voting arbitrarily or for "personal" reason, like that guy who left a thread about the Landorus suspect test on the GameFAQs Pokemon X board). I'll admit that I'm biased for a system in which I can argue my stance because in the current suspect system I can't keep a winning streak going long enough to reach reqs, I start getting unlucky misses and crits and hax and the like, but I digress.

What paragraphs would do is, while injecting some subjectivity that is already present, force voters to present why they are voting the way they are. We won't be seeing any more "I don't like Landorus because he's ugly, ban him to Ubers" votes (this is not something I actually saw, just an example of a horribly shitty vote). And if we adopt a PM system, like a bunch of Create-a-Team threads already use, then it will likely avoid the same problems of plagiarism.

Nysyr · Aug 3, 2013

Lord of Bays said:
I agree with tehy about his points, but I'd like to add something about your bit about discouraging voters. Good. If we have voters who don't care enough about the metagame to articulate their thoughts, however inelegantly they might do so, then it's highly likely that their votes are not going to be for the good of the metagame (and by that I mean they are voting arbitrarily or for "personal" reason, like that guy who left a thread about the Landorus suspect test on the GameFAQs Pokemon X board). I'll admit that I'm biased for a system in which I can argue my stance because in the current suspect system I can't keep a winning streak going long enough to reach reqs, I start getting unlucky misses and crits and hax and the like, but I digress.

What paragraphs would do is, while injecting some subjectivity that is already present, force voters to present why they are voting the way they are. We won't be seeing any more "I don't like Landorus because he's ugly, ban him to Ubers" votes (this is not something I actually saw, just an example of a horribly shitty vote). And if we adopt a PM system, like a bunch of Create-a-Team threads already use, then it will likely avoid the same problems of plagiarism.

+++

Also highly agree with increasing the majority. 51%+ can be swayed by what people ate for breakfast, its just that statistically inaccurate.

X5Dragon · Aug 3, 2013

Well first of all I'd like to thank Iconic and everyone else who posted, I'm pretty sure everyone who read this topic will feel relaxed and more confident about the upcoming suspect tests. If I may make a request, I hope any time a suspect/voting thread is posted you put those links of Dougs and Jumps in the OP as required reads.

---

I just want to bring up an issue that my have been overlooked or misunderstood. Yes, I am one of those people who thought Garchomp was brought down and retested because of a complex ban (which wasn't true), but the point me and others are making was the message sent out by these actions, which is there is a way to ban a pokemon + ability instead of the pokemon in its entirety, and there being a difference between a pokemon and his different abilities. Heck in game you can catch the same pokemon with different abilities. A Sand Rush Excadrill isn't the same as a Sand Force or Mold Break Exca.

I know in the very beginning there were complaints about how difficult to implement this in programming (I believe it was Antar who mentioned this ages ago), and that Aldaron's proposal was an exception, but I think we've come a long way since thing to the point were hosting Other Metagame ladders on PS. So the only obstacle here is a philosophical one. Another objection might arise from the fact what's in it for us, meaning why would care if the weaker abilities of banned pokemon were allowed, most of them wouldn't end up in OU, to which I would say it's the principle and freedom of choice that matters most.

Spinda · Aug 5, 2013

X5Dragon said:
which is there is a way to ban a pokemon + ability instead of the pokemon in its entirety, and there being a difference between a pokemon and his different abilities. Heck in game you can catch the same pokemon with different abilities. A Sand Rush Excadrill isn't the same as a Sand Force or Mold Break Exca.

which is there is a way to ban a pokemon + Nature instead of the pokemon in its entirety, and there being a difference between a pokemon and his different Natures. Heck in game you can catch the same pokemon with different Natures. A Modest Excadrill isn't the same as a Adamant or Jolly Exca.

What makes natures so inherently different from abilities and where does this stop?

You're able to catch the same pokemon with various different qualities that might affect them competitively, but I think this is just making it unneccesarily more complicated for newer players.

Also, I think it's somewhat ridiculous that mold breaker exca can beat every spinblocker with just earthquake anyway, so I'm not sure if I want to see that unbanned. ( especially for a pokemon that 4x resists rocks )

x42bn6 · Aug 5, 2013

Picking a supermajority point shouldn't be something done lightly. How was 2/3 picked? There are other supermajority systems out there in the world, such as 3/5, and 2/3 where "yes" has to be double "no" plus "abstain".

Don't quote me on this, but I thought that supermajority voting was historically used to protect something against present-day decisions not taking into account past decisions with sufficient care (i.e. government proposals, where existing laws could have been in place for decades or even centuries, being changed over a knee-jerk incident).

Picking such a point probably wasn't mathematically analysed that much in the past, but I can't stress enough that picking 2/3 without any real analysis is probably not a good idea.

Maybe some other statistical analysis can come into the picture (i.e. statistical significance - 51% might be good enough for hundreds of votes, but is probably not enough for only, say, 20; or use "abstain" somewhat) instead of supermajority.

Spinda · Aug 5, 2013

x42bn6 said:
Picking a supermajority point shouldn't be something done lightly. How was 2/3 picked? There are other supermajority systems out there in the world, such as 3/5, and 2/3 where "yes" has to be double "no" plus "abstain".

Don't quote me on this, but I thought that supermajority voting was historically used to protect something against present-day decisions not taking into account past decisions with sufficient care (i.e. government proposals, where existing laws could have been in place for decades or even centuries, being changed over a knee-jerk incident).

Picking such a point probably wasn't mathematically analysed that much in the past, but I can't stress enough that picking 2/3 without any real analysis is probably not a good idea.

Maybe some other statistical analysis can come into the picture (i.e. statistical significance - 51% might be good enough for hundreds of votes, but is probably not enough for only, say, 20; or use "abstain" somewhat) instead of supermajority.

Read

X5Dragon said:
For what is certainly a great debate to make this metagame the best possible, I just want to point out to simple things. I don't want any of the proposals mentioned in the OP or subsequent posts to be taken literally, number for number, but rather focus on the point being made. Decisive majority instead of simple, with the number suggested (66%) subject to change. Also, the point about the council being specific could be in numerous ways, a list of criteria, general guidelines or even a simple "do not even attempt to argue for/against based on this way of thinking or else you visiting rights to this forum will be restricted to Circu Maximus forever".

Of course it shouldn't be done lightly, it was really just an example to make the point more clear

X5Dragon · Aug 5, 2013

What makes natures so inherently different from abilities and where does this stop?

This is actually one the questions that was raised, what defines a pokemon and how far should we go in differentiating it's different qualities and characteristics. Instead of attempting to argue how abilities differ from natures, let me point out again even though it was done indirectly, the OU council has recognized a difference between Sand Veil Chomp and Rough Skin Chomp, the players have too and has voted the other ability OU worthy.

It really comes down to why the pokemon was voted Uber in the first place. In either case no one will have any problems if enough arguments were raised to suggest we should try the "unbroken" abilities of a pokemon, just like Rough Skin Chomp was. Maybe not this gen, but we can agree for the future.

but I think this is just making it unneccesarily more complicated for newer players.

I'm not seeing the complication here, I mean right now Sand Veil Chomp is ubers while Rough Skin is OU, no one has complained.

haunter · Aug 5, 2013

Sand Veil (and Snow Cloak) have been outright banned, having been deemed uncompetitive and unhealthy for the game. There was no complex ban on Garchomp+SV in the same fashion as there was no complex ban on Smeargle+Moody. Moody was just banned, for being broken and uncompetitive.

X5Dragon · Aug 5, 2013

I do not object to what your saying Haunter, that is what happened, but Smeargles other abilities were allowed (dunno if it was with a test or not but probably not) and Garchomp' Rough Skin was given a test in OU. Can't we expand this to other pokemon? That's all I am asking.

ShootingStarmie · Aug 5, 2013

X5Dragon said:
I do not object to what your saying Haunter, that is what happened, but Smeargles other abilities were allowed (dunno if it was with a test or not but probably not) and Garchomp' Rough Skin was given a test in OU. Can't we expand this to other pokemon? That's all I am asking.

No, because it isn't just Garchomp's ability that was banned. It was Sand Veil as a whole. Same with Smeargle's abilities not being broken (Technician and Own Tempo).

haunter · Aug 5, 2013

Garchomp was given a test in OU because the most relevant factor that broke it was banned. If we ever decided to opt for a blank ban on Speed Boost and Sand Rush, then I'm sure we'd also see Blaziken and Excadrill unbanned. Note the difference here: Sand Veil was banned as a whole; Sand Rush would be banned just on Excadrill. That's the difference between the ban of a given ability and a complex ban of Pokemon+ability.

ZandgaiaX · Aug 5, 2013

Dangit man! I have been busy and you went ahead and wrote it yourself?! *Shakes fist*

ShootinStarmie said:
I really don't like the idea of making it 66%. 51% or over is enough. If 51% vote that is the majority, and should be enough to decide if something is broken. I don't see why the pro ban side needs more votes. Doesn't seem really fair to me.

I do however like the idea of speeding up the process.

Controversy always need security when making a decision, a ban will be always (to an extend) be controversial unless it is very obviously broken... It has been brought up by many people already that people with no reason other than being a selfish bastard can easily vote on or the other side and pretty much make the decision.. A 51% pro/con vote is far from secure, as it implies a single person made it happen.. Not quoting you for a reason other than you are the first to mention it tho.

I am incredibly in favor of making suspect testing more, how to put it..., standardised? Like making suspect-tests be done after a certain period of relative 'balance' and actually give said testing-period a not too long run: if a pokémon is a potential suspect, then why waste more than two weeks on testing it further?: To give it more attention? To give people the time to ladder? To make sure people know what they are doing, and whether they are pro or con? To test a meta without said suspect?

A potential suspect has, normally speaking, already spend a fair amount of time on the metagame, if it's 'broken' why waste even more valuable time that the metagame can use to stabilise to test it? I am aware that people need to time ladder to vote (a system which is highly flawed) and, should it happen, take their time to properly give their stance, thoughts and arguments about the suspect and any possible effects. Also: making the voters actually bother writing why they voted yes/naw/neither is a very good way to at least make sure that we know the voters a) make sense, and b) know what they are doing without playing the system. Not to mention that it's good to see the 'experts'/better players/voters discuss it out: sure it may not change their opinion, but at least it makes it seem as if something is at least being done in terms of suspect-testing/voting. Also, if suspect-testing is done to make sure the meta without said pokémon isn't turned 180 degrees, then that's an extremely flawed reason. Because the metagame should be able to stabilise: especially without the (potential) suspect, after all a suspect is a suspect because it breaks the metagame or said metagame revolves so much around it that it can be called centralisation.

Along with that, I feel that right now suspect testing is extremely limited... I tend to not even play on PS for weeks, just because I don't want too, though I do scan the forums. And whilst I knew Lando-I was being suspect-tested, there were no real threads about the suspect, or the suspect test, or what the meta will be without said suspect. Involvement by people who aren't high-ladders/voters/god/whatever is so minimal that it creates a barrier to actually bothering to involve with the whole process, and I'm fairly sure I'm not the only one who feels that they have barely any influence/involvement in the suspecting, other than having to ladder for hours (I'm not very lucky with the ladder >,> never got higher than 1700), the system of laddering is flawed and I wonder why voters are even based around the top-ladders if all they have to do is X until they've struck oil by getting high enough points for a won battle. Anyway, with the recent start of this thread, I at least can say that little steps are made to at least allowing more discussion amongst the normal-ish players, however flawed that discussion may be.

TheFourthChaser said:
While I agree that paragraphs will probably become an increasingly important part of the Suspect process, I do not agree with all of these points. You can try using this as a way to remove a problem user's opinion but it is not hard at all to bs these paragraphs. Lurkers may come out of their shell but paragraphs were used in the past and the Suspect threads were not noticeably better, they will always be shitty.

As shitty as they might be, I mean heck: look at the "Rain is it broken?" discussion that happened a while ago, it at least makes up for what might else become a bigger mess that may not be directly seen, but does change the metagame. Also reasonings like: "I hate spore Amoonguss, without Keldeo around Amoonguss will be gone so I'll vote Uber-Keldeo", screw reasonings like that: people like that have no right to vote in my opinion, even if they are masters of the ladder.

tehy said:
So why not just have anyone who wants to suspect vote PM whatever members of the OU council would judge this normally?

Sure, it'd be annoying to get alerts and have it just be these, but in reality it wouldn't really make much of a difference, work-wise. The worst thing that could happen is one guy gets overloaded, although if that's a concern there could be a thread where people post saying who they PM'd, and a tally kept in the OP so people can PM evenly. (Or PM them all, possibly?)

This way it's pretty much impossible for people to steal arguments, while still working about the same.

As improbably as it might be in practice, the idea sounds good. By holding it 'secretive', copying is impossible and people don't feel forced to be completely exact. Of course the whole PM-ing thing is indeed a lot of work, even if it is divided amongst several people. Whilst I trust there are enough trustworthy neutral people around to make it all happen, how could it work out is my biggest question. Extra points if all arguments are eventually gathered, and posted in a 'discussion thread' where for a week or so normal folks can be let loose on it.

That is all, seeing as I either I agree, I can't comment on it, I don't care or a combination of all three, and as such I ignore them as my ego must remain larger than 50 square kilometers.

X5Dragon · Aug 5, 2013

Eh sorry man, I was searching for you on IRC and waiting for you to send a convo but all's good. In any case, the OU Councils stance on complex banning has become clear and under what conditions pokemon like Excadrill and Blaziken can be tested again.

Maybe in the future where people don't believe that complex banning leads to a slippery slope (which can only happen when we decide what defines a pokemon) and that it's not complex at all, we can bring this up again.

MikeDawg · Aug 5, 2013

Complex Banning or not, there is a clear difference between abilities and natures, for example. Excadrill with sand rush is an incredibly threatening sweeper. Excadrill without sand rush is a great spinner/wall. From a competitive standpoint, these two are practically different pokemon, so I really don't think the slippery slope argument is at all relevant here (as opposed to allowing something like deo/d without stealth rocks: it's still the same pokemon, it plays the same role, it was just artificially nerfed)

Spinda · Aug 5, 2013

MikeDawg said:
Complex Banning or not, there is a clear difference between abilities and natures, for example. Excadrill with sand rush is an incredibly threatening sweeper. Excadrill without sand rush is a great spinner/wall. From a competitive standpoint, these two are practically different pokemon, so I really don't think the slippery slope argument is at all relevant here (as opposed to allowing something like deo/d without stealth rocks: it's still the same pokemon, it plays the same role, it was just artificially nerfed)

Do you play a Timid Lucario the same way as you would an Adamant Lucario?
I'd dare say they're almost different pokemon

MikeDawg · Aug 5, 2013

Spinda said:
Do you play a Timid Lucario the same way as you would an Adamant Lucario?
I'd dare say they're almost different pokemon

Essentially, yes. One would just be slightly weaker.

Just like how I would play a modest exca just like an adamant, it would just be silly to do so.

Or a dd rayquaza essentially the same as a sd rayquaza

Spinda · Aug 5, 2013

MikeDawg said:
Essentially, yes. One would just be slightly weaker.

Just like how I would play a modest exca just like an adamant, it would just be silly to do so.

Or a dd rayquaza essentially the same as a sd rayquaza

You could play sand force exca like a sand rush exca, it just wouldn't do very well and it'd be silly to do so.
It's just a bit slower ^.^

MikeDawg · Aug 5, 2013

Spinda said:
You could play sand force exca like a sand rush exca, it just wouldn't do very well and it'd be silly to do so.
It's just a bit slower ^.^

You wouldn't be sweeping with it though. You would be wall breaking (it is stronger, but like other wallbreakers it is limited to a possible kill before it is killed/forced out). That's like comparing hydreigon to hydreigon that gets an instant tailwind boost when it comes out. They certainly do not share the same role nor are they equally effective.

To get an general idea of what the 50% vote should be changed to, an anonymous poll could be put up where voters from a recent test could share whether they were making an educated/productive/good vote.

Then bump the number up a little to account for liars and people who dont know they're dumb and viola!

x42bn6 · Aug 5, 2013

MikeDawg said:
To get an general idea of what the 50% vote should be changed to, an anonymous poll could be put up where voters from a recent test could share whether they were making an educated/productive/good vote.

Then bump the number up a little to account for liars and people who dont know they're dumb and viola!

The one I've been looking at is the chi-squared statistic. An example with a fictional Obama/Romney survey is here: [PDF] http://www.upa.pdx.edu/IOA/newsom/da1/ho_z-test.pdf at the bottom.

As an example, the recent Landorus-I test had 145 votes (ignoring abstainers), meaning at 95% confidence, you'd need 85 "ban" votes, or around 58.6% of the votes. It might not work on all tiers, though - RU Nidoqueen, with only 9 participants, would have needed 8 "ban" votes, or 88.9% (NB I didn't use Yates' Correction).

You might have noticed I said "ignoring abstainers" twice - because this test doesn't work with abstains or, more generally, NOTAs (None Of The Above).

Of course, such a test might not apply to this because Pearson's test is about goodness-of-fit, usually control vs. experiment - but in this case, "no ban" isn't really a control group. But I've seen it used in this fashion before.

Pocket · Aug 5, 2013

I personally wouldn't mind banning Pokemon-specific abilities to unlock Blaziken, Thundurus, Excadrill, and Landorus back into lower tiers, because I tend to promote metagame diversity over simplicity of banlist. However, it's very hard to justify that Abilities are more defining characters of Pokemon than say moves. Volcarona for instance is defined more by its move Quiver Dance than its ability to burn physical attackers. Cloyster has transformed from a mere defensive Spiker to a fearful end-game sweeper thanks to Shell Smash moreso than Skill Link, etc.

Although it sounds very unlikely, if we were to make Ubers our new OU, I would probably prefer an initial banlist of overcentralizing Pokemon in Ubers than a no banlist that TFC pushed. I would prefer Kyogre, Groudon, Arceus, Rayquaza, Mewtwo, Soul Dew, and possibly Palkia to be banned. The remaining Ubers are either stuck in their 90-95 Speed tier or simply lack immediate destructive power (Lugia) to be overcentralizing.

ZandgaiaX said:
A potential suspect has, normally speaking, already spend a fair amount of time on the metagame, if it's 'broken' why waste even more valuable time that the metagame can use to stabilise to test it?

This is a stupid question, but I'll answer it anyway. We have a test period because suspect =/= broken, and we require people to solidify their thoughts on the suspect's position in OU so they can make an educated decision.

ZandgaiaX said:
Also, if suspect-testing is done to make sure the meta without said pokémon isn't turned 180 degrees, then that's an extremely flawed reason.

I disagree with this notion. Banning a Pokemon would of course offer some decentralization, but if a ban results in a more tumultuous metagame than before, where previously checked threats are now uncheckable, then it is a bad ban from my point of view.

ZandgaiaX said:
Along with that, I feel that right now suspect testing is extremely limited... I tend to not even play on PS for weeks, just because I don't want too, though I do scan the forums. And whilst I knew Lando-I was being suspect-tested, there were no real threads about the suspect, or the suspect test, or what the meta will be without said suspect.

You must be new here

PS: x42bn6, interesting stuff - thanks for your contribution ;d

OU Suspect Testing Proposals

TheFourthChaser

#TimeForChange

MMII

tehy

Banned deucer.

KM

slayification

Iconic

Schpoonman

Nysyr

X5Dragon

Spinda

x42bn6

Spinda

X5Dragon

haunter

Banned deucer.

X5Dragon

ShootingStarmie

Bulletproof

haunter

Banned deucer.

ZandgaiaX

X5Dragon

MikeDawg

Banned deucer.

Spinda

MikeDawg

Banned deucer.

Spinda

MikeDawg

Banned deucer.

x42bn6

Pocket

be the upgraded version of me