Potential Flaws with the Current Suspect System


protected by a silver spoon
is a Site Staff Alumnusis a Super Moderator Alumnusis a CAP Contributor Alumnusis a Tiering Contributor Alumnus
Yeah, backing up what Dubulous posted, we've tried a few different approaches to see what works best for a smaller playerbase. As for number of voters, I would definitely prefer more, but only if the playerbase expanded as well. I think that OU has a lot more room to discriminate in its voter pool, and it ought to exercise that power. I agree with quality>quantity. If reachzero, Jabba, and the others involved with the OU suspect process find a way to make paragraphs a bit more workable, then I say let them give it a shot. If they get a good model up and running, then I'll probably give paragraphs another shot in LC at some point, too.


np: Michael Jackson - "Mon in the Mirror" (DW mix)
is a Site Staff Alumnusis a Team Rater Alumnusis a Battle Server Admin Alumnusis a Live Chat Contributor Alumnusis a Researcher Alumnusis a Tiering Contributor Alumnusis a Contributor Alumnusis an Administrator Alumnus
The reason why we stopped liking paragraphs was because everyone bitched if their vote got rejected. There were a small handful of people reading these essays and all they got in return was crap. So nobody liked writing the paragraphs and nobody liked judging them. I really don't see that changing if we go back to it, and that's ignoring the obvious potential for bias in the system.
My name's been thrown around in this thread a few times now so I thought I'd weigh in. Nobody liked writing paragraphs because most people felt it an inconvenient burden to be asked to articulate their thoughts on a suspect. Most of those who did and did not have their submissions accepted bitched about it on one forum or another, which was one of the few things I didn't like about judging paragraphs because it was a constant. There have been several improvements suggested or put in place to the original bold voting system that would make or have made it easier for both parties:

1) Expanding the pool of judges

As much as a chore writing paragraphs was for would-be tiering contributors, imagine how much effort and attention it took for two judges to read dozens of no-word-limit submissions, two and sometimes three times over. Manaphy had over 70 submissions PMed to Aeolus and myself, and if I hadn't reread most of them myself, Blue Tornado and Minato would have gotten away with plagiarism. The evaluations themselves took about a week since besides the fact that Aeolus and I were generally busy, we were only two people, and for the Latios Stage Aeolus was completely out of the pocket and I had to read them all myself.

As level-headed and even-keeled as I think I am, I would have appreciated additional eyes if only to somewhat quiet the cadence of impatient foot-tapping that was often directed at Aeolus and myself. With five people now, the bandwidth of judges is increased dramatically. Lags in evaluation time and, more importantly, the risk of inefficient judging owing to a bandwidth deficiency are (or should be) a thing of the past.

2) Sentences instead of paragraphs

One of the last tests that Aeolus and I administered put in a 400-word limit. I don't think anyone had a real problem with this, and going even shorter or much shorter at least puts everyone on a more equal writing field.

3) Publishing all submissions after a given stage or vote

I was open to this whenever it came up after some votes and even acquiesced on at least one occasion. This should be mandatory if whispers of bias or faulty judgment are still seeping through the grapevine.

4) SEXP for "potential for bias in the system"

As you are one of the few proponents of SEXP (thanks) you likely understand that its introduction was facilitated entirely to reduce bias, and make the ascertainment of an "ideal voter" as automatic and objective as possible. This was not met well by the majority of our community since they felt that not knowing the metrics and components of the SEXP formula precluded its fairness even though X-Act himself stated that the formula would not work if people knew how to manipulate it, but good luck getting anyone to come back around on this.

In sum, there are some strides that can be made to improve the process as it is, and some steps that were taken in stages past should not be forgotten altogether.
Sorry to bump the thread with a fairly simple question, but something came up in a DST thread that made me think. Another flaw in the suspect system is its duration. We've had 5 consecutive months of bans now and it doesn't look like that trend is going to stop any time soon. When is enough enough? I think we should consider extending the duration of the suspect periods to at least 3 months in order to gain some form of a stable metagame.

For the sake of easy posting I'll just paste what I said earlier in that thread:
It's getting clear that the longer we do suspect testing, the less "worthy" the bans that come out of them are. For example, if at the beginning of Gen4 you had told someone that Salamence would become uber, they would have laughed in your face. But after we went through years of banning literally everything that was deemed "suspect", including some things multiple times, that didn't sound too ridiculous. The result of these constant ban periods was of course that we never got to enjoy a stable DPP metagame until the next generation of games came out, which in my opinion was the worst possible outcome of the suspect tests.

The entire premise of the suspect test is that there ARE broken things in the metagame, and because of that there will always be a ban every month. This makes the metagame highly unstable. You can see that this happened already with Gen4 OU, which I feel we ruined competitively by constantly putting things under the microscope instead of letting things play out. That doesn't even address the fact that the voting pools aren't static. You can have a ban one month that voters of the next month wouldn't even consider (and it's WAY harder to unban something than it is to ban something, which I think is backwards). One bandwagon vote and the entire metagame is ruined for everyone.

Every ban that comes about as a result of the suspect test only raises new questions. For example, Manaphy is currently banned and Thundurus is likely to be banned this time next week. Now a ton of pokemon just got their #1 check removed and it's entirely possible we could see a spike in usage of something else. What happens if, a month down the road, we decide to ban Drizzle? Now we have a metagame where Manaphy and Thundurus could easily be compatible but aren't given the chance. [addition: Which will only lead to continuing the same flawed testing indefinitely as new perceived imbalances arise]

I really don't want to see another generation ruined by constant suspect testing. We need to decide when enough is enough.
To be fair, we did reach a No Suspects period in Gen 4 UU. However, the sheer number of nominators now make it seem as if something should be done differently. I would take another look at the unrestricted nominations, if anything.


Believer, going on a journey...
is a Tiering Contributor Alumnusis a Battle Server Moderator Alumnus
A little late to the party here, but upon seeing Cape's post, I couldn't help but to be moved to respond.

To be fair, we did reach a No Suspects period in Gen 4 UU.
This is the apex of everything we're working to achieve. A competitive Pokemon metagame most of us enjoy, where the userbase no longer feels the need to ban. We know it's possible because we've done it before. Even through all of the slops, flops, and messy politics of Gen 4's tiering, the end product was a crowning achievement for both Smogon and competitive Pokemon as a whole, and it's only a pity that we took so long to do it that we scarcely had time to enjoy it.

I would not say I wholeheartedly trust the people to create the metagame I want. But judging from what we have so far, we're moving in the right direction. Seemingly overpowered threats that are handled by the shift of the metagame have not been banned, and we've more or less proven we can change our minds. Case in point: Deoxys-S and Latios. They're not leaving OU anytime soon. If they ever do, it will be because the metagame shifted toward the most popular set that made them broken, much like Latias and the rise of her Choice Specs set.

What I don't understand about the the ban scare is why it's being expressed as such an overarching issue when there's no real precedent for it. Once Salamence left OU, no more serious arguments were being presented about any future suspects. Dragonite never filled his shoes. In fact, if I recall correctly, clause testing was next on the list. There's also the tiering of Deoxys-D that many people expressed interest in but never came to fruition. None of those things had anything to do with banning something we "don't like" or "can't handle".

If we're going to achieve balance, though, we can't be afraid to lift the gavel. Garchomp going to Ubers is not the end of the world. He is replaceable. The same way Latios, Thundurus, and any other potential suspect is. In a game filled to the brim with overlapping roles and Pokemon barely separated by their names, I find it a bit ridiculous to say that banning a few of them to keep the power scale of the standard metagame as homeostatic as possible is going to ruin all of competitive Pokemon. If you disagree with the rise of any potential suspects, at least address them case-by-case rather than pointing fingers at a community that wants a balanced game just as much as you do.