How to handle the Shaymin-S Vote

Status
Not open for further replies.
why should it remain in ou when that default was chosen arbitrarily? if the vote ends up uber, ignoring the results, even if it's merely for the time being, hurts the credibility of the suspect test and voting process.
 
maybe the suspect test/voting process need hurting, or at least questioning. this vote is not something i can see us basing a decision on, especially with so many questionable votes as caelum mentioned. (on both sides). it just seems all round like we don't have the voting pool we want.
idk if bold vote+reasoning is an idea solution, but i like it better than the current thread in the voting site forum.

maybe a heavily discussion thread should be opened before each vote, with relevant arguments posted in a well structured op... but that means we need a firm definition of what a relevant argument is, and eventually a relevant definition of uber. seems like that would be a good starting point, and will help the "nu u" initiative in ou, since these situations will probably keep popping up

what i hate the most is the "camp" mentality that seems to be reapeating itself, and 2 sides clinging to random arguments and not actually discussing stuff. thats why i suggested a prior discussion thread to get as much stuff out in the open and explicit as possible.

as for what to do right now, i'm in support of delaying the decision based on an indefinite vote result. no one can really argue with us taking a step back faced with this situation. kinda reminds me of evo 1 tbh x)
 
I think that with such a close vote, that it would be as Aeolus said, irresponsible to call the vote either way. I have a suggestion though. Why don't we make a thread in Policy Review, and have the staff members and the other PR members discuss Shaymin-S, and try to work something out?

do you not realize that this is exactly what was suggested as a solution to tiering suspects this summer? we decided to let the community have a say in this stuff to be fair to the actual community, even though we knew that it is largely comprised of people that do not have a firm handle on the nuances of competitive pokemon. bold voting threads originated in PR and were moved to stark mountain after having been "started on the right track". if we were to do this, regardless of how "elitist" it would be, it would render the entire last month with respect to skymin worthless. it would render x-act's mathematical analysis of 1655 rating/65 deviations worthless. it would render the efforts of the community worthless, both from the "competing on the ladder for a month" perspective that the more diligent and process-respectful battlers demonstrated, and, to a lesser extent, the "i want to give the reasoning i've thought up even though i dont have to because this is how much i care" perspective. to waver on whatever the decision of the final vote is would be to slap basically everyone in the face.

it has not been a secret that "policy review members" are smarter than the general population. this is from a recent tangerine post in PR:

(and considering I was never for letting people just vote on the issue but rather have people argue it out in an "innocent until proven guilty" fashion). The idea within the voting scheme is that people are aware of the intentions behind the changes in the rules (to create the most competitive ruleset, not "what they like better").

This is why I believe that we should be judging the reasons - although I doubt anyone would be willing to go through that arduous task at this point.
he has maintained this forever, and to be totally honest with you, i don't blame him one bit and actually agree with him. i dont give a shit how it sounds here because this is IS, so i will say it anyway—the community is largely comprised of people who don't really know shit about the game and/or smogon's philosophy. we both would rather PR members argue this out rather than people just voting uber/ou, even if it would only have saved weeks and weeks of time if not also being a better idea given the community's proclivity towards voting for "what they like better". there is also the idea of the "elitism" behind the select handful of PR members deciding on the tiering of pokemon for the thousands who play and will play in our community, but i am less worried about that than the fact that what is being suggested is both elitist and an essential disregarding of the last month+ of efforts we have put forth for skymin, and months and months more than that if we think that this can't easily happen again with later suspects whose votes are close.

do not forget—tangerine's originally tally for dx-s was 41-40. this is the only reason there even had to be a second tally. if he had gotten something like 47-34 or whatever, i don't think anyone would have called for a "recount". the thing is that it very well could have gone 47-31 or 45-28 if he had not willingly accepted votes with "weak arguments" (his words). however, i'm not bringing that up to throw tangerine under the bus. i'm bring that up because he had to even consider doing that because the community largely does not know how to voice a convincing argument on pokemon. this is why the process failed more than any inherent biased on either his or my part.

so now, we have moved on to the rating/deviation process that has battle skill as a supposed screen for "competitive intelligence". however, the efficacy of the vote is still predicated on the trust of our community to not vote for "what they like better". while this trust may have been a little misplaced, it's better placed in the part of the community that was willing and able (more willing than able in some cases) to reach the 1665/65 marks, and i don't think anyone will argue that these voters aren't a better crop than the "just anyones" of the bold vote process. everyone seemed to agree that this process was better...until it actually becomes evident that the vote could be close? ok, so are we, in PR, now allowed to be subjective with the results just because we're smarter than everyone else? again, i have zero problem with this, no matter how it may come off in this thread, but realize that this would be exactly what we would be doing if we were to ignore even a slight majority from the process "we all agreed" was best now.

(re)read the first two posts of this thread, especially mekkah's post. what would we accomplish with a PR thread about this? even if the vote finishes at 55% (i dont even know if uber is leading right now and i'm not even going to check because that's how little it matters), we are still going to be disappointed "45%" of the community, if the suspect test voters are indeed the population that is supposed to be indicative of the community (they are). a lot of people are going to be disappointed on wherever we place skymin for the next 9-11 months. the entire reason it was a suspect in the first place is because it was not, and, now, is not obvious one way or another whether it were/is uber, and this evidently cannot be stressed enough. but if the rest of you still agree that we should take action just because the vote was close, how about latias? or latios even? do you think we wouldn't have to do this again because there will "probably be" a larger majority in those cases? are we seriously just going to "hope the vote isn't close" with suspects going forward? i honestly hope not, because that would seriously be some of the worst theorymon imaginable, given what the implications would be for tiering suspects from now on. i am saying all this because, if we do not decide, right now, to scrap any voting process and just go with the intuition of PR members to tier suspects going forward, we are going to doom ourselves to the same month+ of bullshit second-guessing that some of you are considering now. and it would not only be a big, big waste of time, but it would also entirely undermine the value we have in the voice of the supposedly-competitively intelligent community. do we really want to do this?
 
Throw it in Uber and hold a recount a few months from now. This shows that we care about the majority opinion, but still recognize that for a solid decision to be made we need more conclusive evidence.

Also, I really wanted to restrict the voters by time in the community - someone who is registered Nov '08 should not be voting in our community, regardless of their score on the ladder.
 
what good would a recount do with literally no new information on how skymin performs in standard
 
isnt that also implying that we would have to actually retest it again, taking time away from the other suspects that we need to consider

and, most importantly, that we would still be 100% "hoping" that the vote wouldnt be close again, since we will have done nothing to actively prevent that from happening again (not that that is a good idea)
 
Jumpman paints a very convincing picture for why we need to follow the results of the voting majority -- even if it is by the slimmest of margins. If we do not implement the results of the vote on the standard ladder, then we undermine the foundation of the whole process. We open ourselves to criticism that we pulled a bait-and-switch on the community -- and such criticisms would be completely justified. We said this would be an open, community-driven decision. We need to follow through with that.

I still believe that a convincing majority (60-67%) should be required to banish a pokemon from Standard -- but we never made such a rule beforehand. If we implement such a rule retroactively, after seeing the results of this vote, then we send a message to the community:

"You can vote on this stuff, just as long as you vote the way the Smogon elite staff WANT you to vote. If you don't, then we'll change the rules to get our way anyway."

I know for a fact, that the results have NOTHING to do with my opinion on this. Like Jumpman, I haven't even looked at the current results. I do not have a strong preference for where Skymin should be tiered. So, I have no motivation to steer the results of this suspect test. But, if we give the APPEARANCE that we are steering this test, then we will undermine the process and our own credibility. For that reason, I think we need to abide by the majority vote. No matter how small a majority that may be.
 
Yeah, but with chaos's suggestion, we are following the vote (as it stands currently) I think that was the point.

We are just admitting that we have made mistakes in this process (which is to be expected). And we arent overruling the decision because we dont agree with it. We are just revisiting it because of how close it was, and because we made the error of not having a suspect ladder.

The only downside to this is it takes another month, and it is possible the vote will be close again.

With the second vote we can require a decent majority, and if it is close, if it agrees with the first vote, then we can accept that. And if it disagrees we can say we are undecided and either repeat the process (which is slow, but thorough) or we can just say it falls into the default position of OU (default because OU is always the default, not because that is what it started as).

Also for future votes we should make it clearer to people what they are voting on. I think we should say that the choice isnt OU or Uber but is broken or not broken (and then you vote by selecting the appropriate tier based on its level of brokenness).

Have a nice day.
 
i honestly dont see why this is necessary unless you for some reason feel that, because and only because we didn't have a suspect ladder, the vote was close. i don't know how you would go about qualifying that anyway even if you were to check the pre- and post-test rankings of everyone who voted, to see who didn't play that much over the actual test period or whatever. but the fact that the vote was close should have absolutely no bearing on a decision to redo the test, i dont know why people are even suggesting it should.

and how would we even go about arriving at a "decent" majority anyway thhat isn't arbitrary as hell?

changing the verbiage from ou/uber to broken/not broken will not accomplish anything unless we explicitly define what broken means for the community. most video game dictionaries define broken as a single tactic or option which is powerful enough to be considered overpowered. i happend to feel that way about stealth rock, but many others do not. we'd be better served by attempting to get at a definition of uber rather than trying to come up with a definition of broken that "everyone can agree on".

and ou is always the default for...what exactly? surely you don't mean all pokemon so what is it? pokemon whose BST is 600 or lower? doesn't that include darkrai? or maybe you mean pokemon whos BST is 600 or lower that you can obtain without a nintendo event...but where does that place latios and latias? would you therefore be ok with recognizing them as ou and requiring that their uber votes makes up 60% or more of the total or they'll be labeled ou? or would it be 66%? what about manaphy, which can be obtained through pokemon ranger as well as a nintendo event? are we only counting pokemon as default ou if you can obtain on the original canon pokemon cartridges themselves, making manaphy default uber, or does that not apply to manaphy? and what about jirachi, a 600-or-lower BST pokemon which is obtained exactly similarly to manaphy, through a non-canon pokemon game or a nintendo event, if manaphy is default uber shouldnt jirachi be? doesn't that mean we need to test jirachi? what about event-only 600-BST pokemon mew and celebi? why should they have different defaults unless you are willing to acknowledge that stats have more to do with determining what is default ou? and what about Ho-oh? is it so obviously uber in a metagame with SR rampant that the ou votes would have to total some arbitrary majority in order for it to be ou?

to be clear, despite whatever you may think of the tone of that last paragraph absolutely none of those questions are rhetorical.
 
OU would be the default position for pokemon that we cannot come to any clear consensus on. A vote decided by 1 person doesnt really seem like a strong enough justification for the banning of a pokemon.

I think the reason it would be necessary is because I dont really like the idea of things being banned without some kind of clear consensus.

Changing the verbiage does change things, even without an explicit definition of brokenness, because arguing that a pokemon is broken because it is banned in Nintendo tournaments is much harder than arguing that a pokemon should be banned because it is banned in Nintendo tournaments.

One other thing, which I am unclear on, which could make all this arguing redundant is how is the reconsideration stage going to work? We test all the broken pokemon together and then what do we do? Like, if by that stage we feel the Shaymin decision should be overturned then by what process will we decide that?

Also the relevance of the lack of the suspect test ladder is that people were able to qualify to vote with very little battling in the month of november. I myself just checking now, missed out by less than 21 points of deviation, despite not battling once on standard ladder since October 19th (the deoxys vote cut off, my deviation at that point was 54.87).

I decided I would do an experiment to see how long it would take me to make the requirements. It took me 16 battles, to go from 1627-1798 to 1738-1867. That really isnt enough.

Have a nice day.
 
Also the relevance of the lack of the suspect test ladder is that people were able to qualify to vote with very little battling in the month of november. I myself just checking now, missed out by less than 21 points of deviation, despite not battling once on standard ladder since October 19th (the deoxys vote cut off, my deviation at that point was 54.87).

I decided I would do an experiment to see how long it would take me to make the requirements. It took me 16 battles, to go from 1627-1798 to 1738-1867. That really isnt enough.

Have a nice day.
This is exactly what I pointed out in Policy Review and earlier in this thread. I think that the mistake we did (or maybe I did) is that we didn't playtest Skymin on a separate ladder. I'd suggest that ANY suspect testing be done on a Suspect ladder, even if it has the exact same bans as the Standard ladder. The main reason for this is that the 1655/65 criteria were tested on a clean slate ladder, but another reason would be to separate 'suspect testing' from the real ladder. This way, if people _really_ want to have a say on Skymin's uberness or lack of, they should test on the proper ladder... which entails a little extra commitment, which can only be good.
 
What about the trend of decreasing voters with each new Suspect Test (on a separate ladder), however? Is a vote made by say, 15 qualified voters really a "better" vote than a vote by 51% of 118 voters?
 
If the numbers are being made up by people who only battled 16 times in the month then I would say a 15 man qualified vote is better yeah..

But I guess we should definitely set a figure as a minimum required number of qualified voters..

Have a nice day.
 
Part of a PM I received recently. I pretty much agree with the sentiments expressed within.

Also, although I cannot post there, I, along with many other users, are starting to fear that Smogon will not honor the votes once the poll is completed. This is because of the "faulty reasoning" behind the votes, or that an arbitrary percentage of votes other than a majority will be required for a ban, such as a 60-40 or better- which is not possible given the number of votes remaining. Even if a 51% majority vote is not responsible to the community per se, neither is keeping Skymin de facto OU. While the better outcome is not immediately apparent between the two, I think many people will be disappointed if the site staff throws out the vote.
 
What about the trend of decreasing voters with each new Suspect Test (on a separate ladder), however? Is a vote made by say, 15 qualified voters really a "better" vote than a vote by 51% of 118 voters?
To increase voters (and for other reasons), we alleviated the criteria 1650/60 to 1655/65.
 
hip how can you say you'd be ok with a 15-man vote when it is very likely that at least one of 5KR and mr. skymin-has-a-0.12%-chance-of-flinching-jirachi-to-death-therefore-it's-uber would comprise over 13% of that vote (since 5KR's skill is not in question at all, which uncoincidentally is part of the problem)? you argued in my PR thread that people like 5KR made up a very small minority...do you honestly think 13% is small? or 27% or 33%?

and you say:
OU would be the default position for pokemon that we cannot come to any clear consensus on. A vote decided by 1 person doesnt really seem like a strong enough justification for the banning of a pokemon.
why can't i substitute "uber" for "OU" and "not banning" for "the banning" here, which is going to apply to every pokemon suspect from now on
 
Because the purpose of the uber tier is to ban Pokemon from standard (OU) play. I do believe that all Pokemon are, by default, not banned. Just one reason for this being the ideal default position is that Pokemon that ought to be uber but are in fact OU will prove themselves to be worthy of being put in ubers (if such a Pokemon is unable to do so, then I'd question whether it ought to be uber). Pokemon that ought to be OU but are tiered as uber will prove no such thing. In other words, erring toward OU is a temporary problem, while erring toward uber can be permanent. All things being equal, it is better to make OU the default tier.

Are there any such reasons for making a Pokemon uber by default?
 
The only reason coming to my mind for making a Pokemon uber by default is the idea that the Pokemon is a new element to the metagame and we ask ourselves whether that element should be added to the metagame. After all, we'll be asking the question of whether to add things for the majority of what we have left to test.

Of course, with the Uber tier functioning as a ban list, I'm not sure that idea is necessarily a good mentality to have. Obi's reasoning is a hell of a lot more convincing.
 
I wouldnt be ok with a 15 man vote. I just think it is better than a 100 man vote made up of people who havent really tested the suspect.

If we set our minimum number of voters I think around 50 should be the absolute minimum..

Have a nice day.
 
Okay, well I have a question regarding Skymin, that could have an impact on the vote. From my understanding Shaymin-S has to be holding that flower in order to become "Skymin." If this is the case, then wouldn't everyone that's using Skymin, be considered cheating since they're "technically" using 2 items at once? Correct me if I'm wrong, but that could really help out since most of the Shaymin Sets rely on an item to be used to their fullest potential.
 
Okay, well I have a question regarding Skymin, that could have an impact on the vote. From my understanding Shaymin-S has to be holding that flower in order to become "Skymin." If this is the case, then wouldn't everyone that's using Skymin, be considered cheating since they're "technically" using 2 items at once? Correct me if I'm wrong, but that could really help out since most of the Shaymin Sets rely on an item to be used to their fullest potential.
It's a key item that simply changes its forme without attachment. Much like using an evolution stone. It's impossible for Skymin to even be holding it.
 
Are there any such reasons for making a Pokemon uber by default?

yes, in the interests of the two or three months of time we will save by using our heads in future generations by recognizing the next mewtwos and palkias as uber, and, therefore, not labeling them as suspects we will have to spend two or three or four months on when it is obvious to everyone who knows anything about competitive pokemon that the pokemon is probably uber. even in the case of a "ho-oh uber", we will be able to use our pokemon intuition to determine whether or not it should be labelled a suspect after we see how the new standard metagame unfolds and how the "ho-oh uber" would feature in the standard metagame, and therefore label it as a suspect and spend one month tops instead of two or three

the fact remains that these suspect tests disrupt the standard metagame, whether or not it is the right thing to do to conduct them. obi, if you can tell me that you would rather spend three months on suspect tests for the next "palkia, dialga and giratina" when they are probably going to be obviously uber and given the knowledge that we can conduct suspect tests for them after the standard metagame settles, then ok, we will see who agrees with you. though i kind of think you are selling your own intuition short by wanting to test "probably uber" pokemon in future generation, regardless of the fact that we can use our intuition after the standard metagame settles to "reconsider"

ANYWAY, does anyone think that we should have doug determine when and how many times the 118 accepted voters actually played over the suspect test period? i think that this would solve out problem without actually slighting anyone who actually played over the period, because if you only played like 10 battles on that month but had a great record beforehand, guess what? you're good at pokemon, or "were", but literally didn't participate in the test to the extent that you would have made the requirements then.

^^^please answer this question, everyone who posts in this thread^^^. im working again now cause my broken kneecap healed so i dont have the time to be here from 8-7 or whatever and we need as many smart people to weigh in on this as possible
 
I would agree with perhaps checking up on the players who voted, but then their alts would have to be checked too as not everybody battles on the one account. Just because one account qualified using X battles, it doesn't mean that the user only had X battles during the qualifying period. How far do we want to take this? Checking the alts of that many people could take a fair amount of time ...
 
Not that it will really make any difference, but I wouldn't mind seeing how often the eligible voters' accounts battled over the test period.

You're not thinking of requiring a minimum on the number of battles, are you, Jump? I suppose I'd be up for that to ensure that voters battle enough on the Standard ladder. I can't see that being relevant on the Suspect ladder, though, unless you don't require voters to create a new account at or near the beginning of the test period or don't reset a voter's account at the beginning of the test period.

I do like the idea of requiring potential voters to "register" an account in a thread for testing purposes, though. It would at least cut down the time used to confirm the qualifying accounts.

Does anyone see major problems with all of that?
 
Status
Not open for further replies.
Back
Top