|
|||||||
|
|
Thread Tools |
|
|
#1 |
|
Join Date: Aug 2008
Posts: 49
|
I feel that the current system for determination of tiers, and I’m sure others concur, is substandard. There are many with voting privilege who don’t necessarily have a firm grasp on our current metagame or Smogon’s philosophy. That said, I don't believe Bold Voting will do much better since it can only filter the votes in the current testing process through the arbitrary judgment of whoever chooses which votes count. It's for these reasons that I'm proposing a new way of testing suspects. The voting process will still be based on merit and I feel this process will take a lot of bias out of the voting process. The testing system will require work to implement, but it is necessary in order to filter out the bias.
Setup: 1. Everyone who wishes to partake in the process must register their accounts and decide if the suspect OU or Uber before the testing commences. They will then be separated into two groups, OU and Uber, based on their decision 2. People in the Uber group must use the suspect. 3. People in the OU group may not use the suspect. 4. Testing will commence for a period of time on a different, suspect ladder (a different server or an honor system clause may be necessary). 5. If a player wishes to switch their decision, they must use a new account with a blank rating. Obviously, you may only have one account per person. 6. We will use the statistics on the ladder at the end of the testing to determine the Suspect's tiering. It will be deemed Uber if there are significantly more top players in the Uber group than the OU group(more than a 2:1 ratio) (Optional) Deviation requirements for voting to promote battling. This testing method cements the implied definition of Uber that was unverifiable before: A suspect is considered Uber if its usage generates a decided advantage over an opponent who is not using the suspect, even when said opponent has full knowledge of the presence of the suspect on the opposing team and tries their best to inhibit the suspect from fulfilling its intended role. This definition should fit for all Ubers and previous suspects voted into Uber. The question "How much of an advantage makes it Uber" will be determined by the arbitrary margin set in Step 6. Why it works: -By doing well in the ladder, you are essentially "proving" your vote. If the suspect is indeed broken, then those in the Uber group will have a decided advantage, and therefore, reach the rank necessary to prove it Uber. This is true for the opposite case as well. Doing well in the ladder against the Uber suspect without employing it also reinforces your vote. -The system forcibly creates a faux centralization and gauges the suspect’s performance in a certain environment. -Unlike the current system, there are a limited number of spots which matter, which gives motive for the battlers to do their best instead of simply meeting the requirements to vote. -The test guarantees that all of those who are assuming its tiering have to be in an environment where they are forced to play with or against the suspect. -It gets rid of most of the self-interest because in this process, people can only prove or disprove that a Suspect is Uber. i.e. If someone who has a team that is only weak to Skymin, he will vote it into Uber no matter what the circumstances because it helps his team out. However, in this test, he can only benefit the testing process because he is proving or disproving his opinion by winning or losing. Foreseeable problems for this method of testing: -Insufficient number of testers- If there are much more players for one group or the other. In that case, take volunteers on the other side or pick testers to switch positions in a random fashion. -Mass Conspiracy- If somehow the best players are lumped into one category while the worst are in the other, then the results would be obviously skewed. -Human Error- If somehow the testers voting Uber do not use the suspect to the fullest extent, similar to how Deoxys-S’s full potential was realized quite late into its usage period. -Too difficult to implement- Unavoidable, that one is really a bummer =/ Thoughts, responses and criticisms are, of course, welcome. Anything thoughtful will only help the testing method. Lastly, thanks to outofdashwz for helping me bounce around ideas and proofread the post, Obi/ipl for inspiring this idea. Last edited by guoguo; Dec 8th, 2008 at 11:57:12 PM. |
|
|
|
|
#2 |
|
Join Date: Jun 2005
Posts: 4,905
Irvine, CA
|
Uhh, what? Isn't this exactly what we're trying to avoid, which is people deciding their votes before the test process is begun?
__________________
Black/White Friend Code: 1721 2578 4968 My Pokemon | Free Pokemon | YouTube | Wonder Cards (now with Movie Celebis for Platinum and HG/SS!) |
|
|
|
|
#3 | |
|
Join Date: Aug 2008
Posts: 49
|
Quote:
Edit: P.S. Where do you live in Irvine? I live in the same city |
|
|
|
|
|
#4 |
|
I love weather; Sun for days
![]() ![]() ![]()
Join Date: May 2008
Posts: 3,573
Where the Sun finally sets
|
I somewhat like this idea. We "assume" that the suspect is either OU or Uber. The people who say Uber are allowed to use it in their teams while the people who said OU cannot. If the people who do use it are much higher than the people who didn't than it is Uber. If the people who don't employ it still are just as high as the people who did empoly it then it is OU.
|
|
|
|
|
#5 | |
|
Join Date: Jun 2005
Posts: 1,710
|
I didn't read much of this yet but
Quote:
__________________
Analyzing battle data for luck factors... |
|
|
|
|
|
#6 | |
|
Join Date: Aug 2008
Posts: 49
|
Quote:
|
|
|
|
|
|
#7 |
|
Join Date: Jul 2007
Posts: 829
Seattle
|
Yeah it would be better to just require that people must make teams both with and without the suspect and I'm not even sure that would be such a great idea.
__________________
Warmachine 1.0 my warstory formatting program. Friend code :Richard 2792 7578 5799 |
|
|
|
|
#8 |
|
Join Date: Jan 2008
Posts: 91
|
I do not see why the usage of the suspect Pokemon should be restricted or required for eligible voters. Regardless of whether a player uses that Pokemon, its properties should become evident through repeated interaction as an opponent. You must adapt to its presence if you wish to defeat it and you need to be aware of its downfalls when playing it. Ultimately, playing with and against the suspect Pokemon provide experience enough to understand it and to determine where it properly belongs.
The voting system is brilliant. You have a predetermined filter to find the players with the best understanding of the game. Once you have the best competitive players, you allow them to mold competitive game play. Not only does this minimize the effort put forth by the staff, it also allows Smogon to reinforce its reputation of a competitive Pokemon site while demonstrating a proper respect the the community members and their wishes. Since all isolated competitive communities should encourage competitive and community growth, this is the best and easiest way to do all of the above. It gets even better, as that predetermined filter is so easy to adjust to ensure that the proper group set for voting is selectively superior. Is the voting requirement too low? Just raise it. The voting system is definitely the best way to make changes for this particular site. That said, I think it could also be refined. At the moment, there is no clear way to test the suspect Pokemon. There are also no defined values for the percentage of what will constitute a solid decision, and there are no definitions to guide the eligible voters towards the best decision. There should be no interference with the voting pool or the decisions made therein. For example, with the recent Shaymin-S vote, they had the right idea, but it was not executed properly. There is no Shaymin-S exclusive suspect test ladder, rather it was lumped into the standard ladder. There was nothing that stated that a 60% majority will be required before a decision is made. We have no definition of Uber past what the community generally feels does not belong in OU. Some members in the voting thread have made it clear that they intend to discard votes with undeveloped logic behind them, despite that the voters have already fulfilled the eligibility requirement. The Shaymin-S vote also lacks a short term solution should the vote come down to a deadlock or a tie. I feel that all of these errs are small and can be fixed easily for future votes so long as we learn from our mistakes this time around. This is how I would propose that we organize future suspect votes. |
|
|
|
|
#9 | |
|
Join Date: Sep 2007
Posts: 1,992
minnesooooota
|
Quote:
__________________
Aldaron: what umbarsc you are not allowed to be scandinavian Aldaron: i love scandinavians Aldaron: you can be Mexican |
|
|
|
|
|
#10 |
|
Join Date: Dec 2008
Posts: 23
|
Since I am from PB, I am rather unfamiliar with how the system currently works, but I don't think limiting one's testing to one side would help. Instead, we should have a control group test both sides for a period of time and then vote. This might persuade those who vote distinctly from just hating it to a reasonable opinion from testing.
|
|
|
|
|
#11 | |
|
Join Date: Jan 2008
Posts: 91
|
Quote:
Since we are structuring the game to fit what we deem as "competitive", we have already taken upon ourselves a personalized understanding of what a healthy metagame and proper tiering should be. That said, we have already included bias to create the format that is currently used, and bias will be included regardless of any alternatives that are used. However, if you can think of a way to implement the general interest of the competitive community in a more effective manner or with less bias, post it here so we can learn from it and use it =) edit: While I'm making requests, I would really appreciate the input from any members directly related to the decision making here at Smogon to add further insights that we can also consider. Last edited by Mia; Dec 7th, 2008 at 11:04:55 PM. |
|
|
|
|
|
#12 | |||||||
|
Join Date: Aug 2008
Posts: 49
|
Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
|
|||||||
|
|
|
|
#13 |
|
Guest
Posts: n/a
|
The reason people need to aquire ratings on both standard and suspect is to have a feel for both the metagames. They may say something is broken in standard, but in suspect it is not really in an issue. They decide which metagame is better. How can they see which is better if they can't play the other metagame?
|
|
|
#14 |
|
Join Date: Jan 2008
Posts: 91
|
I responded to umbarsc, not to your initial post. The purpose of my post was to convey my opinions to umbarsc, not to act as a repetition of your own. If you feel that I have wronged you in some way, I apologize.
|
|
|
|
|
#15 | |
|
Join Date: Jan 2008
Posts: 801
New Jersey
|
Quote:
tldr; this is simply a science experiment gone awry. |
|
|
|
|
|
#16 |
|
Join Date: Sep 2007
Posts: 575
|
I would personally just let people vote whatever they want, however many people there are on either side, and then just take the average rating of those that you do have. Chances are, if you can't find enough people on one side, then it's probably really obvious which way the vote is going anyway. I can't say I'm a fan of the idea though.
__________________
not a counter |
|
|
|
|
#17 |
|
Join Date: Aug 2007
Posts: 1,971
My Soul, Your Beats!
|
I have to disagree with any process where votes are decided before the process begins, especially those where the votes are hard/costly to change because of the rules. Like Syberia posted, it's the kind of thing that I think we need to avoid.
That said, I do like the idea of registering accounts for the test. It could save us quite a bit of work, especially when confirming accounts. Here is what I would add to it: Require a minimum number of battles to be conducted on each account. We want our voters to have sufficient experience with the suspect in order to make an educated decision about its status whether it be "ban/not ban" when it comes to attacks (Evasion/OHKOs) or "OU/Uber" when it comes to Pokemon (Lati@s/Manaphy). If we require the voter to participate in X number of battles, then we can feel more confident that the voter has sufficient experience.
__________________
Creator of the DPP Pokemon Resource, the one thing no Trainer should be without! (eventually... maybe) My YouTube Channel My Blog (VERY, VERY small, ATM) |
|
|
|
|
#18 |
|
dreams of ladybugs crushed forever
Join Date: Jun 2007
Posts: 5,366
five years here and i can't change my custom title :(
|
I think we should do minimum number of battles but ignore rating entirely. If something is truly broken, then battles will come down to speed ties or hax much more often and thus it will be harder to get a good rating than normal.
Also Skymin is conclusive proof that good rating and good vote have absolutely no correlation.
__________________
i was nobody we're all a little bit strange, don't worry about it |
|
|
|
|
#19 | ||
|
Join Date: Aug 2008
Posts: 49
|
Quote:
Quote:
|
||
|
|
|
|
#20 |
|
Join Date: Jul 2007
Posts: 302
|
I like your idea a fair deal and I hope that everyone at least reads and understands your proposal. However, some of the implementation is a little sticky and has a few problems, but I think that a slight change could fix that. The basic principle behind your idea is to see whether teams with the suspect preform better than teams without even with complete preparation with the suspect in mind, right? So couldn't we test this just by looking at the win ratio of teams containing the suspect, without needing all the other rules that go with your proposal? This seems far easier to implement and doesn't require any additional work on the part of the the actual battlers (thus attracting a larger pool of testers). Is this fair or am I misunderstanding what you are saying?
|
|
|
|
|
#21 | |
|
Join Date: Sep 2007
Posts: 1,992
minnesooooota
|
Quote:
__________________
Aldaron: what umbarsc you are not allowed to be scandinavian Aldaron: i love scandinavians Aldaron: you can be Mexican |
|
|
|
|
|
#22 | |
|
Join Date: Jan 2008
Posts: 801
New Jersey
|
Quote:
You also mention deviation requirements could be optional "to promote battling". Are these requirements necessary before testing (to see if you're good enough to get into the program) or during the test (like the system being proposed in the PR forum)? Now, I don't mean to sound like I'm totally against this proposal of yours. I do like how your plan makes sure the active community members of Smogon are the ones conducting this test, not people who come back from vacation and happen to realize their account is eligible to vote. This method also eliminates the likelihood of multiple accounts/votes. I also like how a definition of Uber could be proved by this process, but until we can resolve the skill issue the results can't be accurate. |
|
|
|
|
|
#23 | |||
|
Join Date: Aug 2007
Posts: 1,971
My Soul, Your Beats!
|
Quote:
Quote:
Okay, I need to ask this: are you encouraging voters to vote before testing begins and not change their minds? The way the actual vote is conducted pretty much requires that voters don't switch their vote in order to have an impact on the vote. If voters are making up their minds beforehand and aren't given total freedom to change their minds, how is that much different from what's happening now? At least the current system allows you to easily change your mind during the process. This system seems to encourage everything to stay the same. Quote:
Long story short, your system looks to encourage entering the test with an opinion in mind and not changing it throughout the test, the opposite of what I believe the testing process should do. I want voters to enter the test with no opinion on the matter and form an educated opinion through their experience during testing. 'But that's impossible, Cards. Voters are already going into these tests with their own opinions on a suspect. Testing isn't going to make a difference to several of these opinions.' If this is such a problem, then why does it look like you are encouraging more of it?
__________________
Creator of the DPP Pokemon Resource, the one thing no Trainer should be without! (eventually... maybe) My YouTube Channel My Blog (VERY, VERY small, ATM) |
|||
|
|
|
|
#24 | ||
|
Join Date: Sep 2007
Posts: 605
|
Quote:
__________________
Quote:
|
||
|
|
|
|
#25 |
|
Join Date: Sep 2007
Posts: 1,992
minnesooooota
|
CardsoftheHeart, I don't think the idea is to take the voters' votes into account. Unless I'm mistaken, I think the idea isn't to vote exactly, but to give your opinion before the testing starts, and you sort of "prove" your opinion by battling with or without the Pokemon (if you think it's uber or not). Then, sort of see the average of how well the "uber" people are doing and how well the "OU" people are doing. I don't believe the votes are actually taken into account when determining tier, unless I'm mistaken.
__________________
Aldaron: what umbarsc you are not allowed to be scandinavian Aldaron: i love scandinavians Aldaron: you can be Mexican |
|
|
| Thread Tools | |
|
|