Starting NU Testing

twash · Mar 21, 2010

I am posting this in IS so it gets more attention - I'll move it to PR in a couple of days.

After asking Jump once more, he agreed that it is fair to post here in an attempt to get something going for NU, mostly because generation five is looming. The biggest question is whether NU will be popular enough to not only support testing, but support testing alongside suspect tests in other tiers. I personally think it can (with the correct publicity), but I obviously can't guarantee it. The best way to find out is to try publicizing it, and we need to do so soon because of the new generation. A ladder on CAP won't get as much attention as one on SU to be frank, so I think we should implement one on SU asap.

The suspect test that I was planning on using is essentially the same as UU's.

1. Six week period for playtesting.
2. Nominations for moving NU Pokemon to BL2 (placeholder name). People must have enough SEXP to nominate, and must paragraph why they feel the Pokemon/item is broken.
3. If a Pokemon/item recieves enough approved nominations (10% of total is likely desired), then it is deemed a suspect.
4. Eligible voters notified that voting will be taking place. To be eligible for voting, the player must meet rating/deviation requirements and have enough SEXP to be able to vote at all and on individual Pokemon respectively.
5. Process is repeated. If all Pokemon are voted NU, then there are immediate threads placed for BL2->NU nominations, followed by a six week period for playtesting and then voting.

As for Pokemon who are moved down to NU because of usage, I feel that they should be placed in BL2 until the current six week period is over. They are then immediately placed in NU at the start of the next six week period.

Thoughts?

Kevin Garrett · Mar 21, 2010

This looks like a good way of doing it to me. It's been working well for UU. I think it would be popular with the right help and would be interested in taking part.

Philip7086 · Mar 21, 2010

I'm pretty sure it has unofficially been decided that 6 weeks is a bit too long per test. At least, that's what I felt was the consensus after talks with reach and jabba. I'll wait for them to confirm their feelings on this, but perhaps you should consider just 4 week periods per test.

Itsuki · Mar 21, 2010

I've been waiting an NU test for quite a while. =]

Similar to Kevin Garrett, I think this would be a good process for testing. The NU metagame does seem unpopular, however didn't new UU start off the same (correct me if I'm wrong)? The NU tournament did seem to spark a lot of interest, seeing that the tournament was originally capping at 64 but bumped up to 256. So I predict some popularity in the near future if this doesn't seem to work at first.

twash · Mar 21, 2010

Philip7086 said:
I'm pretty sure it has unofficially been decided that 6 weeks is a bit too long per test. At least, that's what I felt was the consensus after talks with reach and jabba. I'll wait for them to confirm their feelings on this, but perhaps you should consider just 4 week periods per test.

6 weeks is best to begin with, if not 8. The metagame is likely to be quite hectic and unexplored at the start and will need settling somewhat, so we can review the situation after a couple of periods. NU is likely to be less popular than UU, so a longer test is probably worthwhile.

eric the espeon · Mar 21, 2010

Really hope NU gets going soon. As for long test or short test, if you're going to force over a month tests then imo it's best leave the option for vote and bans on "obviously broken" Pokemon before the end of each testing period. They would require a large majority to go through, but prevent the metagame from being wreaked by a horribly broken Pokemon until you get to the end and rendering most of the testing period near useless.

Heysup · Mar 21, 2010

Philip7086 said:
I'm pretty sure it has unofficially been decided that 6 weeks is a bit too long per test. At least, that's what I felt was the consensus after talks with reach and jabba. I'll wait for them to confirm their feelings on this, but perhaps you should consider just 4 week periods per test.

I agree with this.

We really don't need any longer than four weeks to figure out that something is broken (or at least suspect status). Most of the rounds in UU have felt tedious due to the long durations of the rounds. This is especially annoying when you consider that the metagame is likely not balanced or an "ideal playing environment" for the initial rounds.

We should also consider that the NU metagame is not actually "unexplored" like the UU metagame was. We actually had no idea what the UU metagame was going to be like simply because we started the test as soon as the BL Pokemon were dropped down. People have had a chance to explore NU a bit because we have had updated NU tiers for a while, and we don't need to change the metagame further. We also have access to the CAP ladder to explore.

Fuzznip · Mar 21, 2010

I am excited for NU to lift off. I also think that four weeks should suffice for our suspect tests. We don't want to make this a very long test, considering how easy it is to notice something that is close to broken (a lot of people already have a lot in mind).

I wouldn't really say that NU isn't "unexplored", because the CAP server has had the NU ladder for a long time, and it does actually get some traffic. On top of that, there's the NU forums that discuss the metagame and other things like that. Finally, there's a NU tournament going on. All in all, people have got the chance to explore the NU environment thoroughly.

Other than that, I can't wait to participate.

jumpluff · Mar 21, 2010

The format proposed works splendidly for UU and it would be good for NU to be in sync with UU, though I think it should have just month-long tests. I am really excited that we can finally officialise NU as a valid metagame ;)

As to its popularity, LC gained a fair bit of attention when it was officialised. NU has already gained popularity from tournies, the Smog articles, and just general promotion. I often see people on SU asking for NU matches and I know there's an established community that plays on CAP already, so I'm confident that we can invest resources into what already exists as a tier.

Diesel · Mar 21, 2010

I agree with everything that has been said so far, including the push for a 4-week testing period (based on Heysup and Fuzznip's points). The metagame has been very centralized from what I have seen, so I'm interested to see what happens with a larger player base. I don't think it'll take a whole lot of effort to promote, but it definitely should be promoted to an extent, since people will get bored with it if it doesn't get enough traffic.

Thanks for jumpstarting this, Ashley, I'm very excited.

Aeolus · Mar 22, 2010

I would like to say that I think the current form of suspect testing is exhausting. It is exhausting for those who administer it, for those who participate in it, and for those who gather data for it. It is especially exhausting when you ask people to keep it up for over a month in multiple different modes of play (UU/OU and now NU). I hate the idea of adding yet another layer of this burden to an already tired and shrinking pool of people willing to invest the time into the process. Rather than simply caking on another level of the status quo suspect process, I'd rather talk about ideas to handle NU more expeditiously.

Given that NU is not nearly as important or widely played as OU or UU, it does not necessarily need the same level of rigor in testing. I think it is worthwhile trying to get the metagame right... but I think we can do that without months and months of testing and weeks of deliberation.

I'd like to invite people to post suggestions about how to handle NU more efficiently than what is proposed in the OP.

Erazor · Mar 22, 2010

well, we were debating more efficient methods of testing in uu, so maybe we can experiment with nu?
one thing that was brought up was the redundancy of nominations and voter paragraphs. we could allow the nominations paragraph to be accepted as the voter paragraph.
another thing is sexp. in uu, we have only a light sexp requirement, and i think that nu should be light as well. this prevents artificial centralization while ensuring voters know what they're talking about.
as for the paragraphs, we could become a bit more lenient. a more common sense based approach , if you will.

this should make the evaluator's job easier in terms of reading.

i would type more but i'm using my ps3, so...

jumpluff · Mar 22, 2010

Something I feel that would make things more efficient is doing like the sped-up UU and just having voting by the paragraphs. I don't think we really need to spend the extra days waiting for the thread. I like the tiering changes being implemented as soon as possible, and people do get very antsy waiting when we could prevent that.

Darkmalice · Mar 22, 2010

This point is similar as Erazor's and jumpluff's but extended upon.

Players who meet the rating/deviation requirements make nomination paragraphs on Pokemon. After the nominees are accepted, the players with the rating/deviation requirements can vote on the nominees if they have sufficient SEXP for that particular Pokemon.

This is basically a shorter UU test, and players bypass the paragraph requirement (besides making satisfactory nomination paragraphs), saving the suspect test holders the effort of reading many paragraphs, but the players still need to have sufficient voting requirements to vote and there needs to be sufficient reasoning when making nomination paragraphs.

twash · Mar 22, 2010

I can't help but feel the reason a lot of people don't vote is because a lot of them don't want to to write final paragraphs - this is even more effort on top of gaining basic voter rights, and it also means that the testing process is drawn out longer. Perhaps we should have strict nomination rules (reasoning and SEXP required), but remove the paragraphs which would otherwise be necessary to have a final vote on the individual Pokemon, and just use SEXP and rating/deviation instead. It means that more people are going to be willing to play and vote because it is less "effort" on their part. It also removes a lot of time and work issues of checking and deciding paragraphs. The biggest time consumer then is gathering data (SEXP mostly - we can easily create threads where users post saying they have the correct rating/deviation before checking), and setting it up in terms of permissions etc - although perhaps we could just have a voting thread in a public forum where people who reached the requirements can post their votes - we can just delete any random posts to be honest. Or perhaps get them PMed to whoever is running the process and they post the results. Either works, and it's more lenient and less taxing on everybody.

Obviously this process still requires consistent attention until the end of generation five, but it is definitely less work on both those running it and those voting.

/edit: Moved to Policy Review 24 hrs after this post.

Erazor · Mar 24, 2010

Okay, so let me outline what I feel the process should look like:

*Implementation of NU ladder on the SU server and creation of NU subforum in Stark. Maybe incorporated with the UU subforum?

*6 weeks playtesting. This is because it's the first time it's being played "officially", and the number of players are likely to increase if there's a NU server on SU.

*Nominations thread for suspects. Pokemon who garner 10% or more of nominations will be considered suspects.

*Identification of alts based on rating/deviation/SEXP requirements - I think 1600/55 is fine for now, with no upper requirement. And of course, a light SEXP requirement.

*Eligible voters are notified, and they send in paragraphs(which do NOT have to be detailed as the current ones are), which focus on the Uber characteristics in a less rigid way, and more on "common sense".

*You must clearly state you vote(NU/BL2) on each of the suspects in your paragraphs. This enables us to skip voting threads, which saves us time.

*Rinse and repeat the process, but 4 weeks instead of 6 now for playtesting.

The main thing is that the paragraphs are easier to read and evaluate. I know that, having written a few myself, the evaluators(Jumpman16/Aeolus/Jabba/reachzero) have to read many, many essays, most of which are quite lengthy. This is aimed at making your work easier(or whoever is going to be in charge of the NU process).

The primary reason people don't vote is because of paragraphs. If we can make them less stringent, while maintaining a certain standard, then it should benefit everyone.

Please feel free to tell me if there are more efficient ways of carrying out the test.

Heysup · Mar 24, 2010

I don't know how I feel about removing the paragraphs. I think the paragraphs make sure we not only weed out the bad players (via rating requirements) but also the people who want to vote something BL/NU for reasons other than balance. For example, in the second most recent UU Vote (the one where Froslass was nominated) many people who had the rating requirements tried to vote Froslass BL simply out of dislike and didn't apply logical reasoning to their paragraphs. If we completely remove paragraphs, then we also remove the validity of (at least some of) the votes. For example I'd have been able to vote Honchkrow and Yanmega both UU because they were important aspects of my team. As much as we'd like to think otherwise, this will almost assuredly happen.

I think the new UU process that has been suggested / outlined in the UU thread sounds like a good process to follow. This means the process would have:

4 weeks per round. 6 weeks is really unnecessary and kills the activity of the metagame simply because of the nature of the metagames being "unstable" in the beginning. For example, people simply stopped playing at the end of the round where Froslass/Raikou/Gallade were dropped down.

Nomination thread: consensus based (not completely % based).

Voting requirements: 1600/55 with paragraphs explaining votes. If the upper requirement (let's say 1750-1800/45 for example) is obtained, then the user is simply required to write a sentence or two explaining the vote. We could always lower the requirements as well. These ratings should actually be lowered for NU.

We can compensate for the paragraphs by simply shortening the test. This would make it less exhausting than the current suspect testing process while also retaining the validity of every vote. The paragraphs can be less detailed as well.

We will see how this pans out in UU soon as well.

EDIT: Twash if you're trying to replicate the UU test then this is it, changes have been made since we've started. Just in case you were not aware (I got the hint from the "sticking to %" which we no longer do in UU).

twash · Mar 24, 2010

OK.

6 weeks for the first test, and then either 4 or 6 weeks on the next run depending on activity.

Nominations should stay as a %, probably sticking to the standard of 10% from UU's test. "Consensus based" is kind of awkward because "where do we draw the line?" is something that will inevitably be asked - and "majority" is rather high.

Paragraphs stay, and can be less detailed. Potential voters must state their votes and why they are voting that way (explaining how they are using the Uber characteristics) - but I am just reiterating the fact that it can be less detailed.

ToF · Mar 24, 2010

I still like the idea of the upper rating requirement for all suspect related voting. It gives players an incentive to reach that higher rating and deviation threshold while also bypassing the subsequent paragraph writing. I would think that if you reach that high of a rating and meet deviation needs, your play should speak for itself about how much experience you have in the tier (of course SEXP should count in this, you should obviously use whatever suspects are currently being voted on). Not only would this lower the amount of possible paragraphs (as if they aren't already low relative to earlier in DPP), but it would encourage more people to vote. Perhaps food for thought for the next generation if it is impossible to implement now.

Kevin Garrett · Mar 25, 2010

I like the concept of having an upper requirement that can bypass writing paragraphs. Participation is generally larger and it can alleviate some of burden for the people reading the paragraphs without jeopardizing the quality of votes. If it doesn't need the same level of attention as OU and UU, then it shouldn't be a problem.

twash · Mar 26, 2010

Final plan:

1. Playtesting period. This period will begin as 6 weeks (both to let it settle a bit more as it is likely to develop quickly to begin with, and to also give us a solid first pool of voters - we don't know how active it will be yet), and will likely be reduced to 4 weeks.

2. Nominations for moving NU Pokemon to BL2. People must have enough SEXP to nominate the chosen Pokémon/item, and must paragraph why they feel the Pokemon/item is broken.

3. If a Pokemon/item recieves enough approved nominations (10% of total, but we will use common sense here), then it is deemed a suspect.

4. Eligible voters notified that voting will be taking place. To be eligible for voting, the player must meet a lower 1600/65 rating/deviation requirement, and also have enough SEXP to be able to vote on the suspects individually. The player must send in paragraphs explaining what and how they are voting, and also explaining how they came to this conclusion (using the Uber characteristics). There will also be an upper bound requirement of 1750/55 rating/deviation, where players only have to write a sentence or two explaining what they are voting and why.

5. Process is repeated. If all Pokemon are voted NU, then there are immediate threads placed for BL2->NU nominations, followed by a six week period for playtesting and then voting.

Matthew · Mar 27, 2010

Erazor said:
Okay, so let me outline what I feel the process should look like:

*Implementation of NU ladder on the SU server and creation of NU subforum in Stark. Maybe incorporated with the UU subforum?

Please feel free to tell me if there are more efficient ways of carrying out the test.

From what I understand, implementing a new ladder onto SU requires a ton of work. and I mean a ton. There's a perfectly fine ladder on CaP which can be played on and I'm sure that there can be an NU ladder implemented into SB2. That being said I think that our programmers should spend more time on that rather than implementing a new ladder.

Other than that I have no qualms about the process.

Seven Deadly Sins · Mar 27, 2010

Yeah, as far as I can tell that's the only reason that we don't already have a LC ladder on the SU server. Everything seems fine, but assuming that you're going to have a ladder on SU before SB2 comes out is extremely shaky.

The mindset is that programmers should spend more time on SB2 than on SB1, and I generally have to agree.

jumpluff · Mar 27, 2010

It's possible the ladders can be reused from CAP, though I don't know anything much about ladder implementation. I have a decent idea from setting up a server myself, but yeah.

Seven Deadly Sins · Mar 27, 2010

I thought so as well, but if it was that easy I'm sure Doug would have done it by now.

Starting NU Testing

twash

Kevin Garrett

is a competitor

Philip7086

Myuu

Itsuki

twash

eric the espeon

maybe I just misunderstood

Heysup

Fuzznip

jumpluff

Diesel

Aeolus

Bag

Erazor

✓ Just Doug It

jumpluff

Darkmalice

Level 3

twash

Erazor

✓ Just Doug It

Heysup

twash

ToF

Kevin Garrett

is a competitor

twash

Matthew

I love weather; Sun for days

Seven Deadly Sins

~hallelujah~

jumpluff

Seven Deadly Sins

~hallelujah~