Metagame Analyses: Gen VI changes

#77
Generally I like how the movesets factor in, although I would say that Choice Scarf should be subtract less than Band or Specs. This is because, although it is still an offensive item, it's is actually usable on stall teams for the purpose of revenge killing threats that can't be handled defensively, and a win condition. You will almost certainly never see a Pokemon with Huge/Pure Power, a Life Orb, Specs/Band, or a set up move on a team, and still be able to call it stall. You may, however, see a Scarfed Pokemon and be able to call the team stall, or at least semi-stall. For that reason I think it should subtract less, perhaps 0.25.
 

jc104

Humblest person ever
is a Contributor Alumnus
#78
Looks very cool antar. There are a couple of errors in there such as the brackets in innocent criminal's formula (shouldn't be there, I assume).

I think you need to consider which abilities are actually useful when you decide whether they should add offensiveness or defensiveness. Abilties that are not very useful, such as Anger Point, Blaze, Torrent, Overgrow, Sniper, and Battle Armor are usually chosen out of necessity rather than in an effort to make a pokemon more offensive. They should make less or no difference to the score.

"The moves Reflect and Dual Screen add 1 to the metric. Screens are rarely seen in stall and are immensely important for hyper offense. "

Do you mean subtract? Also Dual Screen is not a move lol. I'd say they should probably not affect the score though. They are sometimes used defensively, albeit not together. If you want anything to subtract I'd say Light Clay.

Will take a look in more detail later.
 

Antar

is a Battle Server Administratoris a Programmeris a Super Moderatoris a Community Contributor
Official Data Miner
#79
okay right off the bat ima say that reflect should not be considered highly offensive in stalliness - not on its own (so if the metric was referring specifically to both screens at the same time, disregard me). light screen imo is more so than reflect.

also, i think you wrote originally that reflect and light screen would be adding. i thought this would indicate more stalliness? typo?
"The moves Reflect and Dual Screen add 1 to the metric. Screens are rarely seen in stall and are immensely important for hyper offense. "

Do you mean subtract? Also Dual Screen is not a move lol. I'd say they should probably not affect the score though. They are sometimes used defensively, albeit not together. If you want anything to subtract I'd say Light Clay.
Yep. Typos.

And I'll go ahead and change this to Light Clay [edit: and "subtract"]

There are a couple of errors in there such as the brackets in innocent criminal's formula (shouldn't be there, I assume).
Lol, yes. Another typo.

I think you need to consider which abilities are actually useful when you decide whether they should add offensiveness or defensiveness. Abilties that are not very useful, such as Anger Point, Blaze, Torrent, Overgrow, Sniper, and Battle Armor are usually chosen out of necessity rather than in an effort to make a pokemon more offensive. They should make less or no difference to the score.
IIRC no one is forced to use Anger Point, but I concede the point. I'll be editing that ASAP.

To clarify, will you be taking the Pokemon's most used sets, or all of them, in your metric?
PS logs record the actual movesets of each pokemon, so I'll be using the moveset used by that Pokemon in the battle.

Generally I like how the movesets factor in, although I would say that Choice Scarf should be subtract less than Band or Specs. This is because, although it is still an offensive item, it's is actually usable on stall teams for the purpose of revenge killing threats that can't be handled defensively, and a win condition. You will almost certainly never see a Pokemon with Huge/Pure Power, a Life Orb, Specs/Band, or a set up move on a team, and still be able to call it stall. You may, however, see a Scarfed Pokemon and be able to call the team stall, or at least semi-stall. For that reason I think it should subtract less, perhaps 0.25.
As I say at the end of the post, by doing the averaging the way that I do, I ensure that an outlier doesn't dominate the team's score (Blissey on an otherwise offensive team was by best example). The problem with distinguishing between Band/Specs and Scarf is that, quite often than not, the sets play roughly the same way. Anyway, I knew that decision would be controversial. I'll be interested to hear what other people say.
 
#82
Well actually... idk. Dumb question there, sorry.
How are accuracy-boosting moves represented?(I assume -1 like the other offensive moves) What about defensive boosts(Cosmic Power)?
 
#83
Reading through the list of modifiers, something that stood out to me was Flash Fire subtracting half a point. While Flash Fire does increase the power of Fire moves, it also gives pokemon blessed with it a useful immunity. While some pokemon such as Houndoom and Arcanine appreciate its power boost, others care more about the ability to absorb Fire-type moves. Imagine a SpD Heatran without Flash Fire; all of a sudden, it doesn't make a very good Sun/Volcarona counter. Basically, the ability is equally offensive and defensive so I don't think FF should have a modifier at all.

In addition, Sap Sipper and Storm Drain, abilities similar to Flash Fire, don' have modifiers. No matter how you rule on FF, those three should be grouped together.
 
#84
Reading through the list of modifiers, something that stood out to me was Flash Fire subtracting half a point. While Flash Fire does increase the power of Fire moves, it also gives pokemon blessed with it a useful immunity. While some pokemon such as Houndoom and Arcanine appreciate its power boost, others care more about the ability to absorb Fire-type moves. Imagine a SpD Heatran without Flash Fire; all of a sudden, it doesn't make a very good Sun/Volcarona counter. Basically, the ability is equally offensive and defensive I don't think FF should have a modifier at all.
If this is implemented, it should be implemented for Sap Sipper, Lightningrod, Storm Drain, Water/Volt Absorb, etc.
Also, Rattled should be added to the list of offensive abilities.
 

alkinesthetase

<@dtc> every day with alk is a bad day
is a Live Chat Contributor Alumnus
#85
The problem with distinguishing between Band/Specs and Scarf is that, quite often than not, the sets play roughly the same way. Anyway, I knew that decision would be controversial. I'll be interested to hear what other people say.
hmm. we all know that specs/band are generally distinguished by making a mon extremely difficult to switch into by virtue of sheer power, but only rarely execute sweeps (eg scizor's bullet punch or other banded priority. only a few mons can do this well). scarf on the other hand is used for revenging, and for late-game cleanup. yes this is a generalization but it's a pretty accurate one, i'd say. i think many types of teams can benefit from the latter, but only offensive teams will benefit from the former, which is why i would argue that scarf be given an offensive lean, but less so than the other two choice items.

many stall teams can benefit from the use of a strong scarfer. often in full stall that's the only hard hitting mon on the entire team, which ensures that boosting sweepers don't get out of hand (an issue on true full stall teams where there's a lot of setup bait) and also gives the team a way to quickly clean up opposition without having to excessively prolong the match. having a mon that can instantly hit hard is also very useful against taunting stallbreakers and mono-attacking bulky sweepers that often excel against stall teams because they don't hit hard enough to break them down quickly.

in comparison i think cb/specs sets are almost exclusively useful for their ability to put switch-in pressure on an opponent. this is intended to break super-defensive walling mons that would impede the sweep of another mon (usually packing less sheer power, but more speed or the ability to switch moves or something along those lines). i have yet to see a full stall team having a use for such functionality because walls are more efficiently eroded by toxic and hazard abuse. since cb/specs has less usefulness on heavily defensive teams than scarf does, i would say scarf, while still an aggressive item, is less so than the other two.

will come back with more thoughts later

EDIT: alright, i definitely agree with 2sly4u on flash fire. immunity abilities should be neutral; they effectively change a pokemon's typing and that can affect a mon in a crapton of ways.
i also feel like if a mon cannot choose its ability, something should be done about the modifier for it (ie sure the ability has an effect on your modifier, but since you couldn't choose whether or not to use that ability, should the effect be reduced?), but i can't think of a good way to handle that. the example i was thinking of was rough skin sharpedo but obviously sharpedo should be running speed boost. if i get a better grip of how to express this idea i will bring it up again

EDIT @ below: weather is waaaaay too versatile to be categorized as stally or offensive. the fact that sand and hail deal passive damage is definitely not a defining factor on its own
 

Antar

is a Battle Server Administratoris a Programmeris a Super Moderatoris a Community Contributor
Official Data Miner
#87
What about Weather inducing abilities? Are they all neutral or do they have different ranks (sun at -0.5, sand at +0.5)?
They're all neutral.

@alkinesthetase The more I think about Scarf, the more I agree that it doesn't deserve -0.5, in which case it'll just be neutral.
 
#88
If this is implemented, it should be implemented for Sap Sipper, Lightningrod, Storm Drain, Water/Volt Absorb, etc.
Also, Rattled should be added to the list of offensive abilities.
The difference between Sap Sipper, Flash Fire, and Storm Drain (And I forgot Lightningrod, so that too) is that those either raise the power of a move type or raise stats. Other immunity abilities don't.
 

Antar

is a Battle Server Administratoris a Programmeris a Super Moderatoris a Community Contributor
Official Data Miner
#89
The difference between Sap Sipper, Flash Fire, and Storm Drain (And I forgot Lightningrod, so that too) is that those either raise the power of a move type or raise stats. Other immunity abilities don't.
Yeah, I'm gonna remove Flash Fire. Probably Flame Body, too, if I'm not doing anything with Static or Effect Spore.
 
#90
Well actually... idk. Dumb question there, sorry.
How are accuracy-boosting moves represented?(I assume -1 like the other offensive moves) What about defensive boosts(Cosmic Power)?
I'm DYING TO KNOW!!! Ok not really, but I am interested in the second question's answer especially.

Thanks
 

alkinesthetase

<@dtc> every day with alk is a bad day
is a Live Chat Contributor Alumnus
#91
on their own i think acc boosts are meaningless but hone claws should behave like all the other attack boosting moves and it's like the only +acc booster that matters (EDIT: lol coil. would have remembered it if not for its crappy ass distribution >_>)

as for defense boosters that's obviously a +1 on most mons (eg cosmic power sigilyph) but one must beware of baton pass, in which it's not really being used the same way. i wouldn't call baton pass a stall team but you're probably gonna find some amnesia/acid armor/iron defense in there. and as for mixed boosts like cm and bulk up, i'm leaning towards offensive, but not as much so as other boosting moves so tough to say
 
#92
on their own i think acc boosts are meaningless but hone claws should behave like all the other attack boosting moves and it's like the only +acc booster that matters

as for defense boosters that's obviously a +1 on most mons (eg cosmic power sigilyph) but one must beware of baton pass, in which it's not really being used the same way. i wouldn't call baton pass a stall team but you're probably gonna find some amnesia/acid armor/iron defense in there. and as for mixed boosts like cm and bulk up, i'm leaning towards offensive, but not as much so as other boosting moves so tough to say
Coil,lol
I've got to agree with you about everything else, though.
 
#93
Substitute subtracts 0.5 from the metric. Substitute is mostly used to allow its user to set up for a sweep, and the 25% health cost means that it doesn’t really work great with stall (which relies on a lot of switching anyways). There are Prankster-Sub-Recover strategies, but in that case, the net effect is in favor of stall (+0.5).
I find this part to be confusing. Does it check for abiltites like Snow Cloak and Poison Heal then determine the score for the Sub?

Other than that solid work.

Edit: Do you think the move Trick is a good candidate to subtract seeing how it is used to shut down walls and stuff like that. Also how about Psycho shift?
 

Electrolyte

and at once I knew I was not magnificent
is a Contributor Alumnusis a Smogon Media Contributor Alumnusis a Battle Server Moderator Alumnus
#95
that was actually very interesting to read; I have no idea why. Still, I must commend you on this, it looks really thought out and is explained well.

I'm also a bit on-the-fence about Light Clay; though they might be part of offensive teams, the user of Dual Screens is rarely offensive. (only exceptions are: Latias, Azelf) In that situation, it really depends on whether you're rating each pokemon as itself, or doing something like averaging each team's score.

I think Baton Pass should be mentioned, preferrably subtracting 1. Stall/Defense rarely uses Baton Pass, in fact I don't recall ever battling even a semi stall team with Baton Pass. It is definitely a offensive only strategy. The reason why I think it should subtract one whole point is because it's extreme, pure offense. There's really no Stall to it.

This is just my opinion, but I think WiloWisp should only add 0.5. Plenty of offensive teams use WiloWisp- as a way to cripple the opponent so they have an easier time boosting, or just statusing the opponent so they're easier to break through. Of course, I agree with you on the fact Stall definitely sues WoW more than offense- but I really don't think it's worth twice the opposite of Choice Items.

Those are my immediate reactions; nice job!
 

alkinesthetase

<@dtc> every day with alk is a bad day
is a Live Chat Contributor Alumnus
#96
I'm also a bit on-the-fence about Light Clay; though they might be part of offensive teams, the user of Dual Screens is rarely offensive. (only exceptions are: Latias, Azelf) In that situation, it really depends on whether you're rating each pokemon as itself, or doing something like averaging each team's score.
this is interesting, making a distinction between an individual mon's score and the score of the team as a whole. let's suppose the stalliness metric is used to gather stats about individual mons in addition to a whole team (if this metric is only used to measure entire teams then there's no point to the distinction between an individual mon and its team). then what do we think a dual screen user should look like? do we call it an offensive mon because it really only fits on an offensive team? or is it a defensive mon because ultimately screens are defensive moves, even if they're being used on an offensive team?

i personally lean towards the former in which case the current implementation makes sense, but if you preferred the latter, you could make screens neutral on the mon, and instead implement a team bonus sort of thing where certain moves on a mon affect the team score, but not the mon itself. i haven't thought too much about how to establish such a bonus and fit it into the averaging system. the easiest thing would be to, when determining the score of a mon, ignore screen bonus, but when determining the average score for its team, add the screen bonus to that mon prior to averaging, if i make myself clear...
 

Antar

is a Battle Server Administratoris a Programmeris a Super Moderatoris a Community Contributor
Official Data Miner
#97
I'm DYING TO KNOW!!! Ok not really, but I am interested in the second question's answer especially.
on their own i think acc boosts are meaningless but hone claws should behave like all the other attack boosting moves and it's like the only +acc booster that matters (EDIT: lol coil. would have remembered it if not for its crappy ass distribution >_>)

as for defense boosters that's obviously a +1 on most mons (eg cosmic power sigilyph) but one must beware of baton pass, in which it's not really being used the same way. i wouldn't call baton pass a stall team but you're probably gonna find some amnesia/acid armor/iron defense in there. and as for mixed boosts like cm and bulk up, i'm leaning towards offensive, but not as much so as other boosting moves so tough to say
Coil,lol
I've got to agree with you about everything else, though.
Moves that boost accuracy alone are neutral. Moves that boost defenses alone are neutral (they *don't* actually help stall, because Stall usually involves a lot of switching)

I find this part to be confusing. Does it check for abiltites like Snow Cloak and Poison Heal then determine the score for the Sub?
The idea is that 25% HP loss from sub is too big a price to pay for it to really be +stall. On the other hand, sub is often important for setting up. And note that sub-stall strategies almost always make use of a recovery move, in which case the net effect will be +0.5

Edit: Do you think the move Trick is a good candidate to subtract seeing how it is used to shut down walls and stuff like that. Also how about Psycho shift?
Since Trick is most commonly utilized with Choice items, I'd imagine that they'd have the same rating.
Yeah, Trick probably deserves -0.5 and Psycho Shift a +0.5. I don't want to assign scores for Flame / Toxic Orb since they can presumably be used for both stall and offense, depending on the ability (and I don't want to get into ability interactions)

I think Baton Pass should be mentioned, preferrably subtracting 1. Stall/Defense rarely uses Baton Pass, in fact I don't recall ever battling even a semi stall team with Baton Pass. It is definitely a offensive only strategy. The reason why I think it should subtract one whole point is because it's extreme, pure offense. There's really no Stall to it.
Baton Pass itself doesn't get a mod. Any mods come from the setup moves.

This is just my opinion, but I think WiloWisp should only add 0.5. Plenty of offensive teams use WiloWisp- as a way to cripple the opponent so they have an easier time boosting, or just statusing the opponent so they're easier to break through.
This is based in math: halving your opponent's attack means you can survive twice as many hits.

this is interesting, making a distinction between an individual mon's score and the score of the team as a whole. let's suppose the stalliness metric is used to gather stats about individual mons in addition to a whole team (if this metric is only used to measure entire teams then there's no point to the distinction between an individual mon and its team). then what do we think a dual screen user should look like? do we call it an offensive mon because it really only fits on an offensive team? or is it a defensive mon because ultimately screens are defensive moves, even if they're being used on an offensive team?

i personally lean towards the former in which case the current implementation makes sense, but if you preferred the latter, you could make screens neutral on the mon, and instead implement a team bonus sort of thing where certain moves on a mon affect the team score, but not the mon itself. i haven't thought too much about how to establish such a bonus and fit it into the averaging system. the easiest thing would be to, when determining the score of a mon, ignore screen bonus, but when determining the average score for its team, add the screen bonus to that mon prior to averaging, if i make myself clear...
This is an interesting idea, and I may consider it. For now, though, I'm saying that a Dual Screener is inherently offensive, in-and-of-itself.
 

Antar

is a Battle Server Administratoris a Programmeris a Super Moderatoris a Community Contributor
Official Data Miner
So how did my stall metric end up doing?

http://pokemetrics.wordpress.com/2012/09/01/testing-the-metric/

Pretty well... until you realize that you get basically the same results using bias, with a LOT less work.

The usage stats that are going up tomorrow (it's the still the 31st in EST) will contain metagame analyses, includuing graphs of stalliness.

This example is for Little Cup, using data collected for Aug. 1-30
Code:
 Stalliness (mean: -0.625)
     |#
 -2.0|#####
     |#########
 -1.5|##################
     |#########################
 -1.0|#############################
     |#############################
 -0.5|##############################
     |#########################
  0.0|########################
     |###############
 +0.5|#########
     |#####
 +1.0|##
     |#
 +1.5|#
 more negative = more offensive, more positive = more stall
 one # = 81 teams ( 0.44%)
Based on these graphs, we'll soon be able to discuss defining things like "heavy offense" and "semi-stall." I look forward to that discussion.