Go Back   Smogon Community > Pokémon > Smogon Metagames
Register FAQ Social Groups Calendar Search Today's Posts Mark Forums Read

Reply
 
Thread Tools
Old Sep 18th, 2012, 2:45:15 PM   #126
SpecsX
 
SpecsX's Avatar
 
Join Date: Aug 2012
Posts: 372
Starfing with Harvest
Default

Quote:
Originally Posted by Fat eric the espeon View Post
Read through all the blog posts and this thread. Interesting project, and cool scatter plots. Mostly looks very good, though there were a few things which seem like they may be questionable. If you're not keen on revisiting things that's fair enough, but my thoughts:

Why exactly do you feel LO should have exactly the same effect as Choice items? From my experience Choice users tend to be more wallbreakers than full sweepers, and unlike choice items Life Orb has a significant direct harm to the holder's defensive ability. My gut feeling is to give LO a higher rating, though I'm not entirely sure about that.

Second and much more major point, you seem to be discarding the lower offensive and defensive stats entirely.

This will mean your formula cannot take into account the advantages of being a mixed sweeper, or, more importantly, the fact that some Pokemon may have one decent defensive stat but be extremely frail to the other kind of attacks (Cloyster, Aggron, Blissey, and Mantine are excellent examples, but even more mildly unbalanced defenses will cause a Pokémon's stallishness to be overestimated to a lesser extent). I can see why you'd want to make that simplification, dealing with both stats can get kind of messy, but this seems likely to be the biggest issue with your formula's correctly assigning stallishness from stats. For attacks perhaps raising both to a power, adding them, then taking that power's root of the result would be effective? A larger power would mean a smaller boost for mixed attackers, and visa versa. Ideally this would only be applied if the set used both physical and special moves. A similar method (perhaps with a different power) could be used for defenses.

Doing this may complicate the effects of certain items. In particular, Eviolite and the Choice items could no longer reasonably said to grant exactly the same boosts. Applying the item boosts in the initial calculation would solve this. And doing the same with Life Orb changes the previous point, applying the boost to both stats then having a smaller modifier simply from HP loss which is near equal or equal seems sane.

Generally a good idea, but I'd suggest some change to how healing berries and berry juice are handled. In LC holding an item like that gives a massive boost to endurance, even though Eviolite seems much more popular in 5th gen and Berry Juice is banned from both. Making these items have the same effect as Salac or a Gem seems backwards. I'd suggest making one time use items which heal health either have +0.5 or at least be neutral (also helps with VGC/doubles/triples, where Sitrus is somewhat viable, and clearly more defensive than other one-time use items). Status healing berries are more debatable. They're used with Rest for one time healing, but of course that's still just a one time thing, not full stall's style, but also not hyper offense style.

Halving the change to the metric because of a fairly small difference in health gained, when it can be activated on the switch rather than needing a turn to just heal.. hm, maybe it's not quite as stally as others, but 0.5 does seem slightly low.

Also missing items which seem possibly worth considering:
Expert Belt
20% type boosting items and plates
Wise Glasses and Muscle Band
Most species specific boosting items (Soul Dew, DeepSeaTooth, DeepSeaScale, Light Ball, Thick Club, Adamant Orb, Lustrous Orb, Griseous Orb, and maybe Ditto's two, Lucky Punch, and Stick?) when held by the correct species
Shell Bell
And to generalise this to all generations, maybe Berserk Gene?
First of all, VGC and LC aren't calculated in his blog posts. They are so radically different from other metas that they should be formula'd differently.
Eh... I think Life Orb is less offensive than Choice Band/Specs, and more offensive than Scarf. This is going on the power output alone, but it's certainly debatable.
Expert Belt and Species Specific Boosting Items(SSBI) These really should be in there. EBelt is a very viable item that shouldn't be ignored(especially with Genesect just being released, and the SSBI are HUGE differences to a pokemon's playstyle. THe best thing to do here is to calculate them with the same level as the moves they imitate(i.e. Soul Dew=Calm Mind, Thick Club=Swords Dance). Certainly the easiest thing to do.
__________________
<awaychuck> alk your mom is so stupid that she thought the chain rule was used by antebellum mathematicians in the South
On Patrat: <Dracoyoshi8> i like to imagine its cheeks are full of semen
SpecsX is offline   Reply With Quote
Old Sep 25th, 2012, 11:32:49 AM   #127
Antar*
That's Dr. Antar to you
is a Battle Server Administratoris a Super Moderatoris a Programmeris a Community Contributor
 
Antar's Avatar
 
Super Moderator
Join Date: Feb 2010
Posts: 2,051
DC Metro Area
Default

@SpecsX--your comments will be addressed below as well.


Okiedoke. First the missing items.

Quote:
Originally Posted by Fat eric the espeon View Post
Most species specific boosting items (Soul Dew, DeepSeaTooth, DeepSeaScale, Light Ball, Thick Club, Adamant Orb, Lustrous Orb, Griseous Orb, and maybe Ditto's two, Lucky Punch, and Stick?) when held by the correct species
Most of these should definitely be added, with species enforcement.
  • Soul Dew should subtract 0.5 from the metric--as much as specs
  • Light Ball, Thick Club and DeepSeaTooth should subtract 1.0 from the metric
  • DeepSeaScale should add 1.0 to the metric
  • Metal and Quick Powder do not work after Ditto has transformed, so they should be ignored.
  • The crit items (Stick, Lucky Punch, but also Scope Lens) are too inconsistent to be factored in.

Quote:
Also missing items which seem possibly worth considering:
Expert Belt
20% type boosting items and plates
Wise Glasses and Muscle Band
Turns out that log_2(1.2)=.263 = ~.25. I'll consider adding them in.

Quote:
Shell Bell
Shell Bell really has only one viable use in Singles: Sturdy-FEAR. But I already account for FEAR, and I don't want to give too much weighting for what is really a gimmicky strategy.

Quote:
And to generalise this to all generations, maybe Berserk Gene?
I had to look this one up. Raises attack one stage, then confuses. If/when we implement Gen II on PS, this will fall into the category of "one-time use" items, which puts it on the same footing as Leichi Berry, which I think makes sense.

Quote:
Also, why exactly do you feel LO should have exactly the same effect as Choice items? From my experience Choice users tend to be more wallbreakers than full sweepers, and unlike choice items Life Orb has a significant direct harm to the holder's defensive ability. My gut feeling is to give LO a higher rating, though I'm not entirely sure about that.
You'll note that I named by metric "stalliness" rather than "offensiveness." That's because it really is more about stall vs. "anti-stall" than stall vs. offense. There are two ways to combat stall. The first is to "wallbreak," the second is to sweep. You can't (or rather, it's hard to) sweep with a Band or Specs, but as you say, Choice items are potent for destroying stall. Weighting one over the other is a tricky business.

Quote:
Leftovers
Several times as I played with the metric I was confronted with the problem of "intent," which ideally would not be an issue at all. I had to ignore modifications from many abilities simply because they tend to have no practical use on most sets, and if I accounted for them, the metric would get thrown off. Not accounting for Leftovers at all was thus more of a "fitting" decision than anything else. However, when I add in Expert Belt et al., I'll consider throwing in a +0.25 for Lefties as well.

Quote:
discarding the lower offensive and defensive stats entirely.
I stand by this decision. Basically, my reasoning assumes that if a matchup is unfavorable, one can always make a switch. You aren't going to leave Blissey in against Machamp. You aren't going to try to take out Steelix with Druddigon. Are mixed sweepers more potent than single-side sweepers? I've wrecked enough teams with mix Deo-A (Espeed / Superpower / Ice Beam / Psycho Boost) to know that walling such sets is much more difficult. But it's been my experience that truly effective mixed sets are few and far between and don't really need the extra weight to be classified as the deadly beasts they are.

As for "double walls," a well-constructed stall team is dependent not on one Pokemon being able to slow all attacks but on several Pokemon having the synergy to completely block each other's weaknesses. In other words, it's been my experience that a wall really is as good as its strongest defensive side. On my super-stally RU team, Audino NEVER takes a Close Combat, and Steelix NEVER takes a Flamethrower.

Quote:
In LC holding an item like that gives a massive boost to endurance, even though Eviolite seems much more popular in 5th gen and Berry Juice is banned from both. Making these items have the same effect as Salac or a Gem seems backwards.
SpecsX correctly pointed out that I'm not touching doubles/triples with a ten-foot pole (Item Clause alone completely ruins many of my assumptions). But this metric *should* still be somewhat applicable to LC.

The bottom line is that one-use items are the antithesis of stall. In non-LC, the only reason to use one (namely Sitrus Berry) is to give the user a little more time to set up/execute a sweep (Belly Drum Linoone, I'm looking at you). Oran Berry has very much fallen out of disfavor in the current LC metagame, due to Eviolite netting more than 10 "effective hit points" on any Pokemon whose HP is greater than 20 (and most Pokemon with less than 20 hit points are too frail to benefit from Oran anyway). The only time you really see Oran is with Sturdy Pokemon, who often use it the same way that BD Linoone uses Sitrus.

If Berry Juice is ever unbanned or Gen IV Little Cup ever goes live on PS, I will reconsider this decision (will probably play it safe and make them neutral).

Quote:
This is assuming Protect is used purely as a stall tactic, rather than to activate a status orb, delay for more Speed Boosts, or for scouting dangerous moves as a frail sweeper.
This is true, but Guts and Speed Boost subtract 0.5 from the metric to counteract this effect somewhat. What I *need* to do is add a rule that says "if Guts / Flare Boost / Toxic Boost / Quick Feet AND Toxic/Flame Orb, subtract 1," which would cancel out Protect.

Speed Boost should also probably have a more significant weight (Moody as well).

Beyond that, I've found my Protect: +1 (which is based on solid mathematical footing, after all) to work pretty well.

Quote:
Re: Regenerator... Halving the change to the metric because of a fairly small difference in health gained, when it can be activated on the switch rather than needing a turn to just heal.. hm, maybe it's not quite as stally as others, but 0.5 does seem slightly low.
It's simply due to the fact that less health is gained, and, in the presence of entry hazards, this will be even more true. Again, it seems to work out okay.



Okay, so making the modifications I talk about above (with and without Leftovers: +0.25), and running the new revised metric against the RMT archive, here's what I get:


As I expected, Blue and Red are pretty much identical (most of these new modifications weren't applicable to the teams in the archive--if they had been, I would have probably implemented them earlier). Meanwhile, adding the Leftovers modification seems to have a negligible effect on HO, but as the metagames get stallier, the difference becomes more and more pronounced.

It's hard to decide which version (red or magenta) is "better." Certainly, I could fix the SS-FS cutoff simply by bumping it from 9 turns/KO to 10 (in order to get that stalliest magneta team in line with the others would necessitate bumping it to the very not-round number of 10.4--I might prefer to move it up to 11--itself an ugly number--and lose the least stally full stall team).

I'll have to think on this. I would definitely welcome input.
__________________
Codes and Hacks I Use
PBR FC: 4898-8739-8815 (See here)
Black FC: 4040 5386 0128 / White 2 FC: 4771 3664 7215
My Narrated PBR & Gen V Battles
My Trade Thread
Convert any sim team to pkms
Pokemetrics: A Blog
Antar is offline   Reply With Quote
Old Sep 25th, 2012, 1:41:31 PM   #128
Amarillo
yellow!
is a Tiering Contributoris a Contributor to Smogon
 
Amarillo's Avatar
 
Join Date: Nov 2010
Posts: 713
Default

First, one thing I thought was making a cut-off for the ratings of people to actually count in this calculation. As much as I don't like this hate on 'noobs', in spirit of pure statistics they ruin the curve. They let their 'stally' team die off quickly. They don't get enough kills with their offensive teams. But unlike the '1337' stats, you want to lower the cutoff significantly - you will want 2 people who know what they're doing battling each other. If the cutoff is too high, then your sample size will be way too small.

Secondly, how about making broader categories than HO, Offene, BO, Balance, Semi-Stall, Full Stall? I'm sure you've heard of people knock on terms such as 'Semi-Stall.' Offense, Balance, Stall might be a relatively more 'objective' comparison. Going along the lines of Archived teams, note there is an 'All-Out Offense' team which makes you wonder if all these can be mutually exclusive / can be even differentiated to begin with. Yes, you can see that teams classified as 'Heavy Offense' or 'Semi Stall' are really rare. You also see the stats for 'Offense' and 'Bulky Offense' is nearly identical. Something to think about - it really does not mean any decrease in the quality of this analysis.

If this idea doesn't interest you, though, and you're concerned with making the minute distinctions, go ahead. It's just that no one will 100% agree with whatever definitions you stick with.

Next up is the concept of typing / resistances as it affects the 'stalliness.' However stally moves you give your ice-type, chances are it will not survive very long, etc. Not exactly sure how to proceed with this, though. Probably as some attack types are more common in one tier as opposed to another. Dragon resistances are not quite a thing in lower tiers, for an obvious example.

Finally, some small nitpicks: I'd say Taunt counts as a good anti-stall measure and probably signifies that the team is rather offensive, especially considering its short duration. Taunt + WOW + Recovery is indeed a thing, but that leads to an overall increased stalliness. Probably a -0.5? But that is up to you, I guess! Also, I'm not sure if Sub should be negative, given the very nature of the move, it delays the kill.

I'm interested in where this is headed! (I'm a big math / GRAPHS person irl)
__________________
~~
Amarillo is offline   Reply With Quote
Old Sep 25th, 2012, 2:09:55 PM   #129
Antar*
That's Dr. Antar to you
is a Battle Server Administratoris a Super Moderatoris a Programmeris a Community Contributor
 
Antar's Avatar
 
Super Moderator
Join Date: Feb 2010
Posts: 2,051
DC Metro Area
Default

Thanks for the feedback!

Quote:
Originally Posted by Fat Amarillo View Post
cut-off for the ratings
First off, this metric is completely divorced from rating. If you're concerned about how fuzzy those scatterplots look, I completely agree, and I do strongly suspect the correlation will be stronger when I limit myself to only players with a certain rating.

Quote:
broader categories
I make these distinctions purely for testing purposes to see how well predictions based on "stalliness" agrees with RMT Archive classification. If you look at last month's metagame analyses, you'll see that instead of reporting the percentage of teams that were semi stall vs. full stall, I instead plot a histogram of stalliness values.

Next month, I plan to do BOTH.

Quote:
Next up is the concept of typing / resistances as it affects the 'stalliness.' However stally moves you give your ice-type, chances are it will not survive very long, etc. Not exactly sure how to proceed with this, though. Probably as some attack types are more common in one tier as opposed to another. Dragon resistances are not quite a thing in lower tiers, for an obvious example.
I have considered this, yes. I may add it to future revisions, but for now I've ignored it.

Quote:
I'd say Taunt counts as a good anti-stall measure and probably signifies that the team is rather offensive, especially considering its short duration.
Taunt also works really well at preventing sweepers from setting up and from entry hazards from being put on the field--two roles that are quite important for stall.

Quote:
Also, I'm not sure if Sub should be negative, given the very nature of the move, it delays the kill.
Sub-stall strategies (Sub+Roost, Sub+Leech Seed) work out to be net positives, but in general, subs are set-up moves. Stall relies on frequent switching. If you try to do that while setting up subs, you're going to wear down your health AWFULLY fast.
__________________
Codes and Hacks I Use
PBR FC: 4898-8739-8815 (See here)
Black FC: 4040 5386 0128 / White 2 FC: 4771 3664 7215
My Narrated PBR & Gen V Battles
My Trade Thread
Convert any sim team to pkms
Pokemetrics: A Blog
Antar is offline   Reply With Quote
Old Sep 25th, 2012, 9:00:27 PM   #130
Amarillo
yellow!
is a Tiering Contributoris a Contributor to Smogon
 
Amarillo's Avatar
 
Join Date: Nov 2010
Posts: 713
Default

what I meant by mentioning those two moves


/end rant

Now that's out of the way, I am curious about something. You mentioned Stall relying on frequent switching and all: how does your stalliness index correlate to how often a team switches in on the opponent? You mention how you can figure out turn / KO ratio. Can you do something similar like turn / Switch ratio?

Idk if you have time for all this, but you could go further. You could analyse the ratio of turns for the type of move you make. 'Offensive Moves' (Damage Dealing moves / Boosting setup moves for example), 'Defensive / Utility Moves' (recovery, hazard stack, toxic, burn), 'Switching' (when you switch something in and it doesn't die), 'Sacrifice' (switch something in and gets KOed, or you leave in something to die), and 'Others' (Double Switches, etc, I'm sure I covered most of my bases)

By definition, if you are sponging around by switching, you are making a defensive play. If you are sacking and attacking, you are making an offensive play. Instead of looking at the team you look at how they play. I hope this would bring up a very to-the-point correlation.

Unless, of course, if this is too hard to implement.
__________________
~~
Amarillo is offline   Reply With Quote
Old Sep 25th, 2012, 9:52:56 PM   #131
Scarfwynaut
 
Scarfwynaut's Avatar
 
Join Date: Mar 2011
Posts: 2,096
PA
Default

Yeah taunt can go either way, on many stall teams I have had I used taunt on Gliscor so it can beat bulk-up Conkeldurr. If you want to count it as offense go ahead, I mean sleep moves are considered offensive, and plenty of stall teams use spore Amoonguss, I doubt it would hurt the stalliness.
__________________
I am not Scarf Wynaut on Pokemon Showdown. I am PrincesoBubblegum
Scarfwynaut is offline   Reply With Quote
Old Sep 26th, 2012, 6:11:01 AM   #132
Antar*
That's Dr. Antar to you
is a Battle Server Administratoris a Super Moderatoris a Programmeris a Community Contributor
 
Antar's Avatar
 
Super Moderator
Join Date: Feb 2010
Posts: 2,051
DC Metro Area
Default

Quote:
Originally Posted by Fat Amarillo View Post
Now that's out of the way, I am curious about something. You mentioned Stall relying on frequent switching and all: how does your stalliness index correlate to how often a team switches in on the opponent? You mention how you can figure out turn / KO ratio. Can you do something similar like turn / Switch ratio?
Yeah, I can. Here's how my usage data scripts work.

First, I have a "Log Reader" that reads in the raw PS battle logs (which are in a format called json and are MUCH MORE machine-readable than, say PO html logs). This "Log Reader" takes all the events in the PS battle log and distills them down to a summary of the events of the battle, such as the following:

Code:
[REDACTED] (bias:-280, stalliness:0.39415090447, tags:hail,balance)
Staryu (3,3)
Houndour (2,7)
Snover (1,4)
Misdreavus (0,3)
Mienfoo (1,2)
Bronzor (0,0)
***
[REDACTED] (bias:616, stalliness:-1.57102191703, tags:weatherless,hyperoffense)
Staryu (0,3)
Cyndaquil (1,3)
Mienfoo (0,4)
Stunky (1,5)
Gastly (0,0)
Cacnea (1,4)
@@@
Mienfoo vs. Snover: Snover was switched out
Mienfoo vs. Misdreavus: Mienfoo was switched out
Misdreavus vs. Stunky: Misdreavus was KOed
Houndour vs. Stunky: Stunky was switched out
Houndour vs. Staryu: Houndour was switched out
Snover vs. Staryu: Staryu was switched out
Gastly vs. Snover: Gastly was KOed
Mienfoo vs. Snover: Snover was switched out
Mienfoo vs. Staryu: Mienfoo was KOed
Cacnea vs. Staryu: Staryu was switched out
Cacnea vs. Snover: Snover was KOed
Cacnea vs. Houndour: Cacnea was switched out
Houndour vs. Staryu: Staryu was KOed
Houndour vs. Stunky: Stunky was KOed
Cyndaquil vs. Houndour: Houndour was KOed
Cyndaquil vs. Mienfoo: Cyndaquil was switched out
Cacnea vs. Mienfoo: Cacnea was u-turn KOed
Cyndaquil vs. Staryu: Cyndaquil was KOed
Note that I don't record the moves, only the results of the individual matchups. The idea was that, at some point when I could figure out how to present the data, I'd publish a "matchup matrix" which told you what happened statistically when Pokemon A went up against Pokemon B (from that, you can get statistics for who the best counters are for each Pokemon, that sort of thing).

But since each switch is recorded, and the number of turns in the battle is easily distilled from the header (the two numbers in parentheses by each Pokemon's name are (KOs,turns in battle)), this would be trivial to look at. Some time after September 1, I'll give it a look.

Quote:
Idk if you have time for all this, but you could go further. You could analyse the ratio of turns for the type of move you make. 'Offensive Moves' (Damage Dealing moves / Boosting setup moves for example), 'Defensive / Utility Moves' (recovery, hazard stack, toxic, burn), 'Switching' (when you switch something in and it doesn't die), 'Sacrifice' (switch something in and gets KOed, or you leave in something to die), and 'Others' (Double Switches, etc, I'm sure I covered most of my bases)
Much of this would involve modifications to my log reader, but that's certainly doable.
__________________
Codes and Hacks I Use
PBR FC: 4898-8739-8815 (See here)
Black FC: 4040 5386 0128 / White 2 FC: 4771 3664 7215
My Narrated PBR & Gen V Battles
My Trade Thread
Convert any sim team to pkms
Pokemetrics: A Blog
Antar is offline   Reply With Quote
Old Sep 26th, 2012, 10:32:54 AM   #133
eric the espeon
maybe I just misunderstood
is a Pokémon Researcheris a Contributor to Smogonis a Forum Moderator Alumnusis a Tiering Contributor Alumnus
 
Join Date: Aug 2007
Posts: 3,695
Default

If I don't reply to something, either you've made the change I suggested, or I accept your reasoning for not making the change and don't have a counter argument. I've reordered a bit.

Quote:
I stand by this decision. Basically, my reasoning assumes that if a matchup is unfavorable, one can always make a switch. You aren't going to leave Blissey in against Machamp. You aren't going to try to take out Steelix with Druddigon. Are mixed sweepers more potent than single-side sweepers? I've wrecked enough teams with mix Deo-A (Espeed / Superpower / Ice Beam / Psycho Boost) to know that walling such sets is much more difficult. But it's been my experience that truly effective mixed sets are few and far between and don't really need the extra weight to be classified as the deadly beasts they are.

As for "double walls," a well-constructed stall team is dependent not on one Pokemon being able to slow all attacks but on several Pokemon having the synergy to completely block each other's weaknesses. In other words, it's been my experience that a wall really is as good as its strongest defensive side. On my super-stally RU team, Audino NEVER takes a Close Combat, and Steelix NEVER takes a Flamethrower.
While you're unlikely to leave a one side only wall in against their weakness for long, a one side only wall seems massively less defensive than a wall which could take hits as well as that one side wall from both physical and special attackers. You can switch in an unfavorable matchup, but doing so shows that this set has been unable to stall out the foe, and had to bring in a team mate. Using a fairly small power for the multiplication you could easily enough give an appropriate boost. There being few or many deadly mixed sweepers seems irrelevant to the extra potency given by the ability to strike with physical and special attacks.

If Blissey had 130 base defense it would be rated as exactly the same stalliness as it currently is. And it would be staying in against pretty much any physical attacker. If Druddigon had 120 base special attack it would again have exactly the same rating currently, and it really would be blasting right through Steelix with Flamethrower. These are of course thought experiments of extreme cases, but it helps to clarify, and you're going to be having these same effects on a smaller scale for practically all sets (especially those which split EVs, for equal attack Pokémon like Deo-A if you're putting some EVs in both Atk and SpA you're going to be classed as more stally than just dumping it into one attack). With the method I suggested previously ((Atk^x+SpA^x)^-x, same for defenses, with perhaps a different x value), you could weigh the lesser defensive and offensive stat as strongly or weakly as appropriate with relative ease. Pokémon which are very capable of taking one kind of attack would still be given a fairly high stalliness rating so long as x is not very small, which corresponds to them taking only the kind of attacks you want.

Your reasoning holds for not weighing both greater and lower stats equally, but it does not show that the lower stat should have zero weight.

Quote:
You'll note that I named by metric "stalliness" rather than "offensiveness." That's because it really is more about stall vs. "anti-stall" than stall vs. offense.
Hm, I'm curious about the distinction between anti-stall and offense.

Quote:
Several times as I played with the metric I was confronted with the problem of "intent," which ideally would not be an issue at all. I had to ignore modifications from many abilities simply because they tend to have no practical use on most sets, and if I accounted for them, the metric would get thrown off. Not accounting for Leftovers at all was thus more of a "fitting" decision than anything else. However, when I add in Expert Belt et al., I'll consider throwing in a +0.25 for Lefties as well.
Perhaps ignoring abilities was not necessarily the best way to handle it? As you say, intent would ideally not be an issue. I'd like to think it was possible to make a metric which can work simply off the stalliness of the set, taking as much into account as it can. Maybe the reason the metric was getting so thrown off was that abilities were given too high weights?

Quote:
If Berry Juice is ever unbanned or Gen IV Little Cup ever goes live on PS, I will reconsider this decision (will probably play it safe and make them neutral).
hm, if you're likely to change it when gen 4 LC comes to PS.. is it not worth preparing the formula for tiers which are not yet live on PS, since it's being used not just by you but by UPC's team and set analyzer? Obviously lower priority, but still. Also, is Oran Berry not used on Sub sets in LC, like Drifloon? hm, actually, yea, even 5th gen LC has quite a few Pokémon which sometimes prefer HP to defenses for more Subs, Wynaut's countercoat, or various abilities.

Quote:
Graphs
hm, I like how adding leftovers differentiates between the different styles more (other than the most stally bulky offense, which seems to be an anomaly anyway, perhaps classified incorrectly? or perhaps an example of something that's being missed by the metric? which team is that?). The balance/semi-stall division seems to benefit most from it, though with only three semi-stall teams it could be a fluke.


Also, something which you seem not to have replied to from my previous post was the idea of applying the stat modifying effects before doing the damage to self calc. This may be mathematically equivalent to current implementation in many cases, but (especially if you take both stats into account) it seems a neater way to handle things (prevents the need for rounding on 20% boost items to get a tidy number, and with the split, makes Life Orb, Light Ball, etc much simpler, makes Wise Glasses and Muscle Band give appropriate boosts (currently their 1.1x for both is a ~20% overall boost, but Light Ball's 2x boost for both is a 100% boost, inconsistent)), and makes it a more easy to understand for those not familiar with logs. The direct changes to the score from other effects are useful, but applying boosts directly when possible seems sane.

And:
Quote:
Originally Posted by Fat blog post
Light Ball, Thick Club and DeepSeaTooth subtract 1.0 from the metric when held by the correct Pokemon
DeepSeaTooth adds 1.0 to the metric when held by Clamperl
__________________
For people who like storing things: The Box
Reading and LC? LCF, LC Guide, LC Analyses
Good channels: #littlecup, #C&C, #1v1, others
And for SCMS editors: SCMS group
ete on IRC. Goodbye Smogon. Good luck, was fun while it lasted.
eric the espeon is offline   Reply With Quote
Old Sep 26th, 2012, 12:08:27 PM   #134
Antar*
That's Dr. Antar to you
is a Battle Server Administratoris a Super Moderatoris a Programmeris a Community Contributor
 
Antar's Avatar
 
Super Moderator
Join Date: Feb 2010
Posts: 2,051
DC Metro Area
Default

Quote:
Originally Posted by Fat eric the espeon View Post
Re: Mixed sweepers / double-walls
Your reasoning holds for not weighing both greater and lower stats equally, but it does not show that the lower stat should have zero weight.
In physics terms, you've just acknowledged that adding the lower stats is a "higher order" correction. It's not something I'm interested in dealing with right now, but if you want to propose a specific change or test one yourself, my source code is readily available, and I'll provide you with a sample dataset if you'd like.

Quote:
Hm, I'm curious about the distinction between anti-stall and offense.
Simply put, this post: higher degree of stall should correlate with longer battles, and thus team strategies that lead to shorter battles would be "anti-stall." Set-up sweeping--a hallmark of hyper-offense--is one way to get a low turns/KO ratio, but it's not the only way to skin this particular cat.

Quote:
Perhaps ignoring abilities was not necessarily the best way to handle it? As you say, intent would ideally not be an issue. I'd like to think it was possible to make a metric which can work simply off the stalliness of the set, taking as much into account as it can. Maybe the reason the metric was getting so thrown off was that abilities were given too high weights?
Again, "higher order" terms. What definitely needs to be factored in is not so much intent but utility. Rivalry is hard to pull off unless you're on a default-gender simulator (old PO). Stat-dropping moves and abilities are too rare for Defiant to come into play very often. Same with stuff like Super Luck, Sniper and Anger Point.

Quote:
Also, is Oran Berry not used on Sub sets in LC, like Drifloon?
You actually just made my case for me. Here, Oran Berry is very much assisting in a sweep, doubling speed with Unburden and working alongside Sub, which, as I argued a few posts ago, is an inherently offensive move.

If this sounds confusing, it's because you're thinking of it in terms of "how easy is this Pokemon to kill" rather than "how easy is this Pokemon to kill vs. how easy is it for this Pokemon to kill?"

Quote:
Wynaut's countercoat
Wobbs and Wynaut are VERY offensive Pokemon. Arena Trap guarantees either a dead Pokemon at the end of the matchup, or, at the very least, a free switch into a teammate who needs it to set up.

Quote:
hm, I like how adding leftovers differentiates between the different styles more
It's true that Leftovers adding greater differences between the various teams is an argument in favor of applying the moveset modification, but I don't think I'll be doing it this month (before Sept. 1), as I either need to come up with a counterbalance to help widen the gap between stall, semi-stall and balance.

Quote:
other than the most stally bulky offense, which seems to be an anomaly anyway, perhaps classified incorrectly? or perhaps an example of something that's being missed by the metric? which team is that?
Quote:
Originally Posted by Fat Antar View Post
The following teams had discrepancies between archive classification and "stalliness" classificiation:

Bulky Offense
  • Negative 3 @ 1.26. Wish/Protect Jirachi + no-attack Skarm + Slowbro throws everything off.
Quote:
with only three semi-stall teams it could be a fluke.
I would LOVE to add some more semi-stall teams to my sample set.

Quote:
Also, something which you seem not to have replied to from my previous post was the idea of applying the stat modifying effects before doing the damage to self calc.
It's essentially equivalent (SD = x2: -1, Expert Belt = x1.2: ~-.25), and there are issues where some of these abilities/items/moves don't deserve the full weight because they won't be applied consistently, but there may be some merit to this idea. For instance, Life Orb vs. Choice Band: would the mod for LO be greater than log_2(3) if I factored in recoil?


Again, ete, I urge you not just to argue with me but to try out some of your suggestions yourself--mod my code and come up with your own version of the metric. If you come up with some demonstrable improvements, I'd be delighted to use them.
__________________
Codes and Hacks I Use
PBR FC: 4898-8739-8815 (See here)
Black FC: 4040 5386 0128 / White 2 FC: 4771 3664 7215
My Narrated PBR & Gen V Battles
My Trade Thread
Convert any sim team to pkms
Pokemetrics: A Blog
Antar is offline   Reply With Quote
Old Sep 26th, 2012, 7:08:35 PM   #135
eric the espeon
maybe I just misunderstood
is a Pokémon Researcheris a Contributor to Smogonis a Forum Moderator Alumnusis a Tiering Contributor Alumnus
 
Join Date: Aug 2007
Posts: 3,695
Default

Quote:
Originally Posted by Fat Antar View Post
In physics terms, you've just acknowledged that adding the lower stats is a "higher order" correction. It's not something I'm interested in dealing with right now..
Quote:
Again, "higher order" terms. What definitely needs to be factored in is not so much intent but utility. Rivalry is hard to pull off unless you're on a default-gender simulator (old PO). Stat-dropping moves and abilities are too rare for Defiant to come into play very often. Same with stuff like Super Luck, Sniper and Anger Point.
Alright. You are right, these are each going to be minor changes on their own, but I think the combination of all little extra weights should add up to a better metric if they're all fairly weighted.

Quote:
Suggestions that I change the code.
While I've read a lot of code, I am not a programmer, and despite perhaps being able to make some math tweaks (I think stalliness=-math.log(((2.0*poke['level']+10)/250*(stats[1]**2+stats[3]**2)**0.5/(stats[2]**2+stats[4]**2)**0.5*120+2)*0.925/stats[0],2) with **2 and **0.5 adjustable *could* work as an implementation of splitting the stats as I meant, but I don't actually know the python math syntax), I have no experience of creating/running new programs. It's something I've been meaning to learn for many years, but have never got to. If you could give me the sample data and some idea of what to do (I've got python, have saved both TA.py and baseStats.json, and it does not give errors when I try to run it. But it also does nothing, probably because I don't know how to input a team.) then I'll try and make a version with the changes I'd suggest.

Quote:
Simply put, this post: higher degree of stall should correlate with longer battles, and thus team strategies that lead to shorter battles would be "anti-stall." Set-up sweeping--a hallmark of hyper-offense--is one way to get a low turns/KO ratio, but it's not the only way to skin this particular cat.
Right, defining stall as how much a team extends the battle. But, how is having a low inclination to increase the number of turns in a battle different from being offensive (assuming you're aiming to win and not using six level 1 Shuckle or something, which would give a very short match and be more defensive than offensive)?

Quote:
You actually just made my case for me. Here, Oran Berry is very much assisting in a sweep, doubling speed with Unburden and working alongside Sub, which, as I argued a few posts ago, is an inherently offensive move.

If this sounds confusing, it's because you're thinking of it in terms of "how easy is this Pokemon to kill" rather than "how easy is this Pokemon to kill vs. how easy is it for this Pokemon to kill?"

Wobbs and Wynaut are VERY offensive Pokemon. Arena Trap guarantees either a dead Pokemon at the end of the matchup, or, at the very least, a free switch into a teammate who needs it to set up.
hm, my point is not exactly that this is not massively stallish, but that it is more stallish than alternate berries (PetayaFloon or Custap Wynaut are going to die faster and often kill stuff faster than their Oran counterparts). It could be, as you put it for my other points, a second order correction. One use items are inherently more short term and so less stally than similar unconsumables, but weighing the more offensive and defensive one use items equally when measuring stallyness seems questionable.

Quote:
It's true that Leftovers adding greater differences between the various teams is an argument in favor of applying the moveset modification, but I don't think I'll be doing it this month (before Sept. 1), as I either need to come up with a counterbalance to help widen the gap between stall, semi-stall and balance.
Ok, makes sense.

Quote:
I would LOVE to add some more semi-stall teams to my sample set.
Perhaps corner a team rater/RMT staff member or two and task them with expanding your collection of exportables?

Quote:
It's essentially equivalent (SD = x2: -1, Expert Belt = x1.2: ~-.25), and there are issues where some of these abilities/items/moves don't deserve the full weight because they won't be applied consistently, but there may be some merit to this idea. For instance, Life Orb vs. Choice Band: would the mod for LO be greater than log_2(3) if I factored in recoil?
Yes, for many it's the same. And it's true that for type boost items etc you're not always getting the boost, so scaling it down slightly may be appropriate (assuming that a 'mon will be using a water move a large majority of the time if it holds Mystic Water is reasonable, but not all of the time. Perhaps drop it from a 20% to a 15-18% boost.). For LO, my instinct is to (assuming stat split, which feels necessary to implement physical+special boosts accurately) apply the boost directly to both stats before calculating base stalliness, then have a separate smaller mod which deals with the recoil (and have this mod approximately equal to Leftovers, since LO will not activate every turn but Leftovers will unless at max HP). The mod should in basically all situations be greater than log_2(3) with recoil as an extra and sane values for lower stat weighting.
__________________
For people who like storing things: The Box
Reading and LC? LCF, LC Guide, LC Analyses
Good channels: #littlecup, #C&C, #1v1, others
And for SCMS editors: SCMS group
ete on IRC. Goodbye Smogon. Good luck, was fun while it lasted.

Last edited by eric the espeon; Sep 27th, 2012 at 4:54:45 PM. Reason: fixed python exponentiation
eric the espeon is offline   Reply With Quote
Old Mar 28th, 2013, 7:04:11 AM   #136
EcoraMori
Banned deucer.
 
Join Date: Mar 2013
Posts: 4
Default プラダ ポーチ,プラダ 店舗,リュック プラダ,プラダ トート デニム,PRADA,PRADA 財ð

Prada founded the start with boutique in 1913. In 1978, this historic PRADA プラダ デニム
honoured fabricator was throw into relief a contemporary unfolding elements and vitality.PRADA Miu DOWNLIGHT Pictures http://jppradaoutlet2013.com (20) ccia (Mario Prada's granddaughter) and then with the in the chips プラダ 店舗
pleasure products circumstance Patrizio Bertelli established a プラダ ポーチ
ahead partnership. 1970s the go circles environmental changes,プラダ 店舗
Prada in the offing the approach of bankruptcy. 1978 Miuccia her alter ego,http://www.jppradabagsonsale.webstarts.com Patrizio プラダ トート デニム
Bertelli prevalent PRADA プラダ デニム
receiver Prada and led Prada toward a stylish milestone
EcoraMori is offline   Reply With Quote
Old Mar 28th, 2013, 7:04:45 AM   #137
EcoraMori
Banned deucer.
 
Join Date: Mar 2013
Posts: 4
Default プラダ ポーチ,プラダ 店舗,リュック プラダ,プラダ トート デニム,PRADA,PRADA 財ð

Prada founded the start with boutique in 1913. In 1978, this historic プラダ トート デニム
honoured manufacturer was set a redesigned happening elements and vitality.PRADA Miu DOWNLIGHT Pictures http://jppradaoutlet2013.com (20) ccia (Mario Prada's granddaughter) and then with the in the chips PRADA プラダ デニム
return products circumstance Patrizio Bertelli established a PRADA
work partnership. 1970s the fad circles environmental changes,リュック プラダ
Prada in the offing the incline of bankruptcy. 1978 Miuccia her partner,http://www.jppradabagsonsale.webstarts.com Patrizio プラダ トート デニム
Bertelli prevalent プラダ 店舗
receiver Prada and led Prada toward a unfamiliar milestone
EcoraMori is offline   Reply With Quote
Old Mar 28th, 2013, 7:07:50 AM   #138
EcoraMori
Banned deucer.
 
Join Date: Mar 2013
Posts: 4
Default プラダ ポーチ,プラダ 店舗,リュック プラダ,プラダ トート デニム,PRADA,PRADA 財ð

Prada founded the start with boutique in 1913. In 1978, this historic プラダ ポーチ
honoured industrialist was arrange a redesigned unfolding elements and vitality.PRADA Miu DOWNLIGHT Pictures http://jppradaoutlet2013.com (20) ccia (Mario Prada's granddaughter) and then with the in the chips PRADA プラダ デニム
return products circumstance Patrizio Bertelli established a プラダ ポーチ
calling partnership. 1970s the go circles environmental changes,PRADA プラダ デニム
Prada in the offing the on the very point of of bankruptcy. 1978 Miuccia her partner,http://www.jppradabagsonsale.webstarts.com Patrizio プラダ 店舗
Bertelli garden-variety プラダ トート デニム
receiver Prada and led Prada toward a stylish milestone
EcoraMori is offline   Reply With Quote
Old Mar 28th, 2013, 7:08:52 AM   #139
EcoraMori
Banned deucer.
 
Join Date: Mar 2013
Posts: 4
Default プラダ ポーチ,プラダ 店舗,リュック プラダ,プラダ トート デニム,PRADA,PRADA 財ð

Prada founded the start with boutique in 1913. In 1978, this historic プラダ 店舗
honoured producer was arrange a contemporary happening elements and vitality.PRADA Miu DOWNLIGHT Pictures http://jppradaoutlet2013.com (20) ccia (Mario Prada's granddaughter) and then with the in the chips プラダ トート デニム
delight products circumstance Patrizio Bertelli established a プラダ ポーチ
industry partnership. 1970s the go circles environmental changes,PRADA プラダ デニム
Prada in the offing the incline of bankruptcy. 1978 Miuccia her helpmeet,http://www.jppradabagsonsale.webstarts.com Patrizio PRADA 財布
Bertelli common プラダ 店舗
receiver Prada and led Prada toward a unfamiliar milestone
EcoraMori is offline   Reply With Quote
Reply Smogon Community > Pokémon > Smogon Metagames

« Previous Thread | Next Thread »
Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



All times are GMT -4. The time now is 2:48:50 PM.