"The Win Formula" -- Lights, Camera, Action!

Status
Not open for further replies.
I think the intro (being just statistics) should be placed in Stark Mountain rather than PR, as all stats are posted.
 
If you haven't been active in Policy Review or the Smogon battling community, then don't think you can suddenly spring up and be a key actor here. Be realistic.
I feel that this should be stressed even more so. I kind of feel bad for some of the Smogonites who have gotten out of battling over the years, but people are going to have a much greater chance of seeing through it if too many of the 'wise old veterans' take an interest in posting in PR all of the sudden. A few coming in when things start to heat up would be great, but overall we should be cautious.

Along the same lines, people will definitely expect me to be posting, and I will support such a formula to the best of my ability! I could very likely be the 'fake out' type poster, though, because I think my stances on competitive pokemon tend to go there as it is. I know that some non-badged users with PR access, like Blame Game, will be vehemently against this, and I typically am on the other side of the spectrum.

Also, when it's originally posted, make sure not to rush the thread too hard - I could easily see us accidentally overloading the thread too early, and even the most exciting PR threads take a while for everyone to weigh in on them. As mentioned in Doug's OP, let's make sure to give it some time and not get too adamant about things initially.
 

DougJustDoug

Knows the great enthusiasms
is a Site Content Manageris a Top Artistis a Programmeris a Forum Moderatoris a Top CAP Contributoris a Battle Simulator Admin Alumnusis a Smogon Discord Contributor Alumnusis a Top Tiering Contributor Alumnusis an Administrator Alumnus
I think the intro (being just statistics) should be placed in Stark Mountain rather than PR, as all stats are posted.
The intro won't actually contain any statistics. It is a post where X-Act is asking for help from experienced members of the community to help make a statistical definition of a "hax win". As the formula develops, we may generate some fake reports to "prove" the accuracy of the formula. But, that comes later. The thread needs to live in PR.
 
"Battling Hero" & "Battling Villain" - People that battle actively and support/oppose the formula based on their assumptions of how the formula will impact their actual teams and battles. They refer to the latest-and-greatest battle strategies to support their arguments.
...is all I can think to do >_< (except I'm not terribly good and just took a week off! And I have like 2? PR posts so... extra maybe?)

I'm thinking something like "This isn't a fair way to control how a game ends because we've already accepted for the most part that hax is out of our hands and you shouldn't punish someone for being 'lucky', especially when it is incredibly likely to even out in the long run anyways".
 

Articuno64

1 to 63 were taken
is a Tournament Director Alumnusis a Site Content Manager Alumnusis a Battle Simulator Admin Alumnusis a Programmer Alumnusis a Smogon Discord Contributor Alumnusis an Administrator Alumnus
I feel that this should be stressed even more so. I kind of feel bad for some of the Smogonites who have gotten out of battling over the years, but people are going to have a much greater chance of seeing through it if too many of the 'wise old veterans' take an interest in posting in PR all of the sudden. A few coming in when things start to heat up would be great, but overall we should be cautious.

Along the same lines, people will definitely expect me to be posting, and I will support such a formula to the best of my ability! I could very likely be the 'fake out' type poster, though, because I think my stances on competitive pokemon tend to go there as it is. I know that some non-badged users with PR access, like Blame Game, will be vehemently against this, and I typically am on the other side of the spectrum.

Also, when it's originally posted, make sure not to rush the thread too hard - I could easily see us accidentally overloading the thread too early, and even the most exciting PR threads take a while for everyone to weigh in on them. As mentioned in Doug's OP, let's make sure to give it some time and not get too adamant about things initially.
Makes sense. I'll probably only make one or two posts in the last week or so.
 
I've been thinking about the fake formula, and, to be honest with you, I'm finding it hard to write down a believable formula (simply because it's not believable!) I'm gonna start surely from the win formula used by Glickman and "try to adapt it for Pokemon" (yes, there is a real win formula based on the two players' ratings/deviations alone).
I don't think it's particularly hard. Basically, you want to call wins and losses accordingly to the expectation of the score rather than the real (noisy) score. Some progress can be made in that direction simply by taking the expectation of damage rather than the real damage towards the win statistic. For every pokemon, you have two HP meters: one that works as normal and one that is depleted at a rate corresponding to the expectation of damage. For example, if you have an attack that can do 30, 60, 90 or 120 damage with uniform probability, regardless of what is truly dealt, you count the mean (75 damage) on the "haxless" hp meter. If a pokemon is confused, you always deal half of the normal damage to the foe and half of whatever a pokemon hitting itself does. An asleep pokemon would deal damage proportional to the probability that it would wake up that turn. The game still goes on normally, but for every pokemon you tally an additional hp meter that is always depleted by the expectation of damage (maybe augmented by its variance). Even if a pokemon faints, you still count the expectation of the damage it would have dealt, multiplied by the probability that it would have survived. At the end of the game, you simply discard all the real hp meters and you use the "expected" meters to compute who "should" have won.

This system already handles a lot of sources of "hax" naturally: critical hits, misses, paralysis, confusion, etc. It doesn't handle random burn and freeze, stat boost "hax" etc. but in theory you could also have a probability distribution over status and stat boosts, which would cover them. Anyway, if the system was actually implemented, it might actually not work too badly, so with proper rhetoric it shouldn't be difficult to make it believable and to discard critics as irrelevant. It will fly over the head of most people but the few who might get it still won't see right through it. For extra brownie points you can pull a few logs and show that the outcome would be better under the new system by walking step-by-step on relevant turns and doing the math (and I mean doing it for real). There will certainly be a lot of logs that will support the position. Picking one or two logs where the system fails miserably (there will be heaps) can help towards making it look a scientific and unbiased venture.

The important point that needs to be made is that the system gives us greater "confidence" in who carried out a particular battle with the most skill and "repairs" statistical errancy due to stochastic aspects of the game. The number of examples that agree with the system must be much greater than the number of examples that don't and for each example that doesn't agree with the system it must be understood why and a solid argument must be made that it's the example that's wrong (contrived, statistically insignificant, an outlier), not the system. I think that is easy to do - if you know the system you can identify what kind of examples make it look good. Extreme, clear cut examples are recommended because it's harder to doubt them. Of course, they are rarely representative of the average (that's why they are extreme) but this is a situation where we precisely do not give a shit. I think there is some value in really "showing" that a formula works on contrived, cherry-picked examples. Everyone who supports the formula can "find" a log and use it to praise the formula. They can (and should) botch and misapply the formula if it is not obviously noticeable. Opponents can give counter-examples that supporters call contrived or where they point out glaring misapplications of the formula. This is exactly how academia works and they've been getting away with it for ages! And also religion but I'm only joking about one of them.

You should also add an extra step where you propose to use the formula as a factor to improve the rating. This should in fact be the primary motivation and one nobody would oppose to as long as you reassure them that the true outcome would be the primary factor or that it would only be used to reduce the penalty incurred by a "hax loss". Since losses due to "hax" are much more resented than wins due to "hax", a measure aiming to do damage control and reduce the impact of frustrating losses would probably be viewed positively from the get-go. Basically (though you shouldn't put it that way) everybody would get a higher rating because their losses would count less whenever they feel that they should. I think people will love that. I mean, this is the kind of thing you could actually get away with doing. The problematic part is touching wins, but as soon as people accept to revise penalties they have little logical grounds to stand on to reject revising rewards. So the mindfuck you can attempt is make people agree strongly that it's a good idea to use the "hax" formula to update the loser's rating conservatively, then smoothly transition into saying "it is inconsistent to only apply the formula on one side" and into "we should use the formula on both sides" without debating that maybe you should not use it at all ("we didn't go all this way for nothing"). Then the part of the rating that depends on the real score can slowly be negated. Furthermore, this helps you to deviate the debate from the formula itself to its application: you can make it look like (and it might actually be the case) some villains *want* the formula to be applied but *conservatively* and only to help victims of hax.

Don't forget to organize a (rigged) vote.

Not that I think this isn't an exercise in futility, but it's fun to think about.

PS: too many people have access to this forum, the ship will leak. Suffices that X tells Y and then Y tells Z and then Z doesn't give a shit and tells A, B, C... the fact this is long-winded makes it all the more likely, since people will be talking about it. It's unlikely that anyone would come out of the blue to spill the beans, but if discussion is prompted it's another game. I mean, for your sake, I hope I'm wrong, but... yeah.

PPS: why did I write all that? What's wrong with me? God.
 

DougJustDoug

Knows the great enthusiasms
is a Site Content Manageris a Top Artistis a Programmeris a Forum Moderatoris a Top CAP Contributoris a Battle Simulator Admin Alumnusis a Smogon Discord Contributor Alumnusis a Top Tiering Contributor Alumnusis an Administrator Alumnus
Brain, that is brilliant! I agree on all points.

I too am worried about our ability to keep a secret. I am going to continue under the assumption that we can. If I'm wrong and the word gets out all over the place, then we'll close the thread and cancel the prank. And next year, we'll know to do less-ambitious pranks, or stop including all badgeholders in the joke, or both.

At this point, all we can do is forge ahead, and hope nobody ruins it.
 

DM

Ce soir, on va danser.
is a Site Content Manager Alumnusis a Senior Staff Member Alumnusis a Smogon Discord Contributor Alumnus
I think the probability of this getting leaked has grown exponentially with how extravagant you've made it. That's not a criticism by any means, that's just the truth. It's gone from a one-day joke into a month-long ordeal, there's much more room for errors.
 
if anyone does leak it I don't think they'd deserve any kind of respect, assuming of course they'd keep their badge. i dont think anyone here would ruin the joke with that on the line!
 

DougJustDoug

Knows the great enthusiasms
is a Site Content Manageris a Top Artistis a Programmeris a Forum Moderatoris a Top CAP Contributoris a Battle Simulator Admin Alumnusis a Smogon Discord Contributor Alumnusis a Top Tiering Contributor Alumnusis an Administrator Alumnus
I think the probability of this getting leaked has grown exponentially with how extravagant you've made it. That's not a criticism by any means, that's just the truth. It's gone from a one-day joke into a month-long ordeal, there's much more room for errors.
I fully acknowledge that we may be asking for trouble, by setting up such an elaborate scheme. If we can't pull it off, I won't go on a rampage looking for the snitch. If the cover gets blown, then next year we may just gather a dozen trusted badgeholders and pull a joke on everyone, badgeholders included. No sweat.

So, the question to other badgeholders is this, "Do you like being on the 'inside' of jokes like this?" If you like being an insider, then keep it "inside". If someone can't keep it inside, then we can solve that problem by pulling the joke ON them, instead of WITH them.

Or we just do simpler jokes next time.

But honestly, if someone doesn't like putting a lot of work into a joke, then they shouldn't participate. No one is forcing anyone to get in on this. I get a kick out of trying to pull off things like this, and it looks like there are several other badgeholders that like it too. I hope other people don't ruin the effort put in by others, by blowing the secret.

Maybe we just can't pull off something this elaborate. I hope we can. I think pulling off a "big con" is a lot more satisfying than doing a "small con".
 

obi

formerly david stone
is a Site Content Manager Alumnusis a Programmer Alumnusis a Senior Staff Member Alumnusis a Smogon Discord Contributor Alumnusis a Researcher Alumnusis a Top Contributor Alumnusis a Battle Simulator Moderator Alumnus
I'm helping a friend convince his parents that said parents are about to be grandparents, so obviously I prefer the elaborate pranks.

I really hope no one ruins this.
 
I think it'd be fun to play the grumpy veteran role. Not many people (not to say no one) know me either, so I can even make a quick introduction and rant about the old days to add to the drama, go on about how this is a sign of how low the pokémon community has sunk, go on about how it's a big enough deal to care to post about pokémon again, and other shit. Could be fun, though I'm aware I'd not be very relevant at all, it won't hurt.
 

Caelum

qibz official stalker
is a Site Content Manager Alumnusis a Community Leader Alumnusis a Smogon Discord Contributor Alumnusis a Tiering Contributor Alumnusis a Top Contributor Alumnusis a Smogon Media Contributor Alumnusis a Battle Simulator Moderator Alumnus
Excellent X-Act. I'll be posting later tonight to help "guide" the thread along. I actually just want to see where it goes first with people that don't know what's up to get the initial reaction so I best know how to move the thread in the direction we want and move it along properly. Don't want to guide it too early or it might look a bit fishy.

Edit: I'll introduce some of the other variables, but we need a few people to volunteer to do that as well so keep that in mind.
 
people are already saying no to this formula.... I may have to be a "hero" if we don't have more people supporting this. =/
 

DougJustDoug

Knows the great enthusiasms
is a Site Content Manageris a Top Artistis a Programmeris a Forum Moderatoris a Top CAP Contributoris a Battle Simulator Admin Alumnusis a Smogon Discord Contributor Alumnusis a Top Tiering Contributor Alumnusis an Administrator Alumnus
OK, X-act has started the ball rolling. I posted a cast list in the first post of this thread. I'll also be sending casting instructions to all cast members.

The first post has already started the community down the road of thinking this will actually be used to determine wins. I was hoping the thread would start off a little less controversial, being interpreted as a stat gathering exercise. But, that's how it goes. Like any performance -- you have no idea how things will play out, until you get in front of the audience.

The audience is currently reacting negatively. So we have no shortage of villains, since people like LonelyNess and Blame Game have come out with their guns blazing.

We'll need a few heroes to step up to the mike and give some encouragement. I'll send out requests to specific cast members and ask them to post. The hero posts should not be outright support, but more of the tone --
"I'm hesitant about the actual formula proposed so far, but I like the general idea of this, and I look forward to seeing what X-Act, Caelum, and Doug come up with."
I don't want to flood the topic too early, so we need to let this settle for a day or so before adding too much more to the thread. At that time, we'll get a few more mad scientists into the mix to clutter it up with more obscure ideas for improving the formula. We'll also bring in our loudmouth theorymon experts to get philosophical about "To hax or not to hax, that is the question."

I'll post more later, be sure to check back here for the latest direction.
 

Caelum

qibz official stalker
is a Site Content Manager Alumnusis a Community Leader Alumnusis a Smogon Discord Contributor Alumnusis a Tiering Contributor Alumnusis a Top Contributor Alumnusis a Smogon Media Contributor Alumnusis a Battle Simulator Moderator Alumnus
I'll post sometime tomorrow to let the thread sit for a bit and simmer before I get too active but I liked the idea of including all the variables for april fool's so I was going to post this later. Tell me if anything is off before I post it lol.

Firstly, we must absolutely introduce a variable to account for the overall number of turns where attacks were used. Most moves have a secondary effect and almost every player considers the secondary effects and critical hits to be "hax", so as the number of turns increases the probability of some hax occuring increases. Thus, the formula needs a new variable O

O = Overall Numbers of turns where Attacks were used.

So, amending the formula to account for this

k = (L * p * O) / a

However, let's not forget your opponent could get hax against you. So, we should be looking at the ratio of your potential hax O1 to your opponents O2.

k= (L*p*O1)/(a*O2)

If you move an equal number of turns, O1=O2, you receive (on average) equal hax. If you move more often, O1>O2, you have a higher chance of hax so "hax" goes up. Vice versa for O2>O1 obviously, your "hax" would go down and there is a greater likelihood your win was on merit alone.

O1 and O2 can also be influenced by individual modifiers based on their individual moves secondary effects and/or high critical hit ratios which I'll have to get with Doug to figure out the logistics of and run some simulations to determine the appropriate modifiers for individual moves in question. For example, a simulation could discover the appropriate modifier for Iron Head to account for a flinch in terms of the formula would be 1.2 so instead of 1 move added for an Iron Head, you would add 1.2. This is just an example of the modifier and the finer points would have to be worked out based on empirical results and simulation to determine ideal values.

Now, note this doesn't account for Serene Grace / Super Luck or similar abilities. Which was a concern of Lemmiwinks. I'm currently working out a way to incorporate those abilities in a new term "s" (Serene Grace, Super luck get it ??).

I change s from stat total to serene grace / super luck since I can make a better case for arguing hax with that then stat total.

Edit @ Mekkah: Part of the reason I posted it so I would know if it was too early ^__^. I'll wait a few days.
 
I'm thinking it may be a little too early to introduce variable letters early like that, but it could be just me. I like the rest of the post though.
 

obi

formerly david stone
is a Site Content Manager Alumnusis a Programmer Alumnusis a Senior Staff Member Alumnusis a Smogon Discord Contributor Alumnusis a Researcher Alumnusis a Top Contributor Alumnusis a Battle Simulator Moderator Alumnus
I'm going to wait to comment until someone links me to the thread, and then I'll say "Hmm... I'll have to look into this more. Seems interesting.".
 

DougJustDoug

Knows the great enthusiasms
is a Site Content Manageris a Top Artistis a Programmeris a Forum Moderatoris a Top CAP Contributoris a Battle Simulator Admin Alumnusis a Smogon Discord Contributor Alumnusis a Top Tiering Contributor Alumnusis an Administrator Alumnus
I think we should "ignore" the non-insiders in the PR thread, and just discuss "amongst ourselves". I think it will really piss off people like LonelyNess and Blame Game that this big involved discussion is taking place around them, but no one will acknowledge them! Don't let them divert the show by getting into arguments with them. Just address the badgeholders and we'll create our own little "private reality".

I just sent Jumpman and Tangerine a cue. I think they should get a post or two in very early, just to establish a few more heroes.

After that, over the next two days, we need to get our Mad Scientists in the mix. Caelum obviously needs to post something tomorrow, but I agree with Mekkah that we need to keep other specific variable letters in the bag right now. Just post ideas that might later be formalized into variables. Obi, it would be great if you dipped into the thread tomorrow and expressed interest in this little "science project" we have started. Ditto for Brain. Brain you could post your entire "alternate damage stream" idea, and propose it as another factor to weigh in the overall formula. Mention it along the lines of
"This is great stuff guys. In fact, I've got this idea that I've been working on that should fit perfectly alongside the other parts of the formula...."
It doesn't matter if it "fits" or not. It's more geek material to cloud the picture for the know-it-alls that are offended this idea is even being discussed.

I'll jump in sometime later, when we are ready to really piss everyone off by talking about actually programming it on the server. For now, I may post in support of the "project I've been working with X-Act and Caelum", but not much more. I'm leaving the heavy lifting to others at this stage of the show.
 
I got a PM:
Naxte said:
Hey there. I was reading the "Hax In Pokemon Battles" thread in the Policy Reivew section, and after reading it, I really wanted to reply to it. I don't have PR posting privileges, however, so I was hoping that you could post it for me. I can understand if you don't don't really think that would be proper, though.

Here's what I want to say:
"I'm very opposed to such a formula being used to determine the actual winner of a match. In it's current form, the formula is essentially just calculating the likeliness of a certain player having won a match without hax being involved. However, that's all it is; the likeliness. Even if the formula says its extremely unlikely for a player to have won without it having been hax, it's still possible. Thus, the formula could screw some people out of some wins that were in fact legit and that, in my opinion, is far worse than having to accept the fact that I'll occasionally loose a few matches due to hax. I'd much rather have the RNG cost me a few matches than have Shoddy be telling me my win wasn't a win, just because it was unlikely for me to pull it off.

Next, there's the fact that no matter how much you strive to make this formula objective, it won't be (at least using the criteria it currently is), and will be costing some people some matches, based on an arbitrarily set parameter. What is this parameter? The role of the Prob_Win value in the formula developed for determining whether or not the player should be given the win.

In order for the amount of hax in a match to be used to determine if a player should be given a win or not, you have to pick a value for Prob_Win that below which will resort in the player who won the match not actually being given the win. This cut-off point will end up being arbitrary, and as a result, it's really no better than the hax it's supposed to counteracting. No matter what the value is that is chosen, there will be matches that, if the value had just been a few points higher or lower, could have been awarded to the other player. Thus, who wins the future matches is dependent upon the value that is chosen now; if you're lucky, it could end up winning you those close matches, and if not, you'll loose them.

Thus, assuming I'm understanding the formula correctly and what I'm saying is true, I cannot support such factors being used to determine the winner of a match. However, if there really is a strong movement for such a thing to be implemented, I'd be willing to accept a bit of a compromise, and have it affect the points gained/lost from such a match instead; basically, if the formula turns out a result that it was extremely unlikely for Player A to beat Player B without a very large amount of hax being a factor, than Player A won't gain as many points and Player B won't loose as many than as if the value generated had been lower. Since it's the actual net point gain that matters when attempting to ladder, and not the amount of matches won/loss, I feel that would be a reasonable compromise. Still not sure if I really even like that idea, but it's still definitely better than it determining the actual winner of a match, in my opinion."

Thanks either way.
what to do!!
 

reachzero

the pastor of disaster
is a Senior Staff Member Alumnusis a Top CAP Contributor Alumnusis a Tiering Contributor Alumnusis a Battle Simulator Moderator Alumnus
Firestorm, thank you so very much. That just made my entire night. I just wish I could see the look on Colin's face. ^_^
 
Status
Not open for further replies.

Users Who Are Viewing This Thread (Users: 1, Guests: 0)

Top