haxcho, I'm asking for something a bit simpler:

*n*measurements,*m*possible outcomes: given frequency counts*f_1*-*f_m*, where sum{*i*}*f_i*=n, what is the margin of error on each*f_i*?The formula is

margin[

*p_*,

*n_*,

*a_*] = {p - z[(1-a/2)_] * Sqrt[p*(1 - p) / n], p + z[(1-a/2)_] * Sqrt[p*(1 - p) / n]}

with the variables

*p*:= (

*f_i*)/

*n*;

*n*:=

*n*(arbitrary, but just to be clear here) and

*a*:= level of significance.

Notable is also z[s_] := table of the Gaußian distribution to a chance of s .

So, at first, you should choose a level of significance: 0.1 means that your margin has a chance of 0.1 to be wrong etc. I would recommend a level of 0.05, since this is the usually used; In this case z[(1-a/2)_] = 1.96 is true. If you really want another level of significance and can't find the table, I can give you the appropriate number - there is no explicit formula for this, sry.

So, to recap, I would recommend to use this:

margin[

*p_i*,

*n_*,005] = {p - 1.96 * Sqrt[p*(1 - p) / n], p + 1.96 * Sqrt[p*(1 - p) / n]}

So, if you put in margin[f_i/n , n, 00.5], you now have a vector; The left element is the lower border, the right is the higher border.

For example I calculated it for Slowbro vs Terrakion:

Chance of Slowbro KO : {0.1020343965970672, 0.1693223873225308}

Chance of Terrakion KO: {0.18033431334903746, 0.26187674192734445}

Chance of Double Down: {-0.0019218224675036245, 0.011972073723785032}

Chance of Slowbro Switch: {0.14770708321154785, 0.2241522132708642}

Chance of Terrakion Switch: {0.358768768734978, 0.45530158302381596}

Chance of a double switch:{0.017076607959311558, 0.053275150834658294}

Chance of Slowbro getting forced out:{-0.0024058696980351186, 0.007430995326175822}

Chance of Terrakion U-Turn KO:{-0.0009598010930764525, 0.016035177977498562}

As you obviously see, some of the lower borders are in the negative spectrum; This is because the so-called "npq-rule" I mentioned in the post before applies : n*p*(1-p) < 9

The reason for this is that Statistics unfortunately isn't like Analysis, were every formula you commonly use is perfectly smooth, so we have to resort to approximations, and these always have certain boundaries.

In this case, you SHOULD choose to make the "n" higher or simply don't use the statistics at all; But you can't make the n higher & don't need perfect accuracy, so if you just substitute every negative number with 0 it's more or less fine(even though exactly 0 is impossibly, since it already happened at least once, but it can be just very, very near 0). There are more precise formulas, but they're also far more complicated(the many digits after the decimal point in my calculation aren't because it's really that precise, but because I just copy & pasted out of mathematica).

I don't know which programming language you use, but I tried to write the formula in a way that you don't have to change to much. I also tried to explain it as thoroughly as possible with the limited means you have when describing something online in a forum without going too much in detail, because I don't like when people use a formula without knowing at least a bit of the background(like the stuff about the npq-rule or the level of significance).

If you don't understand something, or want to know more, feel free to ask! I'm a statistics/mathematics college student, so I have to recap it anyway(or rather want, since I already did the exam for it), & I can atleast help someone while doing that.