To answer how the 500 came to be, I need to explain how the whole formula came to be. Feel free to skip reading this post if you're not interesting in how the formula was created, as it is rather long.

For me, it seemed obvious enough that a centralised metagame should have a high F and a low O, while a metagame that is not very centralised should have a low F and a high O. (Remember: O is the number of OU Pokemon while F is the number of frequently used Pokemon.) For example, in one case, a particular Standard metagame had F=6 and O=50, while a particular uber metagame had F=14 and O=28. As you can see, the Standard metagame has a low F and a high O, while the uber metagame has a high F and a low O. However, intuition isn't always right, so I kept my intuition to myself until I could prove that it is correct.

To test my intuition, I decided to find the number of Pokemon that are used in 1 out of every T = 1, 2, 3, 4, etc. teams, up to 20. (Remember: a Pokemon is frequently used if it is in one out of every 4 teams, while a Pokemon is OU if it is in one out of every 20 teams.) Then I plotted these numbers against T and compared different graphs obtained from different metagames.

I noticed that the graphs for the standard and UU metagames were very similar except that the UU graph was slightly steeper, while the uber metagame graphs were much less steep. Since 'steepness' is determined by gradient, I thought "so centralisation is related to the gradient of this graph; the higher the gradient, the less centralised the metagame is".

So what I did at first is to plot a trendline of these points and find the gradient of this trendline. However, this whole process seemed to be way too complicated for the average Smogoner to understand, and the teacher in me decided that this process, although sound, needs to be simplified. :)

Then I remembered that I had already mentioned that frequently-used Pokemon are those used in 1 out of every 4 teams (i.e. the point where T=4). So I connected a line between this point and the 'OU' point (i.e. the point where T=20) and found that the gradient of this line is very near that of the trendline. Thus I found a much simpler way of finding a very good approximation of the trendline gradient.

There was, however, a problem: the higher the gradient, the LESS is the centralisation, and vice-versa... which means that the gradient is inversely proportional to the centralisation. This problem was solved easily by taking the reciprocal of the gradient as being the measure I wanted. Alternatively, you could argue that I found the difference in x divided by the difference in y instead of the other way round.

Okay, so I had a point at (4,F) and another one at (20,O). The difference in x is 16, while difference in y is (O-F). Thus, my initial measure of centralisation was 16 / (O-F). I soon realised that this number is too small for practical purposes, however, so I just multiplied the number by a constant. After testing a few numbers, I settled for a constant equal to around 30 to 32, and since 30x16 = 480 and 32x16 = 512, I decided to use 500 as the constant. Finally, I just removed the numbers after the decimal point so that the measure is a neater number, and the formula floor(500 / (O-F)) resulted.

The fact that (O-F) is in the final formula confirms that a centralised metagame has a low O and a high F, while a non-centralised metagame has a high O and a low F. If O is low and F is high, then O-F is a small number, and hence 500 / (O-F) is high. If O is high and F is low, the O-F is a large number, and hence 500 / (O-F) is small.