We could weigh these numbers as well, especially if we feel that our cutoff point is a bit low. Or we could do something like:
-Players above 1500 are using
-Players above 1600 are using
-Players above 1700 are using
The best part about doing this, is that even if there is a large disparity in battle count between users, we can still gather useful stats. If solid players are using pokemon above 1600 or 1700 that we might not have expected them to use, that's telling. It's also notable if they're not using many OU pokemon ever, or if a couple pokemon seem to be on almost every team over 1700, but see 'normal' numbers at 1500-1600 or so.
I agree that usage split into ranking tiers would yield some interesting stats. I'd love to see how the usage rankings change as you move up in rating. However, combining that information into a single formula for determining "What good players are using". I still think that is fraught with insurmountable problems, many of which I listed in my tl;dr post.
Saw Doug's latest post just as I was about to hit reply: I don't see any reason as to why we can't find some sort of 'magic number' which makes the Glicko actually tell us what we want it to. Though if Glicko is that arbitrary, maybe we need a new system? I assume the entire ladder rating system would have to be rewritten, which sounds like a nightmare unless there are actually 'better' systems under public domain.
I don't think the Glicko system is arbitrary. Using the actual Glicko rating number for weighting pokemon usage -- that is what I think is arbitrary. From what I know, Glicko a great system for rating players. I have no reason to think we should stop using it for ranking players on our ladder. But the numbers yielded by the Glicko system were not devised for weighting pokemon usage.
Many people see the rating number and see the usage numbers, and automatically jump to the conclusion that multiplying the two will somehow yield a meaningful result. I think that is silly.
I agree with your statement that perhaps we could use some other method to make Glicko ratings give us an appropriate weighting multiplier. Maybe so. I think X-Act has an interesting approach...
I decided to implement the above "Player1 beats Player2" formula for 200 randomly-generated players. I applied it for every player against every other player, summed up the probabilities and divided by the number of probabilities. I then sorted them in descending order. Here is the result:
(Percentages)
It seems like this percentage is a non-arbitrary way of making weighted stats.
The problem I can see is what Doug said: the R and RD are only 100% accurate at 11:30pm every day, not at the time when the battle was played. At least, however, this percentage would be a non-arbitrary multiplier.
I agree that using the percentage seems to be much more relevant to making weighted stats than using the ratings numbers themselves.
However, as I have said many times in the past -- I really think our current overall rating system is deeply flawed. Any system that encourages players to "reset" their ratings on a fairly frequent basis, cannot possibly be accurate for determining "the good players" at any point in time.
On top of that, how can you curb manipulation by players -- since the players themselves are in total control of how much weight is given to their usage of a pokemon? If a player desires a pokemon usage to be weighted heavily then they use the pokemon on a "good alt". If they want it to be weighted lower, then they can use it on a "bad alt".
How can the weighted usage numbers be considered an objective observation when the participants have so much control over what is observed and how much importance will be placed on the observation? This gives way too much ability for players to game the weighted usage rankings.
The rating system is predicated on the assumption that all players always do their best to win every match, and that a player has a single "identity" that can be used to accumulate a rating over time -- forever. That is patently NOT the case. Players are not always playing to win, and players cannot be uniquely identified and consistently rated over time. These two inherent problems cannot be avoided. At least I can't see a way to get around them.
No single issue I have posted is THE reason I am against using weighted numbers for determining pokemon tiers. It's all of them. There are just too many problems.