Follow along with the video below to see how to install our site as a web app on your home screen.
Note: This feature may not be available in some browsers.
Welcome to Smogon! Take a moment to read the Introduction to Smogon for a run-down on everything Smogon, and make sure you take some time to read the global rules.
The below post is regarding the rankings in the above hide tag. These rankings have been superseded as of January 2022 by rankings further down in this thread, and those will likely become outdated as further updates occur.
To explain how I came to the above results, I first publicly linked a survey form in this post. There were a total of 14 respondents to the survey form: Mr.378, CKW, me, FriendOfMrGolem120, Eeveeto, Beelzemon 2003, Eo Ut Mortus, Waveshaper, Isa, zf, M Dragon, Watchog, Jorgen, and Blightbringer. The form contained a question asking for the number of GSC Ubers games played as a means of determining experience in the tier and deciding how to weight the responses when determining the final results. Initially I used the upper end of # of games played as a raw weighting, i.e. my rankings counted as though 200 people had submitted them, while Isa's counted as 100, etc. I then received feedback saying that it would be better to use the square root of this number instead to reduce the degree of bias towards people that had played a huge number of games in the tier. I was also originally adding a flat 1 extra weighting to all submissions, meaning even though Eo Ut Mortus had never played GSC Ubers before, his ratings would influence the outcome. Upon receiving further feedback, this was removed. I then compared the rankings with and without outliers with vapicuno. He showed me how outliers were affecting the outcome due to the relatively small number of respondents and wildly different ratings from Beelzemon 2003. With outliers removed, the results look a lot cleaner and I am confident we have come to the optimal result possible from this data set.
This time, there were material differences between the rankings with and without outliers. You can see what changed by comparing the Weighted_Result with the Weighted_OR_Result sheet in the spreadsheet.
This form was a big improvement on the previous google form used in the OU VR, but one thing I will clarify next time is when to use the "N/A" option--some people used it for Pokemon they thought didn't deserve to be ranked, while others used it when they had no experience using or facing the Pokemon in question.
With vapicuno's help yet again, here are some nice pictures to show everyone's rankings and how I split up the tiers above. Everything below is using the weighted results, with outliers removed. The charts have error bars based on weighted standard deviation.
Everyone's rankings compared with the result and the Old VR (which didn't rank Pokemon within sub-ranks so is a bit misleading)
Chart of each Pokemon's ranking against their weighted average ranks (i.e. if they were very close in ranking, they will be close together on the y axis)
Same chart as above but with coloured boxes showing where I drew the divisions between ranks in the OP
Please let me know if you have any questions and I will do my best to answer.
shuckle should be ranked somewhere, prolly above muk at the very least. proposing this replay as a proof of concept; i was down 4-5 on a permanently spikesless field but shuckle walled the entirety of my opponent's team and spread toxic to allow the comeback https://replay.pokemonshowdown.com/smogtours-gen2ubers-561515
cloyster above ho-oh also for sure and possibly above ttar. offensive spiker that doesnt have to fear fire moves from every mon in the tier = good
golem top of b if not low a. p much same reason as in ou
edit: prolly rank quagsire too, even if the dmg at +6 is super sad it's a good electric stop and surprisingly also good vs mewtwo (psychicless sets fail to 3hko)
I've also used Shuckle before and I agree that it should be ranked. In fact, several people did rank it including myself in the survey, but I don't think it reached the cutoff that ended up getting used to finalise the VR. I'm less sure about Quagsire, so I would like to see more of it before I can comfortably agree.
forretress and especially steelix down. fire moves are everywhere in this meta, everyone preps for primarily forretress but it hits steelix just as well. there are a lot of alternatives to steelix (any rock type) and it's not easy to justify steelix over smth like golem.
this isnt a new take per se but zapdos >> lugia. the recent influx in eq lugia is interesting to see how it develops, admittedly, but even with that in mind lugia is a two-trick pony with pp issues that hates status. zapdos utilizes its broad movepool much better and you must always respect thunder/hp water. celebi isn't feeling too great with stall overall feeling quite clunky and overloadable by the myriad of offensive tools available, only stalk raikou kind of shrugs off those two moves alone while fitting on better structures. just because snorlax _can_ take a thunder doesnt mean it would like to take a thunder for the team.
related, celebi down within its rank. celebi's role hasn't changed but i find that the structures that celebi fits on (stally ones) are weak at the moment. its really easy to get donked by well-played HOs.
merge S1 and S2; snorlax being on all teams is a reflection of how solid it is, and it probably is the most important mon, but i do not believe that translates into power level. mewtwo's flexibility and immediate threat level/mew being completely undeniable as a 1-or-more-for-1 threat coupled with teams being better prepared for lk monolax makes the gap smaller.
still not sure that there exists a ho-oh team where you can't remove the ho-oh and replace it with something else to make that team more consistent.
vaporeon up a rank, premier abuser of barrier m2 aka everyone's best friend at the time of writing.
edit: roughly my a-rank atm:
zapdos
lugia
--
raikou
tyranitar
cloyster
celebi
forretress
golem
--
steelix
ho-oh
umbreon? honestly these might all be more deserving of B than A.
edit2: gengar up a subrank, monolax teams do not care for pursuit support consistently enough yet and this alone makes gengar strong, but a relatively fast boom (outspeeding mew is important) is also a strong point in its favor.
IMPORTANT: Please use the Smogon classic theme instead of Smogon Dark to view this post (link to profile settings here for convenience). This provides a white background necessary to view the graphs in this post because of the png transparency.
Hi everyone,
After working on this for a few weeks (and then stalling for some more) months, I have collected the data necessary to publish a GSC Ubers VR for the year 2021 (it's now 2022 but who is keeping score?). To analyze the data provided, I have used the tools developed by vapicuno - by now you should be familiar with the general structure. Unfortunately, I am not as well-versed in statistics, so the analysis of the stats will probably be shallow. Some parts are directly copy-pasted from vapicuno too.
To gather the list of eligible voters, I decided to cast a wide net and include everyone who had made roughly the top 20% in the GSC Ubers tournaments. The intention was that this'd allow for a large pool of voters that had proven their competence in the tier, and would hopefully remove VRs that were not representative of how the metagame looks. You can see the results of that later in the post.
There is essentially no difference in these lists for the purposes of this VR, as we are analyzing just the top few tiers. The aggregate VR tiers obtained are
S: A1: A2: B1: B2: C1: C2: C3:
Let's go through the process.
Quoting vapicuno:
"First the data is cleaned by compensating outliers 1 standard deviation away from the edge of the percentiles expected to contain +/- 1 standard deviation of a normal distribution. This is a modification of the conventional interquartile range (IQR), which I have not chosen to use because 50% of the sample doesn't capture the full variation from what I've seen. The compensation is done by bringing these points to the edge of this extended range. This results in mostly zero, but sometimes one or two outlier corrections. We then plot the outlier-removed data as a function of the integer rank to obtain this graph."
VR Tiering Visualised
The tiers are overall fairly well-defined and there is a clear gap between the end of the B-ranked Pokemon and those in C-rank. This gap is not as noticable between A and B-ranked Pokémon, though, and some individual Pokémon are not trivial to place in any one tier. (I shifted Quagsire and Rhydon down one subtier manually as they seemed to be a better fit there.) The S tier remains as steady as previously, though, and Zapdos has a case for earning its own tier.
Following this, there is more statistical wizardry that I am not competent enough to speak on, so I will again quote Vapicuno.
" We form a dissimilarity matrix where the distances between Pokemon X and Y are given by the following: Take the rate at which voters ranked Pokemon X over Pokemon Y, take the logit transform as is done in logistic regression of a Bernoulli-distributed variable, and take the absolute value. Performing what we call a Ward linkage, this yields a dendrogram of the following sort, where the clusters (what we are going to call tiers) formed by setting a reasonable threshold are represented by different colors, and the dissimilarity between each cluster can be thought of as the vertical height of the nearest branch that connects the two clusters. "
We want to verify the validity of the clusters obtained from the dendrogram, so we next plot the dissimilarity matrix and draw out the tiers specified.
To read the dissimilarity matrix, note that zero (the darkest value) corresponds to equal number of people voting in favor and against the Pokemon on the Y axis > X axis, and the higher the value, the more one-sided the voting becomes. In other words, the darker, the more indistinguishable the Pokemon on the X and Y axis become, and a well-defined tier would be a fully dark square (read my methodology thread for explanations).
This yields the following subdivision:
S: A1: A2: B1: B2: C1: C2: C3:
Numerical ranks represent partial tiers, whereas letter ranks represent a more complete separation. I choose to adopt numerical subranks because there is no reason a priori to believe that Pokemon are grouped in viability by a tripartite scheme of +/-.
Metagame Shifts
This chart shows the difference between this and the previous VR, together with the uncertainties in the means (not the standard deviation, but the standard deviations divided by sqrt(N-1)).
A better way to understand how significant these changes are so as not to mistake changes occuring as due to pure chance is to plot the z-score,
where the Y axis means number of standard deviations away from zero. To recap, 0.5, 1 and 2 standard deviations are about 69%, 84%, and 98% significant (one-sided), meaning roughly that for a z-score of 1, we expect that this change to have occurred due to chance 100%-84% = 16% of the time. Therefore, trust the data on the left than on the right.
Individual Analyses
For those who are interested to see whose S to C rankings are closest to theirs, you can refer to the chart below. The numbers inside the box go from -100% (full anticorrelation) to 100% (full correlation). They are sorted by the S to C dendrogram order.
And finally, these are the relative ranks of everyone. Blue = disfavor, Red = favor. Cyan lines demarcate tier cutoffs.
You can also find the ranks of individuals in this spreadsheet.
Closing Remarks
The large number of graphs may seem daunting, and to people who aren't quantitatively trained, this may be really confusing. I recommend just glancing over the spoilers on the first read, only thoroughly analyzing them after you've gone through the more important graphs that have been left unhidden. I'm interested to know what you can infer from these trends, and I hope this can generate some discussion.
--
Sorry for the massive delay in producing this. Rankings were done three to four months ago but personal demons got in the way.
I will detail my thoughts in a later post on the changes to this VR compared to the last.
Cloyster: Boom offense has blossomed during this time period, and GSC Ubers is now a very offensively oriented metagame - indeed, this will guide a lot of the changes to this VR compared to the last. The uptick in spikes usage and shift of spikes user from Forretress to Cloyster is a result of this, as Cloyster has a more powerful, safer, faster and more "directable" boom. During the early part of the ranking period, teams also heavily teched for Forretress, carrying stray Fire moves to a high degree - this is no longer seen as much, but care must still be taken around otherwise safe spiking opportunities such as Celebi and Snorlax. MiracleBerry has also been used on Cloyster to a surprisingly high degree, intending to provide a one-time stop to Snorlax Lovely Kiss for the fastest teams.
Lugia: Lugia is still a defensive behemoth, but in a metagame where teambuilding has become more structured and threats more covered for, Lugia's complete lack of offensive presence is increasingly a nuisance. Earthquake sets have seen a surge in popularity, but has not replaced the shuffling capabilities of the Whirlwind set.
Tyranitar: Settling into a more defensive role than in OU, Tyranitar's most popular and successful set as of writing is a mono-attacking set with Rock Slide/Curse/Roar/Rest, intending to sit on mono-attacking Snorlaxes and safely waiting it out. Rest provides very important longevity and with the lack of Snorlax Earthquakes in the metagame today, waking up is relatively easy. Rest Tyranitar also carries an important role as a sturdy answer to defensive Mewtwo sets that can drain its Toxic PP, and with Curse and Roar under its belt, it forces a reaction.
Steelix: Just like in OU, Steelix finds itself competing with Golem for defensive duties, but the weakness to fire moves is an increasing bother. Curse sets are invalidated by Barrier Mewtwo, a staple on defensive teams. Steelix may fall even more in the ranks if the current trends hold up.
Golem: The opposite to the above, Golem provides great utility for an offensively inclined team, with splendid Rapid Spin opportunities, a welcome resistance to Fire and decent Explosion opportunities. However, HP Water from opposing electric types must be heavily respected, and should be assumed to be on every opposing Zapdos.
Jolteon: Perhaps moreso a correction in rank than a result of new innovations or metagame trends, Jolteon fulfills the same roles as before. With fewer Lugias in play, it is not as easy as it once was to set up Growths to either Baton Pass away or use to fire off massive Thunders, but going even with Mewtwo remains a very important characteristic, and offensive teams can get completely dismantled by either Jolteon itself or a recieving Mewtwo.
Ho-Oh: A Pokémon that may see itself overcorrected is Ho-Oh. While very difficult to fit on anything but highly specific structures, the natural bulk it carries means that you have a fair multi-purpose wall that has some of the best Toxic spreading capabilities in the game. Unfortunately, Ho-Oh really only fits on stally teams, and those have seen a steady decline in usage over multiple years.
Skarmory, Umbreon and Blissey: See above. All three of these Pokémon are only seen on heavily defensive teams, with the possible exception of some Umbreon variants. As these structures fall in viability, so does the Pokémon associated with them.
Vaporeon: A niche option that can provide both defensive and offensive capabilities, the Water type is surprisingly tough to find resists to in the Ubers metagame outside of the highly passive Celebi. Working both as a standalone setup sweeper and as a Baton Passer (possibly passing Acid Armor as well as Growth), with a key resistance to Fire, Vaporeon is still somewhat underexplored and could be an important cornerstone in the future, should Barrier Mewtwo take too large of a portion of the metagame.
Quagsire: The rising star of the VR period and a personal favorite mon of myself, Quagsire can run two different sets in Ubers - Belly Drum and RestTalk with Surf. Common for both sets is utilizing its excellent typing to provide breathing room versus otherwise uncomfortable encounters such as Zapdos, Raikou, Tyranitar and Mewtwo. This mon wasn't so much benefiting from meta changes as much as it was flat out discovered and utilized at all during this time, unlike previously, where it was deemed completely unviable.
I am not very expert in pokemon (I lose a lot / sad) and English is not my domain / sad x2
but why are houmdoom and heracross not more common options? , let me explain of course
houmdoom: this pokemon blocks mew and mewtwo, the second above all is common that it carries a barrier, explosion and fire attacks, if mewtwo commits suicide it would be much better to waste it on houmdoom, don't you think? Of course this is a quick meta and mew can demolish it with earthquake and submission, but if you could paralyze it and force it to change, houmdoom would have an advantage because mew is not commonly seen with soft-boiled, and of course it is a natural counter to celebi ( uncommon), houmdoom sucks with lugia however lol (it's incredible that being so heavy lugia is still faster xD)
heracross: the situation of heracross is better than that of houdoom for me, if it is true that the fire dog is faster but it was irrelevant because mew and mewtwo were still faster than houmdoom, heracross gets better scenarios against mew than again Even committing suicide in front of the bug, it would be a good thing, right? , mew especially never uses psychic and shadow ball x2 damages him less than half of life, heracross hits hard at lugia taking advantage of the fact that aeroblast only has 8 pp, it should not be a problem, he also hits hard on zapdos (you must have a lot of experience here), raikou, tyranitar, jolteon, vaporeon, celebi, snorlax.
I also find the use of quagsire surprising, but I think that where there is a niche for him there is also a niche for piloswine (this occurs in ou), the hp water of zapdos, raikou, jolteon fails to cause him 2ohko, and piloswine does so It achieves with ice beam (fails with zapdos lol) and earthquake (jolteon has more than 70% of being ohko), also piloswine can fish ice beams in lugia, celebi, golem, gengar (it causes little damage if it does not carry fire punch), blissey, etc, also looking to freeze, it is also faster than golem and rhydon
Let's remember that 3 years ago (I read them but didn't post them) golem and jynx weren't ou, and now they found a niche that can't be ignored, and ho-oh and celebi aren't even used in ubers (I even wondered if they could down to ou), also who would have thought before mew and mewtwo wouldn't use stab moves? , or even that barrier / toxic mewtwo would be SOMETHING COMPLETELY COMMON IN UBERS
jolteon should be higher. growthpass is crazy dumb- it allows already threatening mons like mewtwo to get maybe a bit too much out of control. even stuff like cloyster and mixed ttar enjoy receiving spa boosts. maybe above golem? bottom A2 is fine too ig
ttar is the 5th best mon in the tier, above cloy and lugia. it does everything, and it does everything perfectly. curseroar is the best lax answer we have, aoa (crunch/flamethrower/surf-pursuit-rockslide/roar is annoying to face and its good at breaking holes, and ancient power altho a bit meme is an actual threat that can sweep surprisingly easy as many teams are unprepared for it. a lucky boost most of the time seals the game right on the spot because ttar answers like vaporeon and machamp are not common enough for ap to have a hard, solid check (forry and steelix die to flamethrower, mewtwo dies to crunch, lax dislikes eating a dynamic punch and, if a mon is already sleep, ttar just sits on it and eventually beats it with crunch/dynamic).
mewtwo in my eyes is the best mon in the tier. mandatory in every team and useful in every single match with absurd utility, offensive presence and unpredictability. the 'recently' discovered stalltwo is obnoxious as shit to fight and never dies- it gives offensive teams a sturdy defensive backbone that helps relieve some pressure of mons like lax, ttar, zap, etc in checking opposing lax, ttar, marowak, sd mew (it forces it to boom). it also is a staple on stall teams. offensive sets like boltbeam, aoa, barrier flamethrower icebeam, etc love jolteon being meta and giving it a +1 or sometimes even a +2 spa boost. this makes mewtwo almost unwallable. being tied fastest mon in the meta also helps it being hard to revengekill without a boom.
mewtwo also is one of the best leads in the meta. boom/fire/fight(submission/dynamicpunch)/thunder coverage threatens everything at least with a neutral hit and pressures common opposing leads like lax, opposing mewtwo, cloyster...
willing to entertain that mewtwo might be the best pokemon in the tier. with softboiled mew becoming less frequent, offensive mewtwo often just needs snorlax removed before it can demolish you. psychic is also brutal amounts of damage to most of everything.
barrier mewtwo meanwhile is a very restrictive force in the teambuilder. it's such a force that it invalidated several of my old teams on its own and you need a consistent plan to deal with it or you get slowly 6-0d.
also in agreement that jolteon is deserving of a rise. only on the rarest of occasions does jolteon not go at least 1 for 1 through either its own power or by passing to a waiting mewtwo. passing to other mons is significantly less powerful and more of a backup strat, but breaking down +1 SpA / +2 Def mewtwo is simply put incredibly tough.
not sure if tyranitar is 5th best though. the moment snorlax hits you with a +1 earthquake your ttar is kind of doomed in the long run no matter the set. it's great vs monolax but not so much elsewhere.
--
other takes:
raikou down
celebi up
all the c-rank waters up (except cune)
exeggutor up
misdreavus down
common trend being that having smth to reliably hit edqequakers with is really good. misdreavus is too weak and frail to do anything that gengar couldnt do better and nobody uses raikou when they could have jolteon instead. special shoutout to my boy haze quagsire, the rare joltpass check
with both uwtt and upl in the books i'd like to share an updated view on how i'd rank the viability of the mons in ubers
S1: S2: A1: A2: B1: B2:
after this point it gets a bit murky so ill leave it like this but you all know that quagsire cant be slept on
celebi Good. gives stall an answer to lkiss lax and hp grass is a great tool to 3hko ttar, which is great since a lot of stalls dont fit water STAB onto their teams. "recover is the best move in gsc" -conflict since forever, and he just might be right
raikou Bad. use jolteon instead.
ho-oh Decent. yeah it's difficult to fit on some teams but being 3hkod at worst by all non-stab thunder special attacks is amazing - hooh is a cloyster switch in. excellent mewtwo answer, and helps vs joltpass for that reason (+1 Mewtwo Psychic vs. Ho-Oh: 146-172 (35.1 - 41.4%) -- 80.4% chance to 3HKO after Leftovers recovery). thunder still stings though. better than lugia, because flamethrower actually hurts a bit, and teams are well prepared for tyranitar in this day and age. also servicable at taking +0 snorlax attacks.
lugia Bad. yeah we all know the defensive stats are gargantuan, but the complete lack of any offensive presence is devastating. the typing is really bad as well - one meaningful resistance/immunity (ground) and several bad weaknesses (ghost, rock, electric, ice). lugia isn't even all that great of a mew check because every mew set has a move to hit it for super effective damage and if you try to counteract it with curse, you can die to untimely rock slide flinches. the support set (surf toxic ww recover) ditches any hope of sweeping to try to provide more meaningful help to the team than just being a flying wall with recover, but the damage output from surf is abysmal.
gengar Good. turns out decently strong boltbeam coverage is quite alright even if it doesnt have stab, all the ubers are either weak to it or loathe paralysis from thunder. tons of tricks in the book too, thief is a great move that gengar likes to utilize, perish trap must be respected, hypnosis exists etc. also nobody runs pursuit and eqlax is not that common (yet) so youve got a potentially permanent wall for one of the premier threats in the generation.
exeggutor Good. great boom, powders are great too, typing helps vs zapdos and jolteon.
IMPORTANT: Please use the Smogon classic theme instead of Smogon Dark to view this post (link to profile settings here for convenience). This provides a white background necessary to view the graphs in this post because of the png transparency.
Hi everyone,
Earlier this month I started asking if it was suitable to conduct a yearly update to the GSC Ubers VR. The tournament scene has seen a lot of life this past year and there have been interesting metagame shifts and trends that made me feel an update was worthwhile.
To analyze the data provided, I have used the tools developed by vapicuno - by now you should be familiar with the general structure. Unfortunately, I am not as well-versed in statistics, so the analysis of the stats will probably be shallow. Some parts are directly copy-pasted from vapicuno too (as well as last year's VR post).
To gather the list of eligible voters, I went over all individual and team tournaments featuring GSC Ubers (excluding Retro Cup), and invited everyone who either placed top 4/6 in an individual tournament, or won a game in a team tournament, to cast a vote. The intention was that this'd allow for a pool of voters that had proven their competence in the tier, and would hopefully remove VRs that were not representative of how the metagame looks. You can see the results of that later in the post.
There is one difference in these two lists beyond the cutoff point, which is the removal of Machamp at position 32, as it only received two votes. Scizor and Muk also could not maintain their position from the VR of last year.
The aggregate VR tiers obtained are
S:
A1:
A2:
B1:
B2:
C1:
C2:
Let's go through the process.
Quoting vapicuno:
"First the data is cleaned by compensating outliers 1 standard deviation away from the edge of the percentiles expected to contain +/- 1 standard deviation of a normal distribution. This is a modification of the conventional interquartile range (IQR), which I have not chosen to use because 50% of the sample doesn't capture the full variation from what I've seen. The compensation is done by bringing these points to the edge of this extended range. This results in mostly zero, but sometimes one or two outlier corrections. We then plot the outlier-removed data as a function of the integer rank to obtain this graph."
VR Tiering Visualised
The tiers are overall fairly well-defined and there is a clear gap between the end of the A2-ranked Pokemon and those in B1-rank as well as B2 and C1. This gap is not as noticable between B and B1-ranked Pokémon however, and it could be argued that there should be smaller subtiers for Golem + Gengar, Celebi + Forretress, or a solo Mew tier. On the Mew note, for the first time, it has dropped out of the S rank, and Mewtwo was just one vote short of overtaking Snorlax for the #1 position, eventually coming down to a 6-5 preference for Snorlax over Mewtwo, with those two Pokémon occupying the top 2 slots for every voter.
The Pokémon with such a wide range in the C2 tier is Machamp, who only got two votes, being 21st and 37th.
Following this, there is more statistical wizardry that I am not competent enough to speak on, so I will again quote Vapicuno.
" We form a dissimilarity matrix where the distances between Pokemon X and Y are given by the following: Take the rate at which voters ranked Pokemon X over Pokemon Y, take the logit transform as is done in logistic regression of a Bernoulli-distributed variable, and take the absolute value. Performing what we call a Ward linkage, this yields a dendrogram of the following sort, where the clusters (what we are going to call tiers) formed by setting a reasonable threshold are represented by different colors, and the dissimilarity between each cluster can be thought of as the vertical height of the nearest branch that connects the two clusters. "
We want to verify the validity of the clusters obtained from the dendrogram, so we next plot the dissimilarity matrix and draw out the tiers specified.
To read the dissimilarity matrix, note that zero (the darkest value) corresponds to equal number of people voting in favor and against the Pokemon on the Y axis > X axis, and the higher the value, the more one-sided the voting becomes. In other words, the darker, the more indistinguishable the Pokemon on the X and Y axis become, and a well-defined tier would be a fully dark square (read my methodology thread for explanations).
This yields the following subdivision:
S: A1: A2: B1: B2: C1: C2:
Numerical ranks represent partial tiers, whereas letter ranks represent a more complete separation. I choose to adopt numerical subranks because there is no reason a priori to believe that Pokemon are grouped in viability by a tripartite scheme of +/-.
Metagame Shifts
This chart shows the difference between this and the previous VR, together with the uncertainties in the means (not the standard deviation, but the standard deviations divided by sqrt(N-1)).
A better way to understand how significant these changes are so as not to mistake changes occuring as due to pure chance is to plot the z-score,
where the Y axis means number of standard deviations away from zero. To recap, 0.5, 1 and 2 standard deviations are about 69%, 84%, and 98% significant (one-sided), meaning roughly that for a z-score of 1, we expect that this change to have occurred due to chance 100%-84% = 16% of the time. Therefore, trust the data on the left than on the right.
Individual Analyses
For those who are interested to see whose S to B1 rankings are closest to theirs, you can refer to the chart below. The numbers inside the box go from -100% (full anticorrelation) to 100% (full correlation). They are sorted by the S to B1 dendrogram order. (Side note: I do not know why this is the cutoff point. The scripts did not cooperate well with me.)
And finally, these are the relative ranks of everyone. Blue = disfavor, Red = favor. Cyan lines demarcate tier cutoffs.
You can also find the ranks of individuals in this spreadsheet.
Closing Remarks
The large number of graphs may seem daunting, and to people who aren't quantitatively trained, this may be really confusing. I recommend just glancing over the spoilers on the first read, only thoroughly analyzing them after you've gone through the more important graphs that have been left unhidden. I'm interested to know what you can infer from these trends, and I hope this can generate some discussion.
Mewtwo: Did not actually overtake the #1 slot from Snorlax, but damn if it was not close. Having the widest set of viable moves any Pokémon can run in the tier, competing only with Mew, there is no team that is truly safe from Mewtwo from the get-go. We are also finally living in an era where players are no longer running Curse Mewtwo, which is worthy of celebration in itself.
Tyranitar: It has been a bit of a meme that "Mono TTar should not sweep games in the year 20XX" for a while now, yet it continues to rack up wins. After a period of time where anything but Rock Slide/Curse/Roar/Rest was seen as an oddity, we are now experiencing an era where coverage moves are picking up steam on more offensively inclined teams.
Celebi: Stall-defining mon. Heal Bell continues to be amazing, and HP Grass has started to be settled as a staple move, as it grants Celebi the ability to 3HKO Tyranitar, which is highly useful as stall teams can often otherwise lack tools to deal with it. It is also good for hitting Cloyster, Golem, Rhydon and Quagsire.
Jolteon: No new discoveries in terms of moves or teammates, but the two attacks + Barrier Mewtwo set that accompanies Jolteon has continued to take names and provides excellent breaking capabilities versus stall teams. A similar albeit smaller positive trend can be seen in Vaporeon who can also pass Growths with great success.
Lugia: After long being held up high on the VR for its impact in the builder, Lugia has finally dropped from its former high positions. It continues to be the case that Lugia suffers from having a primary STAB with only 8 PP and a typing that leaves several weaknesses, counteracting its amazing natural bulk. Personally, I am not convinced that Lugia won't drop further in coming updates.
Ho-Oh: Best Mewtwo answer in the game? Maybe? The bird is fat yo.
Raikou and Misdreavus: The two biggest losers this VR are both Pokémon that see most of their niches occupied by other, more offensively minded choices in the form of Jolteon and Gengar, respectively. Raikou can still provide some kind of utility as a fairly bulky phazer without many weaknesses and can set Reflect for a team that wants it, and the high BST provides conditional revenge killing opportunities versus Mew, Zapdos, Gengar etc., but being overpowered by Psychic Mewtwo means it is not reliable enough at its job as a special sponge to make most teams willing to pick it. Misdreavus lacks offensive capabilities and is mostly suitable as an "annoyer", but is easy to overwhelm by strong attacks and the lackluster BST does it no favors.
Rhydon: Third biggest loser, Rhydon has lived in the shadow of Golem for many years in OU and the breakaway is starting to get seen in Ubers as well. Teams prepare more readily for Tyranitar and Quagsire, and these trends hurt Rhydon as collateral. Some innovation may be possible though, as Rhydon has a decent set of surprises in its movepool such as Counter, Zap Cannon, and even RestTalk has been attempted.
Quagsire: Still on the rise and now located in the B2 tier, serving as the gatekeeper for what is commonly seen as "viable" or not. The revitalized Haze set is providing stall teams with an important tool to handle Growth pass, which contributes to its increased viability.
hello! I made a set compendium with the help of Isa. every pokemon thats ranked in the vr is showcased in this compendium, with (almost) every viable set they can run in the format.
this compendium will hopefully give some more insight to gsc ubers, where many newbies and veterans alike can see it and be reminded or learn about the plethora of interesting sets there are! thanks for reading!
IMPORTANT: Please use the Smogon classic theme instead of Smogon Dark to view this post (link to profile settings here for convenience). This provides a white background necessary to view the graphs in this post because of the png transparency.
Hi everyone,
I am happy to bring to you the 2023/2024 VR update. To be eligible to contribute, you needed to place highly in an individual or win a game in a team tournament. In the end we got 9 submissions, from (listed alphabetically) BigFatMantisConflictcorvereIguanaIsaMr.378ToasterBoi420Torchicvani - thank you for your contributions.
--
Okay, TLDR stuff first:
The average outlier-compensated ranks from everyone are
Machamp, Tentacruel and Mr. Mime received only one vote from the VR submissions, which is not enough to to appear on the final rankings.
The aggregate VR tiers obtained are
S:
A1:
A2:
A3:
B1:
B2:
C1:
C2:
Let's go through the process.
Quoting vapicuno:
"First the data is cleaned by compensating outliers 1 standard deviation away from the edge of the percentiles expected to contain +/- 1 standard deviation of a normal distribution. This is a modification of the conventional interquartile range (IQR), which I have not chosen to use because 50% of the sample doesn't capture the full variation from what I've seen. The compensation is done by bringing these points to the edge of this extended range. This results in mostly zero, but sometimes one or two outlier corrections. We then plot the outlier-removed data as a function of the integer rank to obtain this graph."
VR Tiering Visualised
Unfortunately due to a skill issue on my end I cannot get the Vapicuno script to produce correct linear plots. I suspect this is because of issues with how the script looks at the Umbreon and Heracross placements, where the extreme outliers for Heracross causes some issues I cannot immediately determine.
Note: Due to the aforementioned issues, I believe Umbreon and Heracross have swapped placements in this plot.
We want to verify the validity of the clusters obtained from the dendrogram, so we next plot the dissimilarity matrix and draw out the tiers specified.
To read the dissimilarity matrix, note that zero (the darkest value) corresponds to equal number of people voting in favor and against the Pokemon on the Y axis > X axis, and the higher the value, the more one-sided the voting becomes. In other words, the darker, the more indistinguishable the Pokemon on the X and Y axis become, and a well-defined tier would be a fully dark square (read my methodology thread for explanations).
This yields the following subdivision:
S: A1: A2: A3: B1: B2: C1: C2:
Numerical ranks represent partial tiers, whereas letter ranks represent a more complete separation. I choose to adopt numerical subranks because there is no reason a priori to believe that Pokemon are grouped in viability by a tripartite scheme of +/-.
Metagame Shifts
This chart shows the difference between this and the previous VR, together with the uncertainties in the means (not the standard deviation, but the standard deviations divided by sqrt(N-1)).
A better way to understand how significant these changes are so as not to mistake changes occuring as due to pure chance is to plot the z-score,
where the Y axis means number of standard deviations away from zero. To recap, 0.5, 1 and 2 standard deviations are about 69%, 84%, and 98% significant (one-sided), meaning roughly that for a z-score of 1, we expect that this change to have occurred due to chance 100%-84% = 16% of the time. Therefore, trust the data on the left than on the right.
You can also find the ranks of individuals in this spreadsheet.
raikou stonks up. i find him more and more valuable the more i use him. both offensive and defensive teams have a tough time dealing with offensive mewtwo, and raikou really helps here. pairing him with a strong spinner and/or heal bell support makes roarkou a more potent option, giving stall another joltpass check. crunch is superb vs. stall, hidden power water or ice does well vs offense. conversely, zapdos down a peg.
forretress up within its rank, the greedy sets provide additional teambuilding opportunities that otherwise couldnt be gotten, and defense curl + rest gives you incredible longevity (at the expense of only dealing damage via toxic).
misdreavus up within its rank, being a ghost type is a great perk by itself and the struggle to innovate against lead mew has led to this guy getting some more attention. still not a great mon though
tentacruel to be ranked, and shuckle up - conflict showed that the combination has legs, i later built my own take on it, and it absolutely does work. you get a spikes-free environment (until you face gengar) and a shitton of PP which makes it that much more difficult for the opponent to break you down, even with all the free turns you give away. they definitely both have a proper niche and i would put them in the low end of the b2 group. begrudgingly, omastar can also join them there, im still not too hot on him but the guy does his very specific job well.
golem stonks down - probably not overall, but i definitely overrated golem in my VR submission and wanted to note it here.
umbreon down, i feel like this guy is rarely seen and his impact on games is miniscule. monodark offers very little in terms of resistances but does further extend some heracross weaknesses. trappass sets are free entry points for growthpass too which _sucks_.
I think Snorlax and Mewtwo are basically equal in terms of viability. I put Snorlax at the top spot because it is harder to drop on teams.
Mew I haven't actually used much, but lead Mew is incredibly effective. I also like Surf + Fire Blast + Thunder + Explosion.
I think Jolteon is the strongest Electric. Growth pass to Mewtwo is just incredibly lethal. Zapdos, like Mew, I didin't use as much, but its a very respectable Pokemon; Raikou is a similar story. Celebi suffers because of its passivity and how exploitable it can feel, but its utility is very useful. Ho-Oh is incredibly obnoxious; STAB Flamethrower + Toxic + Whirlwind can feel so hard to dance around.
I don't like Curse Lugia that much. Support Lugia with Surf and Toxic is quite good though.
I have a couple of notes here as a general newcomer to this tier from other oldgen ubers; they're ideally meant to inspire some discussion rather than to be taken as any definitive takes.
---
One thing that stands out about the VR is that Mew is such an oddball to me. I think it's equally deserving of S rank and the top of A+ and what would sway the average GSC Ubers player is their personal perception of what S rank means. I can understand a traditionalist's perspective in wanting to keep Lax + M2 clearly distinguished from the rest of the tier, as they are easily the heart and soul of how the entire metagame is played - but then I think to myself, if S rank is supposed to delineate how powerful and threat-worthy we consider a pokemon, then should Mew not be considered for that position or at least an S-?
I don't think it's an exaggeration to say that Mew is the most variable lead in the game, with the capability of being able to absolutely rip through anything if left unchecked. SD/Twave/coverage/boom is nightmarish for any team to try and guess what the coverage is, and oftentimes GSC teams will be constructed with an implicit understanding of which mon they will feed to lead Mew to force a trade. Oh, and if you guess its coverage move wrong, it can still bring the boom like the Costco guys on another pokemon. Considering that the fourth slot can be anything from EQ to Rock Slide to Thunder(one I have REALLY liked recently when testing on ladder) to Fire Blast, the lead Mew metagame feels almost as deep as the actual meta itself. I don't know if anyone has experimented with SD 3 atk lead recently either but you could even argue THAT works as a lure simply due to the level of depth this pokemon has.
That's also not getting into the support value this pokemon can bring with moves like Thief, Surf, Toxic, Thunder Wave, Soft-Boiled for longevity and so on. Dawnbuster also mentioned the 3 atk + boom set which is another great point toward its unpredictable nature as well. True to its lore in the anime and the games, Mew can quite literally do anything. I think in writing this I have convinced myself that I'd support an S tier Mew, but I'd be curious on the thoughts of others.
---
Zapdos seems to be falling a bit out of favor compared to Raikou. I certainly agree that the latter is 10x more splashable in a meta that appreciates a thunder resist + something you can toss in vs mix2 and at least do some mild damage to Celebi with, but the former's spike immunity allows it to be much more flexible into things like Lugia or forcing out Steelix/Golem. I don't feel nearly as qualified to say anything definitive on this other than that I would be very interested to see if Zapdos sees itself knocked down a peg going forward and what other players are thinking right now. Given how I see things I would support Raikou being positioned above Zapdos but I also realize there is a history of the meta well before I started playing that is worthy of its own respect, which is why I think that it's better to let more discussion play out before I say anything.
In terms of providing pure offensive value I think it's obvious that Jolteon is the best electric; JoltPass is a monstrous archetype in this meta for sure. Speedtying with Mewtwo is no joke and being able to Baton Pass out growth boosts to Mewtwo(or other pokemon - I've enjoyed seeing some of the creative recipients that folks like BFM have come up with, like Raikou or TTar w crunch) is insanely valuable. My biggest struggle has been in figuring out when I want HP Water or HP Grass, because although being able to smack Steelix for a clean 60-70% at +1 can prove nice, Quagsire being able to stop Jolteon cold is something I've rarely enjoyed accepting in my own personal builds.
Also in this light with JoltPass stocks being up it should come as no surprise that I find it much more difficult to use Vaporeon. I still don't disrespect that pokemon as a threat by any means, certainly not impossible to win with it, but with Celebi and these three electrics around I think it can be hard for the mon to gain traction. Umbreon is still neat in what it provides but a free invitation to its brother Jolteon to come in and Growth up is not heartwarming either.
---
Lugia bothers me because I think the best word I have to describe for it is awkward. I can't put my finger on if it's just a pure skill issue for me, but the pokemon feels a little unreliable in the sense that its Curse set is certainly capable of blitzing through teams in specific contexts but it's also respected enough of a threat that most teams provide countermeasures to this. It feels like it has to work very hard to accomplish its role, and 8 PP means you have to be rather deliberate about how you do it. I've tended to prefer support Lugia but that might be my own difficulty in making the mon work, and even then I tend to prefer other sorts of builds.
Its bird counterpart however, Ho-Oh, I cannot say enough good things about. It feels like a literal crime to have the Gold mascot lower than the Silver mascot purely because while Ho-Oh has one set), it does its job so insanely well that it deserves the rise. Several years ago I remember reading a very very old GSC ubers dex analysis of how Ho-Oh was the "worst uber" and I'm glad that not only has its dex entry been updated but also that other players are supporting it to be pushed up the rankings. Being able to spread toxic, shuffle pokemon around after spike has been laid, and having fire stab is HUGE in this meta. Although it would prefer to pivot into a teammate in the face of a threat, its absurd SpDef means it can take a hit in a pinch, and while it will never outright defeat Snorlax it can find opportunities to whirlwind it out. There's also some potential in being cute and running EQ over whirlwind; I have not personally tried this myself but it seems interesting if your build can accommodate it. It adds another layer to what is already a great Pokemon, but it arguably doesn't even need to experiment because it does its job so damn well.
---
Final quicker bullet point notes:
- I really don't see what Rhydon has to offer this tier and rarely if ever ran into it on ladder, but I am open to having my mind changed on what value it holds(if any)
- Suicune feels like it's on the fringe of low tier viability by offering some great defensive value but seems to suffer from the opportunity cost of simply using better builds with pokemon that can do more than it, this disappoints me as I stress tested it a lot this month and nuked my GXE in the process
- Heracross is cool if a little hard to pilot
- if absolutely forced to pick a #1 I would put Mewtwo just slightly above Snorlax due to the versatility in its movesets; the gap between the two is incredibly small though
- Curse EQ Lax feels rare but deadly and I love using it when I can
- Rest Gengar stonks will probably be going up but honestly Gengar stonks are just great in general
- If you didnt read this whole monologue I want to float the idea of Mew in either the third S tier spot or an S-, it just seems a cut above the other mons to me personally.