re: Clear Guidelines
If the world were perfect and if Pokemon weren't nearly as intricate as it is, these guidelines would already have been established. Unfortunately, it is virtually impossible to set a perfectly explicit definition to the work "broken" that is void of subjectivity because of competitive Pokemon's incredibly complex nature. Doug's
Characteristics of a Desirable Metagame and Jumpman's
Portrait of an Uber are two valuable documents that I believe any informed player should peruse before voting, and they introduce and address some interesting conundrums in attempting to define Pokemon jargon. Just think about this for a minute: these topics use definitions to reduce the subjectivity of the words "desirable" and "broken", but the definitions in and of themselves are open for interpretation. This isn't some grand discovery though -- most people are well aware of the paradox I just outlined, and Doug actually addresses the numerous issues of subjectivity that his definitions introduce. Anyways, the purpose of their topics isn't exactly to lay down 100% irrefutable definitions, but rather to narrow the scope of these frequently used terms. No matter how hard we try, I honestly don't believe we'll ever arrive at a universally accepted definition of "broken" for competitive Pokemon because of the complex ways the game's variables interact with one another.
Want my personal opinion? I think "broken" can be summed up with the phrase "bad for the metagame". Now clearly this is an EXTREMELY broad interpretation, but if you combine that sentiment with the definitions Doug and Jump outlined, hopefully you can get a clearer picture of what I mean. Anything that has a net negative (again, broad, but this is unavoidable) impact on the metagame when you consider the majority of its effects (this includes Pokemon it counters, Pokemon that beat it, how it interacts with weather, how it interacts with various playstyles, luck factors, etc) is "broken" in my view. Each and every voter has their own unique view of the term "broken", but I have faith in the voters that they have thought critically and rationally enough to arrive at something similar to what I just explained. After all, I wouldn't have pushed so hard for the return of Suspect Testing if I didn't believe that the voters as a whole were capable of this.
re: Supermajority/Percentages
Admittedly this is an issue we didn't address as thoroughly as we should have when Suspect Testing returned, as it was low on the priority list of things to do in order to get Suspect Testing off the ground. Without going into too much detail I can just say that we will probably return to the old system of requiring two consecutive votes of 50%+ or one supermajority vote of 66%+1 in order for a Suspect to be banned. This in effect means Landorus will be tested again since it received a majority ban but didn't reach a supermajority. As long as it receives a second majority ban, it will be kicked out of OU (EDIT: it is possible that we do not apply this to Landorus because its vote is already done, but I'll have to sort this out with the Council). Aldaron explained the reasoning for this system better than I ever could so I will just leave his words here (spoiler: it boils down to math... so fun!):
As anyone with any statistical knowledge can tell you, analysis over any dataset has some error variable. In pokemon suspect voting, this error could encompass rating system deficiencies, voter stupidity, or even acts of god (site / sim being down for large amounts of time). One of the purposes of a supermajority is to make sure we are above 50% + error, to ensure a ban. Note, the reason this is done this way and not for the 50% -error for keeping the suspect is that banning a suspect changes the status quo, which logically requires we be sure we want to change it.
Granted, nobody has ever bothered to measure this error, but I figured I should give you objective reasoning for why some percentage over 50% is preferred. The exact, true, 50 + error % might never be known, but it is for sure above 50%, hence the desire for some percentage over 50%. Since measuring stuff like the errors I mentioned is probably impossible, and since we know the 50 + error percentage is above 50%, the best option we have is to use a judgment to determine what the "best" percentage is...basically hope that the 5 council members are somewhat competent in math and logic :p
You can apply some principles like a larger voter pool will probably have a lower error range (standard error is standard deviation divided by the square root of sample size) or a larger time range to qualify will probably have a lower error range (more time for ladders to stabilize, more time for metagame in general to stabilize), but in the end it will end up being a judgment call by the council.
tl;dr - laws of stats require a meaningful change of the status quo to be above 50+error%; its very difficult to quantify that error in pokemon voting, so it will ultimately end up as a council judgment call.
re: Speed
Speed has certainly been a tricky issue in the past (iirc Jumpman mentioned somewhere that the slowness of the DPP Suspect Testing process was one of its biggest pitfalls but don't quote me on this). However, I think the problem is being exaggerated as a break in between tests is certainly not crippling to the process. There is actually a lot of merit to a small break in between tests, especially after a ban has taken place, in order to (re)evaluate the course of action. I'm certainly opposed to creating a rigid schedule of Suspect Tests because philosophically and pragmatically it makes very little sense.
re: Retesting OU Pokemon
I have never been opposed retesting Ubers in OU, but I am certainly opposed to testing Pokemon for the sake of it. It's always wise to consider the prospect of retesting Ubers based on metagame changes, but there has to be a ton of supporting material in order to even contemplate changing the status quo. I mean, the OU Council often has discussions regarding the possibility of retesting some Ubers, but we generally tend to agree that we don't believe it's even worth a test. I think Aldaron has mentioned retesting Thundurus and someone may have dropped Excadrill's name at some point, but given the current state of the metagame I don't think I could get behind any retest. Speaking in massive hypotheticals, if something such as a full weather ban takes place, then we will obviously look into Manaphy and Excadrill retests, but given the current metagame I don't think anything similar will be happening. I think it's dangerous to view a stable metagame as boring and thus grounds for a new Suspect Test, because this idea is at odds with the fundamental principles of Suspect Testing.
re: Paragraphs
Again, this has been a tricky issue for the entirety of Suspect Testing's lifespan. The merits of paragraphs are so obvious that I don't need to repeat them, but the concerns they introduce are extremely problematic. The way I see it, paragraphs introduce another unnecessary layer of subjectivity when it comes to voting on suspects. Not only do voters have to state their subjective (see my response to point #1 as to why I think this subjectivity is inherent) views on the Suspect, but now the Council has to subjectively evaluate whether there is merit to the voter's assertions. I think the Council as a whole is smart enough to discern between intelligent and blatantly stupid lines of reasoning, but we'll probably run into problems when we disagree with controversial principles from which the voter's opinion was formed. Statements like "Landorus is broken because the Rock Polish variant destroys most well-built offensive teams" require us to determine whether this justification is acceptable and thorough enough to allow the player to vote. But how do you evaluate this acceptability? Well, the Council would have to do this subjectively, but at this point the opinions of the paragraph writer lose a lot of their meaning because this process puts so much power in the subjective views of the Council that we may as well scrap the Suspect Testing system altogether.
I am definitely not explaining this as well as I could (sorry it's late!), but basically what I'm trying to convey is that I just don't think the benefits of paragraphs outweigh the repercussions. Sure, it would let us weed out extremely stupid voters (which I maintain are anomalous though that's besides the point), but remember the whole reason why Suspect Testing works is because we believe that voters AS A WHOLE are smart enough to make good decisions for the metagame. I just cannot stress this enough... without this trust the whole thing falls apart. If we did not trust that voters AS A WHOLE were smart enough to do this, Suspect Testing would never have been conceived in the first place.
------------------
If you can take anything from this post it's that I acknowledge that Suspect Testing is an imperfect process, but all things considered, I think it's pretty darn good. Given the number of variables competitive Pokemon presents us with, there are literally an infinite number of ways to produce a balanced metagame and therefore an infinite number of ways to run tiering. But we operate under a series of principles for a reason, and unfortunately there are far too many of them for me to list and explain individually. Speaking broadly though, all I can say is that I personally believe in a system that maximizes efficiency (time, resources, etc) to the best of its ability without compromising the integrity of the final product, all the while ensuring public involvement remains at the forefront of priorities for the individuals who organize the tiering process.
I'm sure there are things that could be improved with Suspect Testing that fall in line with this philosophy, but before you criticize the current system, keep in mind that we are very much following a path that was carved out by the DPP Tiering Leaders (a very good path if you ask me) and that it is far more difficult to alter this paradigm than it is to continue using its basic principles. On the topic of being critical, I would also like to stress that while I love reading posts that at odds with my views because I find them insightful, people should be careful not to get too hung up on their opinions that they become aggressive or inflammatory. While I don't post too often here, I read most threads that are relevant to tiering, and far too often I see people that get so caught up on winning an argument that their points become clouded by their emotions. Be happy and enjoy, because after all the point is to have fun.
I just wanted to conclude by saying I didn't consult with the rest of the Council in making this post, and so other than Aldaron's post I quoted, these are simply my views and nothing else.