Considering one outcome is allowing both at the same time (if that's not correct me), then we need to be able to test what it's like with both of them in the game, methinks.
The same can be said of any two suspects though, which is why Stage 3 exists, and also why I completely agree with you that the process by which we decide what is/isn't banned from standard is "backwards." The fact is, we can't turn around now and say "well never mind, let's do it the
right way now!" (which is pretty much exactly why Amazing Ampharos, probably before DP was even released stateside, was already saying that we should start off DP with no bans/clauses whatsoever), so the people who don't have this fixation on the "real OU metagame" (the same mentality that has helped keep the "UU reset" controversial for several months, because "they're not the
real UU!") are kind of stuck hoping that Stage 3 serves as OU's "reset" so to speak.
At this point I honestly think that the best bet for those people
is to push for Stage 3 to be as fleshed-out and thorough as possible. I'm obviously being very presumptuous here, but I think it's imperative that, when the time comes, we don't just look at Stage 3 as a confirmation of our good Stage 1/2 decisions, and a nice little band-aid fix for our bad ones. Those who are already skeptical of the initial stages of this process might find it more useful to focus on Stage 3's potential too; I think it's inevitable that, whether we go back to using a Bold Vote, the current voting system, or a hybrid of the two, we're still running into a lot of bias and a lot of controversy no matter what. I think the voting system has a lot less to do with our problems than the system as a whole (though that isn't to say that I particularly like it the way it is).
We're given a month or less to really figure out how each of our Suspects changes the metagame: even if people actually
used that entire month of testing (which they don't), such a short period of time gives us plenty of ways to vote something Uber when it in fact "wouldn't be" in the long run once things actually settled. There are plenty of votes in both the Deoxys-S and Shaymin-S polls that either preemptively deem something to be too strong based on a couple weeks of experience, or literally predict that "once the metagame settles it will dominate." This is a problem that, judging from the utter lack of controversy most of these types of votes have generated, I don't think would be solved by implementing a Bold Vote system, or really by doing anything beyond giving the "OU side" some kind of artificial advantage ("only needs 40% of the vote to win"). It's just the way things go when we've already committed to the notion that testing suspects one at a time (which suddenly makes time an issue) is the way to go... but since that obviously can't be the case for Stage 3, it seems rather ideal to me.
So yeah, sorry for veering things somewhat off topic like that, there's just a lot to think about with this crazy system and with Aldaron's old testing proposal getting the shaft I'd love to see someone reassure me, that Stage 3 could viably do that justice.