Analysis of IV Distribution of resetted legendarys in Emerald

Status
Not open for further replies.
For some time there has been some confusion about the unusual amount of clones when resetting legendarys. It has been speculated that the reason for this is the way the game generates random numbers but, to my knowledge, there has never been a deeper analysis of that matter until now.

Theory
The Pokegames use a linear congruential random number generator with a 32 bit buffer. As with all pseudo random number generators the real series of values is deterministic. So if you know the state you know what the next “random” number will be. Normally this problem is avoided by seeding the the generator. That means the initial random value depends on real time for example.

How Pokegames seed
RSFL seeds based on something in their savegames instantly when you start the game. I currently don’t know what exactly.
DP seeds when you load your game. The seed is based on how long you are in the menu and the current real ds time.
Emerald is different. Emerald never seeds besides one time when you enter your nickname. Normally when loading a game emerald does not seed. The reason why the events in emerald are not completely static is that there is a useless rand call about every 1/60 s.
This is the main reason why I chose Emerald for further examination.

Emeralds initial state is always 0. I implemented my own version of the pokemon random number generator. For reference this is what it looks like

[FONT=&quot]u32 randbuf=0;[/FONT]
[FONT=&quot]u16 pokerand() {[/FONT]
[FONT=&quot] randbuf = (randbuf*0x41C64E6D)+0x6073;[/FONT]
[FONT=&quot] return (randbuf >> 16);[/FONT]
[FONT=&quot]}[/FONT]

Basically you can see randbuf as a recursive series (A_n) with A_0 = 0 and A_n = f(A_(n-1)) (sorry for the crappy formatting it’s a shame that latex is not standart in boards…) the real returned value is always the upper halfword of randbuf.

It should be fairly obvious at this point that after catching the legendary (in my case ho-ho). You can calculate how often pokerand() was called before the Pokemon was generated.
That offers a great advantage because after that you don’t have a cryptic looking pseudorandom number but a plain n. You know that n increases with time in emerald and that it increases faster if your actions require a lot of rand() calls. If you always do the same things you can see n as a time axis with small correction. You can identify one unique Ho-oh with one n. That’s what I do with 70 Ho-oh caught on German emerald.

Results
The results are what I expected but I was a little surprised when I found that all 70 Ho-oh are in the Interval n element [860:980] so even if I catch 1000 Ho-oh’s chances are that I only get like 130 different ones. I expected a kind of Gaussian distribution and well that fits quite nicely if you look at the plot




Here we have the number of Ho-ohs I caught with that n. As you can see the amount of clones I got is really big. The blue line is gaussian fit. The error margins are quite big, but that’s probably because 70 is not enough (but there is no way I’d want to repeat this even more often.

Practical implications
The obvious one would be when reseting legendarys before you run into them take your time. That way n will evolve a little further giving you a bigger interval. And change times. Otherwise you will end up with the same iv distribution very often and thus wasting even more time. In RSFL walk around a little and save like once every 20 attempts so you get a different seed. Only the new games don’t have this problem at all.
 
Great research and all, but I'm having trouble understanding it because of the combination of not being a maths expert and English not being my first language. ): However, I'm sure this is quite spectacular if you know what you're doing.
 
It should be fairly obvious at this point that after catching the legendary (in my case ho-ho). You can calculate how often pokerand() was called before the Pokemon was generated.

...

Practical implications
The obvious one would be when reseting legendarys before you run into them take your time. That way n will evolve a little further giving you a bigger interval. And change times. Otherwise you will end up with the same iv distribution very often and thus wasting even more time. In RSFL walk around a little and save like once every 20 attempts so you get a different seed. Only the new games don’t have this problem at all.
One instance of ho-ho rather than Ho-oh. Also, thanks for the tips on changing intervals, and walking, I'm going to be resetting for a better Groudon and Kyogre soon, so this should save me a bit of time. Other than that, I'm afraid I don't understand much either, my excuse being I'm only entering High School :P Great work though!
 
Can you calculate the IVs associated to a given n? If you could, the best strategy would be to find the n that yields the best possible IVs and wait just the right amount of time such that the peak of the gaussian is located at that n.

Similarly, if the game is reseeded according to the current time and amount of time spent in the menu in DP and that the random number generator is known, one could calculate the time with the best expectancy for the IVs.
 

obi

formerly david stone
is a Site Content Manager Alumnusis a Programmer Alumnusis a Senior Staff Member Alumnusis a Smogon Discord Contributor Alumnusis a Researcher Alumnusis a Top Contributor Alumnusis a Battle Simulator Moderator Alumnus
Just how far apart of a time are we talking about here? I mean to say, does n change every millisecond, second, or what?
 

X-Act

np: Biffy Clyro - Shock Shock
is a Site Content Manager Alumnusis a Programmer Alumnusis a Smogon Discord Contributor Alumnusis a Top Researcher Alumnusis a Top CAP Contributor Alumnusis a Top Tiering Contributor Alumnusis a Top Contributor Alumnusis a Smogon Media Contributor Alumnusis an Administrator Alumnus
[FONT=&quot]u32 randbuf=0;[/FONT]
[FONT=&quot]u16 pokerand() {[/FONT]
[FONT=&quot]randbuf = (randbuf*0x41C64E6D)+0x6073;[/FONT]
[FONT=&quot]return (randbuf >> 16);[/FONT]
[FONT=&quot]}[/FONT]
As an aside information, in mathematics, using decimal numbers, this can be written as

pokerand(0) = 0
pokerand(n) = ((pokerand(n-1) * 1103515245) + 24691) mod 4294967296

Remember though, that after each call of pokerand(), the number returned is the number in the formula above divided by 65536 and floored.

I have personally researched a bit on this formula. It repeats itself after 4294967296 numbers, i.e. it generates all the numbers between 0 and 4294967295 once before repeating the same list again. Because of this, an algorithm can be written to generate the list of random numbers in reverse order. For Pokemon, this can let us go 'back in time' to get Pokemon with ideal IVs (according to loadingNOW). loadingNOW and me have written such an algorithm, but it takes an average of 30 seconds to find the previous number in the list (his version is slightly more efficient than mine). I have also found a formula to get pokerand(n-m) for any m given pokerand(n), but the numbers used can become so big that it's easy to get an overflow error if you implement it on a computer. :[

Results
The results are what I expected but I was a little surprised when I found that all 70 Ho-oh are in the Interval n element [860:980] so even if I catch 1000 Ho-oh’s chances are that I only get like 130 different ones.
I find this extremely intriguing, as there's nothing to suggest this fact from the random number generator alone. There must be some other hidden mechanism to kind of 'shift' all random numbers to the range 860 to 980 somehow.

Awesome research though.
 
the evolution from n->n+1 takes about 1/80s this is a very rough estimate (there are corrections depending on what you do of course, it means that my clicking inervals where somewhere in the 1-2s max range which sounds reasonable) and this is emerald only. Further research will be necessary on that matter to realize true random-manipulation. From a theory standpoint i'd assume it a real n formular would look something like n = f*t + bootup + A*entering-new-areas + B*steps-in-grass + C*random-pokemon-encountered + ...

@X-act: nono the random values were actually "random". I just took the random values i got on the caught pokemon and calculated their n (calculate pokerand^(-1) so to speak) use in the myself. That way it's more easy to compare what i get and to see which pokemon are actually "next" to each other on the time (well n) scale. The interesting part is that because of the slow evolution of n you only get the n 860-980. And not the full interval in n. (that was to be expected, however that it's just 100 surprised me)

to take that a step further one would halve to find n with good ivs (trivial) and thus try to find good times to catch not trivial because that would require a theoretical formula to calculate n(t, your-cations) as exactly as possible (missing the value by 10 would not matter much but to be off by 500 would obviously suck)
 
Now this is strictly IVs, right? Say I found a Ho-oh with great IVs, but a Timid nature. That Ho-oh could still show up with a Jolly nature with the same IVs, correct?
 
Practical implications
The obvious one would be when reseting legendarys before you run into them take your time. That way n will evolve a little further giving you a bigger interval. And change times. Otherwise you will end up with the same iv distribution very often and thus wasting even more time. In RSFL walk around a little and save like once every 20 attempts so you get a different seed. Only the new games don’t have this problem at all.
What do you mean by "And change times."? You can't change the time in Emerald.

Edit: Does this only work for legendaries, or can it work for other Pokemon (such as starters and other one-offs)?
 
hmm yes that was not really clear i guess. What i meant was wait a little
longer before you encounter or get your pokemon. Eg: dont always bootup, skip intro load, press a to start battle instantly. A good method to make sure you get something different all the time would be to wait one second more every time you start the battle.

It basically works with everything random.
 
to a limited extend also the other gba generation games. emerald is a little different (and a more simple to calculate) the general ideas also work with rsfl but they seed. maybe the distribution interval is a little bigger or smaller and because of the seeding you will have a random (but constant for identical saves) offset on the n-axis but that is it.
 
Interesting research. I've known about this issue for some time so I usually try to change the time before I start the battle, walk a few random steps or whatever to remedy it, like you suggest. But it's still nice to see some actual numbers on this stuff although there's clearly much more to discover and probably no practical way to put the information to use in order to get better IVs.
 
I did a little reasearch after I have read on all this. Before I had read this, however, all I usually got when catching Dittos were the same ones with exact IVs and was wasting time. So after the game started, I would already be in the cave where you can catch Dittos. To me, it seems as if you wait for a couple of minutes, the random number generator changes the IVs of the Dittos differently. For me, I have started to catch Dittos with 2-3 flawless IVs by jsut waiting 2-3 minutes each. Maybe if I wait longer, I may be at a better rate to catch even much better Dittos. I'll update and tell you if I find anything else out.


EDIT: I found out when you wait a little too long, around 5-8 minutes, I get pretty low IVs for Dittos. It may just be really random, but you never know?
 

obi

formerly david stone
is a Site Content Manager Alumnusis a Programmer Alumnusis a Senior Staff Member Alumnusis a Smogon Discord Contributor Alumnusis a Researcher Alumnusis a Top Contributor Alumnusis a Battle Simulator Moderator Alumnus
We'll need more than just 'some' number of Dittos with 'bad' IVs. Just how many are we talking about here? Basically just more in depth information or that is meaningless.
 

X-Act

np: Biffy Clyro - Shock Shock
is a Site Content Manager Alumnusis a Programmer Alumnusis a Smogon Discord Contributor Alumnusis a Top Researcher Alumnusis a Top CAP Contributor Alumnusis a Top Tiering Contributor Alumnusis a Top Contributor Alumnusis a Smogon Media Contributor Alumnusis an Administrator Alumnus
I managed to find a formula that reverses the random number generator, i.e. produces the list of random numbers in reverse order. This should have implications about finding Pokemon with perfect IVs in the wild.
 
I'm finding through breeding repeatedly that I'm also coming out with very similar IV spreads for hatched pokemon, particularly those that are recieved and hatch within a few minutes of each other (I use the fast bike to get the steps quickly outside of the daycare).

More interestingly, I'm finding that abilities and genders and natures are passed at the same time as well. For example, I hatched 6 chanseys in a row, bold with serene grace in a row, and I hatched 8 adamant larvitars and they were all female. While these two do not link all 3 categories, I similarly hatched 4 consecutive ralts male modest with trace.

I would think that the ability to breed perfect pokemon outweighs the ability to find a perfect legendary. My research is inconclusive at the moment, but I'm still looking at it. The game seems to extremely limit the variety of pokemon you receive, which should be used for breeding, since breeding for IVs in the previous versions seems much harder.
 
all of my tested parent pokemon were max IV, and approximately 1/4th of he offspring hatched were max IV, which is significantly better than the estimated 50,000+ from your applet. Some parents were hacked, none of the offspring were hacked, and no hacking devices were used during the breeding.

2 observations: I have never seen less than 4 IVs passed, it's always either 5 or 6. Also, when breeding 2 legal max IV pokemon, the trend seems to repeat, as in the example with my dragondance larvitars. I hacked a charmander and a dratini to start the process, but used no hacking tools afterwards. the dragondance charmander that I used as a parent to the larvitars should be legit that way, so I've concluded that the multiple occurances of passed dynamic values is not impacted by the hacking device. Even then, it shouldn't be so easy that I have a box of 20 lv 1s with max IVs with 4 hacked parents, so I'll have to look into it. The 4 hacked pokemon are mudkip (used for breeding endeavor treeko), a charmander and a dratini for the larvitar, and a duskull for breeding will o wisp onto the ralts. Nothing was hacked on the pokemon except IVs, and the wild IVs were set to all 31 before capture.

However, this would not explain why my completely non-hacked 6 bold chanseys all came out with serene grace, since neither parent was hacked.

on a side note, all 4 of the charmander that I hatched for the larvitar breeding came out adamant male, no other nature and no other gender.

I will start doing tests on non-max IV pokes sometime during this week, since Ill be in a car for about 40 hours of it.
 
okay let me make one thing clear: similar n does NOT mean similar iv looking iv. And similar iv does not mean similar n. In fact unless you do an exact caluclation you cannot approximate how far the states are away from each other. pokemon you catch at a similar time range will have a similar n. but not necessary similar ivs. at least as of now i don't have a theoretical explaination for that and unless you have statistical relevant data i'd seriously doubt that it's a non-random effect.

Or for more math savvy people let's call one IV-space and the onther one tau-space (tau resembles t but is not exactly t) it might be you can transform from one into the other with a mathematical formula given by the RNG and the way the game uses it during generation and both representations are equivalent but besides that they "look" completely different and the n space is way more useful if you want to compare your results. The only thing you know for sure is that if you have identical pokemon they will also be the same in n space if only one iv is different they can be at completely different positions in n space. (the whole thing is kinda like a fourier-transformation if you will).

On the other hand it might that at "good" times there are more good iv's. I currently don't think so but maybe there is some clustering. it's quite easy to check and i am going to do that sometime today. If we find that that would be really useful as it wold transform a simple way of getting max output like for best ivs "catch your pokes from minutes 4-7".

other things which might be worthwile research topics
1) calculate the n of a lot of emerald pokemon to test what n are reasonable in iv. as i said before i do not expect to see a whole period.
2) find the n for max iv.
3) find a proper way to estimate the lower order effects of n (the major effect is time that's easy to estimate but that will probably not be enough if you want to abuse (2)
4
 
This subject kind of finally convinced me to register here because it's of much interest to me. I've been attempting now and then to soft reset for shinies of the "gift" Pokémon (so starters, Game Corner pokes, legendaries, ...), and have been concerned that bad RNG seeding has been hampering my progress. I've always suspected this and I'm glad that someone's finally backed it up with cold facts. Now, I have a few questions:

  1. I'd presume that the same "pokerand" is used for the PV (controls nature, ability, shininess, ...) as well as IVs. Is this true?
  2. Is there anyway, short of restarting the entire game, to reseed the RNG on Emerald? (other than sitting around idle for an undetermined period of time after each reset .__.)
  3. How could Game Freak have done such an awful job with the random number calculations in the Advance games?

I can think of a whole bunch of ways to avoid RNG cloning, any of which would alone have solved the problem:
  1. Seed from the real-time clock, dammit! This would make the "phase" in the Pokerand function continue to progress between soft resets.
  2. Rewrite the random numbers the game seeds from to the save each time you load your save, even if you don't resave. (Even if the power is lost mid-write, the results are still "random".)
  3. Carry over some value between soft resets. It could be the amount of game time that passed before resetting or anything else. It doesn't matter, so long as it's unique. Soft resetting is entirely a software process, so there's no need to wipe the memory 100%. Almost no one resets their game by turning off and on the GBA/DS.

Now, if you don't mind, I'm going to go assemble an Excel sheet containing Beldum stats and go do a couple thousand SRs on Emerald to populate it. I'll see if walking around a bit and saving changes the set of clones that appear.
 
To your questions:
1. Yes, of course
2. actually restarting emerald sets the rnd back to the same state all the time. the only way to get other seeds is a) to wait a long time b) when starting new games to use different nicknames
3. to be honest other games are not that different but it does not matter because most are not really based on high level randomness.
your suggestions
1. d/p does
2. this is what linux does (?) however it's not really usable for games because saving takes a lot of time on eeproms/flash any you normally dont "shut down" your gba. the seeding in rsfl is compareable but with some severe limitations
3. how do you do soft resets on a gba? (unless it's programmed in some game menu)

-if you collect data it would be a good idea to use some savedump method since that's fast and you get all the shiny/gernder/nature data.
 
I don't know if this is of any use to you, but in ViralKite's trading topic, I found three others (Including the topic maker) who had Rayquazas with exact IVs and Natures as mine... And this is certainly too much to be coincidence. Does this mean all games share a similar IV plot on the time of manufacture?
 
Status
Not open for further replies.

Users Who Are Viewing This Thread (Users: 1, Guests: 0)

Top