Procedural System Generation

robject · Aug 21, 2013

My proposal is to have system data available for every sector in the universe. This proposal consists of (1) a random number algorithm, and (2) a generation procedure.

Pseudo-Random Number Generator (PRNG)

The PRNG is a modified version of Bob Jenkins' "small noncryptographic PRNG", posted in a separate thread in Software Solutions.

The rule is: a new random number is used to generate every value (e.g. 1d6, 1d9, or 1d10) as needed.

2d6 is two calls to the random number generator. Similarly flux.
3d6 is three calls.

And so on.

These random integers are big (32 bits), but the period in the PRNG is huge. The average cycle length is expected to be about 85,070,591,730,234,615,865,843,651,857,942,052,864 results. So, it's less of a logistical nightmare to just get a new number than to try to use up the current one.

Generation Procedure

The procedure follows the Traveller5 rules for sector and system generation, in order, and as needed.

1. Seed the PRNG with the unique hex ID, which is just any string that uniquely identifies a sector and its hex, be it "Faraway/1910" or "TashAlt/54A7/B9910/1910".

2. Supply external information about the sector, such as system density.

3. Using page 421, determine if there's system in this hex. If not, then stop. This requires from one to three dice to be rolled.

4. Follow the master system generation checklist on page 431.

The full process may require anywhere from as few as 40 to upwards of 400 dice rolls, depending on the number of stars and planets show up in the system. On the average, each star requires 6 to 7 d6 rolled, and each world requires 45 d6 rolled (due mostly to satellites).

robject · Aug 23, 2013

Updated and generalized my process.

rancke · Aug 23, 2013

Any entirely automatic world generation system will lack the final and most crucial step:

?. Vet the results!!!

This step ought to contain a number of substeps with the format "Such and such a result is possible but unlikely for such and such reasons. If you get such a result, you can explain it thus and so. If none of those explanations suit you, change the result."

(Not to forget, of course, a few substeps with the format: "Such and such a result is impossible; change it.)

Hans

robject · Sep 11, 2013

My current Procedural System Generation program follows the Traveller5 rules, and is here:

http://eaglestone.pocketempires.com/survey/t5-prog/t5sysgen.pl

I've started to add GG and planetoid placement, as well as satellite generation. The additional data is hidden by default, but can be included in the output by checking the "Full" checkbox.

The Sector UID / Sector Key seeds the generator: enter in the same key, get out the same listing. The key is hashed by MD5, and MD5's first half is used as a 32 bit seed to a very small and portable random number generator with a plenty-long period.

esampson · Sep 11, 2013

I like the idea of what you are doing, but if I could make one suggestion.

Rather than using a PRNG you might find it better to use a hashing algorithm. Find one that you're comfortable produces a nice random hash. For the purposes of demonstration I'll use SHA-512 although there are plenty of other choices.

SHA-512 takes a specific input and returns 512 bits of apparently random data. Since 3 bit will generate a number between 0 and 7 we can break those 512 bits into 170 dice. About 25% of those dice will be 'out of range' but that will still leave us with over 120 dice, which is more than enough.

For any hex you will generate a set of coordinate numbers based off some fixed point. For the sake of argument I'll chose Capital, but you could pick anywhere you like. You will generate a SHA-512 hash of the coordinate of that hex (for capital I would generate a hash of "00,00". That gives me 120 dice with which to determine if there is a system in the hex and if there is the number of stars, what those stars are, gas giants, asteroid belts, and the placement of planets around those stars.

You can then work outwards from each star and generate a new hash based on the coordinate of the system, the star you are working with, and the orbit of the planet. For example the planet in orbit 3 of Capital's primary would use the hash for "00,00:1:3" (I'm going to toss in colons to separate things, but it doesn't actually matter as long as you're consistent). You've now got another 120 dice to determine the UWP of the planet, presence of bases, number of moons, and what orbits those moons fill.

Now you start working out through each filled orbit and you generate a new hash based on the planet's hash seed plus the orbit the moon is in, so a moon in orbit Eff would use "00,00:1:3:6" for the hash.

There are numerous advantages to this approach. The first is that this system is that it will never cycle. Yes, it is possible that you will have a collision between two given values (such as "00,00" and "-134267,8534290" but statistically those collisions are as common as rolling the same 120 dice twice, so nothing is lost. Additionally as soon as you move to your first planet the hashes for those two planets will be different, thus no cycle. 85,070,591,730,234,615,865,843,651,857,942,052,864 is a really long cycle but infinity is more.

The second, and bigger advantage, is that you are much more freed up from making sure you go through a specific order.

As an example, if I want to see what the planet in the habitable zone of hex 2435 of a given subsector is you have to seed the PRNG and then check each hex from 0000 to 2435 for a system. For each system that exists you have to generate all the stars, gas giants, and asteroid belts for each system, the number of planets for each system, their placements, their UWPs, and all their moons, which is an awful lot of work considering you weren't even asking about any of those hexes.

Using the hash method you start off with the hash for the hex to see if the system even exists. If the hex is empty, you're done. If not then you generate the stars of the system, gas giants, asteroid belts, and the placement of planets. If there's no planet in the habitable zone you are once again done. If there is you generate up the planet.

In essence the hash method becomes a sort of MOARN.

The final advantage is that the system is already built to support additional generators. As an example I could add a new step so that I generate up the map for the third planet from the primary using the hash seed of "00,00:1:3:Tri:32,55:W:B66:T:W32:L:W16" to see what is in the White 16 hex of the White 32 Local Hex of the Black 66 Terrain hex of the 55 World hex of triangle 32 of the third planet out from the primary star in hex 0000. Adding in this new generator will have no impact at all on anything that has already been generated and actually would probably be capable of running rather fast considering the level of detail provided (of course if you generate complete maps down to the sub-local hex for every planet and moon of every system in a subsector that might take a lot of time, but you would have to ask yourself why you're doing that since the data is there waiting for you to call it when you need it).

robject · Sep 11, 2013

So, Evan has a great suggestion. Basically, it is to use the hashcode as THE random numbers themselves, and add orbit number to the input string to help randomize the hash (which is a nice touch, by the way). I had thought of this, too, and played around with hashes, including SHA512, and also realized that we'd need to do multiple hashes, but your solution worked better than what I had in mind.

Anyway, it turns out that chopping up 512 bits into 3s, and discarding 25% of them, is more complicated than simply using Bob Jenkins' small (as in 20 essential lines) non-cryptographic PRNG... and as you can see, creative hashing is useful regardless of algorithm, and we could use your idea of tacking on orbit # to the key regardless.

And yes, 85,070,591,730,234,615,865,843,651,857,942,052,864 results aren't infinite, but of course we can always re-seed in the same way as you re-hash... and in fact I do re-seed at strategic points.

Code:

#######################################################################
#
#  As seen on http://www.burtleburtle.net/bob/rand/smallprng.html
#
#  The average cycle length is expected to be
#  85,070,591,730,234,615,865,843,651,857,942,052,864 results.
#
#######################################################################
use integer;
my ($a, $b, $c, $d);

sub rot { ($_[0] << $_[1]) | ($_[0] >> (32-$_[1])) }

sub randint
{
   my $e = ( $a - rot( $b, 27 ) );
      $a = $b ^ rot( $c, 17 );
      $b = ( $c + $d );
      $c = ( $d + $e );
      $d = ( $e + $a );

   return $d;
}

#######################################################################
#
#  t5srand(): Seeds the random number generator with a string.
#
#  For world-building, I suggest the following format:
#
#  <galaxy>/<arm>/<sector>/<hex>/<orbit>
#
#  Orbit 1 of Hex 1910 in Sector X411 in the Orion Arm of the Milky Way galaxy:
#
#  t5srand( "MW/Orion/X411/1910/1" );
#
#######################################################################
sub t5srand
{
   my $UID  = shift;
   my $hash = '0x' . substr( md5_hex( $UID ), 0, 16 ); # 64 bits

   $a = 0xf1ea5eed;
   $b = $c = $d = eval $hash;
   randint() for 0..19;
}

esampson · Sep 11, 2013

Yeah, using the technique I outlined for when to generate a hash and then using that hash as a seed should give you results just as reproducible without the headaches of having to write a parser to turn the hash into dice rolls. The only thing I didn't like about that approach is that as a pure hash it would be easier for me to dig out 'roll #67 of has "X"'. However considering you could just hash X and then feed it into the PRNG and quickly make 67 rolls that advantage is probably minimal.

(in fact you could just make a routine so you call something like generateSpecificPRN(String hash, Integer roll) which would do exactly that. As long as you aren't being ridiculous with the value for 'roll' you shouldn't even notice anything and if you were allowing values of 'roll' that were that high you would need to extend the original hashing mechanism to generate hash values past 512 bits, anyway).

robject · Sep 11, 2013

esampson said:
Now you start working out through each filled orbit and you generate a new hash based on the planet's hash seed plus the orbit the moon is in, so a moon in orbit Eff would use "00,00:1:3:6" for the hash.

This is worth discussing further, because you're right -- even my procedure works discretely, even though one seed gets us a bazillion numbers. This is what allows me to have a "Full" generation checkbox on my web page: if you don't want the extra data, I don't have to generate it.

The question, however, is Where does it make sense to break up those sets of results? Here's how I decided to break it up for as far as I've gotten.

System presence. I use the sector UID as the seed. This is only used for system presence.

Basic data. I use the sector UID/hex. This is used to generate the UWP, bases, PBG, mainworld type, HZ variance, and mainworld satellites and their sub-orbits (or, alternately, the UWP of the planet/GG around which this mainworld orbits). Trade codes are deterministic. The only trade codes that can't be generated at this stage are orbit-related, and that's taken care of later.

Extensions. I use sector UID/hex/ext (literally, the string 'ext'). This determines Importance, Ex, and Cx.

Stars. I use sector UID/hex/stars (literally, the string 'stars'). This determines all star data, including orbits. (Once that's done I go ahead and recalculate the trade codes to catch the ones that were missing.)

Gas Giants. This data seeds with sector UID/hex/gg (literally 'gg'). This determines the satellites around each GG, their sub-orbits and their UWPs, the GG's size, and its orbit, for each GG.

Belts. This data seeds with sector UID/hex/belts (that's 'belts'), determining its spaceport, orbit, pop, gov, law level, and TL.

Those are my breakpoints. As you can probably tell, they're largely broken down by method: generate all the gas giants, check. Generate all the belts, check. And so on.

robject · Sep 11, 2013

By far the most valuable thing about just using a hashcode is that many of them are optimized into binary libraries. For example, Perl's hashing library extension is written in C, not Perl, and therefore MD5 runs blazingly fast. Similarly its SHA functions. Hand in hand with that benefit is that every modern computer language has standardized libraries for these.

If Jenkins hadn't found that cute little PRNG I would be more likely to use hashes as-is, although he also has some very easy and decent S-Box generators.

esampson · Sep 12, 2013

robject said:
This is worth discussing further, because you're right -- even my procedure works discretely, even though one seed gets us a bazillion numbers. This is what allows me to have a "Full" generation checkbox on my web page: if you don't want the extra data, I don't have to generate it.

The question, however, is Where does it make sense to break up those sets of results? Here's how I decided to break it up for as far as I've gotten.

System presence. I use the sector UID as the seed. This is only used for system presence.

Basic data. I use the sector UID/hex. This is used to generate the UWP, bases, PBG, mainworld type, HZ variance, and mainworld satellites and their sub-orbits (or, alternately, the UWP of the planet/GG around which this mainworld orbits). Trade codes are deterministic. The only trade codes that can't be generated at this stage are orbit-related, and that's taken care of later.

Extensions. I use sector UID/hex/ext (literally, the string 'ext'). This determines Importance, Ex, and Cx.

Stars. I use sector UID/hex/stars (literally, the string 'stars'). This determines all star data, including orbits. (Once that's done I go ahead and recalculate the trade codes to catch the ones that were missing.)

Gas Giants. This data seeds with sector UID/hex/gg (literally 'gg'). This determines the satellites around each GG, their sub-orbits and their UWPs, the GG's size, and its orbit, for each GG.

Belts. This data seeds with sector UID/hex/belts (that's 'belts'), determining its spaceport, orbit, pop, gov, law level, and TL.

Those are my breakpoints. As you can probably tell, they're largely broken down by method: generate all the gas giants, check. Generate all the belts, check. And so on.

What I was thinking is that you could actually break it down to almost the level of an individual roll. For example, let's assume we have a sector UID of 1004 and a hex of 2410. Your check to see if there is a system in the hex would be the first random number generated from the seed "1004/2410". If there is a system then you would use the first random number generated from the seed "1004/2401/stars/". The roll for the primary would be "1004/2401/star/1". This continues on to the planets which use seeds such as "1004/2401/planet/1/3/starport1" and "1004/2401/planet/1/3/starport2" (or alternately "1004/2401/planet/1/3/starport" which generates a number between 1 and 36 which is then translated to a 2-12 value).

The reason for the complexity is one of speed. If I want to get information about the moon in orbit F of the plant in orbit 3 of the primary I don't need to generate any data that isn't necessary to the moon. I'll need to see if there is a system and if there is a system I'll need to see if there's a planet in orbit 3. At the moment I don't care about the starport, hydrographics, atmosphere, population, etc. of the planet so I don't generate any of those. All I do is generate the bare minimum of the data to see if the moon exists and anything that might affect the moon (for instance, the size of the planet). I can go back at any time I like to find out what the hydrographics for the planet are because the unique string can be assembled and hashed.

I know this seems like a crazy amount of seeding, but there's a method to my madness. If you use a more general seeding method such as just seeding with the UID and hex and then you make a modification to anything within system generation the dice rolls will shift and suddenly you are no longer producing the same results.

As an example, lets say that the roll for the number of belts in the system changes. Since that is made very early on in the process a system that was previously generated might find the number of belts in the system changed by 1. Now there are a different number of rolls for placing the belts which means that the rolls for placing the planets will be completely different. So will all the UWP rolls for the planets, the number of moons generated, etc..

On the other hand using my 'heavy seeding' method what will happen is that you might see one planet shift around as either an orbit that was used by the asteroid belt becomes open or else an orbit that was open and used by the planet is now used by the new belt. Lets assume our rolls to place planets put them in orbit 5, 9, 2, and 7 and that the new asteroid belt is put into orbit 9. The rolls to place planets will still produce 5, which is fine, then 9, which is now occupied. Since it is occupied you roll again and now place the planet in orbit 2. The third planet gets placed in orbit 7 and the last planet gets placed in orbit 4 (the number generated by using the seed "UID/HEX/Planet Placement/5"). Because of the heavy seeding method the planets in orbits 2, 5, and 7 will remain just like they were. You've got a new asteroid belt in orbit 9 as defined by the new method and the planet that was in 9 has now been regenerated as a new planet in orbit 7.

Procedural System Generation

robject

SOC-14 10K

robject

SOC-14 10K

rancke

Absent Friend

robject

SOC-14 10K

esampson

SOC-13

robject

SOC-14 10K

esampson

SOC-13

robject

SOC-14 10K

robject

SOC-14 10K

esampson

SOC-13

Similar threads