Sector Data File Formats (.sec)

forthekill · Mar 24, 2009

I've been looking at a lot of different utilities and programs for Traveller, and the thing that bothers me the most is that the format of the various data formats is very inconsistent.

Just the .sec file formats being used alone is enough to drive someone nuts.

travellermap.com, the gensec utility, and H&E all use a different format for the sec data they output. I don't know what Universe outputs (or if it can?) since I don't yet have it to play with.

The standard .sec format is supposed to be:

01-13: Name
15-18: HexNbr
20-28: UWP
31: Bases
33-47: Codes & Comments
49: Zone
52-54: PBG
56-57: Allegiance
59-74: Stellar Data

Now to me the 13 character name field is way too small, and apparently others feel the same since only H&E outputs to that format, and the two others I mention here expand the name field.

gensec uses:

01-18: Name
20-23: HexNbr
25-33: UWP
35: Bases
37-51: Codes & Comments
53-55: PBG
57-58: Allegiance
60: Zone

travellermap.com uses (this is the current output, but there is no guarantee that this won't change):

01-25: Name
27-30: HexNbr
32-40: UWP
42: Bases
44-68: Codes & Comments
70: Zone
72-74: PBG
76-77: Allegiance
79-98: Stellar Data

Heaven & Earth (and Galactic) uses almost the standard .sec, but for one change in the length of the Name field:

01-14: Name
15-18: HexNbr
20-28: UWP
31: Bases
33-47: Codes & Comments
49: Zone
52-54: PBG
56-57: Allegiance
59-74: Stellar Data

And for its .HES format:

01-04: HexNbr
07-20: Name
23-31: UWP
34-44: Codes & Comments
48-50: PBG
53:Bases
56-57: Allegiance
60: Zone
63: Satellite
66-85: Stellar Data

Personally, I like the travellermap.com format the best, as I like freedom in naming.

Ultimately though, I'd like to come up with a single XML format and then create a tool that can go from any of the common .sec formats to XML, and vice-versa.

I'd love to hear what people think of the various data formats, which people use, what people like or dislike about them, etc.

Merxiless · Mar 24, 2009

Jim V's Galactic uses the following, which I believe matches H&E Exactly, since H&E came later:

The data in the sector text files is laid out in column format.

1-14: Name
15-18: HexNbr
20-28: UWP
31: Bases
33-47: Codes & Comments
49: Zone
52-54: PBG
56-57: Allegiance
59-74: Stellar Data

I'm a fan of that, since I've used it over 12 years now.

Hemdian · Mar 25, 2009

forthekill said:
I've been looking at a lot of different utilities and programs for Traveller, and the thing that bothers me the most is that the format of the various data formats is very inconsistent.

Just the .sec file formats being used alone is enough to drive someone nuts.

travellermap.com, the gensec utility, and H&E all use a different format for the sec data they output. I don't know what Universe outputs (or if it can?) since I don't yet have it to play with.

Ultimately though, I'd like to come up with a single XML format and then create a tool that can go from any of the common .sec formats to XML, and vice-versa.

I'd love to hear what people think of the various data formats, which people use, what people like or dislike about them, etc.

Universe uses a relational database internal but it does include a flat file parser for import/export that is parameter driven. Parameter files for several formats are included and you can easily write your own. Thus you can use Universe to convert from any flat file format to any flat file format.

Universe 2 (currently in development) will add support for XML, Galactic's SAR format, and web services.

forthekill · Mar 25, 2009

Hemdian said:
Universe uses a relational database internal but it does include a flat file parser for import/export that is parameter driven. Parameter files for several formats are included and you can easily write your own. Thus you can use Universe to convert from any flat file format to any flat file format.

That's a pretty nice feature to have. I hope to get a hold of the software soon (damn budget!) and start using it.

Universe 2 (currently in development) will add support for XML, Galactic's SAR format, and web services.

Will you be able to define the XML as you can with the current version as mentioned above? Or will you create a Universe specific XML format?

I was thinking it might be nice to have an XML schema that has support for a ton of information (without being overly complicated), with style sheets for displaying the data in the various old styles for readability.

robject · Mar 25, 2009

Forthekill:

Most of the .sec file formats are parseable with a regular expression and a bit of decision logic. Anchor the search on the UWP and (perhaps) the PBG, and the rest will fall into place. Hex number is distinguished from world name by its four digits (and woe to any who gives their worlds a four-digit name... I've never seen one, but there's no rule against it, but just the same the rule here is to worry about the 0.0001% case only when it crops up).

As for XML... it's overkill when representing Traveller data, except when your application uses it. It is most definitely overkill for the general representation of .sec files (again, except where your application makes use of it).

In short, XML must be targetted to your application. Acceptance of an existing, applied schema will take time, and may never happen.

coliver988 · Mar 25, 2009

for what it is worth, here's the C# version of the regex to extract/validate .sec files. Someone on this website originally listed it, and I've been using it. The actual regex should be universal, some of the options probably .Net-specific.

Code:

 public static readonly Regex worldRegex = new Regex(@"^" +
            @"( \s*             (?<name>        [^\s.](.*?[^\+\s.])?  ) )? \+?\.* " +    // Name
            @"( \s*             (?<hex>         \d\d\d\d              ) )      " +    // Hex
            @"( \s+             (?<uwp>         \w{7}-\w              ) )      " +    // UWP (Universal World Profile)
            @"( \s+             (?<base>        \w | \*               ) )?     " +    // Base
            @"( \s{1,3}         (?<codes>       .{10,}?               ) )      " +    // Codes
            @"( \s+             (?<zone>        \w                    ) )?     " +    // Zone
            @"( \s+             (?<pbg>         \d[0-9A-F][0-9A-F]    ) )      " +    // PGB (Population multiplier, Belts, Gas giants)
            @"( \s+  (\w\w\/)?  (?<allegiance>  (\w\w\b|\w-|--)       ) )      " +    // Allegiance
            @"( \s*             (?<stellar>     .*                    ) )      "        // Stellar data (etc)
            , RegexOptions.Compiled | RegexOptions.CultureInvariant | RegexOptions.Singleline | RegexOptions.ExplicitCapture | RegexOptions.IgnorePatternWhitespace);

in use it is like:

Code:

  Match worldMatch = worldRegex.Match(line);
  world.hex = worldMatch.Groups["hex"].Value;
  world.bases = worldMatch.Groups["base"].Value;
  world.travelclass = worldMatch.Groups["zone"].Value;

where line is the .sec line; this sample then gives the world hex code, the bases and travel classification (world is a class with the various Traveller attributes and some functions; so world.hex, for instance, contains the hex code for the position such as 0123, world.class could be G, A or R (or whatever).

Seems to work on at least 3 different .sec file formats I have used.

forthekill · Mar 25, 2009

robject said:
Forthekill:

Most of the .sec file formats are parseable with a regular expression and a bit of decision logic. Anchor the search on the UWP and (perhaps) the PBG, and the rest will fall into place. Hex number is distinguished from world name by its four digits (and woe to any who gives their worlds a four-digit name... I've never seen one, but there's no rule against it, but just the same the rule here is to worry about the 0.0001% case only when it crops up).

My biggest problem is in using more than one tool with my .sec files, and the fact that I need to transform the .sec from one style to another to use the same data, or I need to maintain multiple copies everytime I make a change.

As for XML... it's overkill when representing Traveller data, except when your application uses it. It is most definitely overkill for the general representation of .sec files (again, except where your application makes use of it).

I agree that it is overkill for the simple data that is contained in .sec files, however it may be useful in storing additional, more comprehensive data that programs such as Galactic, H&E, and Universe allow you to store.

XML is not easy to read compared to the simple columnar .sec data files, nor is as easy to deal with programatically, but it is at least better labelled and if it was used to store more data, might be easier to process in the long run.

In short, XML must be targetted to your application. Acceptance of an existing, applied schema will take time, and may never happen.

I agree. It is the widespread acceptance and use of an XML format that would be the stumbling block to such a thing, and perhaps Universe 2, when released, will be the something that could guide that.

coliver988 · Mar 25, 2009

forthekill said:
My biggest problem is in using more than one tool with my .sec files, and the fact that I need to transform the .sec from one style to another to use the same data, or I need to maintain multiple copies everytime I make a change.

see the regex below - it seems to work for multiple variations of the .sec files.

I agree that it is overkill for the simple data that is contained in .sec files, however it may be useful in storing additional, more comprehensive data that programs such as Galactic, H&E, and Universe allow you to store.

I use it for expanding out stuff (such as starports as per some T5 discussions). I can add data to the file and it won't hurt older versions of the application. For that, XML is really good.

Gadrin · Mar 25, 2009

I'm not sure XML is necessary. But it might be a fun project.

I've adopted Travellermap.com as my default. I do recall I had
to do some finagling to get the fixed-width import right as you
mentioned every other data block is different format.

Building tables for MySQL 5.0.67 I've come up with:

Code:

DROP TABLE IF EXISTS `spinward_marches`;
SET @saved_cs_client     = @@character_set_client;
SET character_set_client = utf8;
CREATE TABLE `spinward_marches` (
  `system` varchar(26) default NULL,
  `hex` varchar(5) default NULL,
  `uwp` varchar(9) default NULL,
  `bases` varchar(2) default NULL,
  `notes` varchar(28) default NULL,
  `pop` varchar(4) default NULL,
  `align` varchar(3) default NULL,
  `subsector` varchar(100) default NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
SET character_set_client = @saved_cs_client;

I made it a bit "roomy" on purpose.

Had to write a special script to take a start hex (top left of the subsector)
and then build a SQL query to get all the worlds in that subsector so I
could name them correctly. I used another language for that. So now I can
do queries on subsector = "District 268" and get filtered results.

I also ended up writing a MySQL Stored Procedure to create UWP details
and then another to take an entire subsector and spit out UWP details for
each world.

Code:

+-------------------------------------------------------------------------------------------------------------------------------+
| System Details                                                                                                                |
+-------------------------------------------------------------------------------------------------------------------------------+
| Tulena System Information - D7463007                                                                                          | 
| ------------------------------------                                                                                          | 
| Starport:      D - Poor Quality Installation. Only unrefined fuel is available. No repair or shipyard facilities are present. | 
| Size:          7 - 7000 miles (11200 km).                                                                                     | 
| Atmosphere:    4 - Thin, tainted.                                                                                             | 
| Hydrographics: 6 - 60% water.                                                                                                 | 
| Population:    3 - Thousands of inhabitants.                                                                                  | 
| Government:    0 - No government.                                                                                             | 
| Law Level:     0 - No laws affecting weapons possession or ownership.                                                         | 
| Tech Level:    7 - circa 1970 to 1989.                                                                                        | 
+-------------------------------------------------------------------------------------------------------------------------------+
10 rows in set (2.39 sec)

>

robject · Mar 25, 2009

forthekill said:
My biggest problem is in using more than one tool with my .sec files, and the fact that I need to transform the .sec from one style to another to use the same data, or I need to maintain multiple copies everytime I make a change.

Yes. What I've done is write a parser that reads in anything and spits out the format I use, much like you've already discussed.

I agree that it is overkill for the simple data that is contained in .sec files, however it may be useful in storing additional, more comprehensive data that programs such as Galactic, H&E, and Universe allow you to store.

Quite so.

XML is not easy to read compared to the simple columnar .sec data files, nor is as easy to deal with programatically, but it is at least better labelled and if it was used to store more data, might be easier to process in the long run.

Yep. And most languages support it. I prefer YAML for readability and parse speed, but that's beside the point and makes no nevermind.

inexorabletash · Mar 25, 2009

Ooh boy, one of my favorite topics.

First off, I just put this page up a few days ago:

http://www.travellermap.com/formats.htm

This is the portion of MWM's article from Challenge #26 that describes the Standard UPP Format for sector data interchange. I figure this represents the "1.0" standard. It's quite retro - check it out! Note that the two halves of the article don't (data format, data interpretation) exactly match.

I would retroactively call the format used for the GEnie uploads, which forthekill describes as the "standard .sec format", as the "2.0" standard. The differences are add Name, replace G with PBG, drop Tradeworld (app-specific), drop Explored? (app-specific), reorganize fields (alleg/zone/gg to zone/pbg/alleg/stellar) Note that the field sizes themselves can be determined by parsing the header, and this allows for extensibility - field lengths can be changed or added - although the format doesn't strictly dictate how to identify the start or end of the header or where the data starts.

But as is noted, it all goes higglety pigglety after that. There are several other variations as well:

* Inconsistent delimiting of world names from hex location (sometimes '.' terminates a world name, sometimes no space at all)
* Sometimes stellar data is collapsed (G2V)
* Funky stellar data like * and []
* Some funky allegiances like Im[V or JP/Jr
* Inclusion of LRX fields
* Era-specific zones (e.g. B in TNE data); sometimes no space delimeter
* Novel comments like O:1234 ("owned by hex 1234")
* Routes with $ prefix

And so on. Jeepers!

I used to try and parse everything, but eventually gave up and cleaned up the data I cared about. I still use a regex, as robject points out you can match on the Hex, UWP, and PBG pretty reliably. The one case you really need to watch out for is:

Code:

NavalAndScoutBase    0101 A123456-7 A                         123 IM
AmberZone            0101 A123456-7                         A 123 IM

That's why the regex coliver988 quotes restricts the codes to starting within a few characters of the UWP.

BTW, the only reason TravellerMap.com doesn't spew out stellar data is that the map doesn't consume it, so I haven't made a robust parser and/or cleaned up the data. I'll do that at some point.

TravellerMap.com doesn't care about field lengths at all; if I was feeling clever I'd have coded it to spit out the fields with dynamic length (just enough to fit the longest name)

forthekill · Mar 27, 2009

The variable lengths wouldn't be an issue if the data was just somehow delimited with something other than spaces.

inexorabletash, your post, and that page, is pretty informative.

I like the idea of "versioning" the different .sec formats to make it easier to reference them, even if it is an informal label. It might make it easier to get people writing or maintaining Traveller software to decide to support one version or another, or many, or least for others to understand exactly what format the software expects or spits out.

Part of the reason this is of interest to me is that I've been rewriting the old gensec software (in C++ rather than C) so that it is more modular (and adding a couple additional features), and it will allow you to choose the output from among the most common .sec formats.

The other reason is that I don't want to have to reprogram every tool I use to add support for various formats, so I'd like to write a simple converter that:

a. is independent of other tools
b. allows non-programmers to convert between .sec formats for use with whatever tools they need

I would like to be able to label these formats by something as simple as a version number, so maybe we can come up with a good system.

forthekill · Mar 27, 2009

inexorabletash,

I noticed that page you link to does not say what bytes the stellar data resides in. It lists the tables for the stellar descriptions, but that's it.

Is something missing?

Space Hamster · Mar 27, 2009

Fixed width fields are the old way of doing file formats. XML is not overkill. It is the logical successor to the fixed width fields. Most modern programming languages have build in support for reading and writing XML files. XML is by far easier to read than a fixed width field file as you know what values your working with because it has the information wrapping the values.

Fixed width fields are so 80’s. Get with the 2000’s and embrace variable length fields.

inexorabletash · Mar 27, 2009

forthekill said:
I noticed that page you link to does not say what bytes the stellar data resides in. It lists the tables for the stellar descriptions, but that's it.

Is something missing?

Nope - but the page is from MWM's Challenge #26 article, which itself has two halves:

Description of the file format (used in Trader, for the Apple II, and intended as an interchange format)
Compilation of what the fields mean, from the various sources.

As noted on the page, these don't agree - the tables list nIn and nAg whereas the file format stipulates 2-letter codes (Ni and Na). Stellar data is not found in the file format.

Interestingly, in the first part of the article (not included, I could type it up), MWM laments the paucity of good Traveller software available at the time, and suggests that a lack of a standard file format is one cause.

Regardless of what standard is suggested for the future, there is a lot of legacy data out there. It seems a worthwhile exercise to document that, so that tools that produce and consume some future standard format can also interoperate with old data. (This is analogous to the HTML 3.2 standard, which was basically "what do all of the current browsers do?" rather than attempting to forge any new ground.)

coliver988 · Mar 28, 2009

Space Hamster said:
Fixed width fields are the old way of doing file formats. XML is not overkill. It is the logical successor to the fixed width fields. Most modern programming languages have build in support for reading and writing XML files. XML is by far easier to read than a fixed width field file as you know what values your working with because it has the information wrapping the values.

Fixed width fields are so 80’s. Get with the 2000’s and embrace variable length fields.

depends. I work with banks, credit card companies, telcos for data exchange. ALL use fixed-width fields. I don't see that changing anytime soon, and I've been doing this since 1986. Inertia has a lot going for it

Gadrin · Mar 28, 2009

inexorabletash said:
TravellerMap.com doesn't care about field lengths at all; if I was feeling clever I'd have coded it to spit out the fields with dynamic length (just enough to fit the longest name)

Yeah, I thought so.

I did 3 imports by hand and not script last summer and I think they were
all different: Spinward Marches, Far Frontiers and Reaver's Deep.

>

Gadrin · Mar 28, 2009

coliver988 said:
depends. I work with banks, credit card companies, telcos for data exchange. ALL use fixed-width fields. I don't see that changing anytime soon, and I've been doing this since 1986. Inertia has a lot going for it

Yeah, I did a parser for a credit card strip reader and all that info was
fixed width data.

>

inexorabletash · Mar 28, 2009

Gadrin said:
I did 3 imports by hand and not script last summer and I think they were
all different: Spinward Marches, Far Frontiers and Reaver's Deep.

Hrm... well, looking at the code (I'm rigging up stellar data), it appears that I'm using a fixed-width format right now for the SEC.aspx output. You may have been clicking on the links at the bottom of the main map page (credits section) which just give you the source data (often from the original publisher). Those are definitely random.

The SEC.aspx output format is:

Code:

"{0,-25:name} {0,4:hex} {0,9:uwp} {0,1:base} {0,-25:codes} {0,1:zone} {0,3:pbg} {0,2:allegiance} {0,-20:stellar}";

(The stellar chunk isn't live yet)

I also plan to add a "v2" compliant header, i.e. names of the fields and their lengths.

aramis · Mar 28, 2009

coliver988 said:
depends. I work with banks, credit card companies, telcos for data exchange. ALL use fixed-width fields. I don't see that changing anytime soon, and I've been doing this since 1986. Inertia has a lot going for it

Inertia and clarity of coding...

then again, I recently (last year) saw an add for a bank looking for a person familiar with Cobol, and C++, to rework one of their programs into C++...

Sector Data File Formats (.sec)

SOC-11

SOC-13

SOC-14 1K

SOC-11

SOC-14 10K

SOC-14 1K

SOC-11

SOC-14 1K

SOC-14 1K

SOC-14 10K

SOC-14 1K

SOC-11

SOC-11

SOC-12

SOC-14 1K

SOC-14 1K

SOC-14 1K

SOC-14 1K

SOC-14 1K

Administrator

Similar threads