• Welcome to the new COTI server. We've moved the Citizens to a new server. Please let us know in the COTI Website issue forum if you find any problems.

System Data Exchange Structure

robject

SOC-14 10K
Admin Award
Marquis
Regardless of the format (here I'm using XML), structure is important for data exchange.

Also important is intelligibility of course. Does it make sense? If you look at a structure 6 months from now and go "huh??" then it's a failure. If others look at your structure and wonder what you've been smoking, then it's a failure.

So, my current preference is here. The reason I use XML here is that most people speaks it. We can fight over interchange formats later, elsewhere. Internally there's no difference, and popular formats have readers and writers that operate directly on internal memory structures, so it's a non-issue as far as I'm concerned.

I'll insert comments.

Code:
[color=red]
System data exchange (SYS) informal proposal version 0.6
[/color]

[color=blue]
# The outer element for a system contains the data extracted 
# from the "UWP line", aka a SEC 2.0 line.  I'm not going to 
# bother including derivable trade codes here.  Special trade 
# codes ought to go in their own tags under system, perhaps 
# with a reference.
#
# Feel free to add (and ignore) "overview" text.
[/color]

  <system allegiance="Na" bases="A" gasGiants="4" hex="0101" mainworld="Fetters
Alpha" planetoids="2" rockballs="4" stars="2" uwp="B373665-B" zone="R">

     <overview>This is a boring system.</overview>

[color=red]
# [b]NOTE[/b]: Updated to include "milieu".  If the data change is
# that important, it will affect the entire system.
#
# I consider this element optional, and perhaps even unneeded,
# assuming a program is serving data from an agreed-upon milieu
# to another program.
#
# In other words, milieu might be beyond the scope of this structure...
[/color] 

     <milieu>M1100</milieu>

[color=blue]
#  I decided to group worlds (and close binary companions) 
# by the star they orbit.  Is this a bad idea?
[/color]

    <primary type="M8 III">
      <companion type="F D" />
      <worlds>
        <world>
          <name>Fetters Alpha</name>

[color=blue]
# Under the UWP is a detailed breakdown for each element --
# if you have data to share, that is.
# Otherwise, feel free to ignore it.  I suppose you don't even 
# need the "code" attributes.  So consider it all being optional 
# but potentially useful.
#
# The descriptive text under the UWP elements is taken from 
# Traveller's standard tables.
#
# The population text is in western-human-usable format, 
# with commas separating thousands places.  Is this acceptable?
#
# And feel free to add WorldBuilder elements and comment text, 
# too, such as "vitals", "starport details", "physical description" 
# and "social description".
[/color]

          <UWP code="B373665-B">
            <atmosphere code="7" />
            <government code="6" />
            <hydrographics code="3" />
            <lawLevel code="5" />
>
            <population code="6" popMult="9" >9,000,000</population>
            <size code="3" />
            <starport code="B" />
            <techLevel code="B">11</techLevel>
          </UWP>
          <allegiance code="Na">Non-aligned</allegiance>

[color=blue]
#  The bases are broken down in a manner similar to the UWP.
[/color]

          <bases code="B">
            <navy>Naval Base</navy>
            <scouts>Scout Way Station</scouts>
          </bases>

[color=blue]
# Orbit and trade codes have the "code" separated from the "value", 
# just like with the UWP and Base code.
[/color]

          <orbit code="5">2.5 AU</orbit>
          <tradeCodes>
            <tradeCode code="Ni">Non-industrial</tradeCode>
            <tradeCode code="Ri" />
          </tradeCodes>
          <zone code="R">Interdicted</zone>
        </world>
        <world>
          <name>Fetters Beta</name>
          <UWP code="A888899-C">
            <atmosphere code="8">
              <density>Thick</density>
              <taint>None</taint>
            </atmosphere>
            <government code="9" />
            <hydrographics code="8">80%</hydrographics>
            <lawLevel code="9" />
            <population code="8" popMult="4">400,000,000</population>
            <size code="8" />
            <starport code="A" />
            <techLevel code="C">12</techLevel>
          </UWP>
          <allegiance code="Im">Imperial</allegiance>
          <bases code="A">
            <navy>Naval Base</navy>
            <scouts>Scout Base</scouts>
          </bases>
          <orbit>6</orbit>
          <tradeCodes>
            <tradeCode code="Ag">Agricultural</tradeCode>
            <tradeCode code="Ri">Rich</tradeCode>
          </tradeCodes>
        </world>
      </worlds>
    </primary>
    <secondary type="G D" />
  </system>


Or, here it is "inverted", with the source data as content:

Code:
<system>

[color=blue]
# Basic metadata is tagged and available.  Easy to get at.
[/color]

    <mainworld>Fetters Alpha</mainworld>
    <hex>0101</hex>
    <uwp>B373665-B</uwp>
    <bases>B</bases>
    <zone>R</zone>
    <planetoids>2</planetoids>
    <gasGiants>4</gasGiants>
    <rockballs>4</rockballs>
    <allegiance>Na</allegiance>
    <stars>2</stars>

    <primary>
      <companion>F D</companion>
      <orbits>


[color=blue]
# World data is a bit flatter: the UWP digits 
# are on the same level as the UWP itself.
[/color]

        <world>
          <name>Fetters Alpha</name>
          <allegiance>Na</allegiance>
          <atmosphere>7</atmosphere>
          <bases>B</bases>
          <codes>
            <code>Ni</code>
            <code>Ri</code>
          </codes>
          <government>6</government>

[color=blue]
# Derived, supplemental information are 
# in attributes.  The core data is the main content.
# How do you feel about that?  Worried?  
# Seems less flexible.  Is it?
[/color]

          <hydrographics display="30%">3</hydrographics>
          <lawLevel>5</lawLevel>
          <orbit>5</orbit>
          <popMult>4</popMult>
          <population display="4,000,000">6</population>
          <size>3</size>
          <starport>B</starport>

[color=blue]
# Still have a "decoded" tech level.
[/color]

          <techLevel>11</techLevel>
          <uwp>B373665-B</uwp>
          <zone>R</zone>
        </world>

        <world>
          <name>Fetters Beta</name>
          <orbit>6</orbit>
        </world>

      </orbits>
      <star>M8 III</star>
    </primary>

  </system>
 
Last edited:
Whoa! I came back to the thread to comment and you had updated the code with the comments! It was suprising!

Anyway, I like it - I want to incorporate it, maybe generate a schema file from that. Whenever I get free time, that is. Which looks to be sometime in 2011. ;)
 
Whoa! I came back to the thread to comment and you had updated the code with the comments! It was suprising!

Anyway, I like it - I want to incorporate it, maybe generate a schema file from that. Whenever I get free time, that is. Which looks to be sometime in 2011. ;)

:) Thank you. That's an encouraging sign: maybe some of this is right.

As you implied, though, the proof is in using it. That's the best way to shake out the bugs.
 
# Under the UWP is a detailed breakdown for each element --
# if you have data to share, that is.
# Otherwise, feel free to ignore it. I suppose you don't even
# need the "code" attributes. So consider it all being optional
# but potentially useful.

I think I like having the separate code values for each UWP attribute. IIRC, I split them out so that it would make sorting a bit easier once the sector data was laid out in a datagrid. I didn't have to parse out any values at runtime from the UWP string as they were already represented.

The only thing I'm not sure about and have to think over is the atmospheric breakdown. I'm not sure if it is needed there or where the data is being consumed; for physical attributes, there is little, if any, variation between classification meanings, so the data consumer will (or, should) always know that atmosphere 7 is standard tainted, for example, whereas a Government 7 is Balkanized in most Traveller versions, unless you are running a TNE campaign in the wilds, where it is a Mystic Dictatorship.
 
Last edited:
Hi there,

I would recomend not including the values. Just provide the codes. This will reduce the size of your file by quite a bit. You can provide a file with the values in it. Normalizing your data will reduce the number of characters needed to represent the data by quite a bit. If the data from the link http://www.travellermap.com/formats.htm you provided in the other thread was included in another file you could link them together in code.
 
One observation on the data formats: XML and YAML are both generating more overhead per record than the carried data. From the above, it's looking like 2:1 or worse...
 
One observation on the data formats: XML and YAML are both generating more overhead per record than the carried data. From the above, it's looking like 2:1 or worse...

That's the problem with these -ML formats, they are very inefficient. It comes from this truely bizarre idea that the data should include its own format definition. Every field in every record includes both data and metadata. Horrible, ain't it. However, it can be very powerful when used appropriately. The real question is: is it appropriate here?
 
That's the problem with these -ML formats, they are very inefficient. It comes from this truely bizarre idea that the data should include its own format definition. Every field in every record includes both data and metadata. Horrible, ain't it. However, it can be very powerful when used appropriately. The real question is: is it appropriate here?

Given the need to parse the data encoded in the above examples before machine use, not IMO.

And, they are not readily human readable, either, due to tag clutter and low visual density of information.

In short, it's a bloated format that isn't suitable for interim storage or ready use, as it's going to be processor intensive to open and read; it's only suitability is as an interchange format, and even then, it's bloated, given that the normative mode for generating the information is already encoded highly.
 
Given the need to parse the data encoded in the above examples before machine use, not IMO.

And, they are not readily human readable, either, due to tag clutter and low visual density of information.

In short, it's a bloated format that isn't suitable for interim storage or ready use, as it's going to be processor intensive to open and read; it's only suitability is as an interchange format, and even then, it's bloated, given that the normative mode for generating the information is already encoded highly.

I disagree with the readability part of this. The tags in XML let you know what the fields are. While I do admit that the optional fields will confuse the user if they are not in the file they are looking at and show up in a later file. The optional fields are the power of XML. You can add fields to your file without breaking older readers. The readers read by tag not position. XML also shows how fields relate to each other. If one field is contained in a collection of other fields it is easy to see.

With today’s hardware multi gigabyte files are doable. Multi megabyte files are a drop in the bucket. There is stuff you can do to reduce data size like normalizing your data. Now you could reduce your tags to a set of codes but then you lose readability.

Fixed width fields are hard to read with extreme character clutter. Could you imagine what these posts would look like without spaces and paragraph blocks?
 
Maybe look at all recorded space.

The TravellerMap Guy could give us a better idea.

For the entire known space meaning all sectors that have official data, how big is the database ?

It's probably not that massive, and as long as you're not loading everything into memory at once...

Let's see using MySQL for 3 sectors and some other data, is about 396KB.

>
 
And, they are not readily human readable, either, due to tag clutter and low visual density of information.

In short, it's a bloated format that isn't suitable for interim storage or ready use, as it's going to be processor intensive to open and read; it's only suitability is as an interchange format, and even then, it's bloated, given that the normative mode for generating the information is already encoded highly.

Think of it as an RSS feed for applications. This isn't about XML, but rather about a structure that lets applications exchange data.

The proof of the pudding, as they say, is in the eating.



Back during another time when COTI talked about a data standard, we had back-and-forthed over a text-based representation of system data, basically containing the kind of data we have here.

One solution was to standardize text formats for the data, including headers. It was human readable and seemed fine.

But, it boiled down to the same structure to the data. Maybe me using XML is distracting you from this, or maybe I'm once again being confused by internal storage versus transmission.

Whether you use XML or YAML or flat files, you still have to represent the same data. That's really what I'm trying to nail down, because once you have that, you can transmit it any way you like. Most applications will prefer to speak XML, hence my example. But that shouldn't matter to anyone else, and in fact to worry about that is a red herring.
 
Last edited:
And, they are not readily human readable, either, due to tag clutter and low visual density of information.

In short, it's a bloated format that isn't suitable for interim storage or ready use, as it's going to be processor intensive to open and read; it's only suitability is as an interchange format, and even then, it's bloated, given that the normative mode for generating the information is already encoded highly.

XML can become hard to read when attributes are used, which is the tendency of most people, even though most best practices agree that attributes should be used very rarely.

Advantages:

- It is plain text.
- Every field is labelled.
- It is collected in a logical tree.
- There are tools to allow easy parsing.
- It can be easily added to.

Sure XML files are going to be bigger than straight data because of the tags, but they are not unmanageable. We aren't using 286 machines anymore. How fast do you need to process sector data anyway?

And honestly, if we are talking about multi-milieu data, or system and world data, they really need to be broken up into multiple files anyway.
 
the best part about XML is the extendability: you can add new parts & if done correctly, it won't effect previous consumers (it just gets ignored).

For instance, as per another discussion on starports, suppose you want to extend out the starport data. Simply adding a starport tag structure you can do that, and it won't hurt software reading in that format, it just gets ignored. Can't do that with fixed length (well, you may if you just add to the end, and you are reading in a line at a time rather than x bytes at a time, and you are not worried about buffer space...)

I'm liking where this is going, BTW. I may revise my XML to the one being generated here once it gets more finalized. So that the 1 person using my trade software can have more options (well, it already reads SEC files and my XML format, but if there was a more standard XML format, I'm all for that)
 
Maybe look at all recorded space.

For the entire known space meaning all sectors that have official data, how big is the database ?


>

I parsed the data file from The Traveller Map into an XML document. It went from about 1MB plain text to about 9MB in XML format:

Code:
<?xml version="1.0" encoding="utf-8"?>
<Imperium Name="Charted Space">
  <World Sector="eC">
    <HexLocation>0101</HexLocation>
    <Name>Zeycude</Name>
    <UWP>C330698-9</UWP>
    <Starport>C</Starport>
    <Size>3</Size>
    <Atmosphere>3</Atmosphere>
    <Hydrosphere>0</Hydrosphere>
    <Population>6</Population>
    <Government>9</Government>
    <LawLevel>8</LawLevel>
    <TechLevel>9</TechLevel>
    <Bases></Bases>
    <TradeCodes>Na Ni Po De</TradeCodes>
    <TravelZone></TravelZone>
    <PBG>613</PBG>
    <PopMultiplier>6</PopMultiplier>
    <Belts>1</Belts>
    <GasGiants>3</GasGiants>
    <Allegiance>Zh</Allegiance>
    <CargoBaseCost>6900</CargoBaseCost>
    <CargoUCP>Zeycude  C-9 Na Ni Po De Cr6900 Zh</CargoUCP>
  </World>

The reason for the cargo info and lack of stellar info is that I use this in my personal Merchant Prince trading game.

Also, perhaps this is not the most efficient XML since I should probably have the sector as a higher level, but it works for me. (I also have a similar XML file with GDP and Naval Budget calc'd for all the worlds)
 
That's a good example of non-attribute XML. Let's see what the entire structure would look like that way.

We can turn the notation on its head, and have un-attributed XML for the base data, then add derived values as attributes to suit the application. Note the sample use of an optional "display" attribute below.

I've flattened out the structure slightly, placing the UWP digits on the same level as the UWP itself.

I ran it through Perl's parser and emitter, so I'm sorry to say that the fields have become sorted by key alpha.

If this turns out to work better, then I'll modify the top post's example accordingly.

Code:
  <system>

    <allegiance>Na</allegiance>
    <bases>B</bases>
    <gasGiants>4</gasGiants>
    <hex>0101</hex>
    <mainworld>Fetters Alpha</mainworld>
    <planetoids>2</planetoids>

    <primary>
      <companion>F D</companion>
      <orbits>

        <world>
          <name>Fetters Alpha</name>
          <allegiance>Na</allegiance>
          <atmosphere>7</atmosphere>
          <bases>B</bases>
          <codes>
            <code>Ni</code>
            <code>Ri</code>
          </codes>
          <government>6</government>
          <hydrographics [color=red]display="30%"[/color]>3</hydrographics>
          <lawLevel>5</lawLevel>
          <orbit>5</orbit>
          <popMult>4</popMult>
          <population [color=red]display="4,000,000"[/color]>6</population>
          <size>3</size>
          <starport>B</starport>
          <techLevel>11</techLevel>
          <uwp>B373665-B</uwp>
          <zone>R</zone>
        </world>

        <world>
          <name>Fetters Beta</name>
          <orbit>6</orbit>
        </world>

      </orbits>
      <star>M8 III</star>
    </primary>

    <rockballs>4</rockballs>
    <stars>2</stars>
    <uwp>B373665-B</uwp>
    <zone>R</zone>
  </system>
 
Last edited:
I parsed the data file from The Traveller Map into an XML document. It went from about 1MB plain text to about 9MB in XML format:

Interesting.

What does it zip to using a basic zip program ?

I'll send you an PM with my email address (if you're so inclined) and
send me a copy of the XML.

I generally only work with MSXML (the old freebie from MSIE). Maybe I
can see how it works with that. I believe it uses x2 RAM for documents
of a given size.

I also have XMLStarlet from Sourceforge and can see if that can work
with it.

Anyhoo, if you feel like it send me a copy of the zip and I'll see how
cumbersome it is to work with on my end, what MySQL thinks of it, etc.

Boy, it's been a while since I worked with XML. I've got the new v6.0.X of MySQL that supposedly can work with XML files and maybe I can give that a shot this weekend or in the coming week. I haven't twittered with MySQL's XML yet.

>
 
Boy, it's been a while since I worked with XML. I've got the new v6.0.X of MySQL that supposedly can work with XML files and maybe I can give that a shot this weekend or in the coming week. I haven't twittered with MySQL's XML yet.
>

I don't know what I'd do without XML. And with the addition of LINQ and XML Literals in VB it's super easy. No more XPath nightmare...

I've uploaded the XML file to file front, in case anyone else was interested:

http://files.filefront.com/13530623

This is for the 35 sectors of the Imperium.

BTW, I'm sure there are a few missing worlds. In fact I have only 438 worlds in the Marches. Any corrections are welcome.
 
I don't know what I'd do without XML. And with the addition of LINQ and XML Literals in VB it's super easy. No more XPath nightmare...

I've uploaded the XML file to file front, in case anyone else was interested:

http://files.filefront.com/13530623

This is for the 35 sectors of the Imperium.

BTW, I'm sure there are a few missing worlds. In fact I have only 438 worlds in the Marches. Any corrections are welcome.

snagged. Gives me an excuse to play with LINQ.

I did notice (once I finally got it to open...) that you have both the UWP & breakdown each field:
Code:
    <HexLocation>0101</HexLocation>
    <Name>Zeycude</Name>
    <UWP>C330698-9</UWP>
    <Starport>C</Starport>
    <Size>3</Size>
    <Atmosphere>3</Atmosphere>
    <Hydrosphere>0</Hydrosphere>
    <Population>6</Population>
    <Government>9</Government>
    <LawLevel>8</LawLevel>
    <TechLevel>9</TechLevel>
    <Bases></Bases>
    <TradeCodes>Na Ni Po De</TradeCodes>
    <TravelZone></TravelZone>
    <PBG>613</PBG>
    <PopMultiplier>6</PopMultiplier>
    <Belts>1</Belts>
    <GasGiants>3</GasGiants>
    <Allegiance>Zh</Allegiance>
    <CargoBaseCost>6900</CargoBaseCost>
    <CargoUCP>Zeycude  C-9 Na Ni Po De Cr6900 Zh</CargoUCP>
Since this is a Traveller format, you could probably do away with the breakdown of the actual UWP stats & the trade codes as the consumer (program) would know (unless it was GURPS in which case there are a few differences, but you'd have either a marker in the XML for that or a flag in the pgm to indicate which version). It would reduce that file by 9 lines per world. Anyway, that would reduce the file size by about 57% or so at the minimum).

But that's just me (the trade pgm I play with has options for classic or T5 & sets the trade codes according to that flag. The T5 parts of that pgm are table-based (the LBB hard-coded, can't have everything!) My XML is similar but does not break out the stats.

Now to redo the back-end of my pgm to use LINQ sometime! I thought it looked interesting & useful.
 
snagged. Gives me an excuse to play with LINQ.

Since this is a Traveller format, you could probably do away with the breakdown of the actual UWP stats & the trade codes as the consumer (program) would know...

The reason I have the breakdown is to make it easier to sort and filter worlds without having to parse the UWP or some kind of ugly regex. This XML doc is really just a database of worlds.

If anything the UWP could be removed.

LINQ is awesome for querying data (especially XML).
 
Back
Top