Okay, I think I have stellar data close to ready to go. There are a handful of worlds with bogus data, but it's not too bad.
I ended up writing a recursive descent parser for the starsystem data to capture Malenfant's Stellar Generation that encodes the system structure as a superset of the classic data - several files use that.
Here's the grammar I ended up with:
(The extended grammar includes all productions of the basic grammar, so I just parse as extended.)
Errata:
"Un" - this is present as a star code in Mendan sector data from the Challenge #49 (Mike Mikesh) for 0221. Possibly a typo, but I interpret it as "Unknown star type" i.e. there's something weird going on.
Using this grammar and scrubbing my data files, and ignoring LRX and some strange single-letter suffixes (h, f), I end up with just these errors - corrections appreciated:
A small number of errors, which is encouraging, but many sector files have no stellar data so that doesn't mean much.
I ended up writing a recursive descent parser for the starsystem data to capture Malenfant's Stellar Generation that encodes the system structure as a superset of the classic data - several files use that.
Here's the grammar I ended up with:
Code:
// Basic:
// system ::= star ( w star )*
//
// Extended: Malenfant's Stellar Generation
//
// system ::= unit ( w companion )*
// companion ::= near | far
// near ::= unit
// far ::= "[" system "]"
// unit ::= star | pair
// pair ::= "(" star w star ")"
//
// star ::= type tenths w* size main?
// | dwarf
// | unknown
// type ::= "O" | "B" | "A" | "F" | "G" | "K" | "M"
// tenths ::= "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9"
// size ::= "D" | "Ia" | "Ib" | "II" | "III" | "IV" | "V" | "VI" | "VII"
// dwarf ::= "DB" | "DA" | "DF" | "DG" | "DK" | "DM" | "D"
// unknown ::= "Un"
//
// main ::= "*"
//
// w ::= " "
(The extended grammar includes all productions of the basic grammar, so I just parse as extended.)
Errata:
"Un" - this is present as a star code in Mendan sector data from the Challenge #49 (Mike Mikesh) for 0221. Possibly a typo, but I interpret it as "Unknown star type" i.e. there's something weird going on.
Using this grammar and scrubbing my data files, and ignoring LRX and some strange single-letter suffixes (h, f), I end up with just these errors - corrections appreciated:
Code:
Zhodane
Suffix Data Error: 'F3 V M4 V M9 D M4'
* Oythepru 2034 A66586B-C Q Ri Mr 324 Dr F3 V M4 V M9 D M4
Suffix Data Error: 'M9 Zh G6 D'
* 2924 E333223-B Lo Ni Po 923 Zh M9 Zh G6 D
Core
Suffix Data Error: 'K8 VI M4'
* Night 0839 A5749C9-F N Hi In 320 Im K8 VI M4
Ley Sector
Suffix Data Error: '2 V A6 D'
* Tender Mercy 0314 E447210-5 Lo Ni 200 IL 2 V A6 D
A small number of errors, which is encouraging, but many sector files have no stellar data so that doesn't mean much.