So the other night I had an idea for training a starship combat system.
Basically the training is hands-on, requiring a human to perform the actions until he's confident that the computer knows his style. Here's how I think it might work.
First, we assume a realtime combat system that has 2D non-vector movement, generalized distances such as range bands, and of course weapons and defenses. We assume that all of these elements are reasonably accessible to the human player. And we assume that each tick of the clock is a training moment.
The human player plays one ship. The computer plays the opponent.
The computer observes the human's behavior. At a given tick, a player's state is represented by his ship's current maneuver rating, its current weapons and defenses bearing, its vector to its opponent, and the opponent's ship's maneuver, weapons, and defenses status.
These data are inputs multiplied through a series of weights to a small number of central values, which are in turn multiplied through a series of weights to output nodes, which represent behavior in terms of a movement vector and weapons activity. The computer plugs in its ship's state to determine behavior, and acts accordingly. At first, its behavior will be quite random.
At the same time, the computer plugs in the human player's state to predict the human's behavior. The computer compares this with the actual behavior of the human; the difference is used to modify the weights to the output nodes and generate a new set of central values, which in turn are used to modify the weights to the input nodes. At the end of each training moment, the computer is better able to simulate the human's reaction to states in the game arena. The more the computer plays, the more likely it is to learn. In addition, playing with different humans ought to result in a mixture of playing style, though not necessarily a superior one.
In short, the computer trains a software neural network in order to build a customized decision-making process. My theory is that the computer will learn how to behave like the trainer in ship combat.
I estimate fifteen input nodes (plus or minus), four hidden nodes, and fifteen output nodes (plus or minus). Total network size is therefore about 120 double floating-point numbers -- maybe 1k of memory.
Thoughts?
Basically the training is hands-on, requiring a human to perform the actions until he's confident that the computer knows his style. Here's how I think it might work.
First, we assume a realtime combat system that has 2D non-vector movement, generalized distances such as range bands, and of course weapons and defenses. We assume that all of these elements are reasonably accessible to the human player. And we assume that each tick of the clock is a training moment.
The human player plays one ship. The computer plays the opponent.
The computer observes the human's behavior. At a given tick, a player's state is represented by his ship's current maneuver rating, its current weapons and defenses bearing, its vector to its opponent, and the opponent's ship's maneuver, weapons, and defenses status.
These data are inputs multiplied through a series of weights to a small number of central values, which are in turn multiplied through a series of weights to output nodes, which represent behavior in terms of a movement vector and weapons activity. The computer plugs in its ship's state to determine behavior, and acts accordingly. At first, its behavior will be quite random.
At the same time, the computer plugs in the human player's state to predict the human's behavior. The computer compares this with the actual behavior of the human; the difference is used to modify the weights to the output nodes and generate a new set of central values, which in turn are used to modify the weights to the input nodes. At the end of each training moment, the computer is better able to simulate the human's reaction to states in the game arena. The more the computer plays, the more likely it is to learn. In addition, playing with different humans ought to result in a mixture of playing style, though not necessarily a superior one.
In short, the computer trains a software neural network in order to build a customized decision-making process. My theory is that the computer will learn how to behave like the trainer in ship combat.
I estimate fifteen input nodes (plus or minus), four hidden nodes, and fifteen output nodes (plus or minus). Total network size is therefore about 120 double floating-point numbers -- maybe 1k of memory.
Thoughts?
Last edited: