JFORCES Executive

Overview

JFORCES is more accurately a code integration framework instead of a simulation. The objective of JFORCES from its inception was to provide a framework for integrating code across a LAN or WAN into a single execution. From a practical standpoint FORCES has always been used as a simulation, but the executive is not limited to simulation support. In fact, the executive is ignorant of the fact that it runs a simulation. The executive is a combination of runtime services and data wrappers autogenerated prior to scenario execution. During runtime, the executive provides the following services:

It does this through platform-independent communications services that permit executives to coordinate activities between different processes on the same machine or distributed machines. Communications networking is the core of how the executive controls all other functions. When an executive is started it does the following:

  1. Performs local initialization

  2. Calls application-specific initialization

  3. Establishes a list of public interfaces

  4. Establishes communications

A single executive can establish multiple communications links. Each of these links can either be a server or a client. A communications server link opens a port on the local machine that other processes can communicate with. When a server link is opened the executive simply registers the port ID it is listening to and a routine to be called when a client logs in on that port. A client, on the other hand, specifies the machine and port that it wants to communicate with. A client communications link will wait until the server on the other end opens a communications channel with the client. At that point a dedicated communications channel will be established between the two (this permits other clients to communicate with the same server) and application details will be shared between the processes. Internally TCP/IP is used throughout the executive, although some executive applications have employed alternate message transfer mechanisms (e.g UDP, serial, shared memory, etc) as an informal “back door” to transfer large amounts of data beyond what can be handled via TCP/IP.

The following key features give this system tremendous flexibility:

  1. One executive can host multiple communications channels

  2. A single executive can host a mixture of communications clients and server

  3. All connections are dynamic - that is the executive does not need to be told which clients will connect before execution, instead they can connect and disconnect at will throughout an execution.

  4. The executive does not try to force application initialization or handshakes - instead APIs to the applications are provided so each application can be tailored according to its specific needs

Together these features permit complex but reliable communications routing. For example, the communications between the simulation and the Man Machine Interface often used in a single-process simulation looks like this:

Note that the simulation in this diagram starts 4 servers. All clients must use a unique combination of net address and port. By starting 4 servers on the sim host up to 4 interfaces can be started on any client node. For example, a single client node can host an MMI, an AFATDS interface, a real-time interface to another simulation, and an OTH-Gold interface. This example was used for experiments in the past. it is possible to run more servers on the simulation node, but in practice the maximum employed at any node has not exceeded 20. But intermediate relay executives have been used to push data out to additional satellite applications in a route-like mode, permitting a simulation to support about 50 applications. But because of transmission latencies, maintaining tight time control becomes more difficult when relays are used.

Runtime Interface APIs

All applications hosted under the executive must provide the following API's (though any could be null functions):

Runtime Message Pass

After initialization is completed, applications communicate transparently through an executive interface. This means that the applications do not (or at least should not) know where the applications are located nor particulars about the application execution. For example, a sensor can send a message to a tracker by calling the sendmsg function with a destination (aka mailbox) of "trackapp". The sensor model will not know whether the tracker is located on the same machine or a remote machine. It will also not need to know whether the tracker is a simple model or an actual tracker involved in a software/hardware in loop execution. Clearly, sometimes the message formats must be modified for different destination modules. This is accomplished by providing a bridging function between the simulation and the destination, as follows:

This is accomplished via a router table developed by the executive network based on the application registration that occurred during application initialization. This ties a port on a computer to a specific software package. Each software package can consist of one or multiple applications. An example of an single application package would be the JFORCES map. On the other hand, when a single simulation node is used it typically incorporates all of the following applications within a single package:

  1. objects

  2. sensors

  3. tracker

  4. environment

  5. communications

  6. engagement

In this case all communications are sent to the same node. Once there the messages are parceled out according to the callback function associated within each application. Typically these callback functions are large switch statements that execute appropriate subroutines according to the message type sent.

The application routing list is dynamic. When applications fall off the network their death is detected and information is sent throughout the remaining network telling each executive that the application is not longer available. Messages sent to the application after that time will fail non-destructively. Each executive is responsible for calling it's applications app_remote_delete function with a specification of the departing application and instance whenever this situation is detected so the local applications can perform any appropriate actions (if any) when a remote application disappears.

On the other hand, when a new application starts up in mid-run this information is also broadcast throughout the system and each executive calls the app_remote_init function of it's local application so the application can perform any required maintenance. This call includes the application type(s) and instances of the new application. For example, when a user interface checks into the objects module the objects module immediately broadcasts the initial state of any assets that user interface should know about so the UI becomes immediately fully operational.

Together these functions permit applications to drop off and reconnect with an ongoing simulation seamlessly, providing a high degree of fault-tolerance and permitting the flexibility required to support intermediate connections to real-world systems.

Clock Control

The JFORCES executive also provides conservative clock control. Conservative means that the clock can only advance; not run backwards. This is the only place where there is a “master” node in the JFORCES. There is exactly one master clock which broadcasts simulation time in heartbeats. This system maintains the simulation clock and can be run in any of the following modes:

  1. real-time

  2. ratios of real-time (e.g. 100 times faster than real-time)

  3. unbound by real-time clock

  4. pause

The system relays the heartbeat throughout the network. Each executive creates a list of all executives “down stream” from it and relay the information. To date, this has been a simple mechanism, although some experiments have been performed with determining the “bounce back” times to each executive and having each executive adjust it's timeclock accordingly. But the initial experiments indicated there was a small payoff for the mechanism. So given it's complexity and the possibility of heading into situations requiring local applications to “roll back” their clock clocks, this mechanism was abandoned.

In addition to the simulation clock, each local executive also maintains a real-time clock. This is used to maintain communications with live systems. As such, it is never paused, sped up or slowed down. Note that each package has two periodic call functions, namely app_simtime_event_loop and app_realtime_event_loop, which perform periodic processing based on the clock time from either the simulation or system clock, respectively. Typically most JFORCES processes are controlled by the simulation clock and only realtime communications and realtime processing (e.g. Airspace) monitoring are performed in the app_realtime_event_loop, but this mixture can be changed for special applications.

Event Stack

Part and parcel with clock controls is the event stack. An event stack receives instructions from an application to deliver a message to another instance or application and holds it until that time, when it then delivers that message to the destination. The delivery time is based upon the simulation clock, not the realtime clock. The event stack exists locally at every exec and broadcasts the message when it's executive simulation clock reaches the prescribed time. Note that this means that if a satellite application drops offline any messages that it stacked for later delivery will be canceled. This is an aspect of JFORCES peer-to-peer approach (i.e. Nobody's in absolute control, so all packages must complete their own processing).

Be aware that the message will not be lost in this system (unless the application drops off). This is not a substitute for a communications or processing queue model. These models are applications in their own right and if imperfect commutations are desired the appropriate application should be called and that application will be responsible for forwarding (or dropping) the message (and optionally logging the result). This is done to maintain the separation of the exec from the application and to permit the application to tailor both the processing and data collection for the specific study (or exercise) needs.

Logging and Replay

Each executive can log every message sent to it for later analysis and/or replay. Typically to save time this is not done, but any package can turn this option on by calling the StartReplay function. The saved messages can be:

  1. Replayed at the node without alteration

  2. Replayed at the node with alternation (e.g. A UI attaches to a replaying simulation and alters the commands to examine what might happen if alternate actions are taken)

  3. Reviewed for completeness (a simple method for debugging messages and traffic flows)

  4. Analyzed for message content and types.

Before proceeding, it should be mentioned that the last function is intended to be used to analyze system performance, not to analyze scenario or mission performance. While it could be used that way in a pinch, application data collection has proven much better tailored to capture the data required for situation analysis. And JFORCES employs a relational database to collect, filter, collate and analyze this data; see the section on data analysis.

Reviewing the messages is performed though the auto-generated analyze_messages function, described below. Suffice it to say for now that the registered message fields are employed to provide a relatively simple interface to convert the binary TCP/IP messages to user-intelligible message dumps.

The first two functions, namely, replaying with or without interaction, are extremely powerful exercise and analysis tools. The first is typically used to “reset” an experiment when something goes wrong by rerunning to just before the problem and then turning off the replay and restarting the experiment from that point. While we all hope nothing will go wrong in an exercise, it undoubted will, and this capability has proven invaluable many times.

The second function provides the ability to alter a prior execution at some point to examine what would happen if an alternate action is taken. To date this has been done only for limited changes, although occasionally a group of users have logged back into a replaying scenario and run the wargame a different way. But to date large re-enactments have not been common and have been more in the nature of proving JFORCES than enhancing an exercise debriefing (which is it's intent).

Checkpoint/Restart Services

Checkpoint functionality consists of make a memory image of one or more packages and storing these for later execution. The other half of this it “restart”, which reloads the image and restarts execution from the same point. These services have largely been moved out of the executive in favor of employing the Berkley Lab Checkpoint System (BLCS), as described on the jforces.info website. The only executive functions left are

  1. Holding the system in a pause state when the checkpoint is created and restarting the system from a loaded checkpoint file.

  2. Closing and reopening executive communications to accommodate the altered sockets the are expected when the system restarts.

Presimulation Autogeneration Functions

These are the functions that generate message wrappers and data definitions shared across various machine architectures and languages. This document focuses on the runtime executive functions, these will be described in a later document.