JFORCES is more accurately a code integration framework instead of a simulation. The objective of JFORCES from its inception was to provide a framework for integrating code across a LAN or WAN into a single execution. From a practical standpoint FORCES has always been used as a simulation, but the executive is not limited to simulation support. In fact, the executive is ignorant of the fact that it runs a simulation. The executive is a combination of runtime services and data wrappers autogenerated prior to scenario execution. During runtime, the executive provides the following services:
Process and interprocess control based upon event-based and time-based processing
Common clock control
Transparent interprocess communications
Logging and Replay functionality
It does this through platform-independent communications services that permit executives to coordinate activities between different processes on the same machine or distributed machines. Communications networking is the core of how the executive controls all other functions. When an executive is started it does the following:
Performs local initialization
Calls application-specific initialization
Establishes a list of public interfaces
A single executive can establish multiple communications links. Each of these links can either be a server or a client. A communications server link opens a port on the local machine that other processes can communicate with. When a server link is opened the executive simply registers the port ID it is listening to and a routine to be called when a client logs in on that port. A client, on the other hand, specifies the machine and port that it wants to communicate with. A client communications link will wait until the server on the other end opens a communications channel with the client. At that point a dedicated communications channel will be established between the two (this permits other clients to communicate with the same server) and application details will be shared between the processes. Internally TCP/IP is used throughout the executive, although some executive applications have employed alternate message transfer mechanisms (e.g UDP, serial, shared memory, etc) as an informal “back door” to transfer large amounts of data beyond what can be handled via TCP/IP.
The following key features give this system tremendous flexibility:
One executive can host multiple communications channels
A single executive can host a mixture of communications clients and server
All connections are dynamic - that is the executive does not need to be told which clients will connect before execution, instead they can connect and disconnect at will throughout an execution.
The executive does not try to force application initialization or handshakes - instead APIs to the applications are provided so each application can be tailored according to its specific needs
Together these features permit complex but reliable communications routing. For example, the communications between the simulation and the Man Machine Interface often used in a single-process simulation looks like this:
Note that the simulation in this diagram starts 4 servers. All clients must use a unique combination of net address and port. By starting 4 servers on the sim host up to 4 interfaces can be started on any client node. For example, a single client node can host an MMI, an AFATDS interface, a real-time interface to another simulation, and an OTH-Gold interface. This example was used for experiments in the past. it is possible to run more servers on the simulation node, but in practice the maximum employed at any node has not exceeded 20. But intermediate relay executives have been used to push data out to additional satellite applications in a route-like mode, permitting a simulation to support about 50 applications. But because of transmission latencies, maintaining tight time control becomes more difficult when relays are used.
Runtime Interface APIs
All applications hosted under the executive must provide the following API's (though any could be null functions):
app_init_ - The application initialization function. This function typically includes the following executive-related functions in addition to local initialization:
1) registration of local application(s) for public calls. This consists of calls to the executive function register_app with the following arguments:
The application instance (if known, default 0)
The application ID (typically using a mnemonic as identified in the public constants and include in the execlist files
A callback routine - that is a routine to be called when a message is received for this application. Typically this routine is a switchboard that calls other routines based upon the message received.
app_simtime_event_loop - A routine called each advance of the scenario clock. Used to support time-based continuous applications. Note that event-based processing is handled by the registration function described above. The result of this is there are usually few things to do in this loop (sometimes nothing). Despite this, this application is the normal mode for implementing a timer loop in the simulation as compared to the the app_realtime_event_loop procedure (described next).
app_realtime_event_loop - A routine called each advance of the scenario clock. Used to support realtime-based continuous applications. This is NOT a commonly used interface for time-based applications because it ignors the scenario clock. Typically the app_simtime_event_loop procedure (described above) should be used. But this interface is provided for time-critical interfaces to live operational systems or feeds. To date this interface has ony been used for UDP connections to operational systems and the RAP tracker interface.
app_checkpoint - A routine to handle application-specific checkpointing functions
app_remote_init - A routine called whenever remote applications "check in". This routine typically sends initial data to the remote application as required. This is how the simulation can dynamically initialize new MMI applications or specific external interfaces as required.
app_remote_delete - A routine called whenever an application checks out. This is sometimes used for cleanup. It can also be used to instantiate a default fill in application when a key process drops out and the desire is to keep an ex routine called whenever an application checks out. This is sometimes used for cleanup. It can also be used to instantiate a default fill in application when a key process drops out and the desire is to keep an exercise running.
app_version – This procedure is called when the application is run with the –version option (e.g. wn –version) In this mode the application does not actually execute but instead executes the printf found in this procedure. This is just a method for allowing programmers to specify versioning information that can be checked by users.
app_handle_clock_control – This procedure is called whenever the executive receives a new pause, resume, stopsim or time ratio command. The MSG_MASTER_TIME message corresponding to the event is passed to the called application. Embedded in this message is the clock change type (msgratio, msgpause, msgresume or msgstopsim) and the required additional information (e.g. New time ratio value associated with the msgratio command).
app_verify_msg_validity – This procedure permits any application to do a final check on the validity of a message prior to execution. The purpose of this check is to support time management schemes in which an event has been invalidated by locally known events which have not yet been propagated to remote applications.
app_debug_interface – This procedure is provided to support application debugging by calling an application routine immediately before and after the injection of any message. This is called regardless of whether the message is injected from a remote or local source, and whether it's a standard message or one from an event stack. This routine is very useful in debugging application memory errors.
Runtime Message Pass
After initialization is completed, applications communicate transparently through an executive interface. This means that the applications do not (or at least should not) know where the applications are located nor particulars about the application execution. For example, a sensor can send a message to a tracker by calling the sendmsg function with a destination (aka mailbox) of "trackapp". The sensor model will not know whether the tracker is located on the same machine or a remote machine. It will also not need to know whether the tracker is a simple model or an actual tracker involved in a software/hardware in loop execution. Clearly, sometimes the message formats must be modified for different destination modules. This is accomplished by providing a bridging function between the simulation and the destination, as follows:
This is accomplished via a router table developed by the executive network based on the application registration that occurred during application initialization. This ties a port on a computer to a specific software package. Each software package can consist of one or multiple applications. An example of an single application package would be the JFORCES map. On the other hand, when a single simulation node is used it typically incorporates all of the following applications within a single package:
In this case all communications are sent to the same node. Once there the messages are parceled out according to the callback function associated within each application. Typically these callback functions are large switch statements that execute appropriate subroutines according to the message type sent.
The application routing list is dynamic. When applications fall off the network their death is detected and information is sent throughout the remaining network telling each executive that the application is not longer available. Messages sent to the application after that time will fail non-destructively. Each executive is responsible for calling it's applications app_remote_delete function with a specification of the departing application and instance whenever this situation is detected so the local applications can perform any appropriate actions (if any) when a remote application disappears.
On the other hand, when a new application starts up in mid-run this information is also broadcast throughout the system and each executive calls the app_remote_init function of it's local application so the application can perform any required maintenance. This call includes the application type(s) and instances of the new application. For example, when a user interface checks into the objects module the objects module immediately broadcasts the initial state of any assets that user interface should know about so the UI becomes immediately fully operational.
Together these functions permit applications to drop off and reconnect with an ongoing simulation seamlessly, providing a high degree of fault-tolerance and permitting the flexibility required to support intermediate connections to real-world systems.
The JFORCES executive also provides conservative clock control. Conservative means that the clock can only advance; not run backwards. This is the only place where there is a “master” node in the JFORCES. There is exactly one master clock which broadcasts simulation time in heartbeats. This system maintains the simulation clock and can be run in any of the following modes:
ratios of real-time (e.g. 100 times faster than real-time)
unbound by real-time clock
The system relays the heartbeat throughout the network. Each executive creates a list of all executives “down stream” from it and relay the information. To date, this has been a simple mechanism, although some experiments have been performed with determining the “bounce back” times to each executive and having each executive adjust it's timeclock accordingly. But the initial experiments indicated there was a small payoff for the mechanism. So given it's complexity and the possibility of heading into situations requiring local applications to “roll back” their clock clocks, this mechanism was abandoned.
In addition to the simulation clock, each local executive also maintains a real-time clock. This is used to maintain communications with live systems. As such, it is never paused, sped up or slowed down. Note that each package has two periodic call functions, namely app_simtime_event_loop and app_realtime_event_loop, which perform periodic processing based on the clock time from either the simulation or system clock, respectively. Typically most JFORCES processes are controlled by the simulation clock and only realtime communications and realtime processing (e.g. Airspace) monitoring are performed in the app_realtime_event_loop, but this mixture can be changed for special applications.
Part and parcel with clock controls is the event stack. An event stack receives instructions from an application to deliver a message to another instance or application and holds it until that time, when it then delivers that message to the destination. The delivery time is based upon the simulation clock, not the realtime clock. The event stack exists locally at every exec and broadcasts the message when it's executive simulation clock reaches the prescribed time. Note that this means that if a satellite application drops offline any messages that it stacked for later delivery will be canceled. This is an aspect of JFORCES peer-to-peer approach (i.e. Nobody's in absolute control, so all packages must complete their own processing).
Be aware that the message will not be lost in this system (unless the application drops off). This is not a substitute for a communications or processing queue model. These models are applications in their own right and if imperfect commutations are desired the appropriate application should be called and that application will be responsible for forwarding (or dropping) the message (and optionally logging the result). This is done to maintain the separation of the exec from the application and to permit the application to tailor both the processing and data collection for the specific study (or exercise) needs.
Logging and Replay
Each executive can log every message sent to it for later analysis and/or replay. Typically to save time this is not done, but any package can turn this option on by calling the StartReplay function. The saved messages can be:
Replayed at the node without alteration
Replayed at the node with alternation (e.g. A UI attaches to a replaying simulation and alters the commands to examine what might happen if alternate actions are taken)
Reviewed for completeness (a simple method for debugging messages and traffic flows)
Analyzed for message content and types.
Before proceeding, it should be mentioned that the last function is intended to be used to analyze system performance, not to analyze scenario or mission performance. While it could be used that way in a pinch, application data collection has proven much better tailored to capture the data required for situation analysis. And JFORCES employs a relational database to collect, filter, collate and analyze this data; see the section on data analysis.
Reviewing the messages is performed though the auto-generated analyze_messages function, described below. Suffice it to say for now that the registered message fields are employed to provide a relatively simple interface to convert the binary TCP/IP messages to user-intelligible message dumps.
The first two functions, namely, replaying with or without interaction, are extremely powerful exercise and analysis tools. The first is typically used to “reset” an experiment when something goes wrong by rerunning to just before the problem and then turning off the replay and restarting the experiment from that point. While we all hope nothing will go wrong in an exercise, it undoubted will, and this capability has proven invaluable many times.
The second function provides the ability to alter a prior execution at some point to examine what would happen if an alternate action is taken. To date this has been done only for limited changes, although occasionally a group of users have logged back into a replaying scenario and run the wargame a different way. But to date large re-enactments have not been common and have been more in the nature of proving JFORCES than enhancing an exercise debriefing (which is it's intent).
Checkpoint functionality consists of make a memory image of one or more packages and storing these for later execution. The other half of this it “restart”, which reloads the image and restarts execution from the same point. These services have largely been moved out of the executive in favor of employing the Berkley Lab Checkpoint System (BLCS), as described on the jforces.info website. The only executive functions left are
Holding the system in a pause state when the checkpoint is created and restarting the system from a loaded checkpoint file.
Closing and reopening executive communications to accommodate the altered sockets the are expected when the system restarts.
Presimulation Autogeneration Functions
These are the functions that generate message wrappers and data definitions shared across various machine architectures and languages. This document focuses on the runtime executive functions, these will be described in a later document.