usage

Usage Guide

To run a simulation model, first it must be compiled in a way which is suitable for the program to interact correctly with the ROOT-Sim runtime environment. ROOT-Sim significantly mangles the code generated by standard compilers, so generating a ROOT-Sim aware executable could be tedious. We provide rootsim-cc, a wrapper of the standard C/MPI compiler, to automatically perform all the required compilation steps.

Currently, rootsim-cc is only able to rely on gcc or on a MPI compiler using in its turn gcc.

Using the compiler

To compile a model, you can simply use the rootsim-cc compiler as you would do with a standard compiler. All flags are passed to rootsim-cc are passed to the backend compiler. After several compilation steps, mangling of the code, and incremental linking of the various libraries forming up the ROOT-Sim runtime environment, an executable is generated.

A standard Makefile to automatize the compilation process can be used. The following is an example which is used also for the example models which are provided in the source tree. This Makefile is intended for debugging purposes, as it generates debug symbols (-g) and performs extra checks on the source code (-Wall -Wextra). The output is an executable called model which is ROOT-Sim aware.

CC = rootsim-cc
CFLAGS = -g -Wall -Wextra
TARGET = model

SRCS = model.c general.c agent.c
OBJS = $(SRCS:.c=.o)

.PHONY: clean

all: $(TARGET)

$(TARGET): $(OBJS)
	$(CC) $(CFLAGS) -o $(TARGET) $(OBJS)

.c.o:
	$(CC) $(CFLAGS) $(INCLUDES) -c $<  -o $@

clean:
	$(RM) *.o $(TARGET)

Running on a Single Node

As it will be discussed later, a simulation model developed for ROOT-Sim only has to implement event handlers: all the runtime support is provided by the library, including the main function. Therefore, it is possible to pass to the executable generated by rootsim-cc many configuration flags to tell the runtime environment how to support the execution. These are the runtime flags which are understood by the runtime environment:

wt: This option accepts a numerical value which corresponds to the number of worker threads which are used on the node. At simulation startup, the runtime environment spawns this number of threads which take care of scheduling LPs and events, according to the specified scheduling policy. It is not possible, for performance reasons, to set this value to a number higher that the available core count on a node.
lp: Number of logical processes to be used in the simulation, across all threads (or across all compute nodes, if running in distributed). Legal values are in the range [1, 65536].
output-dir: This specifies the name of the folder where runtime statistics are dumped. This defaults to outputs.
scheduler: This option specifies what scheduler should be used to determine the next LP to be activated on a worker thread. Legal values are stf (for Smallest Timestamp First, an O(n) scheduler), and star (an O(1)) scheduler. The rule of thumb to choose the scheduling strategy is that if many LPs are handled by a single worker thread, the star scheduler is expected to behave more efficiently. If few LPs are on a single node, the reduced overhead of the stf scheduler might provide better results.
npwd: "Non piece-wise deterministic execution". This option tells the simulation engine that there is the possibility that replaying an event might lead to different results (note that this is never related to statistical samples drawn by the ROOT-Sim library). If this is the case, this option forces the runtime environment to take a checkpoint after the execution of every simulation event. While this hampers performance, it leads to correct results in that case. This is equivalent to using --p 1.
p: Checkpointing interval. This tells ROOT-Sim to take a snapshot of the simulation state after this number of events. The default value is 10.
full: Full checkpointing. Checkpoints taken by ROOT-Sim are full. It cannot be used in conjunction with --inc. This is the default checkpointing strategy.
inc: Incremental checkpointing. Most of the checkpoints are incremental (periodically, for performance reasons, a full checkpoint is taken anyhow). The performance of executing an event might increase, so this should be used only when it is foreseeable that simulation states are large, but seldom updated (or only small portions of the state are updated). It cannot be used in conjunction with --full.
A: "Autonomic checkpointing". A runtime performance model is periodically evaluated, telling the runtime library what is the best checkpointing interval, and whether to switch across incremental or full checkpointing. This feature is disabled by default.
gvt: The time period to wait before a new GVT reduction is performed (in msec). The legal range is [500, 5000], the default is 1000.
cktrm-mode: ROOT-Sim offers the possibility to ask the simulation model to determine (by inspecting the simulation state) whether a simulation should be considered completed or not. Two different termination detection modes are allowed: normal and incremental. normal asks every LP in the run to check its state every time a new GVT value is computed (see the option gvt-snapshot-cycles for further information). When using incremental, on the other hand, if a LP told the runtime that it considers the simulation as completed, it will never be queried again in the run. While this allows to reduce the overhead of termination detection, it could be possible that if the termination condition is fluctuating, the correct termination instant is lost, leading to incorrect results. Therefore, incremental should only be used when it is true that a LP which determined that the simulation can be halted, will never change its mind.
gvt-snapshot-cycles: Number of consecutive GVT calculations before rebuilding a state for temination detection. There is no limit on this value, provided that it is non-negative. The default is 2. Consider that, the higher is this number, the longer the user might wait before the simulator notices that the simulation can be stopped.
simulation-time: If the simulation should be halted after a certain amount of simulation time, this value can be passed to this option. A value of zero (the default) means infinity, therefore to halt a simulation LPs must agree on a different condition, specified in the model.
lps-distribution: LPs distribution across worker threads (and nodes, if running in distributed). Legal values are block (the number of LPs is divided across the whole number of worker threads, and this number is assigned in bulk, accounting also for leftovers); circular (a "round robin" policy, in which the first LP is assigned to the first worker thread, the second LP to the second worker thread, and so on). Tinkering with this option might lead to reduced inter-thread and inter-kernel communication, depending on the model, and can affect the performance. The default value is block.
deterministic-seed: Every time that a user runs a model, a new pseudo-random seed is generated. This allows to simulate slightly different scenarios across the runs. If this behaviour is not wanted, namely we want all runs to use the very same pseudo-random sequence of numbers, the option --deterministic-seed can be passed to the executable. This option is disabled by default.
verbose: info and debug can be passed to this option. info prints very reduced information statistics on screen. debug dumps a lot of text, describing message interaction and runtime dynamics, to build execution traces.
stats: This option tells what information should be collected at runtime and dumped in the output dir. global only generates a text file describing average information about all worker threads and nodes performance. performance generates a log of the various interesting performance metrics at each GVT computation. lp dumps punctual information for each LP in the run. all is the most verbose statistic level, which dumps all the above.
seed: It is possible to specify an integer which is used to seed the pseudo-random number generator provided by the ROOT-Sim library.
serial: It runs the simulation sequentially, on a single core of a single node. This can be used to perform initial debugging of the model, or to measure the parallel/distributed speedup.
sequential: It is an alias for --serial.
no-core-binding: By default, to reduce interference, ROOT-Sim sticks worker threads to specific cores. This option disables this behaviour. It can be useful when running a sequential simulation, or when running multiple MPI ranks on a single node for debugging purposes.

The minimal amount of parameters to be passed to a model is the number of threads and the number of logical processes, as follows:

./model --wt 1 --lp 16

This runs the simulation model with 16 LPs on a single worker thread. Note that this is fundamentally different from:

./model --sequential --lp 16

Indeed, the sequential scheduler is a completely different subsystem with respect to ROOT-Sim's speculative capabilities. The latter command schedules events using a Calendar Queue, which is expected to be much more optimized in a sequential run. The two commands can be used to appreciate the performance overhead induced by the speculative infrastructure, if any.

Running Distributed

ROOT-Sim ultimately relies on MPI for distributed processing. If MPI support has been compiled in the library, a distributed run can be started as:

mpiexec -n 2 --hostfile hosts --map-by node ./model --wt 2 --lp 16

This tells the MPI runtime installed on the machine and used to compile ROOT-Sim that we want to use to compute nodes (-n 2) which can be reached using the information provided in the hosts file. --map-by node ensures that an even amount of computing resources are taken from both nodes.

Any option which is legal for MPI can be passed to mpiexec.

Debugging the core

If you want to debug the core library, you can use gdb attaching to any model compiled against ROOT-Sim. Please note that you must configure ROOT-Sim passing to configure the --enable-debug flag, which disables optimizations and generates debug symbols.

Debugging on a single node, therefore, can be done with the following command:

gdb --args ./model --wt 2 --lp 2

setting the number of worker threads (--wt) and the number of LPs (--lp) to any suitable value.

If you need to debug distributed runs, possibly to find out problems in distributed algorithms, this is a bit trickier. The problem is that you cannot simply run gdb on mpiexec, as you would debug mpiexec rather than the model. Therefore, these are the common steps for debugging the distributed version:

launch the model
get the pid of the model (e.g., by running pidof model in a shell) on each node where there is an instance of the distributed deploy running
launch gdb on each node where there is an instance of the distributed deploy running
attach to the process using attach PID
look at all debugger instances to see what's going on.

The problem here is that doing all these steps takes time, and the likelihood that you miss the bug is almost 100%. ROOT-Sim, when compiled using --enable-debug at configure time, looks for the presence of the environment variable WGDB (wait for gdb). If this variable is set, it enters an infinite loop right after having entered main(), giving you the time to debug everything that is going on since the simulation startup. Additionally, this global variable prints on screen the PID of the process, to speedup the process of attaching with the debugger.Therefore, to debug the distributed version of the simulation model, you can launch it as:

WGDB=1 mpiexec -n 2 ./model --wt 2 --lp 4

Please note that when you attach to the PIDs in gdb, they are spinning in an infinite loop implemented like this:

if((getenv("WGDB")) != NULL && *(getenv("WGDB")) == '1') {
    ...
    while (i == __wait) {
        sleep(5);
    }
}

Therefore, there is a high likelyhood that you will attach the debugger when it is running the sleep() function.

To continue the execution, you have to issue the following commands in gdb:

(gdb) up
(gdb) set var __wait = 1
(gdb) continue

You might need to issue more than one up function, until you reach the main function.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

usage

Usage Guide

Using the compiler

Running on a Single Node

Running Distributed

Debugging the core

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally