NAMEompi-restart, orte-restart - Restart a previously checkpointed parallel job using the Open PAL Checkpoint/Restart Service (CRS) NOTE: ompi-restart, and orte-restart are all exact synonyms for each other. Using any of the names will result in exactly identical behavior.
SYNOPSISompi-restart [ options ] <GLOBAL SNAPSHOT HANDLE>
OPTIONSompi-restart will attempt to restart a previously checkpointed parallel job from the global snapshot handle reference returned by ompi_checkpoint. <GLOBAL SNAPSHOT HANDLE> The global snapshot handle reference returned by ompi_checkpoint, used to restart the job. This is required to be the last argument to this command. -h | --help Display help for this command -p | --preload Preload the checkpoint files on the remote systems before restarting the application. Disabled by default. --fork Fork off a new process, which is the restarted process. By default, the restarted process will replace ompi-restart. -s | --seq The sequence number of the checkpoint to restart from. By default, the most recent sequence number is used (specified by -1). -hostfile | --hostfile The hostfile from which to restart the application. Useful in unscheduled environments. (Same behavior as --machinefile option) -machinefile | --machinefile The machinefile from which to restart the application. Useful in unscheduled environments. (Same behavior as --hostfile option) -v | --verbose Enable verbose output for debugging. -gmca | --gmca <key> <value> Pass global MCA parameters that are applicable to all contexts. <key> is the parameter name; <value> is the parameter value. -mca | --mca <key> <value> Send arguments to various MCA modules.
DESCRIPTIONompi-restart can be invoked multiple, non-overlapping times. This allows the user to restart a previously running parallel job.
SEE ALSOorte-ps(1), orte-clean(1), ompi-checkpoint(1), opal-checkpoint(1), opal-restart(1), opal_crs(7) OMPI-RESTART(1)