This vignette serves as a reminder about a possible scheduling strategy for the computing jobs that are needed for the marker effect estimation.
Currently all the computing runs are listed in a single file called gsRuns.txt. Then the list of runs is sorted according to the estimated run time from the previous evaluation and the list is split into smaller batches of runs. These batches are defined an stored in files that are named gsSortedRuns.txt.{i} where i is a number between 1 and N and N can be specified as an input to the script that does the sorting and the split. The batches of runs are then run in parallel using trivial parallelization. This parallelization is implemented using different screens on the same machine.
An alternative is to omit the creation of the batches of runs files.
The runs in gsRuns.txt are sorted according to the estimated run time
from the previous evaluation and are written to one file
gsSortedRuns.txt. Then we use a larger number of screens for the
parallelisation. The number of screens should be about equal to the
number of cores that a server has. Within each screen we do a loop over
the lines of gsSortedRuns.txt and start the individual runs one after
each other. After starting a given run, we write a file to a
subdirectory called started
which notifies this run to be
started. Before starting a different new run, we first check, whether it
has not already been started.
Latest Changes: 2022-06-27 14:09:46 (pvr)