app-build is used to train polynomial or rational approximations. The only mandatory input is the data that is to be trained on. This can either be the top-level of a hierarchical directory structure full of YODA files or
a suitably formatted hdf5 file.
Conversion to hdf5 can either be done by app-yoda2h5 or with the option --convert to app-build
Command line options
-o output.json
All trained approximations are written as dictionary to JSON. The output file name can be set with -o.
--order m,n
By default, polynomial approximations of order 3 are trained. To change the desired orders, use --order m,n where m,n is a comma separated string. m and n are expected to be integers. m determines the polynomial order of the numerator and n the order of the denominator. A polynomial of order 0 is a constant.
# These are incomplete examples that show the syntax of --order
# Pure polynomials of order 2
app-build inputdata --order 2,0 -o app_20.json
# Rational approximations --- numerator is order 4, denominator is order 1
app-build inputdata --order 4,1 -o app_41.json
--errs
By default, the bin values (means of MC simulation) are used to train approximations. To train on the bins' statistical uncertainties, use the switch --errs.
-w
If only a subset of histograms is to be trained on, a text file with histogram names can be supplied.
Lines starting with a comment are ignored. Only the first substring of each line is read.
The syntax is:
A pragmatic set of steps is to run
app-build inputdata --order 0,0 -o dummy.json
app-ls dummy.json -w > mytextfile
then edit mytextfile to keep only the desired histograms and run app-build as above.
--mode la or sip
Allows so switch between linear algebra (la) and constraint optimisation (sip) approaches. This only affects rational approximations. The default is constrained optimisation.
The linear algebra approach is not a good idea in the presence of noisy input data due to the occurrence of spurious poles. We automatically check for the presence of those and alert the user to it. In which case --mode sip is the better option.
--convert inputdata.h5
This is an option that converts the data read in from a hierarchical directory structure full of YODA files
to hdf5. The file name for storage is option to be provided to --convert.
This is not mandatory but nice if the parsing of yoda files is time consuming, meaning you have to do it only once. It is particularly beneficial to read from hdf5 when using mpirun.
MPI parallelism
The task of training approximations is embarrassingly parallel and hence very accommodating to distributed computations with MPI. All one needs to do is to prepend an mpirun or srun instruction to app-build.
# Typical example, first we train the bin values
app-build inputdata --order 4,1 -o vals_41.json
# then we train on the uncertainties
app-build inputdata --order 2,0 -o errs_20.json --errs
# Example format for a file that is understood when using -w
/ATLAS_2011_I919017/d01-x01-y01 1.0 # 33 bins
/ATLAS_2011_I919017/d01-x01-y02 1.0 # 33 bins
/ATLAS_2011_I919017/d01-x01-y03 1.0 # 33 bins
#/ATLAS_2011_I919017/d01-x01-y04 1.0 # 32 bins
# On a laptop, run parallel on 4 ranks
mpirun -np 4 app-build inputdata --order 4,1 -o vals_sip_41.json
# On a typical institute node run parallel on 64 ranks
mpirun -np 64 app-build inputdata --order 4,1 -o vals_sip_41.json
# At an HPC facility such as NERSC run parallel on 1024 ranks
srun -n 1024 -c 2 --cpu_bind=cores app-build inputdata --order 4,1 -o vals_sip_41.json