We assume that the input data for tuning has already been generated.
Preparing input data with app-yoda2hf and app-datadirtojson
Loading input data from YODA files is supported but can be time consuming. We suggest converting the hierarchical directory structure to HDF5.
# Convert yoda files found in INPUTDIR to hdf5. Parameter information is# extracted from files matching params.datmpirun-np4app-yoda2h5INPUTDIR--pnameparams.dat-oinputdata.h5# Convert all yoda files recursively found in DATADIR containing /REF data to jsonapp-datadirtojsonDATADIR-odata.json
Data loading from YODA files in directories as was done in Professor is supported. This is true for the parameterisation inputs as well as the reference data files.
Training approximations with app-build
Note that in contrast to Professor, approximations of the bin contents and bin errors are trained separately.
mpirun-np4python3app-buildinputdata.h5--order3,0-oval_30.jsonmpirun-np4python3app-buildinputdata.h5--order2,0-oerr_20.json--errs# NERSC, i.e. slurm --- this example computes rational approximation with slsqpsrun-n1000python3app-buildinputdata.h5--order--modesip4,1-oval_41.json
Envelope plotting
To see how the inputs to the approximation do compare with the experimental data, we provide the script app-yodaenvelopes. It takes the extreme input values per bin and stores that information as two separate YODA files, suitable for plotting with e.g. rivet-mkhtml. This allows to quickly see if for instance the chosen parameter space is suitable for minimisation of a goodness of fit measure given the data.
If the inputs do not envelope the data, neither will the approximation. By default, the tuning stage will discard bins where this is the case. If one does filter these bins, the minimisation is
in danger of leading to non trustworthy results due to being in a regime of extrapolation.
Envelope plot example.
Inspecting approximations with app-ls
A summary of the built approximations can be obtained with app-ls. The script also allows to produce a standard weight file later used as input for the optimisation.
Optmisation with app-tune2
This script loads approximations, experimentally observed data and a weight file to define an objective. The objective is minimised numerically. All output is written to a folder specified with -o.
The outputs vary depending on options and available packages. If YODA is is found on the system,
a representation approximations evaluated at the best fit point is stored as YODA file for convenient plotting with e.g. rivet-mkhtml.
A typical output looks like this:
The base file name of all outputs is automatically generated from the minimisation options.
It contains information about the objective function at the found minimum and of course the best fit point. Further, a comment is added if e.g. a parameter was fixed or ended up at the domain boundary.
Selecting a minimisation algorithm
The default minimiser is a Truncated Newton method (scipy.optimize "tnc").
The command line option for choosing an algorithm is -a the following arguments are valid
tnc (default)
lbfgsb (scipy.optimize "lbfgsb")
ncg (scipy.optimize "ncg")
trust (scipy.optimize "trust-exact")
ncg and trust do not have a concept of domain limits, both use second order information however
Note that currently by default a check is performed to test if the minimisation ended up in a saddle point. If that is detected, the minimisation is restarted up to 10 times. To override this behavior, use the option --no-check.
Multistart options
The strategy to select a good start point for the minimisers is to evaluate the objective for a randomly selected set of points. The size of this survey can be adjusted with the command line option -s.
By default, the minimisation runs once. To increase the number of restarts (that is separate minimisations each starting from a different start point) use the command line option -r.
For each restart, a new random survey is performed to select a start point. The final result of the optimisation is the best result from all restarts.
Limits and fixed parameters
By default, the domain of the approximation is used to set bounds for the minimisation (except for ncg and trust which do not support this to begin with). Manual limits can be supplied via a simple text file.
The command line option is -l.
In the following example, the parameter "PARAM_A" is bound to values between 3 and 5.
Specifying limits overrides the default bounds --- but only for the parameters specified in the file. I.e. the domain bound for all other dimensions stay at their default.
To fix individual parameters, the same option -l can be used. Fixing parameters and setting manual limits can be mixed.
Plot options
By default, the correlation of the parameters is inferred from the inverse of the hessian at the minimum. A colour map plot is stored in the output folder.
To produce profiles, that is 1D projections of the objective function onto the parameter axes, the command line switch -p can be used.
Scan of the objective in one direction ("pickQuarkNorm") all other parameter are fixed at their best fit value.
Optimisation with app-nest
Instead of numerical minimisation, we can use MultiNest (https://github.com/JohannesBuchner/PyMultiNest) to sample the domain and use Bayesian inference to learn about best fit points. MultiNest can be pip installed but requies libmultinest.so to be build. The documentation at their webpage is excellent. All options for multinest are available as options for app-nest. It further supports the setting of limits and fixing parameters the same way as app-tune2.
# Parameter limit file, comments and empty lines are ignored
# A string followed by two numbers is interpreted as bounds
PARAM_A 3 5
# A string followed by a single numbe is interpreted as fixed value
PARAM_B 3.145
app-nest allweights data.json val_30.json -e err_30.json -o nestout
# If MPI4py is available and libmultinest is build with MPI support
mpirun -np 4 app-nest allweights data.json val_30.json -e err_30.json -o nestout