5 minute tuning tutorial

We assume that the input data for tuning has already been generated.

Preparing input data with app-yoda2hf and app-datadirtojson

Loading input data from YODA files is supported but can be time consuming. We suggest converting the hierarchical directory structure to HDF5.

# Convert yoda files found in INPUTDIR to hdf5. Parameter information is
# extracted from files matching params.dat
mpirun -np 4 app-yoda2h5 INPUTDIR --pname params.dat -o inputdata.h5

# Convert all yoda files recursively found in DATADIR containing /REF data to json
app-datadirtojson DATADIR -o data.json
circle-info

Data loading from YODA files in directories as was done in Professor is supported. This is true for the parameterisation inputs as well as the reference data files.

Training approximations with app-build

Note that in contrast to Professor, approximations of the bin contents and bin errors are trained separately.

mpirun -np 4 python3 app-build inputdata.h5  --order 3,0 -o val_30.json
mpirun -np 4 python3 app-build inputdata.h5  --order 2,0 -o err_20.json --errs

# NERSC, i.e. slurm --- this example computes rational approximation with slsqp
srun -n 1000 python3 app-build inputdata.h5  --order --mode sip 4,1 -o val_41.json 

Envelope plotting

To see how the inputs to the approximation do compare with the experimental data, we provide the script app-yodaenvelopes. It takes the extreme input values per bin and stores that information as two separate YODA files, suitable for plotting with e.g. rivet-mkhtml. This allows to quickly see if for instance the chosen parameter space is suitable for minimisation of a goodness of fit measure given the data.

circle-info

If the inputs do not envelope the data, neither will the approximation. By default, the tuning stage will discard bins where this is the case. If one does filter these bins, the minimisation is in danger of leading to non trustworthy results due to being in a regime of extrapolation.

Envelope plot example.

Inspecting approximations with app-ls

A summary of the built approximations can be obtained with app-ls. The script also allows to produce a standard weight file later used as input for the optimisation.

Optmisation with app-tune2

This script loads approximations, experimentally observed data and a weight file to define an objective. The objective is minimised numerically. All output is written to a folder specified with -o. The outputs vary depending on options and available packages. If YODA is is found on the system, a representation approximations evaluated at the best fit point is stored as YODA file for convenient plotting with e.g. rivet-mkhtml.

A typical output looks like this:

circle-info

The base file name of all outputs is automatically generated from the minimisation options.

It contains information about the objective function at the found minimum and of course the best fit point. Further, a comment is added if e.g. a parameter was fixed or ended up at the domain boundary.

Selecting a minimisation algorithm

The default minimiser is a Truncated Newton method (scipy.optimize "tnc"). The command line option for choosing an algorithm is -a the following arguments are valid

  • tnc (default)

  • lbfgsb (scipy.optimize "lbfgsb")

  • ncg (scipy.optimize "ncg")

  • trust (scipy.optimize "trust-exact")

circle-info

ncg and trust do not have a concept of domain limits, both use second order information however

circle-info

Note that currently by default a check is performed to test if the minimisation ended up in a saddle point. If that is detected, the minimisation is restarted up to 10 times. To override this behavior, use the option --no-check.

Multistart options

The strategy to select a good start point for the minimisers is to evaluate the objective for a randomly selected set of points. The size of this survey can be adjusted with the command line option -s.

By default, the minimisation runs once. To increase the number of restarts (that is separate minimisations each starting from a different start point) use the command line option -r.

circle-info

For each restart, a new random survey is performed to select a start point. The final result of the optimisation is the best result from all restarts.

Limits and fixed parameters

By default, the domain of the approximation is used to set bounds for the minimisation (except for ncg and trust which do not support this to begin with). Manual limits can be supplied via a simple text file. The command line option is -l.

In the following example, the parameter "PARAM_A" is bound to values between 3 and 5.

circle-info

Specifying limits overrides the default bounds --- but only for the parameters specified in the file. I.e. the domain bound for all other dimensions stay at their default.

circle-info

To fix individual parameters, the same option -l can be used. Fixing parameters and setting manual limits can be mixed.

Plot options

By default, the correlation of the parameters is inferred from the inverse of the hessian at the minimum. A colour map plot is stored in the output folder.

Parameter correlations

Example visualisation of correlation matrix.

Histogram plotting

If YODA is available, the predictions at the found minimum are written to yoda file. The latter can be plotted using e.g. rivet-mkhtml (See https://rivet.hepforge.org/trac/wiki/RivetHistogrammingarrow-up-right)

Profiles of objective

To produce profiles, that is 1D projections of the objective function onto the parameter axes, the command line switch -p can be used.

Scan of the objective in one direction ("pickQuarkNorm") all other parameter are fixed at their best fit value.

Optimisation with app-nest

Instead of numerical minimisation, we can use MultiNest (https://github.com/JohannesBuchner/PyMultiNestarrow-up-right) to sample the domain and use Bayesian inference to learn about best fit points. MultiNest can be pip installed but requies libmultinest.so to be build. The documentation at their webpage is excellent. All options for multinest are available as options for app-nest. It further supports the setting of limits and fixing parameters the same way as app-tune2.

Cornerplot made from output of app-nest.

Last updated