Differences to Professor

Code base

Professor2 uses a C++ compiled library for training and evaluation. Cython bindings are used to integrate with python. Apprentice is completely written in python3 and heavily uses numpy arrays for computations, thus getting the benefit of vectorisation and thread parallelism from a high level math library. We further use numba's just in time compilation to accelerate pure python functions where it is reasonable.

Parameterisation

Input

The sole input to the parameterisation step in prof2 is a hierarchical directory structure containing yoda files and text files containing the parameter points. Apprentice supports that, too, but in addition allows to convert that structure to hdf5 and to read from hdf5.

This most importantly reduces the time to read the input data as parsing hundreds of YODA files with hundreds of analysis objects can be quite time consuming. Reading from hdf5 on the otherhand is very fast and allows for parallel access.

Output

Apprentice stores the computed coefficients of polynomial and rational approximations as JSON. This makes it easier to use the approximations in other languages such as C++ and Julia. Further, the bin values and bin uncertainties are now trained separately, allowing for more flexibility regarding the type of approximations used and what kind of uncertainties are being interpolated.

This allows to capture e.g. PDFs and scale variation uncertainties.

Rational approximations

Apprentice allows to go beyond purely polynomial approximations by training multivariate rational functions. They are particularly well suited if the functional form exhibits traits of rational functions, like the 1/m dependence in propagators or parts of fragmentation and underlying event models.

In the absence of statistical noise, these can be fit using linear algebra techniques similar to fitting of polynomials. Noisy input data, however, has a tendency to introduce spurious poles, i.e. the denominator polynomials can have roots in the interpolation domain. For these cases, apprentice provides a training method based on the SLSQP algorithm, which is a constrained optimisation, ensuring the denominator always being >0.

Parallelism

Professor used a simple thread based parallelism of the training phase while apprentice supports MPI, thus allowing to distribute the computation to many network connected computers which is desirable for more complex problems.

Minimisation

Apprentice provides the user to choose between four minimisation algorithms available through scipy:

  • TNC (default)

  • LBFGSB

  • NCG

  • trust

All algorithms use exact gradients. NCG and trust use second order information (hessians), too, however do not know about the concept of limits. We chose tnc a default mainly for speed and the tendency of lbfgsb to end up in saddle points. Since we have access to the exact hessian of the tuning objective, we can easily check for saddle points in which case we trigger a restart of the minimisation from a different starting point.

Professor2 used minuit and estimated the gradients numerically through finite difference methods.

The tuning objective minimisation is particularly prone to local minima and may require several restarts of the minimisation in general to get close to global optimality. The two orders of magnitude faster computation of the objective function in apprentice make this drawback much more bearable.

Last updated

Was this helpful?