Final Blog Post For Google Summer Of Code

7 minute read

It’s finally the last week of the Google Summer of Code 2018. Before I start discussing my work over the summer I would like to highlight my general experience with the GSoC program.

GSoC gives students all over the world the opportunity to connect and collaborate with some of the best programmers involved in open source from around the world. I found the programme tremendusly enriching both in terms of the depth in which I got to explore some of the areas involved in my project and also gave me exxposure to some areas I had no previous idea about. The role of a mentor in GSoC is easily the most important and I consider myself very lucky to have got Dr. Chris Rackauckas and Dr. Vijay Ivaturi as my mentors. Chris has been tremendously encouraging and patient throughout the summer. I would also like to mention the importance of the entire community involved, just being part of the Julia community and following the many exciting projects in Julia has been a learning experience in itself.

My proposal had been mainly concerned with Parameter Estimation in differential equations but because of my work before GSoC (also because we had to drop one agorithm, more on this ahead) we were able to quickly finish up the work mentioned in the proposal withhin the first month itself and proceeded to expand the scope of my work to cover Global Sensitivity Analysis too.

Parameter Estimation

I successfully covered the following points from my proposal.

  • First Differencing in loss objective.

The first issue we tackled was #72 which involved adding first differencing to the L2Loss objective.

L2Loss(t,data,differ_weight=0.3,data_weight=0.7)

First differencing incorporates the differences of data points at consecutive time points which adds more information about the trajectory in the loss function. You can now assign a weight (vector or scalar) to use the first differencing technique in the L2loss. Adding first differencing is helpful in cases where the L2Loss alone leads to non-identifiable parameters but adding a first differencing term makes it more identifiable. This can be noted on stochastic differential equation models, where this aims to capture the autocorrelation and therefore helps us avoid getting the same stationary distribution despite different trajectories and thus wrong parameter estimates.

  • Maxmimum A Posteriori

Next on the list was adding a MAP estimator to the parameter estimation tooling. We decided to implement it by adding a prior option to build_loss_objective that essentially turns it into MAP by multiplying the loglikelihood (the cost) by the prior. The option was added as a keyword argument priors, it can take in either an array of univariate distributions for each of the parameters or a multivariate distribution.

  • Multiple Shooting version of the objective function

Next was the issue #22. Multiple Shooting is generally used in Boundary Value Problems (BVP) and is more robust than the regular objective function used in these problems. It proceeds as follows:

  1. Divide up the time span into short time periods and solve the equation with the current parameters which here consist of both, the parameters of the differential equations and also the initial values for the short time periods.
  2. This objective additionally involves a dicontinuity error term that imposes higher cost if the end of the solution of one time period doesn’t match the begining of the next one.
  3. Merge the solutions from the shorter intervals and then calculate the loss.

For consistency multiple_shooting_objective takes exactly the same arguments as build_loss_objective. It also has the option for discontinuity_error as a kwarg whichh assigns weight to te error occuring due to the discontinuity that arises from the breaking up of the time span.

ms_f = function (du,u,p,t)
  du[1] = p[1] * u[1] - p[2] * u[1]*u[2]
  du[2] = -3.0 * u[2] + u[1]*u[2]
end
ms_u0 = [1.0;1.0]
tspan = (0.0,10.0)
ms_p = [1.5,1.0]
ms_prob = ODEProblem(ms_f,ms_u0,tspan,ms_p)
t = collect(range(0, stop=10, length=200))
data = Array(solve(ms_prob,Tsit5(),saveat=t,abstol=1e-12,reltol=1e-12))
bound = Tuple{Float64, Float64}[(0, 10),(0, 10),(0, 10),(0, 10),
                                (0, 10),(0, 10),(0, 10),(0, 10),
                                (0, 10),(0, 10),(0, 10),(0, 10),
                                (0, 10),(0, 10),(0, 10),(0, 10),(0, 10),(0, 10)]


ms_obj = multiple_shooting_objective(ms_prob,Tsit5(),L2Loss(t,data);discontinuity_weight=1.0,abstol=1e-12,reltol=1e-12)

This creates the objective function that can be passed to an optimizer from which we can then get the parameter values and the initial values of the short time periods.

  • Stochastic Approximate Expectation Maximization

SAEM is a modification of the Expectation Maximization algorithm and uses a stochastic approximation procedure in the E-step, coupled with a gradient-type updating procedure (e.g. the Newton–Raphson) in the M-step.

Though the SAEM algorithm was discussed earlier we had missed a slight detail in it, that it’s meant for mixed effects models and currently the DiffEq ecosystem doesn’t support these models. So we had to drop this algorithm and decided to change the direction of my work a bit.

Sensitivity Analysis

We moved on to a different but related area in model analysis, Global Sensitivity Analysis. Sensitivity analysis is the study of how the uncertainty in the output of a mathematical model or system (numerical or otherwise) can be apportioned to different sources of uncertainty in its inputs.

For an ODE the sensitivity problem is the variation in x(t), the solution of the ODE w.r.t. variation in θ, the parameters of the ODE.

Dynamical systems modelled by differential equations contain parameters whose values are not precisely known. Uncertainty in those parameter values produces uncertainty in the output of the model. Understanding and quantifying this uncertainty using sensitivity analysis is an important part of the development and use of models Sensitivity analysis methods are applicable to a variety of fields, like systems pharmacology, and because of this wide area of applications, there exists several different broad techniques and specific methods.

  • Morris Method

The Morris method also known as Morris’s OAT method where OAT stands for One At a Time can be described in the following steps:

We calculate local sensitivity measures known as “elementary effects”, which are calculated by measuring the perturbation in the output of the model on changing one parameter.

These are evaluated at various points in the input chosen such that a wide “spread” of the parameter space is explored and considered in the analysis, to provide an approximate global importance measure. The mean and variance of these elementary effects is computed. A high value of the mean implies that a parameter is important, a high variance implies that its effects are non-linear or the result of interactions with other inputs. The key to the Morris method is an efficient design for the selection of the input points which optimises coverage of the space and minimises the number of model evaluations required to calculate the elementary effects. The implementation can be found here.

  • Sobol Method

Sobol is a variance-based method and it decomposes the variance of the output of the model or system into fractions which can be attributed to inputs or sets of inputs. This helps to get not just the individual parameter’s sensitivities but also gives a way to quantify the affect and sensitivity from the interaction between the parameters.

The Sobol Indices are “order”ed, the first order indices given by the contribution to the output variance of the main effect of , therefore it measures the effect of varying alone, but averaged over variations in other input parameters. It is standardised by the total variance to provide a fractional contribution. Higher-order interaction indices and so on can be formed by dividing other terms in the variance decomposition by .

  • Regression Method

If a sample of inputs and outputs 􏰁 is available, it is possible to fit a linear model explaining the behaviour of Y given the values of X, provided that the sample size n is sufficiently large (at least n > d).

The measures provided for this analysis by us in DiffEqSensitivity.jl are

a) Pearson Correlation Coefficient:

b) Standard Regression Coefficient (SRC):

where is the linear regression coefficient associated to $X_j$.

c) Partial Correlation Coefficient (PCC):

where 􏰈 is the prediction of the linear model, expressing with respect to the other inputs and is the prediction of the linear model where is absent. PCC measures the sensitivity of to when the effects of the other inputs have been canceled.

JuliaCon and Upgradation to v0.7 and v1.0

In the last few days I was involved in upgrading the code of the packages I had been involved in to v0.7 and then v1.0 of Julia following it’s release. I was also busy with preparing for my talk at JuliaCon’18 and travelling to London for it. Attending JuliaCon gave me the opportunity to get even more exposure to all the great stuff happening in Julia community and also gave me a chance to make the community aware of my work through my talk and poster presentation, so I would like to extend my appreciation to the Julia Project and NumFocus for sponsoring my trip to JuliaCon!

All of my work has been mainly concentrated in DiffEqParamEstim.jl, DiffEqSensitivity.jl and DiffEqBayes.jl and can be found in these repositories and my forks of these at https://github.com/Vaibhavdixit02/DiffEqParamEstim.jl, https://github.com/Vaibhavdixit02/DiffEqSensitivity.jl, https://github.com/Vaibhavdixit02/DiffEqBayes.jl.

Cheers!

Updated: