Modelling with Diffsol

Ordinary Differential Equations (ODEs) are a powerful tool for modelling a wide range of physical systems. Unlike purely data-driven models, ODEs are based on the underlying physics, biology, or chemistry of the system being modelled. This makes them particularly useful for predicting the behaviour of a system under conditions that have not been observed. In this section, we will introduce the basics of ODE modelling, and illustrate their use with a series of examples written using the Diffsol crate.

The topics covered in this section are:

First Order ODEs: First order ODEs are the simplest type of ODE. Any ODE system can be written as a set of first order ODEs, so libraries like Diffsol are designed such that the user provides their equations in this form.
- Example: Population Dynamics: A simple example of a first order ODE system, modelling the interaction of predator and prey populations.
Higher Order ODEs: Higher order ODEs are equations that involve derivatives of order greater than one. These can be converted to a system of first order ODEs, which is the form that Diffsol expects.
- Example: Spring-mass systems: A simple example of a higher order ODE system, modelling the motion of a damped spring-mass system.
Discrete Events: Discrete events are events that occur at specific times or when the system is in a particular state, rather than continuously. These can be modelled by treating the events as changes in the ODE system's state. Diffsol provides an API to detect and handle these events.
- Example: Compartmental models of Drug Delivery: Pharmacokinetic models describe how a drug is absorbed, distributed, metabolised, and excreted by the body. They are a common example of systems with discrete events, as the drug is often administered at discrete times.
- Example: Bouncing Ball: A simple example of a system where the discrete event occurs when the ball hits the ground, instead of at a specific time.
DAEs via the Mass Matrix: Differential Algebraic Equations (DAEs) are a generalisation of ODEs that include algebraic equations as well as differential equations. Diffsol can solve DAEs by treating them as ODEs with a mass matrix. This section explains how to use the mass matrix to solve DAEs.
- Example: Electrical Circuits: Electrical circuits are a common example of DAEs, here we will model a simple low-pass LRC filter circuit.
PDEs: Partial Differential Equations (PDEs) are a generalisation of ODEs that involve derivatives with respect to more than one variable (e.g. a spatial variable). Diffsol can be used to solver PDEs using the method of lines, where the spatial derivatives are discretised to form a system of ODEs.
- Example: Heat Equation: The heat equation describes how heat diffuses in a domain over time. We will solve the heat equation in a 1D domain with Dirichlet boundary conditions.
- Example: Physics-based Battery Simulation: A more complex example of a PDE system, modelling the charge and discharge of a lithium-ion battery. For this example we will use the PyBaMM library to form the ODE system, and Diffsol to solve it.
Forward Sensitivity Analysis: Sensitivity analysis is a technique used to determine how the output of a model changes with respect to changes in the model parameters. Forward sensitivity analysis calculates the sensitivity of the model output with respect to the parameters by solving the ODE system and the sensitivity equations simultaneously.
- Example: Fitting a predator-prey model to data: An example of fitting a predator-prey model to synthetic data using forward sensitivity analysis.
Backwards Sensitivity Analysis: Backwards sensitivity analysis calculates the sensitivity of a loss function with respect to the parameters by first solving the ODE system and then the adjoint equations backwards in time. This is useful if your model has a high number of parameters, as it can be more efficient than forward sensitivity analysis.
- Example: Fitting a spring-mass model to data: An example of fitting a spring-mass model to synthetic data using backwards sensitivity analysis.
- Example: Weather prediction using neural ODEs: An example of fitting a neural ODE to weather data using backwards sensitivity analysis.

Explicit First Order ODEs

Ordinary Differential Equations (ODEs) are often called rate equations because they describe how the rate of change of a system depends on its current state. For example, lets assume we wish to model the growth of a population of cells within a petri dish. We could define the state of the system as the concentration of cells in the dish, and assign this state to a variable $c$. The rate of change of the system would then be the rate at which the concentration of cells changes with time, which we could denote as $\frac{dc}{dt}$. We know that our cells will grow at a rate proportional to the current concentration of cells, so this can be written as:

\[ \frac{dc}{dt} = k c \]

where $k$ is a constant that describes the growth rate of the cells. This is a first order ODE, because it involves only the first derivative of the state variable $c$ with respect to time.

We can extend this further to solve multiple equations simultaineously, in order to model the rate of change of more than one quantity. For example, say we had two populations of cells in the same dish that grow with different rates. We could define the state of the system as the concentrations of the two cell populations, and assign these states to variables $c_1$ and $c_2$. could then write down both equations as:

\[ \begin{align*} \frac{dc_1}{dt} &= k_1 c_1 \\ \frac{dc_2}{dt} &= k_2 c_2 \end{align*} \]

and then combine them in a vector form as:

\[ \begin{bmatrix} \frac{dc_1}{dt} \\ \frac{dc_2}{dt} \end{bmatrix} = \begin{bmatrix} k_1 c_1 \\ k_2 c_2 \end{bmatrix} \]

By defining a new vector of state variables $\mathbf{y} = [c_1, c_2]$ and a vector valued function $\mathbf{f}(\mathbf{y}, t) = \begin{bmatrix} k_1 c_1 \\ k_2 c_2 \end{bmatrix}$, we are left with the standard form of a explicit first order ODE system:

\[ \frac{d\mathbf{y}}{dt} = \mathbf{f}(\mathbf{y}, t) \]

This is an explicit equation for the derivative of the state, $\frac{d\mathbf{y}}{dt}$, as a function of the state variables $\mathbf{y}$ and of time $t$.

We need one more piece of information to solve this system of ODEs: the initial conditions for the populations at time $t = 0$. For example, if we started with a concentration of 10 for the first population and 5 for the second population, we would write:

\[ \mathbf{y}(0) = \begin{bmatrix} 10 \\ 5 \end{bmatrix} \]

Many ODE solver libraries, like Diffsol, require users to provide their ODEs in the form of a set of explicit first order ODEs. Given both the system of ODEs and the initial conditions, the solver can then integrate the equations forward in time to find the solution $\mathbf{y}(t)$. This is the general process for solving ODEs, so it is important to know how to translate your problem into a set of first order ODEs, and thus to the general form of a explicit first order ODE system shown above. In the next two sections, we will look at an example of a first order ODE system in the area of population dynamics, and then solve it using Diffsol.

Population Dynamics - Predator-Prey Model

In this example, we will model the population dynamics of a predator-prey system using a set of first order ODEs. The Lotka-Volterra equations are a classic example of a predator-prey model, and describe the interactions between two species in an ecosystem. The first species is a prey species, which we will call $x$, and the second species is a predator species, which we will call $y$.

The rate of change of the prey population is governed by two terms: growth and predation. The growth term represents the natural increase in the prey population in the absence of predators, and is proportional to the current population of prey. The predation term represents the rate at which the predators consume the prey, and is proportional to the product of the prey and predator populations. The rate of change of the prey population can be written as:

\[ \frac{dx}{dt} = a x - b x y \]

where $a$ is the growth rate of the prey population, and $b$ is the predation rate.

The rate of change of the predator population is also governed by two terms: death and growth. The death term represents the natural decrease in the predator population in the absence of prey, and is proportional to the current population of predators. The growth term represents the rate at which the predators reproduce, and is proportional to the product of the prey and predator populations, since the predators need to consume the prey to reproduce. The rate of change of the predator population can be written as:

\[ \frac{dy}{dt} = c x y - d y \]

where $c$ is the reproduction rate of the predators, and $d$ is the death rate.

The Lotka-Volterra equations are a simple model of predator-prey dynamics, and make several assumptions that may not hold in real ecosystems. For example, the model assumes that the prey population grows exponentially in the absence of predators, that the predator population decreases linearly in the absence of prey, and that the spatial distribution of the species has no effect. Despite these simplifications, the Lotka-Volterra equations capture some of the essential features of predator-prey interactions, such as oscillations in the populations and the dependence of each species on the other. When modelling with ODEs, it is important to consider the simplest model that captures the behaviour of interest, and to be aware of the assumptions that underlie the model.

Putting the two equations together, we have a system of two first order ODEs:

\[ \frac{dx}{dt} = a x - b x y \\ \frac{dy}{dt} = c x y - d y \]

which can be written in vector form as:

\[ \begin{bmatrix} \frac{dx}{dt} \\ \frac{dy}{dt} \end{bmatrix} = \begin{bmatrix} a x - b x y \\ c x y - d y \end{bmatrix} \]

or in the general form of a first order ODE system:

\[ \frac{d\mathbf{y}}{dt} = \mathbf{f}(\mathbf{y}, t) \]

where

\[\mathbf{y} = \begin{bmatrix} x \\ y \end{bmatrix} \]

and

\[\mathbf{f}(\mathbf{y}, t) = \begin{bmatrix} a x - b x y \\ c x y - d y \end{bmatrix}\]

We also have initial conditions for the populations at time $t = 0$. We can set both populations to 1 at the start like so:

\[ \mathbf{y}(0) = \begin{bmatrix} 1 \\ 1 \end{bmatrix} \]

Let's solve this system of ODEs using the Diffsol crate. We will use the DiffSL domain-specific language to specify the problem. We could have also used Rust closures, but this allows us to illustrate the modelling process with a minimum of Rust syntax.

use diffsol::{
    CraneliftJitModule, MatrixCommon, NalgebraVec, OdeBuilder, OdeEquations, OdeSolverMethod,
    Vector,
};
use plotly::{common::Mode, layout::Axis, layout::Layout, Plot, Scatter};
use std::fs;
type M = diffsol::NalgebraMat<f64>;
type LS = diffsol::NalgebraLU<f64>;
type CG = CraneliftJitModule;

fn main() {
    solve();
    phase_plane();
}

fn solve() {
    let problem = OdeBuilder::<M>::new()
        .build_from_diffsl::<CG>(
            "
        a { 2.0/3.0 } b { 4.0/3.0 } c { 1.0 } d { 1.0 }
        u_i {
            y1 = 1,
            y2 = 1,
        }
        F_i {
            a * y1 - b * y1 * y2,
            c * y1 * y2 - d * y2,
        }
    ",
        )
        .unwrap();
    let mut solver = problem.bdf::<LS>().unwrap();
    let (ys, ts) = solver.solve(40.0).unwrap();

    let prey: Vec<_> = ys.inner().row(0).into_iter().copied().collect();
    let predator: Vec<_> = ys.inner().row(1).into_iter().copied().collect();
    let time: Vec<_> = ts.into_iter().collect();

    let prey = Scatter::new(time.clone(), prey)
        .mode(Mode::Lines)
        .name("Prey");
    let predator = Scatter::new(time, predator)
        .mode(Mode::Lines)
        .name("Predator");

    let mut plot = Plot::new();
    plot.add_trace(prey);
    plot.add_trace(predator);

    let layout = Layout::new()
        .x_axis(Axis::new().title("t"))
        .y_axis(Axis::new().title("population"));
    plot.set_layout(layout);
    let plot_html = plot.to_inline_html(Some("prey-predator"));
    fs::write("book/src/primer/images/prey-predator.html", plot_html)
        .expect("Unable to write file");
}

A phase plane plot of the predator-prey system is a useful visualisation of the dynamics of the system. This plot shows the prey population on the x-axis and the predator population on the y-axis. Trajectories in the phase plane represent the evolution of the populations over time. Lets reframe the equations to introduce a new parameter $y_0$ which is the initial predator and prey population. We can then plot the phase plane for different values of $y_0$ to see how the system behaves for different initial conditions.

Our initial conditions are now:

\[ \mathbf{y}(0) = \begin{bmatrix} y_0 \\ y_0 \end{bmatrix} \]

so we can solve this system for different values of $y_0$ and plot the phase plane for each case. We will use similar code as above, but we will introduce our new parameter and loop over different values of $y_0$

fn phase_plane() {
    let mut problem = OdeBuilder::<M>::new()
        .p([1.0])
        .build_from_diffsl::<CG>(
            "
        in = [ y0 ]
        y0 { 1.0 }
        a { 2.0/3.0 } b { 4.0/3.0 } c { 1.0 } d { 1.0 }
        u_i {
            y1 = y0,
            y2 = y0,
        }
        F_i {
            a * y1 - b * y1 * y2,
            c * y1 * y2 - d * y2,
        }
    ",
        )
        .unwrap();

    let mut plot = Plot::new();
    for y0 in (1..6).map(f64::from) {
        let p = NalgebraVec::from_element(1, y0, problem.context().clone());
        problem.eqn_mut().set_params(&p);

        let mut solver = problem.bdf::<LS>().unwrap();
        let (ys, _ts) = solver.solve(40.0).unwrap();

        let prey: Vec<_> = ys.inner().row(0).into_iter().copied().collect();
        let predator: Vec<_> = ys.inner().row(1).into_iter().copied().collect();

        let phase = Scatter::new(prey, predator)
            .mode(Mode::Lines)
            .name(format!("y0 = {y0}"));
        plot.add_trace(phase);
    }

    let layout = Layout::new()
        .x_axis(Axis::new().title("x"))
        .y_axis(Axis::new().title("y"));
    plot.set_layout(layout);
    let plot_html = plot.to_inline_html(Some("prey-predator2"));
    fs::write("book/src/primer/images/prey-predator2.html", plot_html)
        .expect("Unable to write file");
}

Higher Order ODEs

The order of an ODE is the highest derivative that appears in the equation. So far, we have only looked at first order ODEs, which involve only the first derivative of the state variable with respect to time. However, many physical systems are described by higher order ODEs, which involve second or higher derivatives of the state variable. A simple example of a second order ODE is the motion of a mass under the influence of gravity. The equation of motion for the mass can be written as:

\[ \frac{d^2x}{dt^2} = -g \]

where $x$ is the position of the mass, $t$ is time, and $g$ is the acceleration due to gravity. This is a second order ODE because it involves the second derivative of the position with respect to time.

Higher order ODEs can always be rewritten as a system of first order ODEs by introducing new variables. For example, we can rewrite the second order ODE above as a system of two first order ODEs by introducing a new variable for the velocity of the mass:

\[ \begin{align*} \frac{dx}{dt} &= v \\ \frac{dv}{dt} &= -g \end{align*} \]

where $v = \frac{dx}{dt}$ is the velocity of the mass. This is a system of two first order ODEs, which can be written in vector form as:

\[ \frac{d\mathbf{y}}{dt} = \mathbf{f}(\mathbf{y}, t) \]

where

\[ \mathbf{y} = \begin{bmatrix} x \\ v \end{bmatrix} \]

and

\[ \mathbf{f}(\mathbf{y}, t) = \begin{bmatrix} v \\ -g \end{bmatrix} \]

In the next section, we'll look at another example of a higher order ODE system: the spring-mass system, and solve this using Diffsol.

Example: Spring-mass systems

We will model a damped spring-mass system using a second order ODE. The system consists of a mass $m$ attached to a spring with spring constant $k$, and a damping force proportional to the velocity of the mass with damping coefficient $c$.

The equation of motion for the mass can be written as:

\[ m \frac{d^2x}{dt^2} = -k x - c \frac{dx}{dt} \]

where $x$ is the position of the mass, $t$ is time, and the negative sign on the right hand side indicates that the spring force and damping force act in the opposite direction to the displacement of the mass.

We can convert this to a system of two first order ODEs by introducing a new variable for the velocity of the mass:

\[ \begin{align*} \frac{dx}{dt} &= v \\ \frac{dv}{dt} &= -\frac{k}{m} x - \frac{c}{m} v \end{align*} \]

where $v = \frac{dx}{dt}$ is the velocity of the mass.

We can solve this system of ODEs using Diffsol with the following code:

use diffsol::{CraneliftJitModule, MatrixCommon, OdeBuilder, OdeSolverMethod};
use plotly::{common::Mode, layout::Axis, layout::Layout, Plot, Scatter};
use std::fs;
type M = diffsol::NalgebraMat<f64>;
type CG = CraneliftJitModule;
type LS = diffsol::NalgebraLU<f64>;

fn main() {
    let problem = OdeBuilder::<M>::new()
        .build_from_diffsl::<CG>(
            "
        k { 1.0 } m { 1.0 } c { 0.1 }
        u_i {
            x = 1,
            v = 0,
        }
        F_i {
            v,
            -k/m * x - c/m * v,
        }
    ",
        )
        .unwrap();
    let mut solver = problem.bdf::<LS>().unwrap();
    let (ys, ts) = solver.solve(40.0).unwrap();

    let x: Vec<_> = ys.inner().row(0).into_iter().copied().collect();
    let time: Vec<_> = ts.into_iter().collect();

    let x_line = Scatter::new(time.clone(), x).mode(Mode::Lines);

    let mut plot = Plot::new();
    plot.add_trace(x_line);

    let layout = Layout::new()
        .x_axis(Axis::new().title("t"))
        .y_axis(Axis::new().title("x"));
    plot.set_layout(layout);
    let plot_html = plot.to_inline_html(Some("sping-mass-system"));
    fs::write("../src/primer/images/spring-mass-system.html", plot_html)
        .expect("Unable to write file");
}

Discrete Events

ODEs describe the continuous evolution of a system over time, but many systems also involve discrete events that occur at specific times. For example, in a compartmental model of drug delivery, the administration of a drug is a discrete event that occurs at a specific time. In a bouncing ball model, the collision of the ball with the ground is a discrete event that changes the state of the system. It is normally difficult to model these events using ODEs alone, as they require a different approach to handle the discontinuities in the system. While we can represent discrete events mathematically using delta functions, many ODE solvers are not designed to handle discontinuities, and may produce inaccurate results or fail to converge during the integration.

Diffsol provides a way to model discrete events in a system of ODEs by allowing users to manipulate the internal state of each solver during the time-stepping. Each solver has an internal state that holds information such as the current time $t$, the current state of the system $\mathbf{y}$, and other solver-specific information. When a discrete event occurs, the user can update the internal state of the solver to reflect the change in the system, and then continue the integration of the ODE as normal.

Diffsol also provides a way to stop the integration of the ODEs, either at a specific time or when a specific condition is met. This can be useful for modelling systems with discrete events, as it allows the user to control the integration of the ODEs and to handle the events in a flexible way.

The Solving the Problem and Root Finding sections provides an introduction to the API for solving ODEs and detecting events with Diffsol. In the next two sections, we will look at two examples of systems with discrete events: compartmental models of drug delivery and bouncing ball models, and solve them using Diffsol using this API.

Example: Compartmental models of Drug Delivery

The field of Pharmacokinetics (PK) provides a quantitative basis for describing the delivery of a drug to a patient, the diffusion of that drug through the plasma/body tissue, and the subsequent clearance of the drug from the patient's system. PK is used to ensure that there is sufficient concentration of the drug to maintain the required efficacy of the drug, while ensuring that the concentration levels remain below the toxic threshold. Pharmacokinetic (PK) models are often combined with Pharmacodynamic (PD) models, which model the positive effects of the drug, such as the binding of a drug to the biological target, and/or undesirable side effects, to form a full PKPD model of the drug-body interaction. This example will only focus on PK, neglecting the interaction with a PD model.

Fig 1

PK enables the following processes to be quantified:

Absorption
Distribution
Metabolism
Excretion

These are often referred to as ADME, and taken together describe the drug concentration in the body when medicine is prescribed. These ADME processes are typically described by zeroth-order or first-order rate reactions modelling the dynamics of the quantity of drug $q$, with a given rate parameter $k$, for example:

\[ \frac{dq}{dt} = -k^*, \]

\[ \frac{dq}{dt} = -k q. \]

The body itself is modelled as one or more compartments, each of which is defined as a kinetically homogeneous unit (these compartments do not relate to specific organs in the body, unlike Physiologically based pharmacokinetic, PBPK, modeling). There is typically a main central compartment into which the drug is administered and from which the drug is excreted from the body, combined with zero or more peripheral compartments to which the drug can be distributed to/from the central compartment (See Fig 2). Each peripheral compartment is only connected to the central compartment.

Fig 2

The following example PK model describes the two-compartment model shown diagrammatically in the figure above. The time-dependent variables to be solved are the drug quantity in the central and peripheral compartments, $q_c$ and $q_{p1}$ (units: [ng]) respectively.

\[ \frac{dq_c}{dt} = \text{Dose}(t) - \frac{q_c}{V_c} CL - Q_{p1} \left ( \frac{q_c}{V_c} - \frac{q_{p1}}{V_{p1}} \right ), \]

\[ \frac{dq_{p1}}{dt} = Q_{p1} \left ( \frac{q_c}{V_c} - \frac{q_{p1}}{V_{p1}} \right ). \]

This model describes an intravenous bolus dosing protocol, with a linear clearance from the central compartment (non-linear clearance processes are also possible, but not considered here). The dose function $\text{Dose}(t)$ will consist of instantaneous doses of $X$ ng of the drug at one or more time points. The other input parameters to the model are:

$V_c$ [mL], the volume of the central compartment
$V_{p1}$ [mL], the volume of the first peripheral compartment
$CL$ [mL/h], the clearance/elimination rate from the central compartment
$Q_{p1}$ [mL/h], the transition rate between central compartment and peripheral compartment 1

We will solve this system of ODEs using the Diffsol crate. Rather than trying to write down the dose function as a mathematical function, we will neglect the dose function from the equations and instead using Diffsol's API to specify the dose at specific time points.

First lets write down the equations in the standard form of a first order ODE system:

\[ \frac{d\mathbf{y}}{dt} = \mathbf{f}(\mathbf{y}, t) \]

where

\[ \mathbf{y} = \begin{bmatrix} q_c \\ q_{p1} \end{bmatrix} \]

and

\[ \mathbf{f}(\mathbf{y}, t) = \begin{bmatrix} - \frac{q_c}{V_c} CL - Q_{p1} \left ( \frac{q_c}{V_c} - \frac{q_{p1}}{V_{p1}} \right ) \\ Q_{p1} \left ( \frac{q_c}{V_c} - \frac{q_{p1}}{V_{p1}} \right ) \end{bmatrix} \]

We will also need to specify the initial conditions for the system:

\[ \mathbf{y}(0) = \begin{bmatrix} 0 \\ 0 \end{bmatrix} \]

For the dose function, we will specify a dose of 1000 ng at regular intervals of 6 hours. We will also specify the other parameters of the model:

\[ V_c = 1000 \text{ mL}, \quad V_{p1} = 1000 \text{ mL}, \quad CL = 100 \text{ mL/h}, \quad Q_{p1} = 50 \text{ mL/h} \]

Let's now solve this system of ODEs using Diffsol. To implement the discrete dose events, we set a stop time for the simulation at each dose event using the OdeSolverMethod::set_stop_time method. During timestepping we can check the return value of the OdeSolverMethod::step method to see if the solver has reached the stop time. If it has, we can apply the dose and continue the simulation.

use diffsol::{CraneliftJitModule, OdeBuilder, OdeSolverMethod, OdeSolverStopReason};
use plotly::{common::Mode, layout::Axis, layout::Layout, Plot, Scatter};
use std::fs;
type M = diffsol::NalgebraMat<f64>;
type CG = CraneliftJitModule;
type LS = diffsol::NalgebraLU<f64>;

fn main() {
    let problem = OdeBuilder::<M>::new()
        .build_from_diffsl::<CG>(
            "
        Vc { 1000.0 } Vp1 { 1000.0 } CL { 100.0 } Qp1 { 50.0 }
        u_i {
            qc = 0,
            qp1 = 0,
        }
        F_i {
            - qc / Vc * CL - Qp1 * (qc / Vc - qp1 / Vp1),
            Qp1 * (qc / Vc - qp1 / Vp1),
        }
    ",
        )
        .unwrap();
    let mut solver = problem.bdf::<LS>().unwrap();
    let doses = vec![(0.0, 1000.0), (6.0, 1000.0), (12.0, 1000.0), (18.0, 1000.0)];

    let mut q_c = Vec::new();
    let mut q_p1 = Vec::new();
    let mut time = Vec::new();

    // apply the first dose and save the initial state
    solver.state_mut().y[0] = doses[0].1;
    q_c.push(solver.state().y[0]);
    q_p1.push(solver.state().y[1]);
    time.push(0.0);

    // solve and apply the remaining doses
    for (t, dose) in doses.into_iter().skip(1) {
        solver.set_stop_time(t).unwrap();
        loop {
            let ret = solver.step();
            q_c.push(solver.state().y[0]);
            q_p1.push(solver.state().y[1]);
            time.push(solver.state().t);
            match ret {
                Ok(OdeSolverStopReason::InternalTimestep) => continue,
                Ok(OdeSolverStopReason::TstopReached) => break,
                _ => panic!("unexpected solver error"),
            }
        }
        solver.state_mut().y[0] += dose;
    }
    let mut plot = Plot::new();
    let q_c = Scatter::new(time.clone(), q_c)
        .mode(Mode::Lines)
        .name("q_c");
    let q_p1 = Scatter::new(time, q_p1).mode(Mode::Lines).name("q_p1");
    plot.add_trace(q_c);
    plot.add_trace(q_p1);

    let layout = Layout::new()
        .x_axis(Axis::new().title("t [h]"))
        .y_axis(Axis::new().title("amount [ng]"));
    plot.set_layout(layout);
    let plot_html = plot.to_inline_html(Some("drug-delivery"));
    fs::write("../src/primer/images/drug-delivery.html", plot_html).expect("Unable to write file");
}

Example: Bouncing Ball

Modelling a bouncing ball is a simple and intuitive example of a system with discrete events. The ball is dropped from a height $h$ and bounces off the ground with a coefficient of restitution $e$. When the ball hits the ground, its velocity is reversed and scaled by the coefficient of restitution, and the ball rises and then continues to fall until it hits the ground again. This process repeats until halted.

The second order ODE that describes the motion of the ball is given by:

\[ \frac{d^2x}{dt^2} = -g \]

where $x$ is the position of the ball, $t$ is time, and $g$ is the acceleration due to gravity. We can rewrite this as a system of two first order ODEs by introducing a new variable for the velocity of the ball:

\[ \begin{align*} \frac{dx}{dt} &= v \\ \frac{dv}{dt} &= -g \end{align*} \]

where $v = \frac{dx}{dt}$ is the velocity of the ball. This is a system of two first order ODEs, which can be written in vector form as:

\[ \frac{d\mathbf{y}}{dt} = \mathbf{f}(\mathbf{y}, t) \]

where

\[ \mathbf{y} = \begin{bmatrix} x \\ v \end{bmatrix} \]

and

\[ \mathbf{f}(\mathbf{y}, t) = \begin{bmatrix} v \\ -g \end{bmatrix} \]

The initial conditions for the ball, including the height from which it is dropped and its initial velocity, are given by:

\[ \mathbf{y}(0) = \begin{bmatrix} h \\ 0 \end{bmatrix} \]

When the ball hits the ground, we need to update the velocity of the ball according to the coefficient of restitution, which is the ratio of the velocity after the bounce to the velocity before the bounce. The velocity after the bounce $v'$ is given by:

\[ v' = -e v \]

where $e$ is the coefficient of restitution. However, to implement this in our ODE solver, we need to detect when the ball hits the ground. We can do this by using Diffsol's event handling feature, which allows us to specify a function that is equal to zero when the event occurs, i.e. when the ball hits the ground. This function $g(\mathbf{y}, t)$ is called an event or root function, and for our bouncing ball problem, it is given by:

\[ g(\mathbf{y}, t) = x \]

where $x$ is the position of the ball. When the ball hits the ground, the event function will be zero and Diffsol will stop the integration, and we can update the velocity of the ball accordingly.

In code, the bouncing ball problem can be solved using Diffsol as follows:

use diffsol::{CraneliftJitModule, OdeBuilder, OdeSolverMethod, OdeSolverStopReason, Vector};
use plotly::{common::Mode, layout::Axis, layout::Layout, Plot, Scatter};
use std::fs;
type M = diffsol::NalgebraMat<f64>;
type CG = CraneliftJitModule;
type LS = diffsol::NalgebraLU<f64>;

fn main() {
    let e = 0.8;
    let problem = OdeBuilder::<M>::new()
        .build_from_diffsl::<CG>(
            "
        g { 9.81 } h { 10.0 }
        u_i {
            x = h,
            v = 0,
        }
        F_i {
            v,
            -g,
        }
        stop {
            x,
        }
    ",
        )
        .unwrap();
    let mut solver = problem.bdf::<LS>().unwrap();

    let mut x = Vec::new();
    let mut v = Vec::new();
    let mut t = Vec::new();
    let final_time = 10.0;

    // save the initial state
    x.push(solver.state().y[0]);
    v.push(solver.state().y[1]);
    t.push(0.0);

    // solve until the final time is reached
    solver.set_stop_time(final_time).unwrap();
    loop {
        match solver.step() {
            Ok(OdeSolverStopReason::InternalTimestep) => (),
            Ok(OdeSolverStopReason::RootFound(t)) => {
                // get the state when the event occurred
                let mut y = solver.interpolate(t).unwrap();

                // update the velocity of the ball
                y[1] *= -e;

                // make sure the ball is above the ground
                y[0] = y[0].max(f64::EPSILON);

                // set the state to the updated state
                solver.state_mut().y.copy_from(&y);
                solver.state_mut().dy[0] = y[1];
                *solver.state_mut().t = t;
            }
            Ok(OdeSolverStopReason::TstopReached) => break,
            Err(_) => panic!("unexpected solver error"),
        }
        x.push(solver.state().y[0]);
        v.push(solver.state().y[1]);
        t.push(solver.state().t);
    }
    let mut plot = Plot::new();
    let x = Scatter::new(t.clone(), x).mode(Mode::Lines).name("x");
    let v = Scatter::new(t, v).mode(Mode::Lines).name("v");
    plot.add_trace(x);
    plot.add_trace(v);

    let layout = Layout::new()
        .x_axis(Axis::new().title("t"))
        .y_axis(Axis::new());
    plot.set_layout(layout);
    let plot_html = plot.to_inline_html(Some("bouncing-ball"));
    fs::write("../src/primer/images/bouncing-ball.html", plot_html).expect("Unable to write file");
}

DAEs via the Mass Matrix

Differential-algebraic equations (DAEs) are a generalisation of ordinary differential equations (ODEs) that include algebraic equations, or equations that do not involve derivatives. Algebraic equations can arise in many physical systems and often are used to model constraints on the system, such as conservation laws or other relationships between the state variables. For example, in an electrical circuit, the current flowing into a node must equal the current flowing out of the node, which can be written as an algebraic equation.

DAEs can be written in the general implicit form:

\[ \mathbf{F}(\mathbf{y}, \mathbf{y}', t) = 0 \]

where $\mathbf{y}$ is the vector of state variables, $\mathbf{y}'$ is the vector of derivatives of the state variables, and $\mathbf{F}$ is a vector-valued function that describes the system of equations. However, for the purposes of this primer and the capabilities of Diffsol, we will focus on a specific form of DAEs called index-1 or semi-explicit DAEs, which can be written as a combination of differential and algebraic equations:

\[ \begin{align*} \frac{d\mathbf{y}}{dt} &= \mathbf{f}(\mathbf{y}, t) \\ 0 &= \mathbf{g}(\mathbf{y}, t) \end{align*} \]

where $\mathbf{f}$ is the vector-valued function that describes the differential equations and $\mathbf{g}$ is the vector-valued function that describes the algebraic equations. The key difference between DAEs and ODEs is that DAEs include algebraic equations that must be satisfied at each time step, in addition to the differential equations that describe the rate of change of the state variables.

How does this relate to the standard form of an explicit ODE that we have seen before? Recall that an explicit ODE can be written as:

\[ \frac{d\mathbf{y}}{dt} = \mathbf{f}(\mathbf{y}, t) \]

Lets update this equation to include a matrix $\mathbf{M}$ that multiplies the derivative term:

\[ M \frac{d\mathbf{y}}{dt} = \mathbf{f}(\mathbf{y}, t) \]

When $M$ is the identity matrix (i.e. a matrix with ones along the diagonal), this reduces to the standard form of an explicit ODE. However, when $M$ has diagonal entries that are zero, this introduces algebraic equations into the system and it reduces to the semi-explicit DAE equations show above. The matrix $M$ is called the mass matrix.

Thus, we now have a general form of a set of differential equations, that includes both ODEs and semi-explicit DAEs. This general form is used by Diffsol to allow users to specify a wide range of problems, from simple ODEs to more complex DAEs. In the next section, we will look at a few examples of DAEs and how to solve them using Diffsol and a mass matrix.

Example: Electrical Circuits

Lets consider the following low-pass LRC filter circuit:

    +---L---+---C---+
    |       |       |
V_s =       R       |
    |       |       |
    +-------+-------+

The circuit consists of a resistor $R$, an inductor $L$, and a capacitor $C$ connected to a voltage source $V_s$. The voltage across the resistor $V$ is given by Ohm's law:

\[ V = R i_R \]

where $i_R$ is the current flowing through the resistor. The voltage across the inductor is given by:

\[ \frac{di_L}{dt} = \frac{V_s - V}{L} \]

where $di_L/dt$ is the rate of change of current with respect to time. The voltage across the capacitor is the same as the voltage across the resistor and the equation for an ideal capacitor is:

\[ \frac{dV}{dt} = \frac{i_C}{C} \]

where $i_C$ is the current flowing through the capacitor. The sum of the currents flowing into and out of the top-center node of the circuit must be zero according to Kirchhoff's current law:

\[ i_L = i_R + i_C \]

Thus we have a system of two differential equations and two algebraic equation that describe the evolution of the currents through the resistor, inductor, and capacitor; and the voltage across the resistor. We can write these equations in the general form:

\[ M \frac{d\mathbf{y}}{dt} = \mathbf{f}(\mathbf{y}, t) \]

where

\[ \mathbf{y} = \begin{bmatrix} i_R \\ i_L \\ i_C \\ V \end{bmatrix} \]

and

\[ \mathbf{f}(\mathbf{y}, t) = \begin{bmatrix} V - R i_R \\ \frac{V_s - V}{L} \\ i_L - i_R - i_C \\ \frac{i_C}{C} \end{bmatrix} \]

The mass matrix $M$ has one on the diagonal for the differential equations and zero for the algebraic equation.

\[ M = \begin{bmatrix} 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix} \]

Instead of providing the mass matrix explicitly, the DiffSL language specifies the multiplication of the mass matrix with the derivative term, $M \frac{d\mathbf{y}}{dt}$, which is given by:

\[ M \frac{d\mathbf{y}}{dt} = \begin{bmatrix} 0 \\ \frac{di_L}{dt} \\ 0 \\ \frac{dV}{dt} \end{bmatrix} \]

The initial conditions for the system are:

\[ \mathbf{y}(0) = \begin{bmatrix} 0 \\ 0 \\ 0 \\ 0 \end{bmatrix} \]

The voltage source $V_s$ acts as a forcing function for the system, and we can specify this as sinusoidal function of time.

\[ V_s(t) = V_0 \sin(\omega t) \]

where $\omega$ is the angular frequency of the source. Since this is a low-pass filter, we will choose a high frequency for the source, say $\omega = 200$, to demonstrate the filtering effect of the circuit.

We can solve this system of equations using Diffsol and plot the current and voltage across the resistor as a function of time.

use diffsol::{CraneliftJitModule, MatrixCommon, OdeBuilder, OdeSolverMethod};
use plotly::{common::Mode, layout::Axis, layout::Layout, Plot, Scatter};
use std::fs;
type M = diffsol::NalgebraMat<f64>;
type CG = CraneliftJitModule;
type LS = diffsol::NalgebraLU<f64>;

fn main() {
    let problem = OdeBuilder::<M>::new()
        .build_from_diffsl::<CG>(
            "
        R { 100.0 } L { 1.0 } C { 0.001 } V0 { 10 } omega { 100.0 }
        Vs { V0 * sin(omega * t) }
        u_i {
            iR = 0,
            iL = 0,
            iC = 0,
            V = 0,
        }
        dudt_i {
            diRdt = 0,
            diLdt = 0,
            diCdt = 0,
            dVdt = 0,
        }
        M_i {
            0,
            diLdt,
            0,
            dVdt,
        }
        F_i {
            V - R * iR,
            (Vs - V) / L,
            iL - iR - iC,
            iC / C,
        }
        out_i {
            iR,
        }
    ",
        )
        .unwrap();
    let mut solver = problem.bdf::<LS>().unwrap();
    let (ys, ts) = solver.solve(1.0).unwrap();

    let ir: Vec<_> = ys.inner().row(0).into_iter().copied().collect();
    let t: Vec<_> = ts.into_iter().collect();

    let ir = Scatter::new(t.clone(), ir).mode(Mode::Lines);

    let mut plot = Plot::new();
    plot.add_trace(ir);

    let layout = Layout::new()
        .x_axis(Axis::new().title("t"))
        .y_axis(Axis::new().title("current"));
    plot.set_layout(layout);
    let plot_html = plot.to_inline_html(Some("electrical-circuit"));
    fs::write("../src/primer/images/electrical-circuit.html", plot_html)
        .expect("Unable to write file");
}

Partial Differential Equations (PDEs)

Diffsol is an ODE solver, but it can also solve PDEs. The idea is to discretize the PDE in space and time, and then solve the resulting system of ODEs. This is called the method of lines.

Discretizing a PDE is a large topic, and there are many ways to do it. Common methods include finite difference, finite volume, finite element, and spectral methods. Finite difference methods are the simplest to understand and implement, so some of the examples in this book will demonstrate this method to give you a flavour of how to solve PDEs with Diffsol. However, in general we recommend that you use another package to discretise your PDE, and then import the resulting ODE system into Diffsol for solving.

Some useful packages

There are many packages in the Python and Julia ecosystems that can help you discretise your PDE. Here are a few, but there are many more out there:

Python

FEniCS: A finite element package. Uses the Unified Form Language (UFL) to specify PDEs.
FireDrake: A finite element package, uses the same UFL as FEniCS.
FiPy: A finite volume package.
scikit-fdiff: A finite difference package.

Julia:

MethodOfLines.jl: A finite difference package.
Gridap.jl: A finite element package.

Example: Heat equation

Lets consider a simple example, the heat equation. The heat equation is a PDE that describes how the temperature of a material changes over time. In one dimension, the heat equation is

\[ \frac{\partial u}{\partial t} = D \frac{\partial^2 u}{\partial x^2} \]

where $u(x, t)$ is the temperature of the material at position $x$ and time $t$, and $D$ is the thermal diffusivity of the material. To solve this equation, we need to discretize it in space and time. We can use a finite difference method to discretise the spatial derivative, and then solve the resulting system of ODEs using Diffsol.

Finite difference method

The finite difference method is a numerical method for discretising a spatial derivative like $\frac{\partial^2 u}{\partial x^2}$. It approximates this continuous term by a discrete term, in this case the multiplication of a matrix by a vector. We can use this discretisation method to convert the heat equation into a system of ODEs suitable for Diffsol.

We will not go into the details of the finite difference method here but mearly derive a single finite difference approximation for the term $\frac{\partial^2 u}{\partial x^2}$, or $u_{xx}$ using more compact notation.

The central FD approximation of $u_{xx}$ is:

\[ u_{xx} \approx \frac{u(x + h) - 2u(x) + u(x-h)}{h^2} \]

where $h$ is the spacing between points along the x-axis.

We will discretise $u_{xx} = 0$ at $N$ regular points along $x$ from 0 to 1, given by $x_1$, $x_2$, ...

          +----+----+----------+----+> x
          0   x_1  x_2    ... x_N   1

Using this set of points and the discrete approximation, this gives a set of $N$ equations at each interior point on the domain:

\[ \frac{v_{i+1} - 2v_i + v_{i-1}}{h^2} \text{ for } i = 1...N \]

where $v_i \approx u(x_i)$

We will need additional equations at $x=0$ and $x=1$, known as the boundary conditions. For this example we will use $u(x) = g(x)$ at $x=0$ and $x=1$ (also known as a non-homogenous Dirichlet bc), so that $v_0 = g(0)$, and $v_{N+1} = g(1)$, and the equation at $x_1$ becomes:

\[ \frac{v_{i+1} - 2v_i + g(0)}{h^2} \]

and the equation at $x_N$ becomes:

\[ \frac{g(1) - 2v_i + v_{i-1}}{h^2} \]

We can therefore represent the final $N$ equations in matrix form like so:

\[ \frac{1}{h^2} \begin{bmatrix} -2 & 1 & & & \\ 1 & -2 & 1 & & \\ &\ddots & \ddots & \ddots &\\ & & 1 & -2 & 1 \\ & & & 1 & -2 \end{bmatrix} \begin{bmatrix} v_1 \\ v_2 \\ \vdots \\ v_{N-1}\\ v_{N} \end{bmatrix} + \frac{1}{h^2} \begin{bmatrix} g(0) \\ 0 \\ \vdots \\ 0 \\ g(1) \end{bmatrix} \]

The relevant sparse matrix here is $A$, given by

\[ A = \begin{bmatrix} -2 & 1 & & & \\ 1 & -2 & 1 & & \\ &\ddots & \ddots & \ddots &\\ & & 1 & -2 & 1 \\ & & & 1 & -2 \end{bmatrix} \]

As you can see, the number of non-zero elements grows linearly with the size $N$, so a sparse matrix format is much preferred over a dense matrix holding all $N^2$ elements! The additional vector that encodes the boundary conditions is $b$, given by

\[ b = \begin{bmatrix} g(0) \\ 0 \\ \vdots \\ 0 \\ g(1) \end{bmatrix} \]

Method of Lines Approximation

We can use our FD approximation of the spatial derivative to convert the heat equation into a system of ODEs. Starting from our original definition of the heat equation:

\[ \frac{\partial u}{\partial t} = D \frac{\partial^2 u}{\partial x^2} \]

and using our finite difference approximation and definition of the sparse matrix $A$ and vector $b$, this becomes:

\[ \frac{du}{dt} = \frac{D}{h^2} (A u + b) \]

where $u$ is a vector of temperatures at each point in space. This is a system of ODEs that we can solve using Diffsol.

Diffsol Implementation

use diffsol::{
    CraneliftJitModule, FaerSparseLU, FaerSparseMat, MatrixCommon, OdeBuilder, OdeSolverMethod,
};
use plotly::{
    layout::{Axis, Layout},
    Plot, Surface,
};
use std::fs;

type M = FaerSparseMat<f64>;
type LS = FaerSparseLU<f64>;
type CG = CraneliftJitModule;

fn main() {
    let problem = OdeBuilder::<M>::new()
        .build_from_diffsl::<CG>(
            "
    D { 1.0 }
    h { 1.0 }
    g { 0.0 }
    m { 1.0 }
    A_ij {
        (0..20, 1..21): 1.0,
        (0..21, 0..21): -2.0,
        (1..21, 0..20): 1.0,
    }
    b_i { 
        (0): g,
        (1:20): 0.0,
        (20): g,
    }
    u_i {
        (0:5): g,
        (5:15): g + m,
        (15:21): g,
    }
    heat_i { A_ij * u_j }
    F_i {
        D * (heat_i + b_i) / (h * h)
    }",
        )
        .unwrap();
    let times = (0..50).map(|i| i as f64).collect::<Vec<f64>>();
    let mut solver = problem.bdf::<LS>().unwrap();
    let sol = solver.solve_dense(&times).unwrap();

    let x = (0..21).map(|i| i as f64).collect::<Vec<f64>>();
    let y = times;
    let z = sol
        .inner()
        .col_iter()
        .map(|v| v.iter().copied().collect::<Vec<f64>>())
        .collect::<Vec<Vec<f64>>>();
    let trace = Surface::new(z).x(x).y(y);
    let mut plot = Plot::new();
    plot.add_trace(trace);
    let layout = Layout::new()
        .x_axis(Axis::new().title("x"))
        .y_axis(Axis::new().title("t"))
        .z_axis(Axis::new().title("u"));
    plot.set_layout(layout);
    let plot_html = plot.to_inline_html(Some("heat-equation"));
    fs::write("book/src/primer/images/heat-equation.html", plot_html)
        .expect("Unable to write file");
}

Physics-based Battery Simulation

Traditional battery models are based on equivalent circuit models, similar to the circuit modelled in section Electrical Circuits. These models are simple and computationally efficient, but they lack the ability to capture all of the complex electrochemical processes that occur in a battery. Physics-based models, on the other hand, are based on the electrochemical processes that occur in the battery, and can provide a more detailed description of the battery's behaviour. They are parameterized by physical properties of the battery, such as the diffusion coefficients of lithium ions in the electrodes, the reaction rate constants, and the surface area of the electrodes, and can be used to predict the battery's performance under different operating conditions, once these parameters are known.

The Single Particle Model (SPM) is a physics-based model of a lithium-ion battery. It describes the diffusion of lithium ions in the positive and negative electrodes of the battery over a 1D radial domain, assuming that the properties of the electrodes are uniform across the thickness of the electrode. Here we will describe the equations that govern the SPM, and show how to solve them at different current rates to calculate the terminal voltage of the battery.

The Single Particle Model state equations

The SPM model only needs to solve for the concentration of lithium ions in the positive and negative electrodes, $c_n$ and $c_p$. The diffusion of lithium ions in each electrode particle $c_i$ is given by:

\[ \frac{\partial c_i}{\partial t} = \nabla \cdot (D_i \nabla c_i) \]

subject to the following boundary and initial conditions:

\[ \left.\frac{\partial c_i}{\partial r}\right\vert_{r=0} = 0, \quad \left.\frac{\partial c}{\partial r}\right\vert_{r=R_i} = -j_i, \quad \left.c\right\vert_{t=0} = c^0_i \]

where $c_i$ is the concentration of lithium ions in the positive ($i=n$) or negative ($i=p$) electrode, $D_i$ is the diffusion coefficient, $j_i$ is the interfacial current density, and $c^0_i$ is the concentration at the particle surface.

The fluxes of lithium ions in the positive and negative electrodes $j_i$ are dependent on the applied current $I$:

\[ j_n = \frac{I}{a_n \delta_n F \mathcal{A}}, \qquad j_p = \frac{-I}{a_p \delta_p F \mathcal{A}}, \]

where $a_i = 3 \epsilon_i / R_i$ is the specific surface area of the electrode, $\epsilon_i$ is the volume fraction of active material, $\delta_i$ is the thickness of the electrode, $F$ is the Faraday constant, and $\mathcal{A}$ is the electrode surface area.

Output variables for the Single Particle Model

Now that we have defined the equations to solve, we turn to the output variables that we need to calculate from the state variables $c_n$ and $c_p$. The terminal voltage of the battery is given by:

\[ V = U_p(x_p^s) - U_n(x_n^s) + \eta_p - \eta_n \]

where $U_i$ is the open circuit potential (OCP) of the electrode, $x_i^s = c_i(r=R_i) / c_i^{max}$ is the surface stoichiometry, and $\eta_i$ is the overpotential.

Assuming Butler-Volmer kinetics and $\alpha_i = 0.5$, the overpotential is given by:

\[ \eta_i = \frac{2RT}{F} \sinh^{-1} \left( \frac{j_i F}{2i_{0,i}} \right) \]

where the exchange current density $i_{0,i}$ is given by:

\[ i_{0,i} = k_i F \sqrt{c_e} \sqrt{c_i(r=R_i)} \sqrt{c_i^{max} - c_i(r=R_i)} \]

where $c_e$ is the concentration of lithium ions in the electrolyte, and $k_i$ is the reaction rate constant.

Stopping conditions

We wish to terminate the simulation if the terminal voltage exceeds an upper threshold $V_{\text{max}}$ or falls below a lower threshold $V_{\text{min}}$. Diffsol uses a root-finding algorithm to detect when the terminal voltage crosses these thresholds, using the following stopping conditions:

\[ V_{\text{max}} - V = 0, \qquad V - V_{\text{min}} = 0, \]

Solving the Single Particle Model using Diffsol

The equations above describe the Single Particle Model of a lithium-ion battery, but they are relativly complex and difficult to discretise compared with the simple heat equation PDE that we saw in the Heat Equation section.

Rather than derive and write down the discretised equations outselves, we will instead rely on the PyBaMM library to generate the equations for us. PyBaMM is a Python library that can generate a wide variety of physics-based battery models, using different parameterisations, physics and operating conditions. Combined with a tool that takes a PyBaMM model and writes it out in the DiffSL language, we can generate a DiffSL file that can be used to solve the equations of the SPM model described above. We can then use the Diffsol crate to solve the model and calculate the terminal voltage of the battery over a range of current rates.

The code below reads in the DiffSL file, compiles it, and then solves the equation for different current rates. We wish to stop the simulation when either the final time is reached, or when one of the stopping conditions is met. We will output the terminal voltage of the battery at regular intervals during the simulation, because the terminal voltage can change more rapidly than the state variables $c_n$ and $c_p$, particularly during the "knee" of the discharge curve.

The discretised equations result in sparse matrices, so we use the sparse matrix and linear solver modules provided by the faer crate to solve the equations efficiently.

use diffsol::{
    CraneliftJitModule, FaerSparseLU, FaerSparseMat, FaerVec, NonLinearOp, OdeBuilder,
    OdeEquations, OdeSolverMethod, OdeSolverStopReason, Vector,
};
use plotly::{common::Mode, layout::Axis, layout::Layout, Plot, Scatter};
use std::fs;
type M = FaerSparseMat<f64>;
type V = FaerVec<f64>;
type LS = FaerSparseLU<f64>;
type CG = CraneliftJitModule;

fn main() {
    let file = std::fs::read_to_string("../src/primer/src/spm.ds").unwrap();

    let mut problem = OdeBuilder::<M>::new()
        .p([1.0])
        .build_from_diffsl::<CG>(&file)
        .unwrap();
    let currents = vec![0.6, 0.8, 1.0, 1.2, 1.4];
    let final_time = 3600.0;
    let delta_t = 3.0;

    let mut plot = Plot::new();
    for current in currents {
        problem
            .eqn
            .set_params(&V::from_vec(vec![current], problem.context().clone()));

        let mut solver = problem.bdf::<LS>().unwrap();
        let mut v = Vec::new();
        let mut t = Vec::new();

        // save the initial output
        let mut out = problem
            .eqn
            .out()
            .unwrap()
            .call(solver.state().y, solver.state().t);
        v.push(out[0]);
        t.push(0.0);

        // solve until the final time is reached
        // or we reach the stop condition
        solver.set_stop_time(final_time).unwrap();
        let mut next_output_time = delta_t;
        let mut finished = false;
        while !finished {
            let curr_t = match solver.step() {
                Ok(OdeSolverStopReason::InternalTimestep) => solver.state().t,
                Ok(OdeSolverStopReason::RootFound(t)) => {
                    finished = true;
                    t
                }
                Ok(OdeSolverStopReason::TstopReached) => {
                    finished = true;
                    final_time
                }
                Err(_) => panic!("unexpected solver error"),
            };
            while curr_t > next_output_time {
                let y = solver.interpolate(next_output_time).unwrap();
                problem
                    .eqn
                    .out()
                    .unwrap()
                    .call_inplace(&y, next_output_time, &mut out);
                v.push(out[0]);
                t.push(next_output_time);
                next_output_time += delta_t;
            }
        }

        let voltage = Scatter::new(t, v)
            .mode(Mode::Lines)
            .name(format!("current = {current} A"));
        plot.add_trace(voltage);
    }

    let layout = Layout::new()
        .x_axis(Axis::new().title("t [sec]"))
        .y_axis(Axis::new().title("voltage [V]"));
    plot.set_layout(layout);
    let plot_html = plot.to_inline_html(Some("battery-simulation"));
    fs::write("../src/primer/images/battery-simulation.html", plot_html)
        .expect("Unable to write file");
}

Forward Sensitivity Analysis

Recall our general ODE system (we'll define it without the mass matrix for now):

\[ \begin{align*} \frac{dy}{dt} &= f(t, y, p) \\ y(t_0) &= y_0 \end{align*} \]

Solving this system gives us the solution $y(t)$ for a given set of parameters $p$. However, we often want to know how the solution changes with respect to the parameters (e.g. for model fitting). This is where forward sensitivity analysis comes in. If we take the derivative of the ODE system with respect to the parameters, we get the sensitivity equations:

\[ \begin{align*} \frac{d}{dt} \frac{dy}{dp} &= \frac{\partial f}{\partial y} \frac{dy}{dp} + \frac{\partial f}{\partial p} \\ \frac{dy}{dp}(t_0) &= \frac{dy_0}{dp} \end{align*} \]

Here, $\frac{dy}{dp}$ is the sensitivity of the solution with respect to the parameters. The sensitivity equations are solved alongside the ODE system to give us the solution and the sensitivity of the solution with respect to the parameters. Note that this is a similar concept to forward-mode automatic differentiation, but whereas automatic differentiation calculates the derivative of the code itself (e.g. the "discretised" ODE system), forward sensitivity analysis calculates the derivative of the continuous equations before they are discretised. This means that the error control for forward sensitivity analysis is decoupled from the forward solve, and the tolerances for both can be set independently. However, both methods have the same scaling properties as the number of parameters increases, each additional parameter requires one additional solve, so the method is not efficient for large numbers of parameters (>100). In this case, adjoint sensitivity analysis is often preferred.

To use forward sensitvity analysis in Diffsol, more equations need to be specified that calculate the gradients with respect to the parameters. If you are using the OdeBuilder struct and rust closures, you need to supply additional closures that calculate the gradient of the right-hand side, and the gradient of the initial state vector with respect to the parameters. You can see an example of this in the Forward Sensitivity API section. If you are using the DiffSL language, these gradients are calculated automatically and you don't need to worry about them. An example of using forward sensitivity analysis in DiffSL is given in the Fitting a Preditor-Prey Model to Data section next.

Fitting a Predator-Prey Model to data

In this example we will again use the Lotka-Volterra equations, which are described in more detail in the Population Dynamics - Predator-Prey Model example.

The equations are

\[ \frac{dx}{dt} = a x - b x y \\ \frac{dy}{dt} = c x y - d y \]

We also have initial conditions for the populations at time $t = 0$.

\[ x(0) = x0 \\ y(0) = y0 \]

This model has six parameters, $a, b, c, d, x0, y0$. For the purposes of this example, we'll fit two of these parameters $b, d)$ to some synthetic data. We'll use the model itself to generate the synthetic data, so we'll know the true values of the parameters to verify the fitting process.

We'll use the argmin crate to perform the optimisation. This is a popular rust crate that contains a number of optimisation algorithms. It does have some limitations, such as the lack of support for constraints, so it may not be suitable for many real ODE fitting problems as the solver can easily fail to converge if the parameter vector moves into a difficult region (e.g. the Lotka-Volterra model only makes sense for positive values of the parameters). However, it will be sufficient for this example to demonstrate the sensitivity analysis capabilities of diffsol.

First of all we will need to implement some of the argmin traits to specify the optimisation problem. We'll create a struct Problem and implement the CostFunction and Gradient traits for it. The Problem struct will hold our synthetic data (held by the ys_data and ts_data fields) and the OdeSolverProblem that we'll use to solve the ODEs. We'll also create some type aliases for the nalgebra and diffsol types we'll be using.

Note that the problem field of the Problem struct is wrapped in a RefCell so that we can mutate it in the cost and gradient methods. Setting the parameters of the ODE solver problem is a mutable operation (i.e. you are changing the equations), so we need to use RefCell and interior mutability to do this.

use argmin::{
    core::{observers::ObserverMode, CostFunction, Executor, Gradient},
    solver::{linesearch::MoreThuenteLineSearch, quasinewton::LBFGS},
};
use argmin_observer_slog::SlogLogger;
use diffsol::{
    DiffSl, OdeBuilder, OdeEquations, OdeSolverMethod, OdeSolverProblem,
    SensitivitiesOdeSolverMethod,
};
use nalgebra::{DMatrix, DVector};
use std::cell::RefCell;

type M = DMatrix<f64>;
type V = DVector<f64>;
type T = f64;
type LS = diffsol::NalgebraLU<f64>;
type CG = diffsol::LlvmModule;
type Eqn = DiffSl<M, CG>;

struct Problem {
    ys_data: M,
    ts_data: Vec<T>,
    problem: RefCell<OdeSolverProblem<Eqn>>,
}

The argmin CostFunction trait requires an implementation of the cost method, which will calculate the sum-of-squares difference between the synthetic data and the model output. Since the argmin crate does not support constraints, we'll return a large value if the ODE solver fails to converge.

impl CostFunction for Problem {
    type Output = T;
    type Param = Vec<T>;

    fn cost(&self, param: &Self::Param) -> Result<Self::Output, argmin_math::Error> {
        let mut problem = self.problem.borrow_mut();
        problem.eqn_mut().set_params(&V::from_vec(param.clone()));
        let mut solver = problem.bdf::<LS>().unwrap();
        let ys = match solver.solve_dense(&self.ts_data) {
            Ok(ys) => ys,
            Err(_) => return Ok(f64::MAX / 1000.),
        };
        let loss = ys
            .column_iter()
            .zip(self.ys_data.column_iter())
            .map(|(a, b)| (a - b).norm_squared())
            .sum::<f64>();
        Ok(loss)
    }
}

The argmin Gradient trait requires an implementation of the gradient method, which will calculate the gradient of the cost function with respect to the parameters. Our sum-of-squares cost function can be written as

\[ \text{loss} = \sum_i (y_i(p) - \hat{y}_i)^2 \]

where $y_i(p)$ is the model output as a function of the parameters $p$, and $\hat{y}_i$ is the observed data at time index $i$. Threrefore, the gradient of this cost function with respect to the parameters is

\[ \frac{\partial \text{loss}}{\partial p} = 2 \sum_i (y_i(p) - \hat{y}_i) \cdot \frac{\partial y_i}{\partial p} \]

where $\frac{\partial y_i}{\partial p}$ is the sensitivity of the model output with respect to the parameters. We can calculate this sensitivity using the solve_dense_sensitivities method of the ODE solver. The gradient of the cost function is then the sum of the dot product of the residuals and the sensitivities for each time point. Again, if the ODE solver fails to converge, we'll return a large value for the gradient.

impl Gradient for Problem {
    type Gradient = Vec<T>;
    type Param = Vec<T>;

    fn gradient(&self, param: &Self::Param) -> Result<Self::Gradient, argmin_math::Error> {
        let mut problem = self.problem.borrow_mut();
        problem.eqn_mut().set_params(&V::from_vec(param.clone()));
        let mut solver = problem.bdf_sens::<LS>().unwrap();
        let (ys, sens) = match solver.solve_dense_sensitivities(&self.ts_data) {
            Ok((ys, sens)) => (ys, sens),
            Err(_) => return Ok(vec![f64::MAX / 1000.; param.len()]),
        };
        let dlossdp = sens
            .into_iter()
            .map(|s| {
                s.column_iter()
                    .zip(ys.column_iter().zip(self.ys_data.column_iter()))
                    .map(|(si, (yi, di))| 2.0 * (yi - di).dot(&si))
                    .sum::<f64>()
            })
            .collect::<Vec<f64>>();
        Ok(dlossdp)
    }
}

With these implementation out of the way, we can now perform the fitting problem. We'll generate some synthetic data using the Lotka-Volterra equations with some true parameters, and then fit the model to this data. We'll use the LBFGS solver from the argmin crate, which is a quasi-Newton method that uses the Broyden-Fletcher-Goldfarb-Shanno (BFGS) update formula. We'll also use the SlogLogger observer to log the progress of the optimisation.

We'll initialise the optimizer a short distance away from the true parameter values, and then check the final optimised parameter values against the true values.

pub fn main() {
    let eqn = DiffSl::<M, CG>::compile(
        "
            in = [ b, d ]
            a { 2.0/3.0 } b { 4.0/3.0 } c { 1.0 } d { 1.0 } x0 { 1.0 } y0 { 1.0 }
            u_i {
                y1 = x0,
                y2 = y0,
            }
            F_i {
                a * y1 - b * y1 * y2,
                c * y1 * y2 - d * y2,
            }
        ",
    )
    .unwrap();

    let (b_true, d_true) = (4.0 / 3.0, 1.0);
    let t_data = (0..101)
        .map(|i| f64::from(i) * 40. / 100.)
        .collect::<Vec<f64>>();
    let problem = OdeBuilder::<M>::new()
        .p([b_true, d_true])
        .sens_atol([1e-6])
        .sens_rtol(1e-6)
        .build_from_eqn(eqn)
        .unwrap();
    let mut solver = problem.bdf::<LS>().unwrap();
    let ys_data = solver.solve_dense(&t_data).unwrap();

    let cost = Problem {
        ys_data,
        ts_data: t_data,
        problem: RefCell::new(problem),
    };

    let init_param = vec![b_true - 0.1, d_true - 0.1];

    let linesearch = MoreThuenteLineSearch::new().with_c(1e-4, 0.9).unwrap();
    let solver = LBFGS::new(linesearch, 7);
    let res = Executor::new(cost, solver)
        .configure(|state| state.param(init_param))
        .add_observer(SlogLogger::term(), ObserverMode::Always)
        .run()
        .unwrap();

    // print result
    println!("{}", res);
    // Best parameter vector
    let best = res.state().best_param.as_ref().unwrap();
    println!("Best parameter vector: {:?}", best);
    println!("True parameter vector: {:?}", vec![b_true, d_true]);
}

Feb 03 13:16:44.604 INFO L-BFGS
Feb 03 13:16:44.842 INFO iter: 0, cost: 21.574177406963013, best_cost: 21.574177406963013, cost_count: 6, gradient_count: 7, gamma: 1, time: 0.238573908
Feb 03 13:16:44.920 INFO iter: 1, cost: 0.6811721224055488, best_cost: 0.6811721224055488, cost_count: 8, gradient_count: 10, time: 0.077082344, gamma: 0.00036901099013336356
Feb 03 13:16:44.969 INFO iter: 2, cost: 0.6478536174002669, best_cost: 0.6478536174002669, cost_count: 9, gradient_count: 12, time: 0.049218286, gamma: 0.00017983731521908368
Feb 03 13:16:45.018 INFO iter: 3, cost: 0.5515637814971768, best_cost: 0.5515637814971768, cost_count: 10, gradient_count: 14, time: 0.049264513, gamma: 0.00013404417466199433
Feb 03 13:16:45.069 INFO iter: 4, cost: 0.2889819270908579, best_cost: 0.2889819270908579, cost_count: 11, gradient_count: 16, time: 0.050718659, gamma: 0.00019004425867568796
Feb 03 13:16:45.120 INFO iter: 5, cost: 0.06441855702549167, best_cost: 0.06441855702549167, cost_count: 12, gradient_count: 18, gamma: 0.0005522578375180803, time: 0.05102388
Feb 03 13:16:45.172 INFO iter: 6, cost: 0.001969603423448309, best_cost: 0.001969603423448309, cost_count: 13, gradient_count: 20, time: 0.051874014, gamma: 0.002084311472606979
Feb 03 13:16:45.224 INFO iter: 7, cost: 0.00018682781933202676, best_cost: 0.00018682781933202676, cost_count: 14, gradient_count: 22, gamma: 0.00020342834067043386, time: 0.051705468
Feb 03 13:16:45.276 INFO iter: 8, cost: 0.0000004145187755175781, best_cost: 0.0000004145187755175781, cost_count: 15, gradient_count: 24, time: 0.052326582, gamma: 0.00013653160933392906
Feb 03 13:16:45.328 INFO iter: 9, cost: 0.00000019683422285908518, best_cost: 0.00000019683422285908518, cost_count: 16, gradient_count: 26, time: 0.05250708, gamma: 0.0002897867622209216
Feb 03 13:16:45.381 INFO iter: 10, cost: 0.00000018573267264008743, best_cost: 0.00000018573267264008743, cost_count: 17, gradient_count: 28, time: 0.052503364, gamma: 0.0006940640307853263
Feb 03 13:16:47.047 INFO iter: 11, cost: 0.0000001857326722089365, best_cost: 0.0000001857326722089365, cost_count: 74, gradient_count: 86, gamma: 0.00012163138683631344, time: 1.665703414
Feb 03 13:16:47.531 INFO iter: 12, cost: 0.00000018573267291060315, best_cost: 0.0000001857326722089365, cost_count: 90, gradient_count: 103, gamma: 0.000001225820261527594, time: 0.484634742
Feb 03 13:16:49.078 INFO iter: 13, cost: 0.000000185732672314337, best_cost: 0.0000001857326722089365, cost_count: 143, gradient_count: 157, time: 1.5466635069999999, gamma: 0.00000005014738728081247
Feb 03 13:16:49.562 INFO iter: 14, cost: 0.00000018573267355654775, best_cost: 0.0000001857326722089365, cost_count: 159, gradient_count: 174, gamma: 0.00000386732692846693, time: 0.483722743
Feb 03 13:16:51.051 INFO iter: 15, cost: 0.0000001857326728225398, best_cost: 0.0000001857326722089365, cost_count: 210, gradient_count: 226, time: 1.489475966, gamma: -0.00000029042697044185633
Feb 03 13:16:51.880 INFO iter: 16, cost: 0.00000018573267387423262, best_cost: 0.0000001857326722089365, cost_count: 238, gradient_count: 255, time: 0.828862625, gamma: 0.0000014262616055754604
Feb 03 13:16:53.202 INFO iter: 17, cost: 0.00000018573267376814762, best_cost: 0.0000001857326722089365, cost_count: 283, gradient_count: 301, gamma: 0.00000006448722374243354, time: 1.32160584
OptimizationResult:
    Solver:        L-BFGS
    param (best):  [1.3333663799297248, 1.0000008551827637]
    cost (best):   0.0000001857326722089365
    iters (best):  11
    iters (total): 18
    termination:   Solver converged
    time:          8.62392681s

Best parameter vector: [1.3333663799297248, 1.0000008551827637]
True parameter vector: [1.3333333333333333, 1.0]

So, we've successfully fitted the Lotka-Volterra model to some synthetic data and recovered the original true parameters. This is a simple example and could easily be improved. For example, you will note from the output that the argmin crate is calling both the cost and gradient functions, and this is often done using the exact same parameter vector. Ideally we'd like to cache the results of the solve_dense_sensitivities method and reuse them in both the cost and gradient functions.

Backwards Sensitivity Analysis

Backwards sensitivity analysis, as the name suggests, starts from a given cost or loss function that you want to minimise, and derives the gradient of this function with respect to the parameters of the ODE systems using a lagrangian multiplier approach.

Diffsol supports two different classes of loss functions, the first being an integral of a model output function $g(t, u)$ over time,

$$ G(p) = \int_0^{t_f} g(t, u) dt $$

The second being a sum of $n$ discrete functions $h_i(t, u)$ at time points $t_i$,

$$ G(p) = \int_0^{t_f} \sum_{i=1}^n h_i(t_i, u) \delta(t - t_i) dt $$

Note that the $h_i$ functions can be a combination of the (continuous) model output function $g$ and a user-defined discrete function, such as the sum of squares difference between the model output and some observed data.

The derivation below is modified from that given in (Rackauckas et. al. 2021), and uses the first form of the cost function, but the same approach can be used for the second form.

Lagrangian Multiplier Approach

We wish to minimise the cost function $G(p)$ with respect to the parameters $p$, and subject to the constraits of the ODE system of equations,

$$ M \frac{du}{dt} = f(t, u, p) $$

where $M$ is the mass matrix, $u$ is the state vector, and $f$ is the right-hand side function. We can write the lagrangian as

$$ L(u, \lambda, p) = G(p) + \int_0^{t_f} \lambda^T (M \frac{du}{dt} - f(t, u, p)) $$

where $\lambda$ is the lagrangian multiplier. We already know we can generate a solution to the ODE system $u(t)$ that will satisfy the constaint such that the last term in the lagrangian is zero, meaning that the gradient of the lagrangian is equal to the gradient of the cost function. Therefore, we can write the gradient of the cost function as

$$ \frac{dG}{dp} = \int_0^{t_f} (g_p + g_u u_p) dt - \int_0^{t_f} \lambda^T (M u \frac{du_p}{dt} - f_u u_p - f_p) dt $$

where $g_p$ and $g_u$ are the partial derivatives of the output function $g$ with respect to the parameters and state variables, respectively, and $u_p$ is the partial derivative of the state vector with respect to the parameters, also known as the sensitivities.

This equation can be simplified by using integration by parts, and requiring that the adjoint ODE system is satisfied, which is given by

$$ \begin{aligned} M \frac{d \lambda}{dt} &= -f_u^T \lambda - g_u^T \\ \lambda(t_f) &= 0 \end{aligned} $$

giving the gradient of the cost function as

$$ \frac{dG}{dp} = \lambda^T(0) M u_p(0) + \int_0^{t_f} (g_p + \lambda^T f_p) dt $$

Solving the Adjoint ODE System

Solving the adjoint ODE system is done in two stages. First, we solve the forward ODE system to get the state vector $u(t)$. We require this solution to be valid the entire time interval $[0, t_f]$, so we use a checkpointing system to store the state vector at regular intervals in interpolate between them to get the state vector at any time point. The second stage is to solve the adjoint ODE system backwards in time, starting from the final time point $t_f$ and using the interpolated state vector to supply $u(t)$ as needed.

The gradient $\frac{dG}{dp}$ can be calculated by performing a quadrature over the time interval $[0, t_f]$ to calculate the last term in the equation above. Special consideration needs to be taken for the second form of the cost function above, where the discrete functions are evaluated at specific time points. In this case, the solver will need to be stopped at each time point and the contribution $M^{-1} g_u^T$ added to the state vector $\lambda$, and the contribution $g_p$ added to the gradient $\frac{dG}{dp}$.

In the case that $M$ is singular, but can be divided into a singular and zero blocks like so:

$$ M = \begin{bmatrix} M_{11} & 0 \\ 0 & 0 \end{bmatrix} $$

where $M_{11}$ is invertible. The corresponding block decomposition of the adjoint Jacobian and the partial derivative of the output function can be written as

$$ f_u^T = \begin{bmatrix} f_{dd} & f_{da} \\ f_{ad} & f_{aa} \end{bmatrix} $$

and

$$ g_u^T = \begin{bmatrix} g_{d} & g_{a} \end{bmatrix}. $$

In this case, the contribution to $\lambda$ can be calculated as

$$ -f_{da} f_{aa}^{-1} g_{a} + M_{11}^{-1} g_{d} $$

Specifying the discrete functions

Here we consider the second form of the cost function. If we have a model output function $m(t, u)$, and a set of discrete functions $h_i(t, u)$ which only depend on the model output (i.e. $h_i(t, u) = h_i(m(t, u))$), then the partial derivatives of the cost function $g$ with respect to the parameters can be calculated as

$$ \begin{aligned} g_u &= g_m m_u \\ g_p &= g_m m_p \end{aligned} $$

where $m_u$ and $m_p$ are the partial derivatives of the model output function with respect to the state variables and parameters, respectively. Therefore, a user only has to supply $g_m$ at each of the time points $t_i$ and Diffsol will be able to calculate the correct gradients as part of the backwards solution of the adjoint ODE system.

Example: Fitting a spring-mass model to data

In this example we'll fit a damped spring-mass system to some synthetic data (using the model to generate the data). The system consists of a mass $m$ attached to a spring with spring constant $k$, and a damping force proportional to the velocity of the mass with damping coefficient $c$.

\[ \begin{align*} \frac{dx}{dt} &= v \\ \frac{dv}{dt} &= -\frac{k}{m} x - \frac{c}{m} v \end{align*} \]

where $v = \frac{dx}{dt}$ is the velocity of the mass.

We'll use the argmin crate to perform the optimisation. To hold the synthetic data and the model, we'll create a struct Problem like so

use argmin::{
    core::{observers::ObserverMode, CostFunction, Executor, Gradient},
    solver::{linesearch::MoreThuenteLineSearch, quasinewton::LBFGS},
};
use argmin_observer_slog::SlogLogger;
use diffsol::{
    AdjointOdeSolverMethod, DenseMatrix, DiffSl, Matrix, MatrixCommon, NalgebraMat, NalgebraVec,
    OdeBuilder, OdeEquations, OdeSolverMethod, OdeSolverProblem, OdeSolverState, Op, Scale, Vector,
    VectorCommon, VectorViewMut,
};
use std::cell::RefCell;

type M = NalgebraMat<f64>;
type V = NalgebraVec<f64>;
type T = f64;
type LS = diffsol::NalgebraLU<f64>;
type CG = diffsol::LlvmModule;
type Eqn = DiffSl<M, CG>;

struct Problem {
    ys_data: M,
    ts_data: Vec<T>,
    problem: RefCell<OdeSolverProblem<Eqn>>,
}

To use argmin we need to specify traits giving the loss function and its gradient. In this case we'll define a loss function equal to the sum of squares error between the model output and the synthetic data.

$$ \text{loss} = \sum_i (y_i(p) - \hat{y}_i)^2 $$

where $y_i(p)$ is the model output as a function of the parameters $p$, and $\hat{y}_i$ is the observed data at time index $i$.

impl CostFunction for Problem {
    type Output = T;
    type Param = Vec<T>;

    fn cost(&self, param: &Self::Param) -> Result<Self::Output, argmin_math::Error> {
        let mut problem = self.problem.borrow_mut();
        let context = problem.eqn().context().clone();
        problem
            .eqn_mut()
            .set_params(&V::from_vec(param.clone(), context));
        let mut solver = problem.bdf::<LS>().unwrap();
        let ys = match solver.solve_dense(&self.ts_data) {
            Ok(ys) => ys,
            Err(_) => return Ok(f64::MAX / 1000.),
        };
        let loss = ys
            .inner()
            .column_iter()
            .zip(self.ys_data.inner().column_iter())
            .map(|(a, b)| (a - b).norm_squared())
            .sum::<f64>();
        Ok(loss)
    }
}

The gradient of this cost function with respect to the model outputs $y_i$ is

$$ \frac{\partial \text{loss}}{\partial y_i} = 2 (y_i(p) - \hat{y}_i) $$

We can calculate this using Diffsol's adjoint sensitivity analysis functionality. First we solve the forwards problem, generating a checkpointing struct. Using the forward solution we can then calculate $\frac{\partial loss}{\partial y_i}$ for each time point, and then pass this into the adjoint backwards pass to calculate the gradient of the cost function with respect to the parameters.

impl Gradient for Problem {
    type Gradient = Vec<T>;
    type Param = Vec<T>;

    fn gradient(&self, param: &Self::Param) -> Result<Self::Gradient, argmin_math::Error> {
        let mut problem = self.problem.borrow_mut();
        let context = problem.eqn().context().clone();
        problem
            .eqn_mut()
            .set_params(&V::from_vec(param.clone(), context));
        let mut solver = problem.bdf::<LS>().unwrap();
        let (c, ys) = match solver.solve_dense_with_checkpointing(&self.ts_data, None) {
            Ok(ys) => ys,
            Err(_) => return Ok(vec![f64::MAX / 1000.; param.len()]),
        };
        let mut g_m = M::zeros(2, self.ts_data.len(), problem.eqn().context().clone());
        for j in 0..g_m.ncols() {
            let g_m_i = (ys.column(j) - self.ys_data.column(j)) * Scale(2.0);
            g_m.column_mut(j).copy_from(&g_m_i);
        }
        let adjoint_solver = problem.bdf_solver_adjoint::<LS, _>(c, Some(1)).unwrap();
        match adjoint_solver.solve_adjoint_backwards_pass(self.ts_data.as_slice(), &[&g_m]) {
            Ok(soln) => Ok(soln.as_ref().sg[0]
                .inner()
                .iter()
                .copied()
                .collect::<Vec<_>>()),
            Err(_) => Ok(vec![f64::MAX / 1000.; param.len()]),
        }
    }
}

In our main function we'll create the model, generate some synthetic data, and then call argmin to fit the model to the data.

pub fn main() {
    let (k_true, c_true) = (1.0, 0.1);
    let t_data = (0..101)
        .map(|i| f64::from(i) * 40. / 100.)
        .collect::<Vec<f64>>();
    let problem = OdeBuilder::<M>::new()
        .p([k_true, c_true])
        .sens_atol([1e-6])
        .sens_rtol(1e-6)
        .out_atol([1e-6])
        .out_rtol(1e-6)
        .build_from_diffsl(
            "
        in = [k, c]
        k { 1.0 } m { 1.0 } c { 0.1 }
        u_i {
            x = 1,
            v = 0,
        }
        F_i {
            v,
            -k/m * x - c/m * v,
        }
        ",
        )
        .unwrap();
    let mut solver = problem.bdf::<LS>().unwrap();
    let ys_data = solver.solve_dense(&t_data).unwrap();

    let cost = Problem {
        ys_data,
        ts_data: t_data,
        problem: RefCell::new(problem),
    };

    let init_param = vec![k_true - 0.1, c_true - 0.01];

    let linesearch = MoreThuenteLineSearch::new().with_c(1e-4, 0.9).unwrap();
    let solver = LBFGS::new(linesearch, 7);
    let res = Executor::new(cost, solver)
        .configure(|state| state.param(init_param))
        .add_observer(SlogLogger::term(), ObserverMode::Always)
        .run()
        .unwrap();

    // print result
    println!("{res}");
    // Best parameter vector
    let best = res.state().best_param.as_ref().unwrap();
    println!("Best parameter vector: {best:?}");
    println!("True parameter vector: {:?}", vec![k_true, c_true]);
}

Mar 05 15:12:51.068 INFO L-BFGS
Mar 05 15:12:53.667 INFO iter: 0, cost: 6.099710479003417, best_cost: 6.099710479003417, gradient_count: 14, cost_count: 13, gamma: 1, time: 2.598805685
Mar 05 15:12:54.246 INFO iter: 1, cost: 1.9513388387772255, best_cost: 1.9513388387772255, gradient_count: 18, cost_count: 16, time: 0.579365776, gamma: 0.0009218600784406668
Mar 05 15:12:54.679 INFO iter: 2, cost: 1.1328003486802616, best_cost: 1.1328003486802616, gradient_count: 21, cost_count: 18, gamma: 0.0011158173475820988, time: 0.432772159
Mar 05 15:12:55.100 INFO iter: 3, cost: 0.36245408149937774, best_cost: 0.36245408149937774, gradient_count: 24, cost_count: 20, time: 0.421368339, gamma: 0.0010683972634626152
Mar 05 15:12:55.473 INFO iter: 4, cost: 0.005661451144141899, best_cost: 0.005661451144141899, gradient_count: 27, cost_count: 22, gamma: 0.0010337960155067532, time: 0.372749194
Mar 05 15:12:55.657 INFO iter: 5, cost: 0.0001534604284670027, best_cost: 0.0001534604284670027, gradient_count: 29, cost_count: 23, gamma: 0.0005519136139557582, time: 0.183717262
Mar 05 15:12:55.811 INFO iter: 6, cost: 0.000017178666309946563, best_cost: 0.000017178666309946563, gradient_count: 31, cost_count: 24, time: 0.154246656, gamma: 0.0005222593123731191
Mar 05 15:12:55.934 INFO iter: 7, cost: 0.0000011504081912133204, best_cost: 0.0000011504081912133204, gradient_count: 33, cost_count: 25, gamma: 0.0005839951848538406, time: 0.123233143
Mar 05 15:12:55.999 INFO iter: 8, cost: 0.0000000057304906811474396, best_cost: 0.0000000057304906811474396, gradient_count: 35, cost_count: 26, time: 0.064562531, gamma: 0.0004953781862470435
Mar 05 15:12:56.040 INFO iter: 9, cost: 0.00000000014827166483234088, best_cost: 0.00000000014827166483234088, gradient_count: 37, cost_count: 27, gamma: 0.0004739141598396127, time: 0.041539938
Mar 05 15:12:56.089 INFO iter: 10, cost: 0.00000000005665660355834637, best_cost: 0.00000000005665660355834637, gradient_count: 39, cost_count: 28, time: 0.048698991, gamma: 0.0006574550747061086
Mar 05 15:12:56.150 INFO iter: 11, cost: 0.00000000004046321552763034, best_cost: 0.00000000004046321552763034, gradient_count: 42, cost_count: 30, gamma: 0.0007768025353897974, time: 0.06118116
Mar 05 15:12:56.230 INFO iter: 12, cost: 0.000000000028544950162250156, best_cost: 0.000000000028544950162250156, gradient_count: 45, cost_count: 32, time: 0.079473263, gamma: 0.0007211431129919831
Mar 05 15:12:56.310 INFO iter: 13, cost: 0.000000000019824882126122364, best_cost: 0.000000000019824882126122364, gradient_count: 48, cost_count: 34, gamma: 0.0005276999999419798, time: 0.079908383
Mar 05 15:12:56.399 INFO iter: 14, cost: 0.000000000014773791668031016, best_cost: 0.000000000014773791668031016, gradient_count: 51, cost_count: 36, time: 0.088933521, gamma: 0.0006137808250157392
Mar 05 15:12:56.488 INFO iter: 15, cost: 0.000000000011918443921135866, best_cost: 0.000000000011918443921135866, gradient_count: 54, cost_count: 38, time: 0.088925265, gamma: 0.0006964726153446881
Mar 05 15:12:56.577 INFO iter: 16, cost: 0.000000000009847613120347547, best_cost: 0.000000000009847613120347547, gradient_count: 57, cost_count: 40, gamma: 0.0006788190120544423, time: 0.088873097
Mar 05 15:12:56.666 INFO iter: 17, cost: 0.000000000008250422213735562, best_cost: 0.000000000008250422213735562, gradient_count: 60, cost_count: 42, gamma: 0.0007147657772943421, time: 0.089253633
Mar 05 15:12:56.735 INFO iter: 18, cost: 0.000000000006963535307175679, best_cost: 0.000000000006963535307175679, gradient_count: 63, cost_count: 44, gamma: 0.0007522010953601424, time: 0.069181779
Mar 05 15:12:56.794 INFO iter: 19, cost: 0.000000000005843346318988485, best_cost: 0.000000000005843346318988485, gradient_count: 66, cost_count: 46, gamma: 0.0008115415177531896, time: 0.05931185
Mar 05 15:12:56.875 INFO iter: 20, cost: 0.000000000005668499496383206, best_cost: 0.000000000005668499496383206, gradient_count: 70, cost_count: 49, gamma: 0.0007981283294910353, time: 0.080732485
Mar 05 15:12:56.966 INFO iter: 21, cost: 0.000000000005235587953062947, best_cost: 0.000000000005235587953062947, gradient_count: 74, cost_count: 52, gamma: 0.0007824334700764565, time: 0.090509819
Mar 05 15:12:57.025 INFO iter: 22, cost: 0.000000000005176697246946799, best_cost: 0.000000000005176697246946799, gradient_count: 77, cost_count: 54, time: 0.05927997, gamma: 0.0007160320701376257
Mar 05 15:12:57.084 INFO iter: 23, cost: 0.000000000005123431411964367, best_cost: 0.000000000005123431411964367, gradient_count: 80, cost_count: 56, time: 0.059282524, gamma: 0.0006691254860074899
Mar 05 15:12:57.145 INFO iter: 24, cost: 0.000000000004888671439469577, best_cost: 0.000000000004888671439469577, gradient_count: 83, cost_count: 58, time: 0.060525746, gamma: 0.0006870970887038928
Mar 05 15:12:57.205 INFO iter: 25, cost: 0.000000000004640183634642808, best_cost: 0.000000000004640183634642808, gradient_count: 86, cost_count: 60, time: 0.060494317, gamma: 0.0006938418390339419
Mar 05 15:12:57.266 INFO iter: 26, cost: 0.000000000004202140058725012, best_cost: 0.000000000004202140058725012, gradient_count: 89, cost_count: 62, gamma: 0.0008074272439861259, time: 0.060721945
Mar 05 15:12:57.327 INFO iter: 27, cost: 0.000000000003819065447066373, best_cost: 0.000000000003819065447066373, gradient_count: 92, cost_count: 64, time: 0.060539541, gamma: 0.000745089693258659
Mar 05 15:12:57.388 INFO iter: 28, cost: 0.0000000000034723813904317016, best_cost: 0.0000000000034723813904317016, gradient_count: 95, cost_count: 66, time: 0.061195346, gamma: 0.0006679419374299279
Mar 05 15:12:57.449 INFO iter: 29, cost: 0.0000000000029793598830196132, best_cost: 0.0000000000029793598830196132, gradient_count: 98, cost_count: 68, gamma: 0.0006803520819495024, time: 0.061207339
Mar 05 15:12:57.510 INFO iter: 30, cost: 0.000000000002000300316746391, best_cost: 0.000000000002000300316746391, gradient_count: 101, cost_count: 70, gamma: 0.0007707919521852348, time: 0.061103644
Mar 05 15:12:57.571 INFO iter: 31, cost: 0.0000000000013081231359178465, best_cost: 0.0000000000013081231359178465, gradient_count: 104, cost_count: 72, gamma: 0.0007594375561408552, time: 0.061080561
Mar 05 15:12:57.632 INFO iter: 32, cost: 0.0000000000009129669554476711, best_cost: 0.0000000000009129669554476711, gradient_count: 107, cost_count: 74, gamma: 0.0007714623407119796, time: 0.061010138
Mar 05 15:12:57.693 INFO iter: 33, cost: 0.0000000000006624988392852611, best_cost: 0.0000000000006624988392852611, gradient_count: 110, cost_count: 76, time: 0.060805933, gamma: 0.0006929806628651383
Mar 05 15:12:57.732 INFO iter: 34, cost: 0.0000000000006457742400885126, best_cost: 0.0000000000006457742400885126, gradient_count: 112, cost_count: 77, gamma: 0.0006611075118236918, time: 0.038888894
Mar 05 15:12:57.771 INFO iter: 35, cost: 0.0000000000004844629418810476, best_cost: 0.0000000000004844629418810476, gradient_count: 114, cost_count: 78, time: 0.038867624, gamma: 0.0006746920875937051
Mar 05 15:12:57.832 INFO iter: 36, cost: 0.00000000000021331454699420582, best_cost: 0.00000000000021331454699420582, gradient_count: 117, cost_count: 80, time: 0.060733748, gamma: 0.0006901151177651382
Mar 05 15:12:57.870 INFO iter: 37, cost: 0.0000000000001604124255790092, best_cost: 0.0000000000001604124255790092, gradient_count: 119, cost_count: 81, gamma: 0.0007678508731651804, time: 0.038720877
Mar 05 15:12:57.909 INFO iter: 38, cost: 0.00000000000006736683702894452, best_cost: 0.00000000000006736683702894452, gradient_count: 121, cost_count: 82, time: 0.038724956, gamma: 0.0007450395288400451
Mar 05 15:12:57.948 INFO iter: 39, cost: 0.000000000000033549027916251836, best_cost: 0.000000000000033549027916251836, gradient_count: 123, cost_count: 83, gamma: 0.0007953280906606892, time: 0.038636279
Mar 05 15:12:57.986 INFO iter: 40, cost: 0.0000000000000009190438168193556, best_cost: 0.0000000000000009190438168193556, gradient_count: 125, cost_count: 84, gamma: 0.0008377658208246105, time: 0.038537432
Mar 05 15:12:58.024 INFO iter: 41, cost: 0.0000000000000000010640888952026633, best_cost: 0.0000000000000000010640888952026633, gradient_count: 127, cost_count: 85, gamma: 0.0008326080674532384, time: 0.037460313
Mar 05 15:12:58.060 INFO iter: 42, cost: 0.0000000000000000000003834750428076903, best_cost: 0.0000000000000000000003834750428076903, gradient_count: 129, cost_count: 86, time: 0.036246197, gamma: 0.0008348526737634932
OptimizationResult:
    Solver:        L-BFGS
    param (best):  [0.9999999999993618, 0.0999999999999094]
    cost (best):   0.0000000000000000000003834750428076903
    iters (best):  42
    iters (total): 43
    termination:   Solver converged
    time:          7.139444823s

Best parameter vector: [0.9999999999993618, 0.0999999999999094]
True parameter vector: [1.0, 0.1]

Example: Weather prediction using Neural ODEs

If we wish to train a neural network to predict the weather or any other time series dataset, we can use a Neural ODE. A Neural ODE replaces the rhs of an ODE with a neural network.

$$ \frac{dy}{dt} = f(y, p) $$

Here, $y(t)$ is the state of the system at time $t$, and $f$ is a neural network with parameters $p$. The neural network is trained to predict the derivative of the state, and the ODE solver is used to integrate the state forward in time, and to calculate gradients of the loss function with respect to the parameters of the neural network.

In this example, we will duplicate the weather prediction example from the excellent blog post by Sebastian Callh, but instead using Diffsol as the solver. We'll skip over some of the details, but you can read more details about the problem setup in the original blog post, and see the full code in the Diffsol repository.

First we'll need a neural network model, and we'll use Equinox and JAX for this. We'll define a simple neural network with 3 layers like so

class NeuralNetwork(eqx.Module):
    layers: list

    def __init__(self, data_dim, key):
        key1, key2, key3 = jax.random.split(key, 3)
        self.layers = [
            eqx.nn.Linear(data_dim, 64, key=key1),
            eqx.nn.Linear(64, 32, key=key2),
            eqx.nn.Linear(32, data_dim, key=key3),
        ]

    def __call__(self, x):
        x = jax.nn.silu(self.layers[0](x))  # Swish = SiLU
        x = jax.nn.silu(self.layers[1](x))
        x = self.layers[2](x)
        return x

We will then create four JAX functions that will allow us to calculate:

the rhs function $f(y, p)$ of the Neural ODE, where $y$ is the state of the system and $p$ are the parameters.
the Jacobian-vector product of the rhs function with respect to the state $y$.
the negative vector-Jacobian product of the rhs function with respect to the state $y$.
the negative vector-Jacobian product of the rhs function with respect to the parameters $p$.

We will need all four of these to define the ODE problem and to solve it using Diffsol.

key = jax.random.PRNGKey(0)
model = NeuralNetwork(data_dim=data_dim, key=key)
y = jnp.zeros((data_dim,))
v = jnp.zeros((data_dim,))
params, static = eqx.partition(model, eqx.is_array)
p, unravel_params = ravel_pytree(params)


def rhs(p, y):
    params = unravel_params(p)
    model = eqx.combine(params, static)
    return model(y)


def rhs_jac_mul(p, y, v):
    return jax.jvp(ft.partial(rhs, p), (y,), (v,))[1]


def rhs_jac_transpose_mul(p, y, v):
    return -jax.vjp(ft.partial(rhs, p), y)[1](v)[0]


def rhs_sens_transpose_mul(p, y, v):
    return -jax.vjp(ft.partial(rhs, y=y), p)[1](v)[0]

Finally, we can export all four of these JAX functions to ONNX, which will allow us to use them within rust.

def to_onnx(model, inputs, filename):
    sig = [tf.TensorSpec(inpt[0].shape, inpt[0].dtype, name=inpt[1]) for inpt in inputs]
    inference_tf = jax2tf.convert(model, enable_xla=False)
    inference_tf = tf.function(inference_tf, autograph=False)
    inference_onnx = tf2onnx.convert.from_function(inference_tf, input_signature=sig)
    model_proto, _external_tensor_storage = inference_onnx
    with open(filename, "wb") as f:
        f.write(model_proto.SerializeToString())
    return model_proto

Within rust now, we can define a Diffsol system of equations by creating a struct NeuralOde. We'll use the ort crate and the ONNX Runtime to load the ONNX models that we made in Python.

struct NeuralOde {
    rhs: Session,
    rhs_jac_mul: Session,
    rhs_jac_transpose_mul: Session,
    rhs_sens_transpose_mul: Session,
    input_y: RefCell<Array1<f32>>,
    input_v: RefCell<Array1<f32>>,
    input_p: Array1<f32>,
    y0: V,
}

impl NeuralOde {
    fn new_session(filename: &str) -> Result<Session> {
        let full_filename = format!("{BASE_MODEL_DIR}{filename}");
        let session = Session::builder()?
            .with_optimization_level(GraphOptimizationLevel::Level3)?
            .with_intra_threads(4)?
            .commit_from_file(full_filename.as_str())?;
        Ok(session)
    }
    fn new(y0: V) -> Result<Self> {
        let rhs = Self::new_session("rhs.onnx")?;
        let rhs_jac_mul = Self::new_session("rhs_jac_mul.onnx")?;
        let rhs_jac_transpose_mul = Self::new_session("rhs_jac_transpose_mul.onnx")?;
        let rhs_sens_transpose_mul = Self::new_session("rhs_sens_transpose_mul.onnx")?;
        let mut nparams = 0;
        for input in rhs.inputs.iter() {
            if input.name == "p" {
                nparams = input.input_type.tensor_dimensions().unwrap()[0] as usize;
                break;
            }
        }
        let mut rng = rand::rng();
        let elem = Uniform::<f32>::new(0.0, 1.0).unwrap();
        let params = Array1::from_shape_fn((nparams,), |_| elem.sample(&mut rng));
        let y0_ndarray = Array1::from_shape_fn((y0.len(),), |i| y0[i] as f32);

        Ok(Self {
            y0,
            rhs,
            rhs_jac_mul,
            rhs_jac_transpose_mul,
            rhs_sens_transpose_mul,
            input_p: params,
            input_v: RefCell::new(y0_ndarray.clone()),
            input_y: RefCell::new(y0_ndarray),
        })
    }

    fn data_dim(&self) -> usize {
        self.y0.len()
    }
}

We'll also implement the OdeSystemAdjoint trait for NeuralOde, which will allow us to use the adjoint method to calculate gradients of out loss function with respect to the parameters of the neural network. As an example, here is the implementation of the NonLinearOp trait:

impl NonLinearOp for Rhs<'_> {
    fn call_inplace(&self, x: &Self::V, _t: Self::T, y: &mut Self::V) {
        let mut y_input = self.0.input_y.borrow_mut();
        y_input
            .iter_mut()
            .zip(x.inner().iter())
            .for_each(|(y, x)| *y = *x as f32);
        let outputs = self
            .0
            .rhs
            .run(
                inputs![
                    "p" => self.0.input_p.view(),
                    "y" => y_input.view(),
                ]
                .unwrap(),
            )
            .unwrap();
        let y_data = outputs["Identity_1:0"].try_extract_tensor::<f32>().unwrap();
        y.inner_mut()
            .iter_mut()
            .zip(y_data.as_slice().unwrap())
            .for_each(|(y, x)| *y = *x as f64);
    }
}

We'll also need an optimiser, so we'll write an AdamW algorithm using the definition in the PyTorch documentation as a guide:

struct AdamW {
    lr: T,
    betas: (T, T),
    eps: T,
    m: V,
    m_hat: V,
    v: V,
    v_hat: V,
    grads2: V,
    lambda: T,
    t: i32,
}

impl AdamW {
    fn new(nparams: usize, ctx: C) -> Self {
        let lr = 1e-2;
        let betas = (0.9, 0.999);
        let eps = 1e-8;
        let m = V::zeros(nparams, ctx.clone());
        let m_hat = V::zeros(nparams, ctx.clone());
        let v = V::zeros(nparams, ctx.clone());
        let v_hat = V::zeros(nparams, ctx.clone());
        let grads2 = V::zeros(nparams, ctx.clone());
        let lambda = 1e-2;
        Self {
            lr,
            betas,
            eps,
            m,
            m_hat,
            v,
            v_hat,
            lambda,
            grads2,
            t: 0,
        }
    }

    fn step(&mut self, params: &mut V, grads: &V) {
        self.t += 1;
        params.mul_assign(Scale(1.0 - self.lr * self.lambda));
        self.m.axpy(1.0 - self.betas.0, grads, self.betas.0);
        self.grads2.copy_from(grads);
        self.grads2.component_mul_assign(grads);
        self.v.axpy(1.0 - self.betas.1, &self.grads2, self.betas.1);
        self.m_hat = &self.m * Scale(1.0 / (1.0 - self.betas.0.powi(self.t)));
        self.v_hat = &self.v * Scale(1.0 / (1.0 - self.betas.1.powi(self.t)));
        params
            .inner_mut()
            .iter_mut()
            .zip(self.v_hat.inner().iter())
            .zip(self.m_hat.inner().iter())
            .for_each(|((params_i, v_hat_i), m_hat_i)| {
                *params_i -= self.lr * m_hat_i / (v_hat_i.sqrt() + self.eps)
            });
    }
}

We'll then define our loss function, which will return the sum of squared errors between the solution and the data points, along with the gradients of the loss function with respect to the parameters. Since the size of the parameter vector is quite large (>2000), we'll use the adjoint method to calculate the gradients.

fn loss_fn(
    problem: &mut OdeSolverProblem<NeuralOde>,
    p: &V,
    ts_data: &[T],
    ys_data: &M,
    g_m: &mut M,
) -> Result<(T, V), DiffsolError> {
    problem.eqn.set_params(p);
    let (c, ys) = problem
        .bdf::<LS>()?
        .solve_dense_with_checkpointing(ts_data, None)?;
    let mut loss = 0.0;
    for j in 0..g_m.ncols() {
        let delta = ys.column(j) - ys_data.column(j);
        loss += delta.inner().dot(delta.inner());
        let g_m_i = delta * Scale(2.0);
        g_m.column_mut(j).copy_from(&g_m_i);
    }
    let adjoint_solver = problem.bdf_solver_adjoint::<LS, _>(c, Some(1)).unwrap();
    let soln = adjoint_solver.solve_adjoint_backwards_pass(ts_data, &[g_m])?;
    Ok((loss, soln.into_common().sg.pop().unwrap()))
}

Finally, we can train the neural network to predict the weather. Following the example given in the linked blog post above, we'll train in stages by increasing the number of datapoints by four each time. Each time we'll train for 150 steps using the AdamW optimiser.

fn train_one_round(
    problem: &mut OdeSolverProblem<NeuralOde>,
    ts_data: &[T],
    ys_data: &M,
    p: &mut V,
) {
    let mut gm = M::zeros(problem.eqn.nout(), ts_data.len(), problem.context().clone());
    let mut adam = AdamW::new(problem.eqn.nparams(), problem.context().clone());
    for _ in 0..150 {
        match loss_fn(problem, p, ts_data, ys_data, &mut gm) {
            Ok((loss, g)) => {
                println!("loss: {loss}");
                adam.step(p, &g)
            }
            Err(e) => {
                panic!("{}", e);
            }
        };
    }
}

To give an indication of the results, we'll plot the results after we've used the first 20 data-points to train the model, and we'll predict the model solution to the entire dataset.

This seems to work well, and is good at matching the data points a long way into the future. This has been a whirlwind description of both Neural ODEs and this particular analysis. For a more detailed explanation, please refer to the original blog post by Sebastian Callh. We've also skipped over many more boring parts of the code, and you can see the full code for this example in the Diffsol repository.

Diffsol APIs for specifying problems

Most of the Diffsol user-facing API revolves around specifying the problem you want to solve, thus a large part of this book will be dedicated to explaining how to specify a problem. All the examples presented in the primer used the DiffSL DSL to specify the problem, but Diffsol also provides a pure Rust API for specifying problems. This API is sometimes more verbose than the DSL, but is more flexible, more performant, and easier to use if you have a model already written in Rust or another language that you can easily port or call from Rust.

ODE equations

The class of ODE equations that Diffsol can solve are of the form

\[M(t) \frac{dy}{dt} = f(t, y, p),\] \[y(t_0) = y_0(t_0, p),\] \[z(t) = g(t, y, p),\]

where:

$f(t, y, p)$ is the right-hand side of the ODE,
$y$ is the state vector,
$p$ are the parameters,
$t$ is the time.
$y_0(t_0, p)$ is the initial state vector at time $t_0$.
$M(t)$ is the mass matrix (this is optional, and is implicitly the identity matrix if not specified),
$g(t, y, p)$ is an output function that can be used to calculate additional outputs from the state vector (this is optional, and is implicitly $g(t, y, p) = y$ if not specified).

The user can also optionally specify a root function $r(t, y, p)$ that can be used to find the time at which a root occurs.

Diffsol problem APIs

Specifying a problem in Diffsol is done via the OdeBuilder struct, using the OdeBuilder::new method to create a new builder, and then chaining methods to set the equations to be solved, initial time, initial step size, relative & absolute tolerances, and parameters, or leaving them at their default values. Then, call a build method to create a new problem.

Users can specify the equations to be solved via three main APIs, ranging from the simplest to the most complex (but also the most flexible):

The DiffSl struct allows users to specify the equations above using the DiffSL Domain Specific Language (DSL). This API is behind a feature flag (diffsl-cranelift if you want to use the slower cranelift backend, diffsl-llvm* if you want to use the faster LLVM backend). This is the easiest API to use as it can use automatic differentiation to calculate the neccessary gradients, and is the best API if you want to use DiffSL from a higher-level language like Python or R while still having similar performance to Rust.
The OdeBuilder struct also has methods to set the equations using rust closures (see e.g. OdeBuilder::rhs_implicit). This API is convenient if you want to stick to pure rust code without using DiffSL and the JIT compiler, but requires you to calculate the gradients of the equations yourself.
Implementing the OdeEquations trait allows users to implement the equations on their own structs. This API is the most flexible as it allows users to use custom data structures and code that might not fit within the OdeBuilder API. However, it is the most verbose API and requires users to be more familiar with the various Diffsol traits.

DiffSL

Many ODE libraries allow you to specify your ODE system using the host language, for example an ODE library written in C might allow you to write a C function for your RHS equations. However, this has limitations if you want to wrap this library in a higher level language like Python or R, how then do you provide the RHS functions?

For this usecase we have designed a Domain Specific Language (DSL) called DiffSL that can be used to specify the problem. DiffSL is not a general purpose language but is tightly constrained to the specification of a system of ordinary differential equations. It features a relatively simple syntax that consists of writing a series of tensors (dense or sparse) that represent the equations of the system.

DiffSL syntax

For more detail on the syntax of DiffSL see the DiffSL book. This section will focus on how to use DiffSL to specify a problem in Diffsol.

Creating a DiffSL problem

To create a DiffSL problem you simply need to use the build_from_eqn method on the OdeBuilder struct, passing in a str containing the DiffSL code. The DiffSL code is then parsed and compiled into native machine code using either the LLVM or Cranelift backends. The CG type parameter specifies the backend that you want to use to compile the DiffSL code. The CraneliftJitModule backend is behind the diffsl-cranelift feature flag. The faster LlvmModule backend is behind one of the diffsl-llvm* feature flags (currently diffsl-llvm15, diffsl-llvm16, diffsl-llvm17 or diffsl-llvm18), depending on the version of LLVM you have installed.

For example, here is an example of specifying a simple logistic equation using DiffSL:

use diffsol::{OdeBuilder, OdeSolverMethod};
type M = diffsol::NalgebraMat<f64>;
type CG = diffsol::CraneliftJitModule;
type LS = diffsol::NalgebraLU<f64>;

fn main() {
    let problem = OdeBuilder::<M>::new()
        .rtol(1e-6)
        .p([1.0, 10.0])
        .build_from_diffsl::<CG>(
            "
                in = [r, k]
                r { 1.0 }
                k { 1.0 }
                u { 0.1 }
                F { r * u * (1.0 - u / k) }
            ",
        )
        .unwrap();
    let mut solver = problem.bdf::<LS>().unwrap();
    let t = 0.4;
    let _soln = solver.solve(t).unwrap();
}

Rust closures

To create a new ode problem, use the OdeBuilder struct. You can create a builder using the OdeBuilder::new method, and then chain methods to set the equations to be solve, initial time, initial step size, relative & absolute tolerances, and parameters, or leave them at their default values. Then, call the build method to create a new problem.

Explicit vs implicit methods

Diffsol contains a variety of methods for solving ODEs, including both explicit and implicit methods. An example of an explicit method is the euler method, which is a simple first-order method that uses the current state to calculate the next state. The only information needed to use this method is the right-hand side of the ODE, which is provided by the user. An example of an implicit method is the backward euler method, which uses the current state and the next state to calculate the next state, by solving an implicit non-linear equation. This method requires the user to provide the right-hand side of the ODE, as well as the jacobian of the right-hand side with respect to the state vector.

Diffsol uses the Rust type system to distinguish between user-provided systems of equations that are either explicit or implicit. Any system of equations that is suitable for an explicit method implements the OdeEquations trait, whereas a system of equations that is suitable for an implicit method implements the OdeEquationsImplicit trait. If you are using the rust closures API, this is largely hidden from you, but you still need to be aware of these traits when when specifying your own functions or structs.

An explicit example

Below is an example of how to create an explicit ODE problem using the OdeBuilder struct. This type of problem is suitable only for explicit solver methods.

The specific problem we will solve is the logistic equation

\[\frac{dy}{dt} = r y (1 - y/K),\]

where $r$ is the growth rate and $K$ is the carrying capacity. To specify the problem, we need to provide the $dy/dt$ function $f(y, p, t)$,

\[f(y, p, t) = r y (1 - y/K),\]

and the initial state

\[y_0(p, t) = 0.1\]

This can be done using the following code:

use diffsol::{MatrixCommon, OdeBuilder};
use diffsol::{NalgebraMat, OdeEquations, OdeSolverProblem, Vector};
type M = NalgebraMat<f64>;
type V = <M as MatrixCommon>::V;
type C = <M as MatrixCommon>::C;
type T = <M as MatrixCommon>::T;

pub fn problem_explicit() -> OdeSolverProblem<impl OdeEquations<M = M, V = V, T = T, C = C>> {
    OdeBuilder::<M>::new()
        .p(vec![1.0, 10.0])
        .rhs(|x, p, _t, y| y[0] = p[0] * x[0] * (1.0 - x[0] / p[1]))
        .init(|_p, _t, y| y.fill(0.1), 1)
        .build()
        .unwrap()
}

The return type of this function, OdeSolverProblem<impl OdeEquations<M=M, V=V, T=T, C=C>> details the type of problem that we are returning. The OdeSolverProblem struct contains our equations, which are of type impl OdeEquations<M=M, V=V, T=T, C=C>, where M is the matrix type, V is the vector type, T is the time type, and C is the context type. We need to specify the matrix, vector, scalar and context types as these are used for the underlying linear algebra containers and operations. The OdeEquations trait specifies that this system of equations is only suitable for explicit solver methods.

The rhs method is used to specify the $f(y, p, t)$ function, whereas the init method is used to specify the initial state vector $y_0(p, t)$.

An implicit example

We will now create an implicit ODE problem, again using the logistic equation as an example. This problem can be used in both the explicit and implicit diffsol solvers.

\[\frac{dy}{dt} = r y (1 - y/K),\]

where $r$ is the growth rate and $K$ is the carrying capacity. To specify the problem, we need to provide the $dy/dt$ function $f(y, p, t)$, and the jacobian of $f$ multiplied by a vector $v$ function, which we will call $f'(y, p, t, v)$. That is

\[f(y, p, t) = r y (1 - y/K),\] \[f'(y, p, t, v) = rv (1 - 2y/K),\]

and the initial state

\[y_0(p, t) = 0.1\]

This can be done using the following code:

use diffsol::{MatrixCommon, OdeBuilder};
use diffsol::{NalgebraMat, OdeEquationsImplicit, OdeSolverProblem, Vector};
type M = NalgebraMat<f64>;
type V = <M as MatrixCommon>::V;
type C = <M as MatrixCommon>::C;
type T = <M as MatrixCommon>::T;

pub fn problem_implicit() -> OdeSolverProblem<impl OdeEquationsImplicit<M = M, V = V, T = T, C = C>>
{
    OdeBuilder::<M>::new()
        .p(vec![1.0, 10.0])
        .rhs_implicit(
            |x, p, _t, y| y[0] = p[0] * x[0] * (1.0 - x[0] / p[1]),
            |x, p, _t, v, y| y[0] = p[0] * v[0] * (1.0 - 2.0 * x[0] / p[1]),
        )
        .init(|_p, _t, y| y.fill(0.1), 1)
        .build()
        .unwrap()
}

The rhs_implicit method is used to specify the $f(y, p, t)$ and $f'(y, p, t, v)$ functions, whereas the init method is used to specify the initial state vector $y_0(p, t)$.

Mass matrix

In some cases, it is necessary to include a mass matrix in the problem, such that the problem is of the form

\[M(t) \frac{dy}{dt} = f(t, y, p).\]

A mass matrix is useful for PDE discretisation that lead to a non-identity mass matrix, or for DAE problems that can be transformed into ODEs with a singular mass matrix. Diffsol can handle singular and non-singular mass matrices, and the mass matrix can be time-dependent.

Example

To illustrate the addition of a mass matrix to a problem, we will once again take the logistic equation, but this time we will add an additional variable that is set via an algebraic equation. This system is written as

\[\frac{dy}{dt} = r y (1 - y/K),\] \[0 = y - z,\]

where $z$ is the additional variable with a solution $z = y$. When this system is put in the form $M(t) \frac{dy}{dt} = f(t, y, p)$, the mass matrix is

\[M(t) = \begin{bmatrix} 1 & 0 \\ 0 & 0 \end{bmatrix}.\]

Like the Jacobian, the Diffsol builder does not require the full mass matrix, instead users can provide a function that gives a GEMV (General Matrix-Vector) product of the mass matrix with a vector.

\[m(\mathbf{v}, \mathbf{p}, t, \beta, \mathbf{y}) = M(p, t) \mathbf{v} + \beta \mathbf{y}. \]

Thus, to specify this problem using Diffsol, we can use the OdeBuilder struct and provide the functions:

\[f(\mathbf{y}, \mathbf{p}, t) = \begin{bmatrix} r y_0 (1 - y_0/K) \\ y_0 - y_1 \end{bmatrix},\] \[f'(\mathbf{y}, \mathbf{p}, t, \mathbf{v}) = \begin{bmatrix} r v_0 (1 - 2 y_0/K) \\ v_0 - v_1 \end{bmatrix},\] \[m(\mathbf{v}, \mathbf{p}, t, \beta, \mathbf{y}) = \begin{bmatrix} v_0 + \beta y_0 \\ \beta y_1 \end{bmatrix}.\]

where $f$ is the right-hand side of the ODE, $f'$ is the Jacobian of $f$ multiplied by a vector, and $m$ is the mass matrix multiplied by a vector.

use diffsol::{MatrixCommon, OdeBuilder};
use diffsol::{NalgebraMat, OdeEquationsImplicit, OdeSolverProblem, Vector};
type M = NalgebraMat<f64>;
type V = <M as MatrixCommon>::V;
type C = <M as MatrixCommon>::C;
type T = <M as MatrixCommon>::T;

pub fn problem_mass() -> OdeSolverProblem<impl OdeEquationsImplicit<M = M, V = V, T = T, C = C>> {
    OdeBuilder::<M>::new()
        .t0(0.0)
        .rtol(1e-6)
        .atol([1e-6])
        .p(vec![1.0, 10.0])
        .rhs_implicit(
            |x, p, _t, y| {
                y[0] = p[0] * x[0] * (1.0 - x[0] / p[1]);
                y[1] = x[0] - x[1];
            },
            |x, p, _t, v, y| {
                y[0] = p[0] * v[0] * (1.0 - 2.0 * x[0] / p[1]);
                y[1] = v[0] - v[1];
            },
        )
        .mass(|v, _p, _t, beta, y| {
            y[0] = v[0] + beta * y[0];
            y[1] *= beta;
        })
        .init(|_p, _t, y| y.fill(0.1), 2)
        .build()
        .unwrap()
}

Root finding

Root finding is the process of finding the values of the variables that make a set of equations equal to zero. This is a common problem where you want to stop the solver or perform some action when a certain condition is met.

Specifying the root finding function

Using the logistic example, we can add a root finding function $r(y, p, t)$ that will stop the solver when the value of $y$ is such that $r(y, p, t) = 0$. For this example we'll use the root finding function $r(y, p, t) = y - 0.5$, which will stop the solver when the value of $y$ is 0.5.

\[\frac{dy}{dt} = r y (1 - y/K),\] \[r(y, p, t) = y - 0.5,\]

This can be done using the OdeBuilder via the following code:

use diffsol::{MatrixCommon, OdeBuilder};
use diffsol::{NalgebraMat, OdeEquationsImplicit, OdeSolverProblem};
type M = NalgebraMat<f64>;
type V = <M as MatrixCommon>::V;
type C = <M as MatrixCommon>::C;
type T = <M as MatrixCommon>::T;

pub fn problem_root() -> OdeSolverProblem<impl OdeEquationsImplicit<M = M, V = V, T = T, C = C>> {
    OdeBuilder::<M>::new()
        .t0(0.0)
        .rtol(1e-6)
        .atol([1e-6])
        .p(vec![1.0, 10.0])
        .rhs_implicit(
            |x, p, _t, y| y[0] = p[0] * x[0] * (1.0 - x[0] / p[1]),
            |x, p, _t, v, y| y[0] = p[0] * v[0] * (1.0 - 2.0 * x[0] / p[1]),
        )
        .init(|_p, _t, y| y[0] = 0.1, 1)
        .root(|x, _p, _t, y| y[0] = x[0] - 0.5, 1)
        .build()
        .unwrap()
}

here we have added the root finding function $r(y, p, t) = y - 0.5$, and also let Diffsol know that we have one root function by passing 1 as the last argument to the root method. If we had specified more than one root function, the solver would stop when any of the root functions are zero.

Detecting roots during the solve

To detect the root during the solve, we can use the return type on the step method of the solver. If successful the step method returns an OdeSolverStopReason enum that contains the reason the solver stopped.

use diffsol::{OdeEquations, OdeSolverMethod, OdeSolverStopReason};

pub fn solve_match_step<'a, Solver, Eqn>(solver: &mut Solver)
where
    Solver: OdeSolverMethod<'a, Eqn>,
    Eqn: OdeEquations<T = f64> + 'a,
{
    solver.set_stop_time(10.0).unwrap();
    loop {
        match solver.step() {
            Ok(OdeSolverStopReason::InternalTimestep) => continue,
            Ok(OdeSolverStopReason::TstopReached) => break,
            Ok(OdeSolverStopReason::RootFound(_t)) => break,
            Err(e) => panic!("Solver failed to converge: {e}"),
        }
    }
    println!("Solver stopped at time: {}", solver.state().t);
}

Forward Sensitivity

In this section we will discuss how to compute the forward sensitivity of the solution of an ODE problem. The forward sensitivity is the derivative of the solution with respect to the parameters of the problem. This is useful for many applications, such as parameter estimation, optimal control, and uncertainty quantification.

Specifying the sensitivity problem

We will again use the example of the logistic growth equation, but this time we will compute the sensitivity of the solution $y$ with respect to the parameters $r$ and $K$ (i.e. $\frac{dy}{dr}$ and $\frac{dy}{dK}$). The logistic growth equation is:

\[\frac{dy}{dt} = r y (1 - y/K),\] \[y(0) = 0.1\]

Recall from ODE equations that we also need to provide the jacobian of the right hand side of the ODE with respect to the state vector $y$ and the gradient vector $v$, which we will call $J$. This is:

\[J v = \begin{bmatrix} r v (1 - 2 y/K) \end{bmatrix}.\]

Using the logistic growth equation above, we can compute the partial derivative of the right hand side of the ODE with respect to the vector $[r, K]$ multiplied by a vector $v = [v_r, v_K]$, which we will call $J_p v$. This is:

\[J_p v = v_r y (1 - y/K) + v_K r y^2 / K^2 .\]

We also need the partial derivative of the initial state vector with respect to the parameters multiplied by a vector $v$, which we will call $J_{y_0} v$. Since the initial state vector is constant, this is just zero

\[J_{y_0} v = 0.\]

We can then use the OdeBuilder struct to specify the sensitivity problem. The rhs_sens_implicit and init_sens methods are used to create a new problem that includes the sensitivity equations.

use crate::{C, M, T, V};
use diffsol::{OdeBuilder, OdeEquationsImplicitSens, OdeSolverProblem};

pub fn problem_fwd_sens(
) -> OdeSolverProblem<impl OdeEquationsImplicitSens<M = M, V = V, T = T, C = C>> {
    OdeBuilder::<M>::new()
        .p(vec![1.0, 10.0])
        .rhs_sens_implicit(
            |x, p, _t, y| y[0] = p[0] * x[0] * (1.0 - x[0] / p[1]),
            |x, p, _t, v, y| y[0] = p[0] * v[0] * (1.0 - 2.0 * x[0] / p[1]),
            |x, p, _t, v, y| {
                y[0] = v[0] * x[0] * (1.0 - x[0] / p[1]) + v[1] * p[0] * x[0] * x[0] / (p[1] * p[1])
            },
        )
        .init_sens(|_p, _t, y| y[0] = 0.1, |_p, _t, _v, y| y[0] = 0.0, 1)
        .build()
        .unwrap()
}

Solving the sensitivity problem

Once the sensitivity problem has been specified, we can solve it using the OdeSolverMethod trait. Lets imagine we want to solve the sensitivity problem up to a time $t_o = 10$. We can use the OdeSolverMethod trait to solve the problem as normal, stepping forward in time until we reach $t_o$.

use diffsol::{OdeEquationsImplicitSens, OdeSolverMethod};

pub fn solve_fwd_sens_step<'a, Solver, Eqn>(solver: &mut Solver)
where
    Solver: OdeSolverMethod<'a, Eqn>,
    Eqn: OdeEquationsImplicitSens<T = f64> + 'a,
{
    let t_o = 5.0;
    while solver.state().t < t_o {
        solver.step().unwrap();
    }
    let sens_at_t_o = solver.interpolate_sens(t_o).unwrap();
    let sens_at_internal_step = &solver.state().s;
    println!("sensitivity at t_o: {sens_at_t_o:?}");
    println!("sensitivity at internal step: {sens_at_internal_step:?}");
}

We can then obtain the sensitivity vectors at time $t_o$ using the interpolate_sens method on the OdeSolverMethod trait. This method returns a Vec<DVector<f64>> where each element of the vector is the sensitivity vector for element $i$ of the parameter vector at time $t_o$. If we need the sensitivity at the current internal time step, we can get this from the s field of the OdeSolverState struct.

Sparse problems

Lets consider a large system of equations that have a jacobian matrix that is sparse. For simplicity we will start with the logistic equation from the "Specifying the Problem" section, but we will duplicate this equation 10 times to create a system of 10 equations. This system will have a jacobian matrix that is a diagonal matrix with 10 diagonal elements, and all other elements are zero.

Since this system is sparse, we choose a sparse matrix type to represent the jacobian matrix. We will use the diffsol::FaerSparseMat<T> type, which is a thin wrapper around faer::sparse::FaerSparseMat<T>, a sparse compressed sparse column matrix type.

use diffsol::{FaerSparseMat, MatrixCommon, OdeBuilder};
use diffsol::{OdeEquationsImplicit, OdeSolverProblem, Vector};
type M = FaerSparseMat<f64>;
type V = <M as MatrixCommon>::V;
type C = <M as MatrixCommon>::C;
type T = <M as MatrixCommon>::T;

pub fn problem_sparse() -> OdeSolverProblem<impl OdeEquationsImplicit<M = M, V = V, T = T, C = C>> {
    OdeBuilder::<M>::new()
        .t0(0.0)
        .rtol(1e-6)
        .atol([1e-6])
        .p(vec![1.0, 10.0])
        .rhs_implicit(
            |x, p, _t, y| {
                for i in 0..10 {
                    y[i] = p[0] * x[i] * (1.0 - x[i] / p[1]);
                }
            },
            |x, p, _t, v, y| {
                for i in 0..10 {
                    y[i] = p[0] * v[i] * (1.0 - 2.0 * x[i] / p[1]);
                }
            },
        )
        .init(|_p, _t, y| y.fill(0.1), 10)
        .build()
        .unwrap()
}

Note that we have not specified the jacobian itself, but instead we have specified the jacobian multiplied by a vector function $f'(y, p, t, v)$. Diffsol will use this function to generate a jacobian matrix, and since we have specified a sparse matrix type, Diffsol will attempt to guess the sparsity pattern of the jacobian matrix and use this to efficiently generate the jacobian matrix.

To illustrate this, we can calculate the jacobian matrix from the rhs function contained in the problem object:

use diffsol::{ConstantOp, Matrix, NonLinearOpJacobian, OdeEquationsImplicit, OdeSolverProblem};

pub fn print_jacobian(problem: &OdeSolverProblem<impl OdeEquationsImplicit>) {
    let t0 = problem.t0;
    let y0 = problem.eqn.init().call(t0);
    let jacobian = problem.eqn.rhs().jacobian(&y0, t0);
    for (i, j, v) in jacobian.triplet_iter() {
        println!("({i}, {j}) = {v}");
    }
}

which will print the jacobian matrix in triplet format:

(0, 0) = 0.98
(1, 1) = 0.98
(2, 2) = 0.98
(3, 3) = 0.98
(4, 4) = 0.98
(5, 5) = 0.98
(6, 6) = 0.98
(7, 7) = 0.98
(8, 8) = 0.98
(9, 9) = 0.98

Diffsol attempts to guess the sparsity pattern of your jacobian matrix by calling the $f'(y, p, t, v)$ function repeatedly with different one-hot vectors $v$ with a NaN value at each hot index. The output of this function (i.e. which elements are 0 and which are NaN) is then used to determine the sparsity pattern of the jacobian matrix. Due to the fact that for IEEE 754 floating point numbers, NaN is propagated through most operations, this method is able to detect which output elements are dependent on which input elements.

However, this method is not foolproof, and it may fail to detect the correct sparsity pattern in some cases, particularly if values of v are used in control-flow statements. If Diffsol does not detect the correct sparsity pattern, you can manually specify the jacobian. To do this, you need to use a custom struct that implements the OdeEquations trait, This is described in more detail in the "OdeEquations trait" section.

OdeEquations trait

While the OdeBuilder struct is a convenient way to specify the problem, it may not be suitable in all cases. Often users will want to provide their own structs that can hold custom data structures and methods for evaluating the right-hand side of the ODE, the jacobian, and other functions.

Traits

To create your own structs for the ode system, the final goal is to implement the OdeEquations trait. When you have done this, you can use the build_from_eqn method on the OdeBuilder struct to create the problem.

For each function in your system of equations, you will need to implement the appropriate trait for each function.

Non-linear functions (rhs, out, root). In this case the NonLinearOp trait needs to be implemented.
Linear functions (mass). In this case the LinearOp trait needs to be implemented.
Constant functions (init). In this case the ConstantOp trait needs to be implemented.

Additionally, each function needs to implement the base operation trait Op.

Once you have implemented the appropriate traits for your custom struct, you can use the OdeBuilder struct to specify the problem.

Non-linear functions

To illustrate how to implement a custom problem struct, we will take the familar logistic equation:

\[\frac{dy}{dt} = r y (1 - y/K),\]

Our goal is to implement a custom struct that can evaluate the rhs function $f(y, p, t)$ and the jacobian multiplied by a vector $f'(y, p, t, v)$.

To start with, lets define a few linear algebra types that we will use in our function. We need four types:

T is the scalar type (e.g. f64)
V is the vector type (e.g. NalgebraVec<T>)
M is the matrix type (e.g. NalgebraMat<T>)
C is the context type for the rest (e.g. NalgebraContext)

use diffsol::{NalgebraContext, NalgebraMat, NalgebraVec};
pub type T = f64;
pub type V = NalgebraVec<T>;
pub type M = NalgebraMat<T>;
pub type C = NalgebraContext;

Next, we'll define a struct that we'll use to calculate our RHS equations $f(y, p, t)$. We'll pretend that this struct has a reference to a parameter vector $p$ that we'll use to calculate the rhs function. This makes sense since we'll have multiple functions that make up our systems of equations, and they will probably share some parameters.

use crate::V;

pub struct MyRhs<'a> {
    pub p: &'a V,
}

Now we will implement the base Op trait for our struct. The Op trait specifies the types of the vectors and matrices that will be used, as well as the number of states and outputs in the rhs function.

use crate::{MyRhs, C, M, T, V};
use diffsol::{Op, Vector};

impl Op for MyRhs<'_> {
    type T = T;
    type V = V;
    type M = M;
    type C = C;
    fn nstates(&self) -> usize {
        1
    }
    fn nout(&self) -> usize {
        1
    }
    fn nparams(&self) -> usize {
        2
    }
    fn context(&self) -> &Self::C {
        self.p.context()
    }
}

Next we implement the NonLinearOp and NonLinearOpJacobian trait for our struct. This trait specifies the functions that will be used to evaluate the rhs function and the jacobian multiplied by a vector.

use crate::{MyRhs, T, V};
use diffsol::{NonLinearOp, NonLinearOpJacobian};

impl NonLinearOp for MyRhs<'_> {
    fn call_inplace(&self, x: &V, _t: T, y: &mut V) {
        y[0] = x[0] * x[0];
    }
}

impl NonLinearOpJacobian for MyRhs<'_> {
    fn jac_mul_inplace(&self, x: &V, _t: T, v: &V, y: &mut V) {
        y[0] = v[0] * (1.0 - 2.0 * x[0]);
    }
}

There we go, all done! This demonstrates how to implement a custom struct to specify a rhs function.

Constant functions

Now we've implemented the rhs function, but how about the initial condition? We can implement the ConstantOp trait to specify the initial condition. Since this is quite similar to the NonLinearOp trait, we will do it all in one go.

use crate::{C, M, T, V};
use diffsol::{ConstantOp, Op, Vector};

pub struct MyInit<'a> {
    pub p: &'a V,
}

impl Op for MyInit<'_> {
    type T = T;
    type V = V;
    type M = M;
    type C = C;
    fn nstates(&self) -> usize {
        1
    }
    fn nout(&self) -> usize {
        1
    }
    fn nparams(&self) -> usize {
        0
    }
    fn context(&self) -> &Self::C {
        self.p.context()
    }
}

impl ConstantOp for MyInit<'_> {
    fn call_inplace(&self, _t: T, y: &mut V) {
        y[0] = 0.1;
    }
}

Linear functions

Naturally, we can also implement the LinearOp trait if we want to include a mass matrix in our model. A common use case for implementing this trait is to store the mass matrix in a custom struct, like so:

use crate::{C, M, T, V};
use diffsol::{LinearOp, Op, Vector};

pub struct MyMass<'a> {
    pub p: &'a V,
}

impl Op for MyMass<'_> {
    type T = T;
    type V = V;
    type M = M;
    type C = C;
    fn nstates(&self) -> usize {
        1
    }
    fn nout(&self) -> usize {
        1
    }
    fn nparams(&self) -> usize {
        0
    }
    fn context(&self) -> &Self::C {
        self.p.context()
    }
}

impl LinearOp for MyMass<'_> {
    fn gemv_inplace(&self, x: &V, _t: T, beta: T, y: &mut V) {
        y[0] = x[0] * beta;
    }
}

ODE systems

So far we've focused on using custom structs to specify individual equations, now we need to put these together to specify the entire system of equations.

Implementing the OdeEquations trait

To specify the entire system of equations, you need to implement the Op, OdeEquations and OdeEquationsRef traits for your struct.

Getting all your traits in order

The OdeEquations trait requires methods that return objects corresponding to the right-hand side function, mass matrix, root function, initial condition, and output functions. Therefore, you need to already have structs that implement the NonLinearOp, LinearOp, and ConstantOp traits for these functions. For the purposes of this example, we will assume that you have already implemented these traits for your structs.

Often, the structs that implement these traits will have to use data defined in the struct that implements the OdeEquations trait. For example, they might wish to have a reference to the same parameter vector p. Therefore, you will often need to define lifetimes for these structs to ensure that they can access the data they need.

Note that these struct will need to be lightweight and should not contain a significant amount of data. The data should be stored in the struct that implements the OdeEquations trait. This is because these structs will be created and destroyed many times during the course of the simulation (e.g. every time the right-hand side function is called).

My custom ode equations

We'll define a central struct MyEquations that will hold the data for the entire system of equations. In this example, MyEquations will own a parameter vector p that is used by all the equations.

use crate::{C, V};
use diffsol::Vector;
pub struct MyEquations {
    pub p: V,
}

impl MyEquations {
    pub fn new() -> Self {
        MyEquations {
            p: V::zeros(2, C::default()),
        }
    }
}

As with individual functions, we also need to implement the Op trait for this struct.

use crate::{MyEquations, C, M, T, V};
use diffsol::{Op, Vector};

impl Op for MyEquations {
    type T = T;
    type V = V;
    type M = M;
    type C = C;
    fn nstates(&self) -> usize {
        1
    }
    fn nout(&self) -> usize {
        1
    }
    fn nparams(&self) -> usize {
        2
    }
    fn context(&self) -> &Self::C {
        self.p.context()
    }
}

Implementing the OdeEquations traits

OdeEquations and OdeEquationsRef are a pair of traits that together define the system of equations using the individual function structs that we defined earlier. The OdeEquationsRef trait is designed to allow individual function structs to be able to reference the data in the main OdeEquations struct, allowing them to share data.

use crate::{MyEquations, MyInit, MyMass, MyOut, MyRhs, MyRoot, V};
use diffsol::{OdeEquations, OdeEquationsRef, Vector};

impl<'a> OdeEquationsRef<'a> for MyEquations {
    type Rhs = MyRhs<'a>;
    type Mass = MyMass<'a>;
    type Init = MyInit<'a>;
    type Root = MyRoot<'a>;
    type Out = MyOut<'a>;
}

impl OdeEquations for MyEquations {
    fn rhs(&self) -> <MyEquations as OdeEquationsRef<'_>>::Rhs {
        MyRhs { p: &self.p }
    }
    fn mass(&self) -> Option<<MyEquations as OdeEquationsRef<'_>>::Mass> {
        Some(MyMass { p: &self.p })
    }
    fn init(&self) -> <MyEquations as OdeEquationsRef<'_>>::Init {
        MyInit { p: &self.p }
    }
    fn root(&self) -> Option<<MyEquations as OdeEquationsRef<'_>>::Root> {
        Some(MyRoot { p: &self.p })
    }
    fn out(&self) -> Option<<MyEquations as OdeEquationsRef<'_>>::Out> {
        Some(MyOut { p: &self.p })
    }
    fn set_params(&mut self, p: &V) {
        self.p.copy_from(p);
    }
    fn get_params(&self, p: &mut Self::V) {
        p.copy_from(&self.p);
    }
}

Creating the problem

Now that we have our custom OdeEquations struct, we can use it in an OdeBuilder to create the problem.

use crate::{MyEquations, M};
use diffsol::OdeBuilder;

pub fn build() {
    OdeBuilder::<M>::new()
        .p(vec![1.0, 10.0])
        .build_from_eqn(MyEquations::new())
        .unwrap();
}

Creating a solver

Once you have defined the problem, you need to create a solver to solve the problem. The available solvers are:

diffsol::Bdf: A Backwards Difference Formulae solver, suitable for stiff problems and singular mass matrices.
diffsol::Sdirk A Singly Diagonally Implicit Runge-Kutta (SDIRK or ESDIRK) solver. You can define your own butcher tableau using Tableau or use one of the pre-defined tableaues.
diffsol::ExplicitRk: An explicit Runge-Kutta solver. You can define your own butcher tableau using Tableau or use one of the pre-defined tableaues.

For each solver, you will need to specify the linear solver type to use. The available linear solvers are:

diffsol::NalgebraLU: A LU decomposition solver using the nalgebra crate.
diffsol::FaerLU: A LU decomposition solver using the faer crate.
diffsol::FaerSparseLU: A sparse LU decomposition solver using the faer crate.

Each solver can be created directly, but it generally easier to use the methods on the OdeSolverProblem struct to create the solver. For example:

use crate::{problem_implicit, LS};

pub fn create_solvers() {
    let problem = problem_implicit();

    // BDF method using
    let _bdf = problem.bdf::<LS>().unwrap();

    // Create a tr_bdf2 or esdirk34 solvers directly (both are SDIRK solvers with different tableaus)
    let _tr_bdf2 = problem.tr_bdf2::<LS>();
    let _esdirk34 = problem.esdirk34::<LS>();

    // Create a TSIT45 solver (a ERK method), this does not require a linear solver
    let _tsit45 = problem.tsit45();
}

Initialisation

Each solver has an internal state that holds information like the current state vector, the gradient of the state vector, the current time, and the current step size. When you create a solver using the bdf or sdirk methods on the OdeSolverProblem struct, the solver will be initialised with an initial state based on the initial conditions of the problem as well as satisfying any algebraic constraints. An initial time step will also be chosen based on your provided equations.

Each solver's state struct implements the OdeSolverState trait, and if you wish to manually create and setup a state, you can use the methods on this trait to do so.

For example, say that you wish to bypass the initialisation of the state as you already have the algebraic constraints and so don't need to solve for them. You can use the new_without_initialise method on the OdeSolverState trait to create a new state without initialising it. You can then use the as_mut method to get a mutable reference to the state and set the values manually.

use crate::{problem_implicit, LS};
use diffsol::{BdfState, OdeSolverState, RkState};

pub fn create_solvers_uninit() {
    let problem = problem_implicit();

    // Create a non-initialised state and manually set the values before
    // creating the solver
    let mut state = RkState::new_without_initialise(&problem).unwrap();
    // ... set the state values manually
    state.as_mut().y[0] = 0.1;
    let _solver = problem.tr_bdf2_solver::<LS>(state);

    // Do the same for a BDF solver
    let mut state = BdfState::new_without_initialise(&problem).unwrap();
    state.as_mut().y[0] = 0.1;
    let _solver = problem.bdf_solver::<LS>(state);
}

Note that each state struct has a as_ref and as_mut methods that return a StateRef or StateRefMut struct respectively. These structs provide a solver-independent way to access the state values so you can use the same code with different solvers.

Butchers Tableau

The Butcher tableau is a way of representing the coefficients of a Runge-Kutta method (see the wikipedia page). Diffsol uses the Tableau struct to represent the tableau, and this is used to create any of the SDIRK or ERK solvers. Diffsol has a few inbuilt tableaus that you can use, otherwise you can create your own by constructing an instance of Tableau

To create an SDIRK or ERK solver using a pre-defined tableau, you can use methods on the OdeSolverProblem struct like so:

use crate::{problem_implicit, LS, M};
use diffsol::Tableau;

pub fn create_solvers_tableau() {
    let problem = problem_implicit();

    // Create a SDIRK solver with a pre-defined tableau
    let tableau = Tableau::<M>::tr_bdf2(problem.context().clone());
    let state = problem.rk_state(&tableau).unwrap();
    let _solver = problem.sdirk_solver::<LS, _>(state, tableau);

    // Create an ERK solver with a pre-defined tableau
    let tableau = Tableau::<M>::tsit45(problem.context().clone());
    let state = problem.rk_state(&tableau).unwrap();
    let _solver = problem.explicit_rk_solver(state, tableau);
}

Solving the problem

Each solver implements the OdeSolverMethod trait, which provides a number of high-level methods to solve the problem.

solve - solve the problem from an initial state up to a specified time, returning the solution at all the internal timesteps used by the solver.
solve_dense - solve the problem from an initial state, returning the solution at a Vec of times provided by the user.

The following example shows how to solve a simple ODE problem up to $t=10$ using the solve method on the OdeSolverMethod trait. The solver will return the solution at all the internal timesteps used by the solver (_ys), as well as the timesteps used by the solver (_ts). The solution is returned as a matrix whose columns are the solution at each timestep.

use crate::{T, V};
use diffsol::{OdeEquations, OdeSolverMethod};

pub fn solve<'a, Solver, Eqn>(solver: &mut Solver)
where
    Solver: OdeSolverMethod<'a, Eqn>,
    Eqn: OdeEquations<T = T, V = V> + 'a,
{
    let (_ys, _ts) = solver.solve(10.0).unwrap();
}

solve_dense will solve a problem from an initial state, returning the solution as a matrix whose columns are the solution at each timestep in times.

use crate::{T, V};
use diffsol::{OdeEquations, OdeSolverMethod};

pub fn solve_dense<'a, Solver, Eqn>(solver: &mut Solver)
where
    Solver: OdeSolverMethod<'a, Eqn>,
    Eqn: OdeEquations<T = T, V = V> + 'a,
{
    let times = vec![0.0, 1.0, 2.0, 3.0, 4.0, 5.0];
    let _soln = solver.solve_dense(&times).unwrap();
}

Manual time-stepping

The fundamental method to step the solver through a solution is the step method on the OdeSolverMethod trait, which steps the solution forward in time by a single step, with a step size chosen by the solver in order to satisfy the error tolerances in the problem struct. The step method returns a Result that contains the new state of the solution if the step was successful, or an error if the step failed.

use diffsol::OdeSolverMethod;

pub fn solve_step<'a, Solver, Eqn>(solver: &mut Solver)
where
    Solver: OdeSolverMethod<'a, Eqn>,
    Eqn: diffsol::OdeEquations<T = f64> + 'a,
{
    while solver.state().t < 10.0 {
        if solver.step().is_err() {
            break;
        }
    }
}

The step method will return an error if the solver fails to converge to the solution or if the step size becomes too small.

Interpolation

Often you will want to get the solution at a specific time $t_o$, which is probably different to the internal timesteps chosen by the solver. To do this, you can use the step method to first step the solver forward in time until you are just beyond $t_o$, and then use the interpolate method to get the solution at $t_o$.

use crate::{T, V};
use diffsol::{OdeEquations, OdeSolverMethod};

pub fn solve_interpolate<'a, Solver, Eqn>(solver: &mut Solver)
where
    Solver: OdeSolverMethod<'a, Eqn>,
    Eqn: OdeEquations<V = V, T = T> + 'a,
{
    let t_o = 10.0;
    while solver.state().t < t_o {
        solver.step().unwrap();
    }
    let _soln = solver.interpolate(t_o).unwrap();
}

Stopping

There are two methods to halt the solver when a certain condition is met: you can either stop at a specific time or stop when a certain event occurs.

Stopping at a specific time is straightforward, as you can use the set_stop_time method on the OdeSolverMethod trait and then just check if the return value of the step method is Ok(OdeSolverStopReason::TstopReached)

Stopping at a certain event requires you to set a root function in your system of equations, see Root Finding for more information. During time-stepping, you can check if the solver has discovered a root by checking if the return value of the step method is Ok(OdeSolverStopReason::RootFound). RootFound holds the time at which the root was found, and you can use the interpolate method to obtain the solution at that time.

use diffsol::{OdeEquations, OdeSolverMethod, OdeSolverStopReason};

pub fn solve_match_step<'a, Solver, Eqn>(solver: &mut Solver)
where
    Solver: OdeSolverMethod<'a, Eqn>,
    Eqn: OdeEquations<T = f64> + 'a,
{
    solver.set_stop_time(10.0).unwrap();
    loop {
        match solver.step() {
            Ok(OdeSolverStopReason::InternalTimestep) => continue,
            Ok(OdeSolverStopReason::TstopReached) => break,
            Ok(OdeSolverStopReason::RootFound(_t)) => break,
            Err(e) => panic!("Solver failed to converge: {e}"),
        }
    }
    println!("Solver stopped at time: {}", solver.state().t);
}

Forward Sensitivities

The SensitivitiesOdeSolverMethod trait provides a way to compute the forward sensitivities of the solution of an ODE problem, using its solve_dense_sensitivities method.

This method computes both the solution and the sensitivities at the same time, at the time-points provided by the user. These are returned as a tuple containing:

A matrix of the solution at each time-point, where each column corresponds to a time-point and each row corresponds to a state variable.
a Vec of matrices containing the sensitivities at each time-point. Each element of the outer vector corresponds to the sensitivities for each parameter, and each column of the inner matrix corresponds to a time-point.

use crate::{M, T, V};
use diffsol::{OdeEquationsImplicitSens, SensitivitiesOdeSolverMethod};

pub fn solve_fwd_sens<'a, Solver, Eqn>(solver: &mut Solver)
where
    Solver: SensitivitiesOdeSolverMethod<'a, Eqn>,
    Eqn: OdeEquationsImplicitSens<T = T, V = V, M = M> + 'a,
{
    let t_evals = vec![0.0, 1.0, 2.0, 3.0, 4.0, 5.0];
    let (y, sens) = solver
        .solve_dense_sensitivities(t_evals.as_slice())
        .unwrap();
    println!("solution: {y:?}");
    for (i, dydp_i) in sens.iter().enumerate() {
        println!("sens wrt parameter {i}: {dydp_i:?}");
    }
}

Using Diffsol from other languages

Diffsol is a Rust library, but it can be used from other languages through various means. Two aspects that enable the portability of Diffsol are:

The DiffSL DSL: The DiffSL Domain Specific Language (DSL) allows users to specify ODEs, DAEs, and discretised PDEs in a way that is independent of the Rust language, but still can be JIT compiled to efficient native code.
WebAssembly: As with many Rust libraries, Diffsol can be easily compiled to WebAssembly (Wasm), allowing it to be used in web applications or other environments that support Wasm.

Note that a limitation is that these two aspects are not yet compatible with each other. DiffSL is the only part of Diffsol that is not yet available in WebAssembly, due to the complexity of compiling LLVM or Cranelift to Wasm. However, this is a planned feature for the future.

Python

The PyO3 and maturin crates allow you to easily create Python bindings for Rust libraries, and Diffsol is no exception. For a full example of how to use Diffsol from Python, see the Diffsol Python example. Below is a brief overview of some of the key features from this example.

Getting started

You can install maturin using pip:

pip install maturin

Then, you can create a template maturin project using pyo3:

maturin new -b pyo3 python-diffsol

cd python-diffsol
cargo add diffsol

Wrapping a Diffsol problem

First, lets define some types that we'll use, including a matrix M, vector V and context type C for the linear algebra operations, as well as a linear solver type LS. For the JIT compilation backend, we'll swap between the LLVM and Cranelift backends, depending on whether the diffsol-llvm feature is enabled.

use std::sync::{Arc, Mutex};

use diffsol::{
    error::DiffsolError, NonLinearOp, OdeBuilder, OdeEquations, OdeSolverMethod, OdeSolverProblem,
    Op, Vector, VectorHost,
};
use numpy::{
    ndarray::{s, Array2},
    IntoPyArray, PyArray2, PyReadonlyArray1,
};
use pyo3::{exceptions::PyValueError, prelude::*};

type M = diffsol::NalgebraMat<f64>;
type V = diffsol::NalgebraVec<f64>;
type C = diffsol::NalgebraContext;
type LS = diffsol::NalgebraLU<f64>;
#[cfg(not(feature = "diffsol-llvm"))]
type CG = diffsol::CraneliftJitModule;
#[cfg(feature = "diffsol-llvm")]
type CG = diffsol::LlvmModule;
type Eqn = diffsol::DiffSl<M, CG>;

Now lets create a simple struct that we're going to wrap to use from Python. This will just store a Diffsol problem that we can solve later. Since we're using the DiffSL equations type (see Eqn above), this isn't threadsafe, so we'll use an Arc<Mutex<_>> to allow us to share the problem between threads safely.

#[pyclass]
struct PyDiffsol {
    problem: Arc<Mutex<OdeSolverProblem<Eqn>>>,
}

Solving the problem

For our implementation for PyDiffsol, we'll create two methods: one for creating the problem from a DiffSL string (new), and another for solving the problem (solve). The solve method will take a set of parameters and a set of times to solve the problem at. It then creates an Array2 to store the solution (we are using the numpy crate to allow us to return a NumPy array), and then iterates over the times, stepping the solver and interpolating the solution at each time. If the problem has an output function, it will call that function to get the output, otherwise it will just return the state vector.

#[pymethods]
impl PyDiffsol {
    #[new]
    fn new(code: &str) -> Result<Self, PyDiffsolError> {
        let problem = OdeBuilder::<M>::new().build_from_diffsl(code)?;
        Ok(Self {
            problem: Arc::new(Mutex::new(problem)),
        })
    }

    #[pyo3(signature = (params, times))]
    fn solve<'py>(
        &mut self,
        py: Python<'py>,
        params: PyReadonlyArray1<'py, f64>,
        times: PyReadonlyArray1<'py, f64>,
    ) -> Result<Bound<'py, PyArray2<f64>>, PyDiffsolError> {
        let mut problem = self
            .problem
            .lock()
            .map_err(|e| PyDiffsolError(DiffsolError::Other(e.to_string())))?;
        let times = times.as_array();
        let params = V::from_slice(params.as_array().as_slice().unwrap(), C::default());
        problem.eqn.set_params(&params);
        let mut solver = problem.bdf::<LS>()?;
        let nout = if let Some(_out) = problem.eqn.out() {
            problem.eqn.nout()
        } else {
            problem.eqn.nstates()
        };
        let mut sol = Array2::zeros((nout, times.len()));
        for (i, &t) in times.iter().enumerate() {
            while solver.state().t < t {
                solver.step()?;
            }
            let y = solver.interpolate(t)?;
            let out = if let Some(out) = problem.eqn.out() {
                out.call(&y, t)
            } else {
                y
            };
            sol.slice_mut(s![.., i])
                .iter_mut()
                .zip(out.as_slice().iter())
                .for_each(|(a, b)| *a = *b);
        }
        Ok(sol.into_pyarray(py))
    }
}

Error handling

In our implementation, we need to handle errors that may occur when working with the Diffsol library. We'll create a custom error type PyDiffsolError that wraps the DiffsolError type, and implement the From trait to convert between the two types and the Python PyErr type. This will allow us to easily convert DiffsolError errors into Python exceptions.

struct PyDiffsolError(DiffsolError);

impl From<PyDiffsolError> for PyErr {
    fn from(error: PyDiffsolError) -> Self {
        PyValueError::new_err(error.0.to_string())
    }
}

impl From<DiffsolError> for PyDiffsolError {
    fn from(other: DiffsolError) -> Self {
        Self(other)
    }
}

The Python module

Finally, we need to create the Python module that will be imported by Python.

#[pymodule]
fn python_diffsol(m: &Bound<'_, PyModule>) -> PyResult<()> {
    m.add_class::<PyDiffsol>()?;
    Ok(())
}

Testing from Python

To build the Python module, we can use the maturin tool. This will compile the Rust code and create a Python wheel.

maturin develop

Then we can write and run a simple Python test to ensure everything is working correctly:

from python_diffsol import PyDiffsol
import numpy as np
import unittest


class TestStringMethods(unittest.TestCase):
    def test_solve(self):
        model = PyDiffsol(
            """
            in = [r, k]
            r { 1 } k { 1 }
            u_i { y = 0.1 }
            F_i { (r * y) * (1 - (y / k)) }
            """
        )
        times = np.linspace(0.0, 1.0, 100)
        k = 1.0
        r = 1.0
        y0 = 0.1
        y = model.solve(np.array([r, k]), times)
        soln = k / (1.0 + (k - y0) * np.exp(-r * times) / y0)
        np.testing.assert_allclose(y[0], soln, rtol=1e-5)
        
if __name__ == '__main__':
    unittest.main()

WebAssembly

Most of diffsol can be compiled to WebAssembly, allowing it to be used in web applications in the browser or other environments that support Wasm.

The main limitation is that JIT compilation of the DiffSL DSL is not yet supported in Wasm. This means you must use rust closures or the OdeEquations trait to specify your problems, rather than using the DiffSL DSL.

An example Yew app

To demonstrate using Diffsol in a WebAssembly application, we will create a simple Yew app that simulates a population dynamics model. The app will allow users to move sliders to adjust the parameters of the model and see the results in real-time using a plotly chart.

A demo built from this example is available online.

Getting started

First, you need to add the wasm32-unknown-unknown target to your Rust toolchain:

rustup target add wasm32-unknown-unknown

We'll use trunk to build our Yew app, so you need to install it if you haven't already:

cargo install trunk

Then, create a new Yew app:

cargo new example-yew-diffsol

And add the necessary dependencies to your Cargo.toml:

[dependencies]
yew = { version = "0.21.0", features = ["csr"] }
diffsol = { version = "0.6.2", features = [] }
nalgebra = { workspace = true }
yew-plotly = "0.3.0"
web-sys = "0.3.77"

The Yew app

We'll keep it simple and create a single component that will handle the population dynamics simulation. This will consist of the following:

An problem of type OdeSolverProblem that we'll create when the component is mounted, and keep wrapped in a use_mut_ref hook so we can modify it when the parameters change.
The initial parameters for the problem held in params, which will be stored in a use_state hook.
An onchange callback that will update the parameters of the problem when the user changes the sliders.
A Plotly component that will display the results of the simulation.
Two sliders for the user to adjust two out of the four parameters of the model.

use diffsol::{MatrixCommon, NalgebraVec, OdeBuilder, OdeEquations, OdeSolverMethod, Op, Vector};
use web_sys::HtmlInputElement;
use yew::prelude::*;
use yew_plotly::{
    plotly::{common::Mode, layout::Axis, Layout, Plot, Scatter},
    Plotly,
};
type M = diffsol::NalgebraMat<f64>;

#[function_component]
fn App() -> Html {
    let problem = use_mut_ref(|| {
        OdeBuilder::<M>::new()
            .rhs(|x, p, _t, y| {
                y[0] = p[0] * x[0] - p[1] * x[0] * x[1];
                y[1] = p[2] * x[0] * x[1] - p[3] * x[1];
            })
            .init(|_p, _t, y| y.fill(1.0), 2)
            .p(vec![2.0 / 3.0, 4.0 / 3.0, 1.0, 1.0])
            .build()
            .unwrap()
    });
    let params = use_state(|| {
        NalgebraVec::from_vec(
            vec![2.0 / 3.0, 4.0 / 3.0, 1.0, 1.0],
            problem.borrow().eqn.context().clone(),
        )
    });

    let onchange = |i: usize| {
        let params = params.clone();
        let problem = problem.clone();
        Callback::from(move |e: InputEvent| {
            let input: HtmlInputElement = e.target_unchecked_into();
            let value = input.value().parse::<f64>().unwrap();
            let mut new_params = NalgebraVec::clone(&params);
            new_params[i] = value;
            let mut problem = problem.borrow_mut();
            problem.eqn.set_params(&new_params);
            params.set(new_params);
        })
    };
    let oninput_a: Callback<InputEvent> = onchange(0);
    let oninput_b: Callback<InputEvent> = onchange(1);

    let (ys, ts) = {
        let problem = problem.borrow();
        let mut solver = problem.tsit45().unwrap();
        solver.solve(40.0).unwrap()
    };

    let prey: Vec<_> = ys.inner().row(0).into_iter().copied().collect();
    let predator: Vec<_> = ys.inner().row(1).into_iter().copied().collect();
    let time: Vec<_> = ts.into_iter().collect();

    let prey = Scatter::new(time.clone(), prey)
        .mode(Mode::Lines)
        .name("Prey");
    let predator = Scatter::new(time, predator)
        .mode(Mode::Lines)
        .name("Predator");

    let mut plot = Plot::new();
    plot.add_trace(prey);
    plot.add_trace(predator);

    let layout = Layout::new()
        .x_axis(Axis::new().title("t".into()))
        .y_axis(Axis::new().title("population".into()));
    plot.set_layout(layout);

    let a_str = format!("{}", params.get_index(0));
    let b_str = format!("{}", params.get_index(1));

    html! {
        <div>
            <h1>{ "Population Dynamics: Prey-Predator Model" }</h1>
            <Plotly plot={plot}/>
            <ul>
                <li>
                    <input oninput={oninput_a} type="range" id="a" name="a" min="0.1" max="3" step="0.1" value={a_str} />
                    <label for="a">{"a"}</label>
                </li>
                <li>
                    <input oninput={oninput_b} type="range" id="b" name="b" min="0.1" max="3" step="0.1" value={b_str} />
                    <label for="b">{"b"}</label>
                </li>
            </ul>
        </div>
    }
}

fn main() {
    yew::Renderer::<App>::new().render();
}

Building and running the app

You can build and run the app using trunk:

trunk serve

Benchmarks

The goal of this chapter is to compare the performance of the Diffsol implementation with other similar ode solver libraries.

The libraries we will compare against are:

Sundials: A suite of advanced numerical solvers for ODEs and DAEs, implemented in C.
Diffrax: A Python library for solving ODEs and SDEs implemented using JAX.
Casadi: A C++ library with Python anbd MATLAB bindings, for solving ODEs and DAEs, nonlinear optimisation and algorithmic differentiation.

The comparison with Sundials will be done in Rust by wrapping C functions and comparing them to Rust implementations. The comparsion with the Python libraries (Diffrax and Casadi) will be done by wrapping Diffsol in Python using the PyO3 library, and comparing the performance of the wrapped functions. As well as benchmarking the Diffsol solvers, this also serves as an example of how to wrap and use Diffsol in other languages.

Sundials Benchmarks

To begin with we have focused on comparing against the popular Sundials solver suite, developed by the Lawrence Livermore National Laboratory.

Test Problems

To choose the test problems we have used several of the examples provided in the Sundials library. The problems are:

robertson : A stiff DAE system with 3 equations (2 differential and 1 algebraic). In Sundials this is part of the IDA examples and is contained in the file ida/serial/idaRoberts_dns.c. In Sundials the problem is solved using the Sundials dense linear solver and Sunmatrix_Dense, in Diffsol we use the dense LU linear solver, dense matrices and vectors from the nalgebra library.
robertson_ode: The same problem as robertson but in the form of an ODE. This problem has a variable size implemented by duplicating the 3 original equations $n^2$ times, where $n$ is the size input parameter. In Sundials problem is solved using the KLU sparse linear solver and the Sunmatrix_Sparse matrix, and in Diffsol we use the same KLU solver from the SuiteSparse library along with the faer sparse matrix. This example is part of the Sundials CVODE examples and is contained in the file cvode/serial/cvRoberts_block_klu.c.
heat2d: A 2D heat DAE problem with boundary conditions imposed via algebraic equations. The size $n$ input parameter sets the number of grid points along each dimension, so the total number of equations is $n^2$. This is part of the IDA examples and is contained in the file ida/serial/idaHeat2D_klu.c. In Sundials this problem is solved using the KLU sparse linear solver and the Sunmatrix_Sparse matrix, and in Diffsol we use the same KLU solver along with the faer sparse matrix.
foodweb: A predator-prey problem with diffusion on a 2D grid. The size $n$ input parameter sets the number of grid points along each dimension and we have 2 species, so the total number of equations is $2n^2$ This is part of the IDA examples and is contained in the file ida/serial/idaFoodWeb_bnd.c. In Sundials the problem is solved using a banded linear solver and the Sunmatrix_Band matrix. Diffsol does not have a banded linear solver, so we use the KLU solver for this problem along with the faer sparse matrix.

Method

In each case we have taken the example files from the Sundials library at version 6.7.0, compiling and linking them against the same version of the code. We have made minimal modifications to the files to remove all printf output and to change the main functions to named functions to allow them to be called from rust. We have then implemented the same problem in Rust using the Diffsol library, porting the residual functions defined in the Sundials examples to Diffsol-compatible functions representing the RHS, mass matrix and jacobian multiplication functions for the problem. We have used the outputs published in the Sundials examples as the reference outputs for the tests to ensure that the implementations are equivalent. The relative and absolute tolerances for the solvers were set to the same values in both implementations.

There are a number of differences between the Sundials and Diffsol implementations that may affect the performance of the solvers. The main differences are:

The Sundials IDA solver has the problem defined as a general DAE system, while the Diffsol solver has access to the RHS and mass matrix functions separately. The Sundials CVODE solver has the problem defined as an ODE system and the mass is implicitly an identity matrix, and this is the same for the Diffsol implementation for the robertson_ode problem.
In the Sundials examples that use a user-defined jacobian (robertson, robertson_ode, heat2d), the jacobian is provided as a sparse or dense matrix. In Diffsol the jacobian is provided as a function that multiplies the jacobian by a vector, so Diffsol needs to do additional work to generate a jacobian matrix from this function.
Generally the types of matrices and linear solver are matched between the two implementations (see details in the "Test Problems" section above). However, the foodweb problem is slightly different in that it is solved in Sundials using a banded linear solver and banded matrices and the jacobian is calculated using finite differences. In Diffsol we use the KLU sparse linear solver and sparse matrices, and the jacobian is calculated using the jacobian function provided by the user.
The Sundials implementations make heavy use of indexing into arrays, as does the Diffsol implementations. In Rust these indexing is bounds checked, which affects performance slightly but was not found to be a significant factor.

Finally, we have also implemented the robertson, heat2d and foodweb problems in the DiffSl language. For the heat2d and foodweb problems we wrote out the diffusion matrix and mass matrix from the Rust implementations and wrote the rest of the model by hand. For the robertson problem we wrote the entire model by hand.

Results

These results were generated using Diffsol v0.5.1, and were run on a Dell PowerEdge R7525 2U rack server, with dual AMD EPYC 7343 3.2Ghz 16C CPU and 128GB Memory.

The performance of each implementation was timed and includes all setup and solve time. The exception to this is for the DiffSl implementations, where the JIT compilation for the model was not included in the timings (since the compilation time for the C and Rust code was also not included). We have presented the results in the following graphs, where the x-axis is the size of the problem $n$ and the y-axis is the time taken to solve the problem relative to the time taken by the Sundials implementation (so $10^0$ is the same time as Sundials, $10^1$ is 10 times slower etc.)

Bdf solver

Bdf

The BDF solver is the same method as that used by the Sundials IDA and CVODE solvers so we expect the performance to be largely similar, and this is generally the case. There are differences due to the implementation details for each library, and the differences in the implementations for the linear solvers and matrices as discussed above.

For the small, dense, stiff robertson problem the Diffsol implementation is very close and only slightly faster than Sundials (about 0.9).

For the sparse heat2d problem the Diffsol implementation is slower than Sundials for smaller problems (about 2) but the performance improves significantly for larger problems until it is at about 0.3. Since this improvement for larger systems is not seen in foodweb or robertson_ode problems, it is likely due to the fact that the heat2d problem has a constant jacobian matrix and the Diffsol implementation has an advantage in this case.

For the foodweb problem the Diffsol implementation is between 1.8 - 2.1 times slower than Sundials. It is interesting that when the problem is implemented in the DiffSl language (see benchmark below) the slowdown reduces to 1.5, indicating that much of the performance difference is likely due to the implementation details of the Rust code for the rhs, jacobian and mass matrix functions.

The Diffsol implementation of the robertson_ode problem ranges between 1.2 - 1.8 times slower than Sundials over the problem range.

tr_bdf2 & esdirk34 solvers (SDIRK)

Sdirk

The low-order tr_bdf2 solver is slower than the bdf solver for all the problems, perhaps due to the generally tight tolerances used (robertson and robertson_ode have tolerances of 1e-6-1e-8, heat2d was 1e-7 and foodweb was 1e-5). The esdirk34 solver is faster than bdf for the foodweb problem, but slightly slower for the other problems.

Bdf + DiffSl

The main difference between this plot and the previous for the Bdf solver is the use of the DiffSl language for the equations, rather than Rust closures. The trends in each case are mostly the same, and the DiffSl implementation only has a small slowdown comparared with rust closures for most problems. For some problems, such as foodweb, the DiffSl implementation is actually faster than the Rust closure implementation. This may be due to the fact that the rust closures bound-check the array indexing, while the DiffSl implementation does not.

This plot demonstrates that a DiffSL implementation can be comparable in speed to a hand-written Rust or C implementation, but much more easily wrapped and used from a high-level language like Python or R, where the equations can be passed down to the rust solver as a simple string and then JIT compiled at runtime.

Python (Diffrax & Casadi)

Diffrax is a Python library for solving ODEs and SDEs implemented using JAX. Casadi is a C++ library with Python and MATLAB bindings for solving ODE and nonlinear optimisation problems. In this benchmark we compare the performance of the Diffsol implementation with the Diffrax and Casadi libraries.

As well as demonstrating the performance of the Diffsol library, this benchmark also serves as an example of how to wrap and use Diffsol in other languages. The code for this benchmark can be found here. The maturin library was used to generate a template for the Python bindings and the CI/CD pipeline neccessary to build the wheels ready for distribution on PyPI. The pyo3 library was used to wrap the Diffsol library in Python.

Problem setup

We will use the robertson_ode problem for this benchmark. This is a stiff ODE system with 3 equations and 3 unknowns, and is a common benchmark problem. To illustrate the performance over a range of problem sizes, the Robertson equations were duplicated a factor of ngroups, so the total number of equations solved is 3 * ngroups.

The Diffrax implementation was based this on the example in the Diffrax documentation, which was further extending to include the ngroups parameter. As is already used in the example, the Kvaerno5 method was used for the solver. You can see the final implementation of the model here.

The Casadi implementation was written from scratch using Casadi's python API. You can see the final implementation of the model here.

The Diffsol implementation of the model written using the DiffSL language, you can see the final implementation of the model here.

The full implementation of the benchmark presented below can be seen here. The Diffsol benchmark is performed using the bdf solver. For ngroup < 20 it uses the nalgebra dense matrix and LU solver, and for ngroups >= 20 the faer sparse matrix and LU solver are used.

Differences between implementations

There are a few key differences between the Diffrax, Casadi and Diffsol implementations that may affect the performance of the solvers. The main differences are:

The Casadi implementation uses sparse matrices, whereas the Diffsol implementation uses dense matrices for ngroups < 20, and sparse matrices for ngroups >= 20. This will provide an advantage for Diffsol for smaller problems.
I'm unsure if the Diffrax implementation uses sparse or dense matrices, but it is most likely dense as JAX only has experimental support for sparse matrices. Treating the Jacobian as dense will be a disadvantage for Diffrax for larger problems as the Jacobian is very sparse.
The Diffrax implementation uses the Kvaerno5 method (a 5th order implicit Runge-Kutta method). This is different from the BDF method used by both the Casadi and Diffsol implementations.
Each library was allowed to use multiple threads according to their default settings. The only part of the Diffsol implementation that takes advantage of multiple threads is the faer sparse LU solver and matrix. Both the nalgebra LU solver, matrix, and the DiffSL generated code are all single-threaded. Diffrax uses JAX, which takes advantage of multiple threads (CPU only, no GPUs were used in these benchmarks). The Casadi implementation also uses multiple threads.

Results

The benchmarks were run on a Dell PowerEdge R7525 2U rack server, with dual AMD EPYC 7343 3.2Ghz 16C CPU and 128GB Memory. Each benchmark was run using both a low (1e-8) and high (1e-4) tolerances for both rtol and atol, and with ngroup ranging between 1 - 10,000. The results are presented in the following graph, where the x-axis is the size of the problem ngroup and the y-axis is the time taken to solve the problem relative to the time taken by the Diffsol implementation (so 10^0 is the same time as Diffsol, 10^1 is 10 times slower etc.).

Python

Diffsol is faster than both the Casadi and Diffrax implementations over the range of problem sizes and tolerances tested, although the Casadi and Diffsol implementations converge to be similar for larger problems (ngroups > 100).

The Diffsol implementation outperforms the other implementations significantly for small problem sizes (ngroups < 5). E.g. at ngroups = 1, Casadi and Diffrax are between 3 - 40 times slower than Diffsol. At these small problem sizes, the dense matrix and solver used by Diffsol provide an advantage over the sparse solver used by Casadi. Casadi also has additional overhead to evaluate each function evaluation, as it needs to traverse a graph of operations to calculate each rhs or jacobian evaluation, whereas the DiffSL JIT compiler will compile to native code using the LLVM backend, along with low-level optimisations that are not available to Casadi. Diffrax is also significantly slower than Diffsol for smaller problems, this might be due to (a) Diffrax being a ML library and not optimised for solving stiff ODEs, or (b) double precision is used, which again is not a common use case for ML libraries, or (c) perhaps the different solver methods (Kvaerno5 vs BDF) are causing the difference.

As the problem sizes get larger, the performance of Diffrax and Casadi improve rapidly relative to Diffsol, but after ngroups > 10 the performance of Diffrax drops off again, probably due to JAX not taking advantage of the sparse structure of the problem. The performance of Casadi continues to improve, and for ngroups > 100 it is comparable to Diffsol. By the time ngroups = 10,000, the performance of Casadi is identical to Diffsol.

Keyboard shortcuts

Diffsol