Mathematical modeller

Mathematical modeller

John is Professor of Infectious Disease Modelling and Dean of the Faculty of Epidemiology and Population Health at the Londons School of Hygiene and Tropical Medicine.

The devastating epidemic of Ebola virus disease (EVD) in West Africa has taken an enormous toll in terms of human suffering and economic loss. As of 18th January 2015, Sierra Leone is the worst affected country, with over 8000 confirmed and probable cases reported

In this study we used a mathematical model of Ebola virus transmission to estimate how the reproduction number,

We used a combination of patient and situation report data to obtain reliable and up-to-date time series of the number of confirmed and probable cases. The WHO publishes the weekly number of confirmed and probable cases at the subnational level for Sierra Leone on their website

To model disease transmission, we used a stochastic SEIR transmission model^{,}^{,}^{,}

We fitted the model to the time series of weekly reported cases (confirmed and probable) using a Bayesian approach

We then used the model to simulate 5000 potential epidemic trajectories from 18th January 2015 until 29 March 2015, and thus to predict the number of cases there would be in the community. Each simulation started with a value of the reproduction number sampled from the posterior distribution on the latest data point. We also conducted a sensitivity analysis by taking the averaged posterior distribution of

While this paper was under review, we collected data for two more weeks (weeks ending 25th January and 2nd February 2015). Here, instead of re-fitting to the latest data we decided to assess how well our forecasts matched these two additional data points. Weekly updates of our fit and forecast to the latest data is available online

We focused our analysis on the nine districts of Sierra Leone that have reported the most cases since 1st November 2014: Bo, Bombali, Kambia, Koinadugu, Kono, Moyamba, Port Loko, Tonkolili and Western Area (Figure 1 and 2). They have a combined population of 4.7 million, representing 75% of the total Sierra Leonean population. Data source comparison for each district shows that the WHO data published on 11th January 2015 gives the more complete estimate of the number of confirmed and probable cases up to a cut-off date ranging from 30th November 2014 to 4th January 2015, depending on the district (see Figure 2, red lines). After this cut-off date, the situation reports provide more complete estimates of the number of confirmed and probable cases (Figure 2).

The map shows cumulative number of confirmed and probable cases reported up to 18 January 2015 in the fourteen districts of Sierra Leone. Darker shades of red indicate a greater number of cases.

The vertical red line indicates the cut-off date after which we used the situation report database since it provided more reliable information than the patient database.

The estimated number of weekly reported cases in the fitted model was consistent with the observed data, suggesting our framework was able to capture the overall pattern of transmission over time (Figures 3, 4 and 5, upper panels). The epidemic appears to be peaking or in decline in all districts. However, in the most heavily affected districts (Western Area, Port Loko) there were still more than 30 cases per week in January 2015 (Figure 3). The situation is less clear in Kambia, where the number of cases has been stable in January, around 10 cases per week, but only one case was reported on the week ending 2nd February 2015 (Figure 4).

The reproduction number,

District
Bo
0.35 (0.21 - 0.55)
Bombali
0.28 (0.16 - 0.52)
Kambia
0.97 (0.71 - 1.2)
Koinadugu
0.098 (0.024 - 0.36)
Kono
0.24 (0.078 - 0.63)
Moyamba
0.39 (0.11 - 1.1)
Port Loko
0.46 (0.34 - 0.62)
Tonkolili
0.28 (0.15 - 0.49)
Western Area
0.32 (0.2 - 0.47)

The shaded area is the interquartile range on estimates (grey) and projections (blue), the red solid line is the bed capacity and the red dotted line on the lower panels represents

In Kambia,

Finally, the situation in Kono, Moyamba and Koinadugu, where

The shaded area is the interquartile range on estimates (grey) and projections (blue), the red solid line is the bed capacity and the red dotted line on the lower panels represents

The shaded area is the interquartile range on estimates (grey) and projections (blue), the red solid line is the bed capacity and the red dotted line on the lower panels represents

We used our fitted model to estimate the number of assessment and treatment beds needed over time and compared this with the number of beds available (Table 2). Our results suggest that bed capacity has remained below what was needed since the outset of the Ebola outbreak in most areas, but that this is now changing. In Western Area and Port Loko, for instance, the bed capacity increased dramatically in December, which coincides with the peaking of the epidemic curve (Figure 3). In Bombali and Tonkolili, where the epidemic first started to decline, current bed capacity is predicted to be sufficient (Figure 3 and 5). However, three districts still suffer from a lack of treatment beds (Kambia, Koinadugu, Kono), in particular Kambia, where the assessment bed capacity will become insufficient to isolate all suspected cases in case the epidemic would increase in the near future (Figure 4).

District
Actual assessment beds expected
Assessment beds needed (18th Jan 2015)
Assessment beds needed (29th March 2015)
Actual treatment beds expected
Treatment beds needed (18th Jan 2015)
Treatment beds needed (29th March 2015)
Bo
20
112 (89 - 156)
0 (0 - 22)
50
2 (2 - 3)
0 (0 - 0)
Bombali
296
124 (104 - 156)
0 (0 - 10)
210
6 (5 - 8)
0 (0 - 0)
Kambia
55
64 (54 - 75)
61 (17 - 150)
0
10 (9 - 12)
10 (2 - 24)
Koinadugu
83
0 (0 - 1)
0 (0 - 0)
0
1 (0 - 1)
0 (0 - 0)
Kono
58
60 (50 - 79)
0 (0 - 14)
0
20 (17 - 27)
0 (0 - 5)
Moyamba
4
63 (47 - 79)
0 (0 - 111)
22
4 (3 - 5)
0 (0 - 7)
Port Loko
326
211 (191 - 234)
16 (6 - 39)
219
36 (32 - 40)
2 (1 - 6)
Tonkolili
199
108 (87 - 141)
0 (0 - 10)
100
5 (4 - 7)
0 (0 - 0)
Western Area
388
583 (529 - 651)
13 (4 - 49)
556
73 (66 - 81)
1 (0 - 6)

We compiled data from daily situation reports from Sierra Leone and fitted an EVD transmission model to these reports to estimate how the reproduction number changed in different parts of the country from August 2014 to January 2015. Our analysis suggests that the epidemic is peaking in Sierra Leone, particularly in the more heavily populated Western Area, and that the reproduction number is currently close to or below the epidemic control threshold of

We separated the bed demand for EHCs/CCCs from that for ETCs in the model. This is because EHC/CCC planning must anticipate a high proportion of suspected but non-EVD cases. By contrast, we have assumed that ETCs received only confirmed cases. In reality, this separation is subtler as many ETCs proceed to triage and can therefore fill the gap between EHC/CCC capacity and bed demand, such as in Bo.

Our forecast approach assumes that the situation remains unchanged from what is inferred from the last data-point. Comparing our forecasts with two additional weeks of data, we found that this assumption held for the districts showing a steady decline in the number of cases (Bo, Bombali, Koinadugu, Moyamba, Tonkolili and Western Area). In the three other districts the number of cases dropped below our IQR forecast estimates during either the first (Kono and Port Loko) or second (Kambia) additional week. However, the increase in the number of cases during the following week in Kono and Port Loko suggest that one should be cautious in interpreting the recent decline of case in Kambia. By fitting these two additional data points, the model would be able to suggest whether a change in the transmission and/or in the reporting of cases occurred recently in these districts. Finally, we conducted a sensitivity analysis on our forecast by using the average

In many areas the drop in the reproduction number has coincided with an increase in bed capacity. For instance, in Western Area the fall in the reproduction number in October occurred at the same time as several ETCs were opened, notably the Hastings-Freetown ETC organised at the Police Training School (125 beds). However, since we did not include an explicit mechanism by which bed capacity affected transmission in the model^{,}^{,}

Community transmission was represented using a single parameter in the model because it has been shown that it is not possible to robustly estimate multiple routes of transmission - such as the contribution from funerals - for Ebola from a single incidence curve^{,}

Real-time modelling is key to tracking changes in

Subnational time-series from the patient database were downloaded from the WHO website

To model Ebola virus disease (EVD) transmission, we used a stochastic SEIR framework accounting for hospitalisation and delay in case reporting (Figure 1 and Table 2 of the Appendix). We assumed that the population in each district was initially fully susceptible to infection. Based on published empirical estimates from the WHO Ebola response team_{c}) and hospital (I_{h}) by assuming a mean time from onset to hospitalisation, 1/τ = 4.3 days

Transition
Description
Rate
Note
_{1}Infection
_{t}S(I_{c} + I_{h})/Nlog(
_{t}) is a Wiener process
_{1} → E_{2}Progression of incubation
_{1}
_{2} → I_{c}Onset of symptoms and infectiousness
_{2}
_{c} → I_{h}Hospitalisation and notification
_{c}Includes multiplicative Gamma noise
_{h} → RRemoval
_{h}

We modelled the time-varying transmission parameter, _{t}, by a Wiener process_{t}):

d log(_{t})=_{t},

where σ is the volatility of the Brownian motion and was estimated when fitting the model to data. Intuitively, the higher the volatility the larger are the changes in _{t}

_{t}=β_{t}ΔS_{t}/N _{t}_{t}_{t}_{t}Δ

It has been reported that the time from onset to notification of EVD cases is over-dispersed_{c} → _{h}). More precisely, we used a multiplicative Gamma noise

The second source of variability in the weekly incidence data compiled from the SitReps or collected by WHO comes from under reporting. This can be split further between:

The proportion,

The proportion, _{t}_{t}=1) for the time period where we use the SitReps instead of the patient database (see Figure 2 of the main text).

Based on these observations, we modelled the weekly incidence reported in the SitReps, _{t}, given the simulated incidence _{t}, by a negative binomial distribution with _{t}_{t}_{t}Z_{t}_{t}_{t}^{2}_{t}^{2}_{t}^{2}.

For each of the nine districts, we fitted our model and estimated three parameters: over-dispersion of reported cases; volatility (i.e. standard deviation) of the Wiener process on log(_{t}

Our inference framework, described below, also allowed us to estimate how the time dependent reproduction number, _{t} = β_{t}Δ

Model fitting was performed using the SSM library

In SSM, parameters are transformed to ensure positivity (log transform) or any boundedness (e.g. logit transform for probabilities) and the pMCMC is implemented with an adaptive multivariate normal proposal distribution on the transformed parameter space. The adaptive procedure of the proposal kernel operates in two steps. First, the size of the covariance matrix is adapted at each iteration to achieve an optimal acceptance rate of ~23%

The SSM library also implements a Kalman-simplex algorithm (ksimplex), which was used to maximise the (non-normalized) posterior distribution and thus initialise the pMCMC close to the mode of the target. Since the simplex algorithm only guaranties convergence to a local maximum, we ran 1000 independent ksimplex initialised from parameter sets sampled from the prior distribution. We selected the simplex that converged to the highest posterior density value, and used the outputted parameter set to initialise 2 independent pMCMC chains of 100,000 iterations. We visually checked that the 2 chains converged to the same stationary distribution and combined them after appropriate thinning and accounting for burn-in. For each posterior sample, a filtered trajectory was sampled from all particles (with probability equal to its overall likelihood). Figures 3, 4 and 5 of the main text show the median and interquartile range of these trajectories at each time point. Figures were plotted using the R software

To estimate future bed requirements, we simulated 5000 stochastic trajectories from 18th January 2015 until 29 March 2015. In order to propagate the uncertainty of the Bayesian posterior distribution, each simulation was started by sampling a set of parameters and states from the joint posterior distribution on 18th January 2015 (last fitted data point).

In order to estimate bed requirements for EHCs/CCCs, we accounted for the number of suspected but non-EVD cases who remain isolated until status result. These numbers are reported in the MoHS SitReps and allowed us to compute the weekly proportion of positive EVD cases. Figure 2 of the Appendix shows how this proportion changed over time in each district. Overall, the proportion of EVD cases decreased over time in most districts and was around 30% in January. In the model, we assumed that the bed demand in EHCs/CCCs was equal to the number of EVD cases in their first three days post-notification divided by the empirical proportion of EVD cases at that time. We used the average value over January during the forecast.

In this study, we have used the estimate of the case fatality rate (CFR) for all cases, based on status outcome, which was reported to be 73.4%

Lower estimates of the contact rate (β) in order to obtain the same estimates of the reproduction number over time as when using a shorter infectious period.

An increase of the time spent in hospital and thus of the number of beds required.

Accordingly, although a lower CFR would not affect our conclusions regarding the temporal changes in the reproduction number, it would affect our estimates of the number of beds required. For instance, using the CFR of hospitalized cases (60.3% instead of 73.4%) for the whole population would translate into an increase of 22% of the time spent in hospital and thus of the number of beds required. Note however that in our study we assumed that all cases (including the 40% under-reported) would present to the hospital when calculating the number of beds required. As such, our estimates can already be seen as conservative, even without the additional 22% due to a lower CFR in hospitalized patients.

In the main analysis, we sampled from the posterior distribution of the reproduction number on the latest data point. To conduct a sensitivity analysis, we also took the averaged posterior distribution over the first three weeks of January, which smoothed the most recent changes in reproduction number. Overall, the forecasts remained much the same, except for Kono and Moyamba, where using the average let to a noticeably larger projection of future case numbers, as well as increased uncertainty, which comes from the sudden changes in epidemic dynamics in early January.

Blue dashed line shows median forecast as in Figures 3-5, with the reproduction number sampled from the posterior at the last data point. Orange dashed line shows median forecast when the reproduction number is sampled from the average posterior distribution over the first three weeks of January. Shaded regions represent interquartile range. Fitted data are plotted as filled circles and the two additional, non-fitted, data as open triangles.

We would like to alert readers of a typo in the section Appendix > Model and parameters > Reporting of cases. The correct expression for the variance of the observation process is: Var(X_t|Z_t)=ρκ_t(1 - ρκ_t)Z_t + φ^2ρ^2κ_t^2Z_t^2

In this study, we fitted time-series of Sierra Leone up to 18th of January 2015. Actually, we are also tracking the Ebola epidemic in Guinea and Liberia at the subnational level. Weekly updates of all our real-time analyses are published online: http://ntncmch.github.io/ebola/ Do not hesitate to contact us if you have any question. Anton