Postdoc @ Hokkaido University, infectious disease modeling

Background: Japan experienced a multi-generation outbreak of measles from March to May, 2018. The present study aimed to capture the transmission dynamics of measles by employing a simple mathematical model, and also forecast the future incidence of cases.

Methods: Epidemiological data that consist of the date of illness onset and the date of laboratory confirmation were analysed. A functional model that captures the generation-dependent growth patterns of cases was employed, while accounting for the time delay from illness onset to diagnosis.

Results: As long as the number of generations is correctly captured, the model yielded a valid forecast of measles cases, explicitly addressing the reporting delay. Except for the first generation, the effective reproduction number was estimated by generation, assisting evaluation of public health control programs.

Conclusions: The variance of the generation time is relatively limited compared with the mean for measles, and thus, the proposed model was able to identify the generation-dependent dynamics accurately during the early phase of the epidemic. Model comparison indicated the most likely number of generations, allowing us to assess how effective public health interventions would successfully prevent the secondary transmission.

Measles is a highly contagious viral infectious disease transmitted by aerosol ^{,}

From March 2018, an abrupt outbreak in Okinawa has been notified ahead of the “Golden Week”, the longest vacation period of the year (i.e., from 28 April to 6 May 2018). The index case was a 30-year-old Taiwanese man who had a travel history to Thailand in early March. On 17 March 2018, he flew to Okinawa, and on the third day of his stay in Okinawa, he sought for medical service. Following an incubation period of 11-12 days after the diagnosis of the index case

During this outbreak, measles cases have been confirmed at governmental diagnostic research facilities and reported in real-time. Each report was regarded as a snapshot of the growing epidemic curve that was used for forecasting of the future course of the outbreak. To understand better the transmission dynamics during the course of an outbreak, we implemented the future forecast to infer public health control activities. While not explicitly assessing the control activities in our exposition, the purpose of the present study is to capture the transmission dynamics of measles by employing a simple parsimonious mathematical model and to forecast future generations of measles incidence.

Measles is clinically diagnosed by the presence of a generalized rash, fever, and catarrh symptoms, such as cough, coryza, or conjunctivitis, and then laboratory confirmed. The laboratory confirmation is performed by detection of measles-specific immunoglobulin M (IgM) antibodies ^{,}^{,}^{,}

Due to close contact tracing practice, we assumed that all cases were certainly diagnosed and reported. To quantify the underlying epidemiological dynamics and delay distribution from illness onset to laboratory confirmation, we employed the maximum likelihood estimation technique. Specifically, we considered the total (composite) likelihood function _{Σ}_{n}_{h}_{n}

We assumed that _{h}_{h}_{h}_{h}_{t}

where

The incidence function Λ_{t}_{Λ}) is modeled by sequential generation process (hereafter, referred to as the generation-dependent model). Each new case has an ability to generate new secondary infections with the probability density function of the generation time _{t}_{1} cases distributed in time according to the distribution _{t}_{2} tertiary cases according to the same distribution _{t}_{1}(_{t}_{2}(_{t}_{1} and _{2} at time

where _{2}(0) is equal zero. The above mentioned method is restricted to three generations, however, we can assume any arbitrary number of new generations for describing an epidemic curve. For example, if we account for up to four generations, the total number of cases is written accordingly as: _{1}(_{t} + _{2}((_{t}_{3}(_{t}

However, the rate Λ_{t} needs to be normalized to the expected cumulative number of all cases

where _{m}_{R}_{R} = _{1} + _{1}_{2} + _{1}_{2}_{3}. See Appendix B for the derivation of generation-based model. Due to the normalization, the parameter _{1} cannot be recovered, only its lower bound can be identified by manually counting the number of secondary cases who can certainly identified as caused by the index case. The model fit for shorter time horizon of forecasting may require a smaller number of generations, and thus, the latest formula would need to be modified (e.g. _{3} = 0 in case of three generations only, and _{2} = _{3} = 0 in case of two generations only, governing the entire dynamics of the observed epidemic data). As for the generation time distribution _{t}^{2} (the average of two previously reported estimates

The total likelihood is:

subject to maximization with respect to five parameters (_{2}, _{3}, _{h}_{h}_{0} as well as the Hessian matrix _{0}). To reconstruct the confidence intervals and the compute the prediction interval, we implement the matrix _{0}^{2} = diag(^{-1}(_{0}))). Then for each identical set of parameters, we obtain a possible variation in estimated parameter values. Finally, by taking 2.5th and 97.5th percentile points of the simulated distributions, we obtain 95% prediction intervals of incidence function.

To perform forecasting exercise, we used an epidemic curve of new measles cases, routinely collected and updated every eight days. As a result, we obtained multiple snapshots of the epidemic curve, all initiated with the date of exposure to the index case on 17 March 2018, but constrained by the date of publication (ranged from 1 April to 25 May with a time step of eight days). Data points of each epidemic curve were then imputed to our model to identify expected number of cases over the time interval of that epidemic curve. Furthermore, the cases were forecasted for an extended time period, up until 8 June.

The number of new cases of measles by the date of illness of onset and date of laboratory conformation are shown as Figures 1A and 1B, respectively. As of 21 August 2018, a total of 124 laboratory confirmed cases have been reported in Japan, of which 99 cases have been in Okinawa, 23 in Aichi, 1 in Kanagawa prefecture, and 1 in Tokyo.

(A) Date of illness onset of measles cases reported in Okinawa, Aichi, Kanagawa prefectures, and Tokyo Metropolis, Japan. Illness onset was unknown for 6 cases notified in Okinawa prefecture, thus, was assumed to be 5 days before laboratory confirmation. (B) Date of laboratory confirmation of measles cases reported in Okinawa, Aichi, Kanagawa prefectures, and Tokyo Metropolis.

Using the observed epidemiological data from Okinawa, Aichi, Kanagawa, and Tokyo reported from 1 April to 25 May 2018, unknown parameters were estimated as shown in Figure 2. In addition, a penalized likelihood for the models of different number of generations was compared based on Akaike Information Criterion (AIC). The minimal value of AIC was used to determine the best-fit number of generations for a given date of publication of the dataset.

The course of outbreak observed before 9 April fitted well using only two generations. Afterwards, the third generation was identified, and the fourth generation appeared since 25 April. The effective reproduction number of the third generation _{2} became greater than one on 9 April. When a new generation was identified to better explain the observed incidence pattern, the model with greater number of generations fitted better than the model with fewer generations. Importantly, our model explicitly accounted for the time delay from illness onset to diagnosis, and thus the effective reproduction number of the most recent generation avoided serious underestimation. However, the expected value of _{4} during the early stage was smaller than during the later stage – the number of cases in the fifth generation was not substantial in May, and thus _{4} was accompanied by a wider confidence interval.

"#" denotes the assumed number of generations in the model. ht is the probability mass function from the time of illness onset to laboratory confirmation. _{m}

Using the latest snapshot of the epidemic curve published on 25 May, the mean delay from illness onset to confirmation was 4.5 days (95% CIs: 4.0-5.0), and the variance was 6.1 day^{2} (95% CIs: 4.3-9.0). The total number of symptomatic cases

Figure 3 shows the forecasted course of the measles epidemic by using the proposed generation-dependent mathematical model, and the data of confirmed cases of each epidemic curve according to different confirmed date. In the first stage, the model describes only the initial part of the outbreak, but the estimates become certainly improved and the 95% prediction intervals progressively become narrower as more data are used.

Performance of forecasting for each epicurve (legend) is compared to the number of reported cases in the latest update (bar chart in grey) by date of illness onset of measles cases (A) and date of laboratory confirmation of measles cases (B). Dashed lines denote the forecasting part for each snapshot of the epicurve.

The following Video available online (Figure 4,

The present study tackled real-time forecasting of measles, employing a generation-specific modelling approach. A simple functional model with generation structure was employed, and the time delay from illness onset to diagnosis was explicitly taken into account. The proposed model helped not only to forecast the future incidence but also to obtain the generation-specific estimates of the effective reproduction number. AIC values helped to identify the most likely number of generations in real-time, allowing us to assess how good public health interventions successfully prevented transmission events during the outbreak. To our knowledge, the present study is the first study to apply the functional generation-dependent model to the context of real-time forecasting.

There are two take home messages. First, the generation-dependent mathematical model successfully helped to anticipate the likely size of the future epidemic in real time. Because the variance of the generation time for measles is relatively limited compared to the mean, the generation-specific number of cases was even manually identified during the early phase of the outbreak

Second, the estimation of the effective reproduction number as the weight for the mixture distribution of the generation time is also a side-product of the model (see Appendix B). Without doubt, the reproduction number helps to evaluate preventive measures during the outbreak. Nevertheless, our study also addressed a possible underestimation of the effective reproduction number for the latest generation once considering an explicit time delay from illness onset to laboratory confirmation. Although we did not incorporate stochasticity in the functional model, our model was able to capture the mechanistic pattern of the transmission dynamics.

Few technical limitations must be described. First, the absence of stochasticity in the transmission process is a systematic limitation of the proposed model. To capture the stochasticity of the transmission process, we must employ a stochastic process model to describe the transmission event, e.g., a branching process or a renewal process. Second, we did not explicitly use susceptibility of the exposed population, and also the background information on the traced contacts. While those datasets were not routinely collected, their use could help increase the validity of the forecast. Third, vaccination history of cases was not taken into consideration. Depending on residual immunity, we may observe a different clinical form of measles, i.e., modified measles. This could lead to a different (potentially longer) time delay from illness onset to diagnosis compared with the primary form of measles. Lastly, our assumptions included a fixed delay distribution function over the whole period of the outbreak. As we additionally verified, the inclusion of a step-like temporal dependence of the mean and variance of the delay function with given switching times (e.g. 29 March and/or 3 April as the dates of raised awareness

In conclusion, we demonstrated a simple generation-dependent model that was able to adequately capture an observed transmission pattern of the measles outbreak in Japan, 2018. The proposed model also helped predict the future incidence and evaluate public health control measures. Polishing the forecasting model further, we can achieve an eventual routine forecast and evaluation the outbreaks while maintaining the model structure as simple as possible.

The authors declare no competing interests.

The code snippets used for simulations and generation of figures as well as the epidemiological count data are accessible from the GitHub repository:

Hiroshi Nishiura (

Here we describe the fitting procedure when the delay distribution function _{h}^{(0)} = {_{h}^{(0)}, _{h}^{(0)}}. The second distribution describes all cases with the time of illness onset later than a calendar time _{h}^{(1)} = {_{h}^{(1)}, _{h}^{(1)}}. The likelihood to describe the time delay from illness onset to laboratory confirmation is given by the formula:

where _{n}

Whereas, the total (composite) likelihood is given by a product of two likelihoods written above:

The total likelihood is maximized with respect to each parameter in the set _{Σ}_{n}_{n}, i_{t}

Model performance is shown in Figure 5 that can be compared with previous case of time-independent distribution

For any epicurve only the cases with minimal AIC values over a set of varied number of generations are shown. The switch in delay function indicates the optimal switching time, i.e., the calendar time on which the distribution is considered to have changed. The mean and variance of the delay distribution function before the switching day are indicated by the variable _{t}^{(0)}, after the switching day by the variable _{t}^{(1)}. The AIC values for a model with fixed distribution of the delay are shown in the last column, while the minimal AIC values are additionally indicated in red.

Our generation-dependent model rests on a well-known renewal equation, i.e.,

where

where _{m}

Here _{m}_{m}_{m}

Replacing the right-hand side of (B2) by that of (B3), we obtain Λ_{t} in the main text. As such, it should be noted that we perform forecasting by estimating the generation-dependent average number of secondary cases generated by a single primary case, which is interpreted as the cohort reproduction number (i.e., the average number of secondary cases generated by a primary case who was born at calendar time

The R script used for calculations and collected data are accessible as a part of the GitHub repository (https://github.com/aakhmetz/MeaslesJapan2018). The Jupyter notebook can be seen here: http://tiny.cc/MeaslesJapan2018.