The 2014 West African Ebola virus (EBOV) epidemic is the largest Ebola virus outbreak to date with 7492 cases (4108 confirmed) and 3439 deaths (2078 confirmed) as of 3 October 2014^{1}. While previous EBOV outbreaks remained localized, the current epidemic has spread across Guinea, Sierra Leone and Liberia with a localized outbreak in Nigeria. (Both Senegal and the USA have reported one imported case with no local transmission, as of 3 October 2014). Relief efforts have so far been ineffective at containing the disease, due largely to porous borders, a lack of education about the disease and degraded public health infrastructure^{2}^{,}^{3}^{,}^{4}. Moreover, the epidemic has spread to major urban areas, further facilitating its continued spread and complicating containment efforts.
Patients exposed to EBOV first undergo an incubation period of 221 days before becoming infectious^{3}^{,}^{5}^{,}^{6}^{,}^{7}. Once infectious, patients either die between days 6 and 16 or may begin to recover between days 6 and 11^{3}^{,}^{8}^{,}^{9}. Although patients who recover are generally noninfectious after convalescence, EBOV has been isolated 33 days after the onset of symptoms from mucosal membranes and 61 days after the onset of symptoms from semen^{10}^{,}^{11}. There is currently no known effective treatment or vaccine for Ebola virus disease and relief efforts focus on bringing down the case fatality rate through supportive care and disease containment^{3}.
In Gire et al. (2014)^{12}, 99 Ebola genomes from 78 patients from the Sierra Leone outbreak are provided. This represents about 70% of confirmed cases during late May to mid June. Based on the phylogenies in Gire et al. (2014)^{12}, it is likely that the Sierra Leone outbreak was started by the simultaneous introduction of two genetically distinct viruses. The initial 14 confirmed cases in Sierra Leone have all been epidemiologically linked to the funeral of a traditional healer in Guinea, supporting a single introduction event. The first split of the Sierra Leone sequences, separating the two introductions, is supported in all posterior trees presented in Gire et al. (2014)^{12} as well as in our preliminary analyses. We focused on the introduction causing the larger outbreak (72 sampled patients) and ignored the smaller outbreak (6 sampled patients).
We use these genomic data to estimate epidemiological parameters. We employed the Bayesian MCMC framework BEAST2^{13} , applying a range of epidemiological tree priors to the sequencing data. The tree priors are based both on birthdeath^{14} and coalescent^{15} models. Furthermore, we estimated epidemiological parameters based on the trees from Gire et al. (2014)^{12} using a maximum likelihood framework implemented in R^{16} .
The larger outbreak, consisting of 72 Ebola sequences, is analysed in BEAST2^{13} to estimate the epidemiological parameters relevant to the epidemic. We employ birthdeath and coalescent approaches as models for epidemic spread.
Birthdeath models assume a transmission rate with which infected individuals transmit, a becomingnoninfectious rate with which infected individuals recover or die, and a sampling probability, which is the probability at which an infectious person is sampled and sequenced. Such a model naturally accounts for incomplete sampling and, since the sampling probability is a parameter in our model, this quantity may also be estimated. In particular, we run birthdeath analyses using the models depicted in Figure 1. We explain the assumptions of these models in the following.
The birthdeath (BD) model^{17} allows the three parameters, transmission rate, becomingnoninfectious rate, and sampling probability to change in a piecewise constant fashion.
To model the spread of EBOV more realistically, we further extend the birthdeath model to allow for an exposed class of infected people. The exposed class is entered upon infection, and an exposed individual moves from the exposed to the infectious class with a constant incubation rate. This model is referred to as the birthdeath exposedinfected (BDEI) model^{18}^{,}^{19}. In the BDEI model we assume that only infectious people are sampled, since exposed patients are asymptomatic.
BD and BDEI assume that individuals become noninfectious upon sampling. As Ebola may be transmitted also after sampling (transmission at funerals constitutes a major source of infection^{2}^{,}^{3}^{,}^{20}) we further run the birthdeath sampledancestors (BDsa) model^{21}, which extends BD by assuming that sampled individuals become noninfectious upon sampling with probability r and remain infectious with probability 1r. When r<1 the phylogeny may contain sampled ancestors, meaning samples do not have to coincide with tips in the tree, but a sample in the tree may have sampled descendants.
The BDSIR model^{22} is a variant of the BD model in which we explicitly account for susceptible hosts, meaning the epidemic slows down once the number of susceptible hosts declines. This model includes an explicit susceptible class and the number of initial susceptible hosts as a parameter, which was estimated using a LogNormal(8,4) prior distribution.
We also fit a deterministic coalescent model to the EBOV sequence data. We use the structured coalescent framework of Volz (2012)^{15}, assuming an exposed and infectious class (as in the BDEI model), to probabilistically take into account whether lineages reside in exposed or infectious individuals. This coalescent SEIR model (coalSEIR) was implemented in BEAST2 and epidemiological parameters were estimated along with the genealogy from the sequence data, with the initial number of susceptible hosts set to 1 million, following Althaus (2014)^{23}.
In all analyses we first assumed a constant basic reproductive number R_{0}, which is the ratio of the transmission rate over the becomingnoninfectious rate. Second we allowed the reproductive number to change twice: at the time of the oldest sample (May 26) and midway between the oldest and youngest samples (June 6). The becomingnoninfectious rate and the sampling probability were assumed to remain constant throughout the epidemic outbreak.
We assumed the following Bayesian prior distributions for our analyses. The prior for R_{0} is LogNormal(0,1.25). The time of origin, i.e. the time of infection of the first person in the Sierra Leone outbreak, was assumed to be uniform during the 6 (and for computational reasons in some analyses, 3) months prior to the most recent sample at time 18 June 2014, thus any start time of the Sierra Leone outbreak from 18 December 2014 (or 18 March 2014) was equally likely. For the incubation rate and the becomingnoninfectious rate we assumed a Gamma prior with shape 0.5 and scale 1/6 days^{1}, truncated, such that the periods of being exposed and infectious lie between 1 and 26 days, and such that all times in this interval have considerable support. The median of these priors is 0.11 days^{1}, meaning that the expected time of being exposed and infectious is 9 days each. As no sequencing effort has been performed prior to the oldest sample, collected on 25 May 2014, we assume that the sampling probability is 0 prior to that date and constant afterwards. After that date, we assume a uniform prior on [0,1] for the sampling probability in the analyses without exposed class. To improve computational performance in the more complex BDEI model, we assume a Beta(70,30) prior distribution, supporting a sampling proportion around 70%, based on our own results as well as Gire et al. (2014)^{12} , and also fix the mean clock rate to 1.984e^{3}/site/year^{12} . The priors on all epidemiological parameters as well as the mean clock rate were identical between the coalSEIR and BDEI models.
Instead of reporting the becomingnoninfectious rate and the incubation rate, we report their inverse values, which are the expected times of being exposed (incubation time) and being infectious. We report the median posterior value for each parameter together with the shortest interval containing 95% of the posterior samples.
Maximum likelihood analysis using birthdeath models
As a comparison, we performed maximum likelihood (ML) parameter estimation using the posterior trees from Gire et al. (2014)^{12}. Again, we first eliminated the Guinea samples and the 6 samples from the second Sierra Leone outbreak. Thus all trees analyzed consist of 72 tips. From the 10001 posterior trees provided by the authors of Gire et al. (2014)^{12} , we eliminated the first 1001 trees as burnin, and then chose every 100th tree from the remaining 9000 trees, yielding a set of 90 trees. For these 90 trees, we performed an analysis under the BD model with constant and timevarying reproductive number and BDEI with constant R_{0} using the R package TreePar v3.1^{16}. Additionally, we applied a birthdeath model to the trees quantifying the amount of superspreading in the population, BDss^{18}. This model extends the constantrate BD model, assuming that individuals belong to either one of two classes with a unique R_{0}. Individuals transmit to both classes. We report the median maximum likelihood value together with the shortest interval containing 95% of the ML estimates from all 90 trees.
Figure 2 displays the estimated R_{0} values for the different phylodynamic methods. Overall the different Bayesian methods simultaneously inferring trees and parameters yield median estimates between 1.652.18. The maximum likelihood methods inferring parameters based on fixed trees obtain lower estimates. In the following we discuss the results in detail.
Bayesian birthdeath analysis
R_{0}/R_{e }_{initial }  R_{e} _{middle} 
R_{e} _{recent } 
Incubation time (days) 
Infectious time (days) 
Sampling probability 
Epidemic origin 
Tree MRCA 


BD 1  1.65 _{(1.022.70)} 
–  –  –  6.09 _{(2.8418.84)} 
0.65 _{(0.201.00)} 
May 7 _{(7/422/5)} 
May 15 _{(3/522/5)} 
BD 3  0.95 _{(0.222.56)} 
1.57 _{(0.732.91)} 
1.81 _{(1.073.03)} 
–  6.15 _{(3.2217.94)} 
0.70 _{(0.271.00)} 
April 8 _{(30/1221/5)} 
May 12 _{(24/423/5)} 
BDsa 1  1.75 _{(1.042.95)} 
–  –  –  6.75 _{(3.1424.10)} 
0.60 _{(0.171.00)} 
May 8 _{(10/422/5)} 
May 15 _{(3/523/5)} 
BDsa 3  0.96 _{(0.202.65)} 
1.61 _{(0.743.00) } 
1.88 _{(1.093.23) } 
–  6.54 _{(3.2422.10)} 
0.65 _{(0.191.00)} 
April 9 _{(31/1220/5)} 
May 12 _{(24/423/5)} 
BDSIR  1.81 _{(1.122.84)} 
–  –  –  6.64 _{(3.6118.78)} 
0.70 _{(0.241.00)} 
May 4 _{(11/419/5)} 
May 15 _{(3/522/5)} 
BDEI 1^{*}  2.18 _{(1.463.22)} 
–  – 
5.6 _{(fixed)} 
2.29 _{(1.235.62)} 
0.72 _{(0.630.80)} 
May 10 _{(13/423/5)} 
May 14 _{(3/522/5)} 
BDEI 3^{*}  1.77 _{(0.594.35)} 
1.92 _{(0.803.64)} 
2.86 _{(1.584.78)} 
5.6 _{(fixed)} 
2.75 _{(1.417.07)} 
0.71 _{(0.620.79)} 
May 8 _{(14/322/5)} 
May 13 _{(28/422/5)} 
BDEI 1^{*}  1.85 _{(1.172.76)} 
–  – 
2.3 _{(fixed)} 
3.92 _{(2.159.47)} 
0.71 _{(0.620.79)} 
May 9 _{(15/421/5)} 
May 14 _{(4/522/5)} 
BDEI 3^{*}  1.63 _{(0.544.09)} 
1.66 _{(0.713.13)} 
2.45 _{(1.284.17)} 
2.3 _{(fixed)} 
4.72 _{(2.4610.74)} 
0.71 _{(0.620.79)} 
May 5 _{(12/323/5)} 
May 13 _{(29/422/5)} 
BDEI 1^{*} 
2.18 _{(1.243.55)} 
–  – 
4.92 _{(2.1123.20)} 
2.58 _{(1.246.98)} 
0.71 _{(0.620.80)} 
May 8 _{(10/421/5)} 
May 14 _{(3/522/5)} 
BDEI 3^{*}  2.00 _{(0.665.46)} 
1.85 _{(0.573.71)} 
3.15 _{(1.436.09)} 
5.92 _{(2.4924.92)} 
2.71 _{(1.289.22)} 
0.71 _{(0.630.80)} 
May 5 _{(3/421/5)} 
May 13 _{(30/422/5)} 
Table 1 shows the results of the Bayesian birthdeath analyses, including the times of origin and of the most recent common ancestor (MRCA). Under the constant birthdeathsampling model (BD1), we estimate an R_{0} of 1.65 (1.022.70), a sampling proportion of 65% (20100%) and an infectious period of 6 days (2.8418.84). There is no indication of a change in the reproductive number before mid June.
Since the BD model does not account for an incubation period, we also perform a simulation study in which we simulate an outbreak with incubation periods and analyse it under BD. This simulation shows that we can robustly estimate R_{0 }under the BD model even without including an explicit incubation period, and that the estimate of the infectious period is roughly equal to the sum of incubation and infectious period in the simulations (Supplementary Table 1).
Allowing individuals to stay infectious upon sampling using the sampled ancestors model (BDsa) leads to very similar estimates of the epidemiological parameters. In fact, we only estimate two sampled ancestors in our dataset and the probability to become noninfectious upon sampling is large, 0.93 (0.711.00).
The epidemiological parameters are also estimated similarly under the BDSIR model, in which incidence can decline over time due to depletion of susceptible hosts. The initial number of susceptible individuals is estimated at 46000 (median) with large uncertainty (95% HPD, 380534000). Estimating a similar R_{0} under a model that explicitly allows for the depletion of susceptible hosts over time suggests that the epidemic had not surpassed the exponential growth phase by mid June.
Using the BDEI model, which takes the incubation period into account, leads to slightly larger estimates of the basic reproductive number, 2.18 (1.243.55). There is a lot of uncertainty in our estimate of the incubation period of 5 days (2.1123.20 days). Figure 3B shows that there is only little deviation of the posterior from the prior. The infectious period is estimated to be rather short, 2.58 days (1.246.98). Here, the posterior deviates a lot from the prior (Figure 3C). When we fix the incubation time to a shorter (2.3 days) or longer (5.6 days, as in Althaus (2014)^{23}) period, we see a slight decrease or increase in the basic reproductive number, respectively. The times of origin (median May 8) and the MRCA (median May 14) show little variation.
Bayesian analysis in a coalescent framework
R_{0}  Incubation time (days) 
Infectious time (days) 
Epidemic origin 
Tree MRCA 


coalSEIR  1.90 _{(1.004.50)} 
6.23 _{(1.5326.05)} 
8.66 _{(1.07626.07)} 
May 5 _{(24/320/5)} 
May 14 _{(29/422/5)} 
Epidemiological estimates obtained under the coalSEIR model were generally very similar to those obtained under the BDEI model, which was expected given that both approaches include an incubation period and account for uncertainty in the genealogy. Table 2 shows the estimated medians and 95% HPD intervals for the coalSEIR model parameters. While the credible intervals for R_{0 }were wider under the coalSEIR than for the BDEI, R_{0 }was estimated to be 1.90, just lower than under the BDEI model. Likewise, both methods returned a median epidemic origin time in the first weeks of May. We are not able to precisely estimate the duration of the exposed or infectious periods under the coalescent model, and our estimates appear to be largely informed by the prior, see Figure 3B and C.
Maximum Likelihood birthdeath analyses based on fixed trees
R_{0}/R_{e} _{initial} 
R_{e} _{middle} 
R_{e} _{recent} 
Incubation time (days) 
Infectious time (days) 
Sampling probability 


BD 1  1.34 _{(1.121.55)} 
–  –  –  4.45 _{(2.856.29)} 
0.7 _{(fixed)} 
BD 3  1.18 _{(0.541.72)} 
1.17 _{(0.871.59)} 
1.62 _{(1.371.90)} 
–  4.74 _{(3.266.99)} 
0.7 _{(fixed)} 
BDEI 1 
1.45 _{(1.251.70)} 
–  – 
2.29 _{(0.083.24)} 
2.07 _{(0.944.80)} 
0.7 _{(fixed)} 
BD 1  1.24 _{(1.081.37)} 
–  –  –  3.04 _{(2.164.32)} 
0.35 _{(fixed)} 
BD 3  1.02 _{(0.631.50)} 
1.10 _{(0.871.47)} 
1.44 _{(1.291.69)} 
–  3.28 _{(2.264.60)} 
0.35 _{(fixed)} 
BDEI 1  1.31 _{(1.191.45)} 
–  –  1.81 _{(1.222.58)} 
1.22 _{(0.612.22)} 
0.35 _{(fixed)} 
R_{0} _{(overall)} 
R_{0} _{(class 1)} 
R_{0} _{(class 2)} 
Fraction class 1 
Infectious time (days) 
Sampling probability 


BDss  1.57 _{(1.281.91)} 
2.63 _{(1.428.31)} 
0.84 _{(0.001.40)} 
0.45 _{(0.070.87)} 
5.16 _{(3.507.35)} 
0.7 _{(fixed)} 
Finally, we performed maximum likelihood parameter inference on fixed trees from Gire et al. (2014)^{12} . Because not all four parameters are jointly identifiable^{24}, and because our Bayesian analysis confirmed previous estimates of the sampling probability, we fixed this parameter to 0.7 for times more recent than the oldest sample. Again, sampling probability was set to 0 prior to the oldest sample. To understand the sensitivity of our estimates with respect to this setting, we performed a second analysis fixing the sampling probability to 0.35.
For each of the 90 posterior trees, we obtained the maximum likelihood parameter estimates, see Table 3. Overall, assuming different fixed sampling probabilities did not significantly affect estimates. R_{0} was estimated slightly lower compared to the full Bayesian analyses above (medians 1.311.45). Again we did not find support for the reproductive number changing through time. A likelihood ratio test, comparing the results for three intervals for the reproductive number vs. a constant R_{0,} does not support three intervals for the effective reproductive number over one interval (for a sampling probability of 0.7, 9 trees out of 90 supported three intervals for R_{e} at the 95% level, and for a sampling probability of 0.35, 11 trees supported three intervals).
The upper bound for the number of days in the infected class across all analyses is 6.99 days. Thus, both full Bayesian and maximum likelihood methods suggest a time in exposed and infectious class that is lower than previous estimates.
As in the Bayesian analyses, when applying the BDEI method to the 90 Sierra Leone Ebola trees we obtain a slightly higher R_{0} when including the incubation period into the model.
When applying a birthdeath model assuming two population groups with unique transmission rates, we observe that half of the population appears to have a large R_{0} (median 2.63, 95% HPD 1.428.31), and the other half does not appear to effectively spread the disease (R_{0} median 0.84, 95% HPD 0.001.40). However, likelihood ratio tests do not strongly support the structured model over the unstructured model.
We used phylodynamic methods to estimate key epidemiological parameters of the current West African EBOV outbreak in Sierra Leone from sequencing data. Although we used a wide range of different models, we consistently recovered very similar estimates. In particular, we estimated the basic reproductive number of EBOV in Sierra Leone up to the time of the most recent sample (18 June 2014). The medians across the Bayesian methods were 1.652.18, with the most plausible model (BDEI) yielding a median estimate of 2.18 (95% HPD 1.243.55). We did not find any support for a reduction of the reproductive number prior to the most recent sample. Thus our results show that public health interventions during May and June were likely ineffective at reducing transmission in Sierra Leone. Furthermore, analyses suggest that there might be superspreaders among the infected population, however the significance of the population structure results should be reevaluated once larger datasets are available. We estimate expected incubation and infectious periods of 4.92 (2.1123.20) and 2.58 (1.246.98) days. Using our birthdeath methods, we confirm the previously estimated sampling proportion of 70%.
Our R_{0} estimates are within the range of estimates for previous outbreaks and other estimates for the current epidemic. For the 1995 EBOV Kikwit outbreak in the Democratic Republic of the Congo, R_{0 }was estimated as 1.359±0.128^{25}, 1.83±0.06^{26} or 2.7 (1.92.8)^{20}. Towers et al. (2014)^{27} estimate an R_{0} of about 1.5 for the current West African EBOV epidemic, but only R_{0}=1.2 (1.0,1.5) for the Sierra Leone epidemic, assuming incubation and infectious time periods of at most 7 days. Gomes et al. (2014)^{28} estimate an R_{0} of 1.8 (1.52.0) for the current West African EBOV outbreak while Althaus (2014)^{23} estimates an R_{0} of 2.53 (2.412.67) for the epidemic in Sierra Leone. Althaus (2014)^{23} further provides estimates of R_{0} for Guinea, 1.51 (1.501.52), and Liberia, 1.59 (1.571.60). Moreover he estimates that the R_{e} in Sierra Leone has been declining since the onset of control measures and dropped below 1 during July. During the period from our samples his estimates of R_{e} vary between 2.7 and 1.47. Nishiura and Chowell (2014)^{29} give estimates of R_{e} in Sierra Leone and Liberia of between 1.4 and 1.7 during June and July, with R_{e} in Guinea fluctuating erratically around 1 during the same period. Fisman et al. (2014)^{30} estimates values of R_{0 }between 1.66 and 2.19 for the West African epidemic, however they estimate an R_{0} of 8.33 for the epidemic in Sierra Leone alone which is clearly outside our HPD intervals. The WHO Ebola Response Team estimated an R_{0} of 2.02 (1.792.26) for Sierra Leone from empirical data^{32}. They also provide estimates for Guinea, 1.71 (1.442.01), and Liberia, 1.83 (1.721.94). These estimates are consistent with our estimates of R_{0}.
We estimate short incubation and infectious periods with all of our birthdeath methods. Estimates of the exposed and infectious periods for the 1995 EBOV Kikwit outbreak range from 5.3±0.23 and 5.61±0.19 days, respectively^{26}, to 10.11±0.713 and 6.52±0.56 days^{25}. However, for the 2000 Sudan Ebola virus (SUDV) outbreak in Uganda, Chowell et al. (2004)^{26} estimated the exposed and infectious periods to be 3.35±0.19 and 3.5±0.67 days. To the best of our knowledge the only estimates of the exposed and infectious periods for the current West African EBOV epidemic are by the WHO Ebola Response Team^{32}, based on observational data. They estimate an incubation period of 9.0±8.1 days (median of 8 days) for 201 patients in Sierra Leone with single exposures. No overall infectious period is estimated, but instead the authors provide separate estimates for the infectious period based on disease outcome. From the onset of symptoms in patients sampled in Sierra Leone they estimate a period of 8.6±6.9 days (from 128 patients) until death, 17.2±6.2 days (from 70 patients) until hospital discharge and 4.6±5.1 days (from 395 patients) until hospitalisation.
Our HPD interval estimates for the incubation period are in line with other estimates, however our estimates for the infectious period are substantially shorter than estimates from the current epidemic^{32}. Judging by the amount of variation in both the estimates from observational and genetic data we conclude that the incubation and infectious periods are highly variable and difficult to estimate accurately. However, we recover consistent estimates for the total time of infection (incubation + infectious periods), meaning there is a significant amount of information in the present dataset on the length of the infection. Sequencing data from more patients might help to get more confined credible intervals.
We see that a method accounting for an incubation period yields higher R_{0} estimates compared to a method assuming all infected individuals are infectious. As having an exposed class/incubation period will slow the initial growth of the epidemic, it is likely that estimates obtained under models that do not include the incubation period are lower to compensate for the slower growth rate. Thus it makes sense that R_{0} was estimated to be higher when the incubation period is included.
It is also noteworthy that both our BDEI and coalSEIR analyses converged on similar estimates for R_{0} and the epidemic origin. Thus our epidemiological estimates appear robust to the specific assumptions of these two models. Nonetheless, we do observe that our credible intervals for R_{0} and the exposed and infectious periods are considerably wider under the coalescent than the birthdeath model. This may seem counterintuitive as the deterministic coalescent models used here ignore demographic stochasticity and should therefore underestimate the true level of uncertainty about the parameters. The confidence intervals being in fact wider under the coalescent may reflect the fact that the birthdeath models are using information entering from the sampling times (while the coalescent conditions on sampling) to obtain more precise estimates of the epidemiological parameters.
Overall, we show that our inferences of the epidemiological dynamics of the current West African EBOV outbreak are robust to the model used and also consistent with estimates from previous outbreaks as well as other estimates of the current epidemic. Our hope is that more sequencing data from the epidemic will be made available in the immediate future. New data will allow us to estimate how the effective reproductive number has changed since June and allow us to estimate the incubation and infectious periods more reliably. Such estimates would be invaluable not only for evaluating the success of containment efforts, but also for planning future interventions.
All Bayesian methods will become available within the BEAST2^{13} addon “phylodynamics” (https://github.com/BEAST2Dev/phylodynamics) and are available directly from us prior to the official release. The maximum likelihood methods are available within our R packages TreeSim v2.1^{31} and TreePar v3.1^{16}. We provide an R script specific to the Ebola analyses on our website (www.bsse.ethz.ch/cevo).
The authors have declared that no competing interests exist.