The largest outbreak of Ebola virus ever recorded continues to spread in West Africa. In response, the World Health Organisation (WHO) has declared a public health emergency of international concern. The potential for international dissemination of the Ebola virus, via international air travel, is an obvious risk that has already generated considerable interest^{1}^{,}^{2}^{,}^{3}^{,}^{4}.

Preliminary assessments of the risk of international spread focused solely on the historic volume of international passenger flight traffic between countries ^{1}^{,}^{2}. This has been followed by more detailed analyses, using historic passenger flight itinerary data to evaluate the expected number of internationally exported Ebola virus infections ^{4}, and using a globally-connected metapopulation epidemic model that allows for epidemic outbreaks to be seeded via importation from passengers and then to dynamically evolve ^{3}. Effectively, these latter two studies respectively consider the two distinct scenarios that we focus on herein: (i) assuming that significant numbers of cases of Ebola remain confined to Guinea, Liberia and Sierra Leone, and using historic Australian Customs passenger arrival card data into Australia; and, (ii) assuming potential global spread based upon historic international flight data.

On 27 October 2014 the Australian Government announced a policy change indicating that, effective immediately, new visas would not be granted, and existing temporary visas for individuals who had not yet departed to Australia would be cancelled ^{5}. Similar measures have since been proposed in Canada ^{6}. Visa restrictions, and in particular restrictions on humanitarian visas, pose a significant ethical challenge and have the potential for wide-ranging political ramifications. As such it is important to determine the efficacy of these policies in reducing the risk of Ebola arrival.

The relative risk to Australia, in comparison to countries such as Ghana, Senegal, and the United Kingdom, is small, and hence an assessment of the risk of Ebola importation to Australia has not been previously reported in existing studies, which where not focussed on any specific country. Such an assessment is of obvious benefit to decision makers within Australia, both in its own right, and to allow for the assessment of new visa restriction policy.

We develop data-informed models appropriate to each scenario, and parameterised these models using passenger arrival card or international flight data, and WHO case data from West Africa as at 3 December 2014 ^{21} . An assessment of the risk under each scenario is reported. This consists of an estimate of the probability of importation at the beginning of each month between January 2015 and July 2015. This time period has been considered, as it is likely that wide-spread vaccination will not become available until April at the earliest ^{8}, and hence the continued spread within West Africa is highly probable. In addition, we report a comparison between the predicted risk based on WHO case data reported on 17 October 2014 and 3 December 2014.

**Direct travel model**

In order to assess the risk of an individual with Ebola travelling directly from West Africa to Australia, the direct travel model was constructed. Epidemic dynamics in each of Liberia, Sierra Leone, and Guinea were evolved via a discrete-time stochastic Susceptible-Exposed-Infectious-Removed (SEIR) ^{9}^{,}^{10} epidemic model (with timesteps of one day; details in Appendix 1), and then the risk of entry into Australia was calculated based on the number of individuals flying into Australia from each of these countries (i.e., irrespective of stopovers). All passenger data was compiled and provided by the Australian Department of Immigration and Border Protection (www.immi.gov.au). Australia is one of the few countries in the world which is able to accurately measure the number of people entering and leaving the nation ^{20}. This arises from Australia’s global isolation and its island geography. With modern border surveillance systems this has meant that all movements into and out of the country are channelled through a relatively small number of sea and air ports, where there is an advanced electronic system to record their movements and some basic characteristics. These include: (i) their origin and destination; (ii) passenger status (i.e., Australian resident, tourist, or immigrant); and (iii) whether it is a permanent, long term (one year or more, but temporary) or short term movement.

The SEIR-type model is a standard epidemiological model for diseases with dynamics like those of Ebola, with the exposed period in particular necessary to account for the latency between initial exposure to the disease and the later onset of symptoms and infectiousness. More complex models have been used to analyse Ebola dynamics in some studies^{3}^{,}^{11}, however SEIR provides sufficient detail in this context.

For a given day, the probability an exposed individual did not travel to Australia that day was:

with the total number of individuals eligible to fly, the number of exposed individuals, the number of passengers arriving per day, and thus the cumulative probability that at least one exposed individual travelled to Australia on or before day T is:

As a baseline case, we assumed a mean latent period of 5.3 days (i.e., σ = 1/5.3) and a mean infectious period of 5.61 days (i.e., γ = 1/5.61), based on parameters reported by Althaus ^{12}. We estimated the contact rate β = 0.21 so as to ensure a resulting doubling time of approximately 45 days. This doubling time was calculated based on weekly new confirmed cases data for Sierra Leone ^{22}. The average daily rate of passenger arrivals into Australia was calculated from each of Liberia, Sierra Leone and Guinea, both total arrivals and limited to solely Australian residents, from Australian Customs arrival data from 2004-05 to 2013-14.

We considered: a baseline scenario with these parameters and historical transport levels; a scenario in which transport from West Africa was reduced by 50%; a scenario under which visas from West Africa are cancelled and no longer granted (i.e., limiting entry to only Australian residents); and a scenario under which the Ebola contact rate within West Africa was reduced by 20%. We also considered model sensitivity to increases or decreases in mean latent period, in particular demonstrating the impact of increasing latent period to 10 days or decreasing to 3 days. We report median cumulative probabilities of a case entering Australia, based on 1000 simulation runs for each scenario, along with 95% prediction intervals in tables/figures.

All modelling and analysis was performed using R version 3.1.0^{19}. Baseline model code is available at github.com/robert-cope/simEbola. We are unable to make specific flight or passenger data available at this time.

**Global network secondary outbreak model**

In order to assess the risk of an Ebola case entering Australia via an outbreak in a secondary source location (i.e., via an outbreak in a country that does not currently have an outbreak), the global network secondary outbreak model was constructed. Each country worldwide was treated as an individual population, connected through the global flight network. Within each country, spread of ebola was modelled via the same discrete-time stochastic SEIR epidemic model as in the previous section. Each day, the number of individuals in each class was updated, and individuals were allowed to fly between countries: the number of flying individuals between each country being the average daily number of flying individuals between each pair of airports in the countries in question. Data on the annual number of international flights per airport and the number of seats, per airplane per airport, travelling worldwide for the year 2013, were obtained from OAG Aviation Worldwide Ltd (www.oag.com/). Susceptible and exposed individuals (i.e., those either not infected or infected and not yet showing symptoms) were allowed to fly, and the number of exposed individuals flying was modelled as a binomial random variable with probability being the proportion of exposed individuals of those eligible to fly.

Simulations of this model were progressed 211 days (3 December 2014 — 1 July 2015) and the spread and growth of Ebola virus cases into each country recorded. Disease parameters were as described above. We report results of: (i) a baseline model with historical infection and transport rates and uniform infection rates in each country; (ii) a scenario under which countries that have experienced at least 100 cases then have 50% reduced outgoing traffic; and (iii) a scenario in which higher economic status countries have reduced contact rate. We report, for each scenario, the cumulative probability of entry into Australia at each timestep based on 50 simulations, i.e., the proportion of those simulations for which an entry into Australia had occurred.

Specifically, for the economically-moderated contact rate scenario, countries were classified into four classes based on existing World Bank income classifications ^{16}: low income, low-mid income, mid-high income, and high income. The contact rate within each country was modified based on this classification: low income countries used an unmodified contact rate; high income countries used a decreased contact rate, such that in these countries the resulting epidemic had and thus would not experience unmitigated growth; and low-mid and mid-high income countries were assigned contact rates equidistant between these two extremes.

**Time-series comparison**

These models were initially constructed based on WHO case data reported on 17 October 2014, and projected forward 200 days. New data became available while the study was in review, and results were subsequently updated to reflect these more recent data, as reported at 3 December 2014. Initial projections from 17 October data were based on a doubling time of 30 days, a conservative choice given the range of doubling times reported at the time^{13}^{,}^{14}^{,}^{15}. Comparisons were made between projected risk into Australia based on this initial analysis (17 October) and the updated analysis (3 December).

**Direct travel model**

Under the baseline scenario of unchanged epidemic conditions and traffic from West Africa to Australia, the probability of a case entering Australia by 1 July 2015 is 0.34 (Figure 1, Figure 2). Under the scenario of 50% reduced traffic, the probability of a case by 1 July 2015 falls to 0.19 (Figure 3, Figure 4). New Australian Government policy, restricting/cancelling visas from West Africa into Australia, reduced the risk of entry to a probability of 0.16 by 1July 2015 (Figure 2, Figure 3).

Alternately, when we consider the potential impact of reduced Ebola contact rates within existing outbreaks, a reduction of 20% results in a substantial reduction in risk, with the probability of a case entering by 1 July 2015 being only 0.03 (Figure 4, Figure 5).

Increasing the latent period for Ebola to 10 days (provided the doubling time remains constant) increased the probability that a case enters Australia within a given time (Figure 6, Figure 7). The converse is also true – a decrease in latent period to 3 days decreased the probability of entry (Figure 6, Figure 7).

**Global network secondary outbreak model**

Under a global outbreak model, with baseline parameters unchanged (infection rates globally uniform, consistent international air traffic), and based on 50 simulation runs, the first date a case entered Australia via an outbreak in a secondary source location was 23 May 2015, and cases had entered Australia by 1 June 2015 in 6% of simulation runs and by 1 July 2015 in 12% of simulation runs (Figure 8).

Simulations were also performed under two alternate scenarios: (a) the rate of air traffic leaving infected countries was decreased by 50% for each country that has experienced at least 100 cases, and (b) contact rates were decreased within higher-income countries. Under both of these scenarios, no Ebola cases entered Australia by 1 July 2015 under 50 simulations of the global network secondary outbreak model.

**Time-series comparison**

Under historic traffic levels from West Africa to Australia (i.e., the direct travel model), and epidemic parameters and initial conditions as reported on 17 October 2014, the probability of a case entering Australia by 1 April 2015 was 0.97. The predicted risk under the same model, with parameters and initial conditions as reported on 3 December 2014, was 0.09 (Figure 9). The probability of a case entering within 200 days of 17 October 2014 was 1.00, compared to a probability of 0.30 within 200 days of 3 December 2014.

Under the Global network secondary outbreak model, the probability of a case entering Australia via an outbreak in a secondary source location within 200 days of 17 October 2014 was 0.76. With updated parameters and initial conditions, the probability of a case entering Australia within 200 days of 3 December 2014 was 0.10 (Figure 10).

**Direct travel model**

Under current epidemic conditions and historic travel levels into Australia, it is possible that an Ebola case will enter Australia within the first six months of 2015, having travelled directly from West Africa, with a probability of 0.34.

The cessation of granting visas/cancelling existing visas is effectively equivalent to a traffic reduction of approximately 60% (i.e., 83% reduction from Guinea, 60% reduction from Liberia, 56% reduction from Sierra Leone), and its impact is in line with this: the probability of a case entering Australia by 1 July 2015 is reduced by 53% (slightly more than under the 50% reduction in traffic scenario). However, the probability of an eventual case entering Australia within the first six months of 2015 is still sufficiently high as to warrant caution (16%).

It is possible that there may be some decrease in the number of Australian residents travelling to and from affected countries, which may further decrease the probability of a case arriving. Alternatively there may be, within the short term, an increase, if for example visitors to West Africa are returning to Australia at a greater rate than they may previously have in an attempt to avoid Ebola.

A 20% decrease in contact rate within affected West African countries reduced the probability of an eventual case entering Australia substantially (3% chance of introduction by 1 July 2015, vs. 34% under the baseline scenario). It is possible that public health research to determine effective ways to reduce infection rates, combined with foreign aid contributing to increased availability of hospital beds and high- quality treatment, could feasibly result in a decrease in contact rate of this magnitude. Note that at this level of reduced contact, the number of cases no longer increases exponentially, or, rather, the exponential growth is so slow that within the time period considered it is close to linear (Figure 5). If the contact rate is reduced even further than this, the number of Ebola cases will begin to decrease within West Africa. This is consistent with CDC predictions, that Ebola infection decreases under potential control and hospitalization scenarios ^{14}.

**Global network secondary outbreak model**

We found that, under existing Ebola transmission parameters and historic global flight conditions, it is possible but not likely that Australia may see an Ebola case via an outbreak in a secondary source country within the first six months of 2015, with a probability of approximately 0.12 by 1 July 2015. It is very unlikely that this happens early during this time period, given the time it would take for outbreaks to be established in countries with significant direct air traffic to Australia.

Under a model with global control of air traffic leaving each country in which a significant outbreak has occurred, the probability of a case reaching Australia within the first six months of 2015 is further reduced, such that no simulation runs (from 50) had cases enter Australia within this interval. Some reduction in air traffic to and from affected countries is a reasonable assumption, either due to mandated restrictions, or just the natural desire of people to avoid travelling where epidemic risk is significant.

When the assumption is made that contact rates are likely to be reduced in higher-income countries, which may be reasonable due to a combination of high-quality healthcare, and education relating to disease transmission, global outbreak spread slows significantly. As a result of this, no simulation runs had a case enter Australia within the first six months of 2015 under this scenario.

It may appear unintuitive that there would be less risk of an Ebola case entering Australia within the first six months of 2015 from the global outbreak model than from direct travel. The discrepancy is due to the time scale involved: under the global outbreak model, secondary outbreaks would need to occur and grow in countries with direct connections to Australia for a case to then enter, which would take a significant amount of time. If the time scale were longer, the risk due to global spread would increase and eventually be greater than due to direct travel, and also be less susceptible to control measures such as visa restrictions.

**Time-series comparison**

Modelling based on updated parameters and initial conditions, based on data available at 3 December 2014 ^{21}, projected substantially lower risk of a case entering Australia than modelling based on parameters and initial conditions from 17 October 2014 ^{7}. For any given date, it is natural that risk under a model beginning in December would be lower than risk under a model beginning in October given that these probabilities are conditional on not having seen a case, i.e., projections from the October model include some risk that the case may have arrived in November. In addition to this, data available in December implied a slower doubling time of 45 days, whereas 17 October models relied on a doubling time of 30 days, chosen conservatively based on a variety of figures reported in the literature ^{13}^{,}^{14}^{,}^{15}. Furthermore, initial case numbers based on 3 December data were lower, now taken from new weekly case counts consisting only of confirmed cases. Case numbers in Sierra Leone were higher than those in Liberia under 3 December data, resulting in greater greater risk of a case entering Australia due to direct travel from Sierra Leone, whereas under 17 October data Liberia presented more risk (Figure 8).

It is likely that the strong difference between results based on these two datasets is primarily due to two factors: (1) efforts to control the spread of Ebola in West Africa, and (2) more accurate data, restricted to confirmed cases. Significant public health measures for the control of Ebola are underway, and show promising signs. The number of new reported incidences in Liberia was stable or declining by 3 December 2014, and protocols were in place throughout the region to effectively isolate patients, and to ensure safe burial practices ^{21}. These control measures would directly influence the rate at which the outbreak is growing, i.e., the doubling time. Data on new weekly cases, restricted to confirmed cases only, were not available when this analysis was performed on 17 October, and as a result the 3 December model uses these lower, more accurate initial estimates, which further slows outbreak growth and results in reduced projected risk to Australia. Overall, it is clear that there can be significant variability in estimated risk due to the parameter estimates used, and the reduction in risk projected here is likely due to both control efforts and improved data.

**Study assumptions and limitations**

The stochastic SEIR model used here effectively represents the necessary components of Ebola dynamics for this study. More complex models have been applied in other studies, incorporating e.g., specific hospitalisation dynamics or separate removal classes (death vs. recovery) allowing specific incorporation of post-death contact. However, in this study it was most parsimonious to use a simple model with fewer assumptions as to disease dynamics or model parameters. There is some variation in reported parameter values in the literature, e.g., in terms of reported latent period (^{12} vs. ^{13}), and for the case of latent period we investigated a selection of values to determine sensitivity to this parameter.

One assumption made here, that is likely to significantly influence our predictions, is of consistency, i.e., the assumption that in general future disease dynamics and/or transport dynamics will follow past dynamics. If measures to control Ebola within West Africa are successful in the near future, or if air traffic trends from affected nations have been decreased significantly, then the risk of transport will be decreased. The best case scenario is of control within West Africa such that disease cases decrease to the point of eventual extinction without extensive outbreaks elsewhere (i.e., any individual cases that emerge elsewhere are controlled quickly). In a sense, the status-quo is the most conservative scenario.

We assumed here that 50% of removed individuals die, and 50% recover. Estimates of mortality rates for Ebola have varied considerably ^{13}^{,}^{15}, and tend to change quickly within this outbreak, in part due to estimates being biased during the early stages of an outbreak ^{17}. A higher mortality rate would, in this model, mean a faster increase in case numbers, and hence would lead to higher probabilities of introduction into Australia. As such, 50% is a conservative choice of mortality rate.

Overall, we have made a large number of assumptions in each of the alternate scenarios we have chosen. The extent to which air traffic, or disease contact rates might decrease is uncertain and will have a nontrivial impact on model results. In particular the choice of contact rate for countries within different economic groups essentially defines that model. In this case, we assumed that high income countries would have contact rates that result only in replacement, on average, in terms of outbreak growth (and as such outbreaks in these countries will die out via stochasticity). This seems reasonable, and is not inconsistent with high quality medical care, contact tracing, etc., but control could certainly be stronger or weaker than this.

Finally, it should be noted that these projections are based upon WHO infection numbers, which it has been suggested may be under-reporting significantly ^{14}^{,}^{18}. If existing case numbers in West Africa are significantly higher than recorded, the disease would propagate more quickly and the probability of entry into Australia within a given timeframe would be higher.

**Conclusion**

Based on two alternate models for the spread of Ebola, either via direct travel from West Africa or through spread to secondary sources, we conclude that under existing conditions it is possible that a case of Ebola will enter Australia within the first six months of 2015, with a probability of entry of 0.34 by 1 July 2015 under the baseline direct travel scenario. Reduced traffic due to new government visa restrictions will decrease the probability of this occurring. Comparison between data from 17 October 2014 and 3 December 2014 suggests that control measures within this period have had a positive impact, resulting in reduced risk of importation into Australia. Further control of existing outbreaks within West Africa, and in any further outbreaks in secondary locations, would provide the strongest decrease in risk to Australia. Medical professionals and policy makers should be prepared for the possible entry of an Ebola case into Australia, and continue to undertake public health research and supply aid in an effort to effectively reduce proliferation of Ebola in existing outbreaks.

As of mid-October 2014, the number of reported suspected cases of the Ebola epidemic in West Africa had exceeded 9,000 cases, which is likely a significant underestimate ^{1}^{,}^{2}^{,}^{3} . Markedly different dynamical behaviors can be observed for the growth curves of the epidemic in the countries of Guinea, Sierra Leone and Liberia (Fig. 1). Most immediately, the epidemic in Liberia has been growing at a much faster rate in Liberia than in Guinea. Although the epidemic likely began much earlier in Guinea ^{4}, Liberia had approximately the same number of cases in early August, twice as many cases by the end of August and nearly three times as many cases by mid-September. Even more striking, the number of cases in Guinea appears to have been growing sub-exponentially until late-August (approximately linearly with a slope of about 3 cases per day) while the number of cases in Liberia has been growing exponentially (approximately 10 cases per day averaged for July, 40 cases per day averaged for August and 70 cases per day in September). The growth dynamics of the epidemic in Sierra Leone appears to be intermediate between these two. It would be helpful to understand these different growth patterns within the context of a single epidemic, since a better understanding of the source of these different patterns may yield productive ideas for curbing the exponential growth of the epidemic in Liberia.

We describe a stochastic network model with three levels of community structure (households and communities of households within a country population) on which we model SEIR transmission dynamics for the spread of Ebola infection. We are able to fit the WHO Ebola case data of each country by varying only the community mixing parameter (connectivity) for each country (see Fig. 3). Observing that the long term dynamics of epidemic spread within communities are linear due to local saturation effects, we are able to demonstrate that linear growth phases followed by exponential growth phases are consistent with seeding of the epidemic to new communities.

A variety of computational and statistical models have been used help to characterize and resolve the mechanisms underlying trends in the growth of this epidemic. The models of ^{5} and ^{6} include a parameter to estimate and predict the effect of control measures on the epidemic. SEIR models such as that of ^{5} and ^{7} are four-compartment models that resolve infectious dynamics between populations based on their susceptibility and infectiousness and account for the time scales of viral incubation and infectiousness. SEIR models with seven compartments ^{8}^{,}^{9}^{,}^{2} further resolve the effects of varying degrees of transmission among, for example, community, hospital, and funeral populations.

These computational models focus on different aspects of the epidemic to explain or observe the marked differences of the growth curves for the epidemic in each country. The models ^{5} and ^{6}, accounting for the effect of control measures, find that their models find that their models identify slowing of the growth of the epidemic only for Guinea and Sierra Leone. The model of ^{8} accounting for different community, hospital and funereal transmission rates, predicts that a higher number of transmissions from funerals in Liberia could account for the faster rate of growth of the epidemic in Liberia compared to Sierra Leone. Likewise, the model of ^{2} predicted a higher fraction of patients with no effective measures to limit transmission, including burial transmission, in Liberia. Especially interesting differences among the three countries were described by ^{7} in their methodology to observe changes in the effective reproductive number over time. In particular they found that the effective reproductive number rose for Liberia and Guinea. The authors observe that this increase occurred somewhat early on during the Liberia outbreak in mid-July, when the outbreak spread to densely populated regions in Monrovia, and during the Guinean outbreak in mid-August, around the time the outbreak spread to densely populated regions in Conakry.

We would like to provide a proof of principle explanation for the differences in the dynamics of the 2014 Ebola epidemic based on differences in the community network properties of the affected regions, even as the number of daily interactions, transmission rates and in particular the average number of people infected by each person *R _{0}* within a naïve population is the same for all three countries. A prediction of our model is that the effective reproductive number

**The Importance of Network Interactions**

The roles of incomplete mixing within communities, heterogeneity in contact transmission, local saturation and the co-incidence of multiple transmission chains can have significant effects on epidemic dynamics ^{13}^{,}^{14}^{,}^{15}. The importance of network considerations, in particular the importance of reducing the contacts between exposed and unexposed groups, for controlling this epidemic has been described ^{16}. One of the most important features to capture about the Ebola virus (in particular for a model resolved at an individual, mechanistic level) is that there is a high probability of transmission between close contacts, but a lower probability among casual contacts. For example, in a formal study of transmission chains in the 1979 Ebola virus outbreak in Sudan, it was found that care-takers of the sick had a 5.1 higher rate of transmission than other family members with more casual contact ^{17}. Likewise, in the 1976 Ebola outbreak in Zaire, the probability of transmission was 27.3% among very close contacts (spouses, parents and children) but only 8% for other relatives ^{18}. In our model we organize individuals within households (a broadly defined term meant to represent the set of potential “close” contacts) and households are organized within local communities (a larger, modularly structured network of the population of less likely and less infectious interactions). The term “household” has been used in the literature for networks of close interactions, so we use this term here for convenience, but in the case of Ebola virus transmission, a network of close contacts would include overlapping household, hospital and funereal networks, and the concept of ‘household’ should be expanded to include these.

Two-scale community models with different transmission rates for close (or local) contacts and casual (or global) have been studied previously ^{19}^{,}^{20}^{,}^{21} and references therein) where the smaller scale compartment may be called households, clusters or sub-graphs. With this paradigm, there are two transmission rates, and thus two scale-dependent reproductive numbers that would sum to the global reproductive number *R _{0}* of the epidemic. We define

We implement a discrete, probabilistic SEIR model on a social connectivity network that combines elements of a two-scale network model by ^{20} and the modular network of ^{24} to create what is technically a three-scale community network (individuals are organized within families, and families within modular local communities that are subsets of the entire population). This connectivity structure is simple by construction (more regular than small-world networks and lacking long-range connections of different lengths) yet nevertheless enables a parametrization of the level of mixing that exists between communities. Specifically, a population that is “well-mixed” has a larger number of families interacting within each local community. Details of the model are described in the Methods section. The effects of incomplete community mixing are significant even for moderate population sizes ^{19} and cannot be reproduced by unstructured mean-field models that assume complete mixing (e.g., see ^{25}). This emphasizes the role of individual and network-based models to resolve these effects.

A stochastic, individual-based SEIR model is implemented for a population with a network structure of two edge types: close contacts among members of a household and casual contacts among members of a local community. This component of the model is comparable to the two-scale SIR community model described by ^{20} (we use a discrete lattice-based simulation approach instead of a Markov approach and we add an exposed period for which an individual is exposed but not infectious). Local communities of each household are modeled as the set of *r* nearest households, for this we refer to the modular lattice approach of ^{24}. See Fig. 2 for a schematic of our three-scale network. This last component enables us to systematically vary the extent of community mixing.

**Households of size H: Modeled on an**

A population of size *P=L·H* is modeled as an *L×H* lattice where *H*=|*h _{i}*| corresponds to a fixed number of individuals in a household

**
Communities of size C=2r+1 households: Modeled on an L×H **

The population of households is further organized within communities *c _{i}* (

**Network Structure of Close and Casual Contacts on the Lattice**

A network of two edge types representing close versus casual contacts is defined for the population (see Fig. 2). Individuals within a household are connected by edges that represent potential close contacts, and individuals within a community are connected by edges that represent potential casual contacts. Thusly, each household may be thought of as a well-mixed (completely connected) graph of *H* vertices and each community may be thought of as a well-mixed (completely connected) graph of *C·H* vertices. From an individual’s perspective, an individual located at the *(i,j*)_{th} node is connected to each member of the *i*_{th} household with potential close contacts and to each member of the *i _{th }*community with potential casual contacts. The described network structure is completely homogeneous: every individual is centered within a network that is identical to that of every other individual.

**SEIR Dynamics**

Initially, all individuals (lattice nodes/network vertices) are susceptible (state** S**) except one individual that is exposed (state **E**) representing “patient 0”. In simulations, time steps are discrete and correspond to exactly one day. States are updated at each time step with the following transition probabilities:

*p*(*S*→*E*)=*probability that a susceptible individual becomes exposed*

*=(1-probability of no exposures from any infected contacts)*

=(1-(1-*t _{h}*)

where *t _{H}* is the transmission probability within a household (probability of exposure per day per infectious household contact),

*p*(*E*→*I*)=*probability that an exposed individual becomes infectious*

=1/γ,

where *γ* is the average incubation period.

*p*(*I*→R*)=probability that an infectious individual will become refractory*

=1/λ,

where *λ* is the average infectious period.

For the incubation and infectious periods, we follow recent modeling groups ^{5}^{,}^{2} based on data in ^{23}^{,}^{24}^{,}^{25}^{,}^{26} using *γ*=5.3 days and *λ*=5.61 days. We observe that these periods may be longer ^{4}^{,}^{26}, as used in ^{9}^{,}^{6}^{,}^{8} and varied in ^{7}.

**Reparametrization of transmission rates in terms of R_{0} and expected number of contacts η**

The extent of mixing increases with the size of the community *C*. A community of size *C* means that each individual in that community has an equi-probable chance of interacting with each other member of the community. However, it becomes increasingly clear as the community size increases that members of the community will not interact with every member of the community every day. Rather, we assume that individuals interact with an average number of people η each day, where η is a fixed value independent of community size.

We define the reproductive numbers *R _{0H}* and

*R _{0H }*≈ (

*R _{0C }*≈ (

Since* R _{0 }*is a key epidemiological parameter, it is helpful to make this a control parameter of the model rather than the transmission rate. Thus, we solve for the transmission rates in terms of

so that

**
Four Free Model Parameters: R_{0H}, R_{0C}, H, C**

Since our incubation and infection periods are fixed, our model is completely prescribed by four intuitive free parameters: the household reproductive number *R _{0H}*, the community reproductive number R

**Model Response Parameters**

Our model tracks the states of individuals over time. In simulations, an individual is defined as an Ebola case when they become infectious. This assumes that an individual is not recognized as a case until they are infectious and that there is no delay in identifying infectious individuals.

Many response variables can be calculated, included the fraction of infections not occurring due to a contact already being infected (saturation), the rate of spread of the infection through the population (as, for example, a function of the average distance from the initial infected individual), the structure of the chain of transmission from any single exposed individual, etc. In the results presented here, we focus on the cumulative number of infectious cases per day. We also calculate the effective reproductive number *R _{e}* which is the average number of infections resulting from each infectious individual. At the conclusion of a simulation,

**Case Data and Matching Epidemiological Curves for Ebola Case Number versus Time**

We compare simulated Ebola cases per simulation days with Ebola cases per day for Guinea, Sierra Leone and Liberia. Case data was found on Wikipedia ^{11} compiled from WHO case reports ^{12} and was retrieved on October 15^{th}.

Since the date of the first case is not given, and especially since the number of days between the first case and the *n _{th}* case can be highly variable, we synchronize the simulation day and the calendar date by using the first day that there are 48, 95 or 51 cases in Guinea, Sierra Leone and Liberia, corresponding with March 22

A comparison of general *R*-square coefficients of determination was used to identify parameter values providing a good fit to the empirical data. In particular, a locally optimal parameter value for *R*_{0C} for given values of *H* and *R*_{H} was verified, and a globally optimal value of *C* for given values of *H*, *R*_{H} and *R*_{0C} (Fig. 3) was verified, by changing each parameter one-at-a-time and confirming that these values maximized *R*-square while all other parameters were held fixed.

Simulations were completed using the software package Matlab. The script for calculating *R*-square was written by Jered Wells. Matlab scripts used to generate all figures can be found at:

http://www.southalabama.edu/mathstat/personal_pages/byrne/PLoS_MATLABscripts.htm

[F1] Accounting for saturation effects that occur even over the infectious period of the first infectious individual in a naïve infectious population, more accurate but less penetrable descriptions for the community network can be found in Appendix 1.

**Fitting Country Case Data for the First Five Months of the Epidemic**

The number of cases over time for Guinea, Sierra Leona and Liberia over the first five months of the epidemic (March 22^{nd} – August 22^{nd}) is shown in Fig. 3. We sought to describe these dynamics in the context of the spread of a single epidemic within a contiguous region, so that key epidemiological parameters, in particular transmission rates and the basic reproductive number, would be the same in each country. We also assumed average household size would be the same for each country and required that the difference between countries be described solely by changing the community size, *C*, a measure of the size of the community within which infectious and susceptible individuals interact by casual contacts. It is thus natural to refer to *C* as the *community mixing size*.

For the fixed set of parameters *H*=16, *R*_{0H}=1.8 and *R*_{0C}=0.55, average simulation results yielded good fits to the empirical increase in case number over time as the community mixing size *C* was increased from *C*=9 to *C*=33 to *C*=51 for Guinea, Sierra Leone and Liberia (Fig 3). This result provides a *proof of principle *that the differences in the growth dynamics of the epidemic can be explained by different levels of community mixing.

The particular set of values in Fig. (3) were chosen to be within likely ranges of their empirical values (see Discussion). A local sensitivity analysis was done to establish that *R*_{0C} and *C* were locally optimized for the given choice of *H*=16, *R*_{0H}=1.8, but a systematic search of parameter space was not done to determine the shape of the set of all parameters, still within the likely range of their empirical values, that would provide equally good fits of the data. Since the epidemic growth rate generally increases with increases of any one parameter, different parameter values may yield similar results if one parameter is raised while decreasing another, though we note there are complex effects on the shape of the transient behavior in each case. In the Appendix, we include several examples of alternative fits to the data in which, for example, the household size *H* is decreased or increased, or the household reproductive number *R*_{0H} is increased or decreased. These changes to *H* and *R*_{0H} require commensurate changes in *C* and *R*_{0C} in order to achieve good fitting of the data. In these results, we focus on the parameter set {*H*, *R*_{0H}, *R*_{0C}}={16,1.8,0.55} since these are consistent with previously described epidemiological data for Ebola virus (see discussion for further details).

The effective epidemic reproduction number *R*_{e} is calculated from the model simulations. The value of *R*_{e} is time-dependent (plots are provided in the Appendix) but decrease to values close to one for each country (*R*_{e}=1.03±0.01, 1.10±0.02 and 1.17±0.02 for Guinea, Sierra Leone and Liberia, respectively). These values indicate very strong saturation effects, as many members of a household do not infect anyone after all members of their household have been exposed.

**Stochastic Variability of Individual Simulations and Stochastic Spread of the Epidemic**

The simulation curves in Figs. 3 and Fig. 7 show the average number of Ebola virus cases versus time for simulations that did not die out before reaching a threshold number of cases. However our model is probabilistic and individual simulation curves are quite variable (Fig. 4a). A significant source of variability is whether the infection spreads beyond a small number of cases. Since stochastic effects are significant (Fig. 4a), and the epidemic fails to reach the threshold number of cases for a large fraction of the simulations, we describe the likely spread of an epidemic with a histogram of the distribution of outbreak sizes in Fig. 4*bcd*. The distribution of outbreak sizes for our model parameters describing the growth dynamics of Guinea and Liberia (Fig. 4*cd*) are bimodal: simulations frequently result in no spread (23±1% of all simulations result in no secondary infection) or epidemics are likely to become quite large (>950 infections) if there is spread. For Guinea, there is some non-negligible probability that an intermediately-sized outbreak does not become an epidemic (considering the low frequency of outbreaks of size 150—950 in Fig. 4*c*). For the parameters that describe Liberian growth dynamics (Fig. 4*d*), in contrast, the probability that an intermediately sized outbreak will spontaneously extinguish is negligible (not once occurring in 2000 simulations). To underscore the possibility that Guinea is a population that may be in transition from one likely to have only small outbreaks to one that would always have large epidemics like Liberia, we provide a histogram of the distribution of outbreak sizes for a population with even smaller community mixing size (*C*=5) (Fig. 4*b*). For this population, there is a sizable probability of small outbreaks, with a very small probability of a large one. Insets of each panel Fig. 4*bcd* show that the early dynamics is similar for all community mixing rates.

**Location of Country Parameters in Phase Space and Predictions for Curbing Epidemic Growth**

The locations of parameter values in phase space that fit the Guinea, Sierra Leone and Liberia data are shown in Fig 5. In Fig. 5a, the community reproductive number *R*_{0C} is varied along the *x*-axis from 0.05 to 0.85, and the community mixing size *C* is varied along the *y*-axis from 1 to 65. The three countries are located along the *x*-axis at *R*_{0C}=0.55, and along the *y*-axis at *C*=9 (Guinea), *C*=33 (Sierra Leone) or *C*=51 (Liberia). The shaded regions indicate increasing numbers of cases in the first 100 days after the outbreak has been established (defined by reaching 50 cases). This diagram shows the effect of decreases in *C *and *R*_{0C} on the growth of the epidemic during an early time period. For example, the phase diagram of Fig. 5a predicts that Liberia could have the slower Guinea-type growth dynamic if the reproductive number within communities was decreased from *R*_{0C}=0.55 to *R*_{0C}=0.25 (approximately a 50% reduction in community transmission).

**Extrapolating from the Five Months March 22 ^{nd} – August 22^{nd }to Interpret Epidemic Dynamics through October 15^{th}**

During the writing of this manuscript, the growth dynamic of the epidemic in Guinea changed abruptly by an increase in the average number of Ebola cases per day. Our model parameters that provided a good fit to the Guinea data March 22^{nd} – August 22^{nd} (Fig 3a) predicted that the number of cases in Guinea would continue with its previous trend of 3.3±0.5 new cases per day (Fig 6a, the solid black line). However, the abrupt change in slope suggested a new source of cases. We calculated the difference between the empirical number of Ebola cases and the predicted number of Ebola cases over time to find the number of ‘new cases’ that would be supplied by a putative second outbreak. In particular, there were 823 cases reported for Guinea on September 3^{rd} whereas our model predicted only 623. We thus used our model to fit a second Guinean outbreak with 200 cases on September 3^{rd, }and interpolated that the difference between the empirical data on August 29^{th} (703 cases ±14 cases) was approximately 100 cases larger than simulation predictions on that date (599 cases ±15 cases), which we used to initialize a new outbreak in simulations (dotted line). (The sum of the two outbreaks fit the data for Guinea very well when the community mixing parameter was *C*=51 for the second Guinean outbreak. Our model predicted that this second outbreak began with its first case in early July. In the absence of any further seeding events, the model prediction for the number of Ebola virus cases in Guinea is 2,083 (standard error 34 cases) on November 1^{st} and 2990 (standard error 42 cases) on December 1^{st}, with a predicted linear growth dynamic of 30±1 cases per day.

Likewise, our model parameters that provided a reasonably good fit for the growth of the epidemic in Liberia over the earlier time period made a poor prediction for the growth over the next six weeks (Fig. 6b). Our model generally predicts a growth dynamic that is linear after a transient exponential period. We repeated the method described for Guinea for fitting the Ebola cases that were in excess of our modeling predictions and these results are summarized in Figs. 6b and 6c.

In Fig 6b, a guiding exponential is drawn (dashed line) to show that the cumulative case data for Liberia is fit well by an exponential. Since early case data is quite noisy, we used the exponential fit to predict that the first 100 cases would have occurred on July 10^{th}. The predictions of simulations, in which the day of the 100^{th} case was defined to be July 10^{th}, are shown in Fig. 6b (solid gray line). Simulation predictions are indistinguishable from the exponential through mid-August, confirming that simulations predict a growth dynamic that is transiently exponential. After this transient exponential period, the predicted growth dynamic was linear (with a slope of approximately 27±1 case per day).

We repeated the procedure described for fitting the Guinean case data, in which a new outbreak with 100 cases is simulated each time the number of reported cases exceeds the number of predicted cases by 100 cases. This method predicted three independent outbreaks, all with community mixing size C=51, beginning in late June, mid-July and late July. While the superposition of these simulation outbreaks resulted in a very good fit of the data, especially the transient growth dynamics, we note that equally good fits could be certainly achieved by other linear combinations of increasingly smaller outbreaks. In the absence of any further seeding events, the model prediction for the number of Ebola virus cases in Liberia is 6,097 (standard error 55 cases) on November 1^{st} and 8,529 (standard error 65 cases) on December 1^{st}, with a predicted linear growth dynamic of 81±1 cases per day.

Generally, our model predicts a growth dynamic that is linear after a transient exponential growth period. In the context of our modeling framework, without the simulation of secondary outbreaks, the linear growth can be understood as the depletion of susceptibles within participating communities, so that new infections occur only with the exposure of households with overlapping communities. The slope of this increase will depend in a complex way on the propagation of saturation effects from infected sources and on the connectivity of individuals between adjacent communities.

Fitting the growth dynamic of the epidemic for each country over the five months March 22^{nd} – August 22^{nd} of the epidemic while varying only the community mixing size *C* provides a *proof of principle* that accounting for local effects such as saturation and heterogeneous transmission among contacts can account for the differences in the rate of growth of the epidemics in different countries, and that they enable us to view qualitatively different growth trends as different facets of a single epidemic.

**Confidence in and Interpretation of Model Parameters**

Our model had three parameters (*R _{0H}, R_{0c}, H*) which needed to be estimated to constrain the model and a fourth parameter (the community mixing size, C) which was varied to fit the different trends in case data over time for Guinea, Sierra Leone and Liberia. The reproduction numbers within households

For the parameter values (*R _{0H}=1.8, R_{0c}=0.55, H=16*), the model predicted community mixing sizes of

Our model provides a proof of principle that the differences in growth dynamics in different regions of West Africa can be explained by differences in community mixing sizes. This should not be interpreted as a prediction that there are not significantly different levels of epidemic control in the different regions. On the contrary, effective and ineffective epidemic controls can have significant effects, positively or negatively, respectively, on the community mixing size. In particular, similar to households ^{31}, hospitals can have an amplifying effect by encouraging transmission among patients and through health care workers ^{18}, or can limit community mixing by isolating patients. Indeed, while isolation may be modeled as decreasing the transmission rate between infected and susceptible individuals, it is arguably a more accurate description that the transmission rate per contact is fixed, but the population of potential contacts is greatly reduced. In contrast, measures such as frequent hand washing and chlorine stations reduce the transmission rate without affecting community mixing size. Since our model assumes constant transmission rates throughout the region, the effects of such measures would be (erroneously) averaged within apparent differences in the community mixing size.

We found that to model increases in the epidemic growth rate in the sixth month of the epidemic, we needed to model the spread of the epidemic into new communities. While this might initially be interpreted as a failure of our model to predict the epidemic past the five months of the epidemic to which we fit our parameters, on the contrary relatively constant trends over those five months enabled us to determine reasonable parameters while the subsequent spread of the epidemic into new communities eventually, as in the sixth month, is not only intuitive but eventually inevitable in an as of yet uncontrolled epidemic. Previous epidemiologists (^{35} and ^{23}) have reasoned that an epidemic over a large geographical region is best considered as the superposition of many smaller outbreaks, with transmission and saturation effects occurring at this local level, with occasional long-range interactions driving the spread across the region.

**The Importance of Community Mixing Size and Saturation Effects **

Our individual-based model accounting for saturation effects was able to fit early outbreak data for Guinea, Sierra Leone and Liberia by changing one community parameter, the community mixing size *C*, while holding other important epidemiological parameters constant. Potential saturation effects in the empirical data may be observed by the difference in the case data curves from exponential; Guinea had a linear growth trend in cases over the first five months of the epidemic, while Liberia was best fit by a simple exponential ^{6}. Heuristically, saturation can be understood by considering the extent to which limited community mixing results in overlapping (and redundant) chains of transmission as two infected families have a higher probability of being located within the same mixing community, and infecting families within the same community, than two randomly chosen families ^{15}.

Saturation effects are reflected in the model by differences in the effective reproductive number *R*_{e} and the basic reproductive number *R*_{0}, and especially by decreases in *R*_{e} over time. Without saturation effects, *R*_{e} and *R*_{0} will be equal; however as saturation effects increase *R*_{e} approaches 1. Our model predictions for *R*_{e} decreased to values close to 1, indicating significant saturation effects. Especially, saturation effects within households will reduce *R*_{e} towards 1 since the household reproduction number *R*_{0H}=2 is the main contribution to the basic reproduction number *R*_{0} =2.35 and most infected individuals within households will find that some members of their households are already exposed. Note that saturation effects cannot drive *R*_{e} below 1. (When saturation effects are maximized, everyone is infected and *R*_{e} =1.)

**Basic Reproductive Number ****R**_{0}

When an individual is infectious in an entirely naïve community, they will on average infect a number of people in their household and a number of people in their community, so that *R _{0 }≈ R_{0H} + R*

**Model Predictions for Limiting Exponential Growth of the Epidemic in Liberia**

Our model provides a proof-of-principle that sub-exponential growth is possible, and indeed explains the linear growth of the epidemic in Guinea over the first five months of the epidemic, if community mixing is limited. Epidemic control measures can have a variable effect on community mixing sizes. Our model results indicate that epidemic control measures can be further optimized by considering implementations that decrease community mixing. Optimal ethical and pragmatic epidemic control measures are beyond the scope of this manuscript. Instead, we generalize that considered epidemic control measures will work by decreasing transmission by decreasing transmission rates and/or decreasing community mixing rates (e.g., chlorine protocols, quarantines, isolation, etc.) The phase diagrams in Fig. 5 shows the location of Liberia, Sierra Leone and Guinea in *C*–*R*_{0C} and *R*_{0H} –*R*_{0C} phase space (monotonically mapping to differences in community mixing size and community transmission rates) and movements within this space show the effect on the number of cases from 50 infectious individuals over a particular 100 day period.

**Differences in Ebola Virus Epidemic Growth Dynamics over Time and Geographic Region**

Our model results show that the spread of an epidemic due to the introduction of an infected individual within the community is not inevitable since even for community with relatively large community mixing sizes, there is a significant probability that the infection fails to spread from one infected individual. However, our model also shows that as the community mixing size increases, once the epidemic begins spreading, it becomes increasingly inevitable that the outbreak will be very large. Similar dependence of the distribution of outbreak sizes were observed in small world networks ^{15}. Our model prediction that communities with small mixing sizes will have a distribution of smaller outbreaks is consistent with the historical observation of an outbreak in 1976 that the Ebola virus did not seem to spread well and no transmission chains longer than three were found in the outbreak ^{17}. A recent study estimated that the probability of a large outbreak of more than 1,000 cases in 1976 was 3% ^{36}. If community mixing sizes increase with historical time, which seems likely with increased population density and urbanization, then the probability of very large outbreaks increases with time and Guinea may be an example of a population that is in transition between a population where small outbreaks would occur versus large epidemics. As noted in ^{36}, these trends would be further exacerbated by concurrent increases in interactions between humans and animal carriers as humans spread into new animal habitats.

Focusing on the dynamics of the epidemic in the five months March 22^{nd} – August 22^{nd }might have suggested that Guinea and Liberia were quite different in some aspect, since the epidemic was not growing exponentially in Guinea but was growing exponentially in Liberia. However, an abrupt change in the growth dynamic of Guinea, and our model prediction that the Liberian growth may be best described as the superposition of several outbreaks staggered in time, suggest that the specific early dynamics of the outbreak may be probabilistic, depending on the characteristics of community where the outbreak begins. As the epidemic spreads to new communities, the growth dynamics represent the sum of these contributions and are dominated by the fastest growing outbreaks. Our model predicts that in a single community outbreak, there is a transient period of exponential growth followed by a linear increase in cases. This reflects the wave-like spread of the infection and has been previously observed previously in small-world networks ^{15}. While our model predicts that each individual outbreak will saturate, the epidemic will remain exponential if the virus continues to seed in new communities. Likewise, it was observed that even a small number of long-range links in a small-world network results in a dramatic increase in the growth rate of the epidemic ^{15}.

**Model Scope, Limitations and Future Directions**

Our three-scale model is a model of intermediate complexity that accounts for heterogeneous transmission and, like more complex models such as small-world networks ^{15}, permits systematic variation of the extent of community mixing (modeled with one other network scale). The model results predict a household size 2-3 times larger than a typical household and thus this cluster likely represents a connected network of households and other groupings where close contacts would be expected. This indicates that a hierarchical network model might be more appropriate. Our network choice was preliminarily simplistic, and other types of networks, such as hierarchical and scale-free networks, are better descriptions of community connectivity in real-world populations. In particular, simulating the ‘seeding’ of the outbreak in new communities as a discrete event is only appropriate for a small number of events. Small-world networks with a heterogeneous distribution of edge lengths (many short-range interactions and a small number of long-range interactions) would be able to model seeding events probabilistically, without requiring explicit inputs for when these events occur. Modeling the Ebola virus epidemic in the context of different network formulations could verify that these described results are not dependent on specific choices of model implementation. Since our model parameters were not fully constrained, virtually any individual-based data, such as the average fraction of immediate family members that become ill, or information about the distribution of the number of infections resulting from infected individuals, can be used to constrain these parameters. Further, a network model with additional spatial information could be used to further explore the effectiveness of various epidemic control strategies, as the spatial movement of the epidemic through new communities, and resurgence through previously exposed communities, will have profound effect on the persistence of the epidemic ^{15}^{,}^{37}^{,}^{38}.

**Conclusions **

Our model demonstrates that sub-exponential growth is a possible long-term trend if community mixing is limited, and provides predictions on how changes in transmission rates will result in decreases in the growth of the epidemic. Our results suggest that limiting community mixing over this scale would be an important consideration while designing epidemic control strategies. Community mixing sizes consistent with model fits of the data are quite small, with 100 to 1000 individuals. It is especially important to prevent the seeding of the outbreak into new communities, as the diffusion of transmission chains into new communities is what maintains the exponential growth of the epidemic.