plos PLoS Currents: Influenza 2157-3999 Public Library of Science San Francisco, USA 10.1371/currents.RRN1003 The early molecular epidemiology of the swine-origin A/H1N1 human influenza pandemic Rambaut Andrew University of Edinburgh, UK Holmes Eddie The Pennsylvania State University, USA 18 8 2009 ecurrents.RRN1003 2019 Rambaut, Holmes, et al This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Swine-origin pandemic human influenza A virus (H1N1pdm) has spread rapidly around the world since its initial documentation in April 2009. Here we have updated initial estimates of the rate of molecular evolution and estimates of the time of origin of this virus in the human population using the large number of viral sequences made available as part of the public health response to this global pandemic. Currently sampled H1N1pdm sequences share a most recent common ancestor in the first 7 weeks of 2009 with the implication that the virus was transmitting cryptically for up to 3 months prior to recognition. A phylogenetic reconstruction of the data shows that the virus has been circling the globe extensively with multiple introductions into most geographical areas. Introduction

H1N1pdm (also referred to as S-OIV) is a newly emergent human influenza A virus that is closely related to a number of currently circulating pig viruses in the 'classic North American' and 'Eurasian' swine influenza virus lineages [1][2]. Since the first reports of the virus in humans in April 2009, H1N1pdm has spread to 168 countries and overseas territories, with >177,000 reported cases [3]. To reveal the early molecular epidemiology of the H1N1pdm, particularly its spatial patterning and evolutionary dynamics, we performed an evolutionary analysis on available genome sequence data sampled globally. The aim of our study was to provide updated information on estimates of the rate of evolutionary change in H1N1pdm, its date of origin, and its growth rate in human populations.


H1N1pdm sequences were collated from the NCBI Influenza Database on 6th August 2009. These sequences were then filtered to produce a data set that met the following criteria; that the exact date (day) of collection was given, that both of the hemagglutinin (HA) and neuraminidase (NA) gene sequences were available (any other sequenced genes were also included), and that they had been isolated from humans. This resulted in a data set of 377 isolates. In a previous application of these approaches to sequences collected early in the outbreak [4], it was possible to trace epidemiologically-linked clusters and correct for this sampling bias explicitly by picking one isolate from each cluster. Given the large number of cases globally, this is no longer possible. Therefore, to reduce any effect of over-sampling of epidemiologically-linked isolates (such as those from New York State, USA), a further filtering was applied where, for each day of isolation, only one virus from a given location (country and state) was retained; this resulted in a total of 242 isolates. This final set included isolates from 23 different countries with the majority (118) coming from the USA.

These sequences were then analysed using BEAST v1.5.0 [5], an analysis package that uses a Bayesian Markov chain Monte Carlo (MCMC) approach to sample time-structured evolutionary trees from their joint posterior probability distribution derived from a combination of molecular evolutionary and population genetic models. The data were analysed under an exponential-growth coalescent model as a prior on the tree, the HKY+gamma model of nucleotide substitution and a relaxed molecular clock [6]. 4 independent runs of 10 million steps were peformed, compared for convergence and combined less a 10% burnin from each.

Results & Discussion

The results of the Bayesian MCMC analysis are summarized in Table 1 . The analysis of this much larger sample of H1N1pdm virus confirms that the date of the most recent common ancestor (MRCA) of circulating human lineages pre-dates the first sampled case by approximately 2 months. Assuming that a single cross-species transfer from pigs to humans gave rise to all the sampled H1N1pdm diversity, the date of the MRCA is a lower limit of how recently this event occurred. The estimate presented here represents a considerable increase in precision over that published previously [2] concomitant with the increased period of sampling and greater number of viruses in this study (we estimate a credible interval spanning approximately the first 7 weeks of 2009). In sum, this means that there was a period of up to about 3 months where the virus was circulating in humans prior to initial characterization (cryptic transmission).

The substitution rate for H1N1pdm is significantly higher than that estimated by Smith et al. [2] for each of the genomic segments for a panel of related swine viruses (at ~3 x 10 -3 substitutions/site/year). This higher rate in H1N1pdm was noted by Smith et al. , and as an explanation it was suggested that the rapid epidemic spread and short timescale had resulted in a proportion of mildly-deleterious mutations being maintained in the population. It will be possible to test this hypothesis when H1N1pdm sequences are sampled over a longer time-period.

Table 1. Marginal posterior estimates of model parameters.

Rate of molecular evolution (x10 -3 subs/site/year) Date of most recent common ancestor Exponential growth rate Doubling time (days)
mean estimate and Bayesian credible interval 5.02 (4.17, 5.95) 27-Jan-2009 (30-Dec-2008, 22-Feb-2009) 14.92 (10.44, 19.61) 17.0 (12.9, 24.2)

Figure 1 shows the maximum clade credibility tree (the tree sampled from the MCMC with the highest product of individual clade probabilities). This tree is intended as a representative tree from the posterior sample; however, the ages of each node in the tree are set to the mean across the entire sample.

In blue is the marginal posterior probability density of the time of the most recent common ancestors of the sampled lineages (95% credible interval shown by darker blue). Clades are labelled with their posterior probability where greater than 0.5. Lineages are coloured using a parsimonious reconstruction based on the locations of the sampled viruses.

PDF version with isolate names

The color-coded branches highlight the rapid spatial diffusion of H1N1pdm, which multiple entries into countries from Asia and Europe. Such a spatial mixing is typical for influenza A virus [7] and suggests that H1N1pdm exhibits similar spatial dynamics to those of seasonal influenza. It is also notable that multiple lineages of H1N1pdm are circulating in a single geographical region (with the United States a notable example), providing the raw material for intra-subtype reassortment.

The population growth rates and epidemic doubling times inferred here are similar to those estimated previously for H1N1pdm [4] and suggest that the virus will continue to spread globally. However, it is difficult to compare these coalescent-based estimates of epidemiological dynamics with those of seasonal influenza viruses, as multiple introductions into any locality in any season [8][9] mean that equivalent point-source outbreaks are rarely observed.


Thanks to Gytis Dudas for help with collating the sequence data.

Funding information

AR was funded by The Royal Society of London and the Interdisciplinary Centre for Human and Avian Influenza Research (ICHAIR). ECH was funded by NIH grant R01 GM080533. Both AR and ECH thank the Fogarty International Center, NIH, for continued support.

Competing interests

The authors have declared that no competing interests exist.

References Novel Swine-Origin Influenza A (H1N1) Virus Investigation Team, Dawood FS, Jain S, Finelli L, Shaw MW, Lindstrom S, Garten RJ, Gubareva LV, Xu X, Bridges CB, Uyeki TM. Emergence of a novel swine-origin influenza A (H1N1) virus in humans. N Engl J Med. 2009 Jun 18;360(25):2605-15. Epub 2009 May 7. Erratum in: N Engl J Med. 2009 Jul 2;361(1):102. Smith GJ, Vijaykrishna D, Bahl J, Lycett SJ, Worobey M, Pybus OG, Ma SK, Cheung CL, Raghwani J, Bhatt S, Peiris JS, Guan Y, Rambaut A. Origins and evolutionary genomics of the 2009 swine-origin H1N1 influenza A epidemic. Nature. 2009 Jun 25;459(7250):1122-5. World Health Organization (2009) Pandemic (H1N1) 2009 - update 61. August 12th 2009. Fraser C, Donnelly CA, Cauchemez S, Hanage WP, Van Kerkhove MD, Hollingsworth TD, Griffin J, Baggaley RF, Jenkins HE, Lyons EJ, Jombart T, Hinsley WR, Grassly NC, Balloux F, Ghani AC, Ferguson NM, Rambaut A, Pybus OG, Lopez-Gatell H, Alpuche-Aranda CM, Chapela IB, Zavala EP, Guevara DM, Checchi F, Garcia E, Hugonnet S, Roth C; WHO Rapid Pandemic Assessment Collaboration. Pandemic potential of a strain of influenza A (H1N1): early findings. Science. 2009 Jun 19;324(5934):1557-61. Epub 2009 May 11. Drummond AJ, Rambaut A. BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evol Biol. 2007 Nov 8;7:214. [], [], [] Drummond AJ, Ho SY, Phillips MJ, Rambaut A. Relaxed phylogenetics and dating with confidence. PLoS Biol. 2006 May;4(5):e88. Epub 2006 Mar 14. [], [] Nelson MI, Simonsen L, Viboud C, Miller MA, Holmes EC. Phylogenetic analysis reveals the global migration of seasonal influenza A viruses. PLoS Pathog. 2007 Sep 14;3(9):1220-8. [], [] Holmes EC, Ghedin E, Miller N, Taylor J, Bao Y, St George K, Grenfell BT, Salzberg SL, Fraser CM, Lipman DJ, Taubenberger JK. Whole-genome analysis of human influenza A virus reveals multiple persistent lineages and reassortment among recent H3N2 viruses. PLoS Biol. 2005 Sep;3(9):e300. Epub 2005 Jul 26. [], [] Rambaut A, Pybus OG, Nelson MI, Viboud C, Taubenberger JK, Holmes EC. The genomic and epidemiological dynamics of human influenza A virus. Nature. 2008 May 29;453(7195):615-9. Epub 2008 Apr 16. [], []