The Tree of Life and a New Classification of Bony Fishes

·

The tree of life of fishes is in a state of flux because we still lack a comprehensive phylogeny that includes all major groups. The situation is most critical for a large clade of spiny-finned fishes, traditionally referred to as percomorphs, whose uncertain relationships have plagued ichthyologists for over a century. Most of what we know about the higher-level relationships among fish lineages has been based on morphology, but rapid influx of molecular studies is changing many established systematic concepts. We report a comprehensive molecular phylogeny for bony fishes that includes representatives of all major lineages. DNA sequence data for 21 molecular markers (one mitochondrial and 20 nuclear genes) were collected for 1410 bony fish taxa, plus four tetrapod species and two chondrichthyan outgroups (total 1416 terminals). Bony fish diversity is represented by 1093 genera, 369 families, and all traditionally recognized orders. The maximum likelihood tree provides unprecedented resolution and high bootstrap support for most backbone nodes, defining for the first time a global phylogeny of fishes. The general structure of the tree is in agreement with expectations from previous morphological and molecular studies, but significant new clades arise. Most interestingly, the high degree of uncertainty among percomorphs is now resolved into nine well-supported supraordinal groups. The order Perciformes, considered by many a polyphyletic taxonomic waste basket, is defined for the first time as a monophyletic group in the global phylogeny. A new classification that reflects our phylogenetic hypothesis is proposed to facilitate communication about the newly found structure of the tree of life of fishes. Finally, the molecular phylogeny is calibrated using 60 fossil constraints to produce a comprehensive time tree. The new time-calibrated phylogeny will provide the basis for and stimulate new comparative studies to better understand the evolution of the amazing diversity of fishes.

Multi-locus phylogenetic analysis reveals the pattern and tempo of bony fish evolution

·

Over half of all vertebrates are “fishes”, which exhibit enormous diversity in morphology, physiology, behavior, reproductive biology, and ecology. Investigation of fundamental areas of vertebrate biology depend critically on a robust phylogeny of fishes, yet evolutionary relationships among the major actinopterygian and sarcopterygian lineages have not been conclusively resolved. Although a consensus phylogeny of teleosts has been emerging recently, it has been based on analyses of various subsets of actinopterygian taxa, but not on a full sample of all bony fishes. Here we conducted a comprehensive phylogenetic study on a broad taxonomic sample of 61 actinopterygian and sarcopterygian lineages (with a chondrichthyan outgroup) using a molecular data set of 21 independent loci. These data yielded a resolved phylogenetic hypothesis for extant Osteichthyes, including 1) reciprocally monophyletic Sarcopterygii and Actinopterygii, as currently understood, with polypteriforms as the first diverging lineage within Actinopterygii; 2) a monophyletic group containing gars and bowfin (= Holostei) as sister group to teleosts; and 3) the earliest diverging lineage among teleosts being Elopomorpha, rather than Osteoglossomorpha. Relaxed-clock dating analysis employing a set of 24 newly applied fossil calibrations reveals divergence times that are more consistent with paleontological estimates than previous studies. Establishing a new phylogenetic pattern with accurate divergence dates for bony fishes illustrates several areas where the fossil record is incomplete and provides critical new insights on diversification of this important vertebrate group.

Phylogenetic Analysis of Six-Domain Multi-Copper Blue Proteins

·

Multicopper blue proteins, composed of several repetitive copper-binding domains similar to one-domain cupredoxin-like proteins, were found in almost all organisms. They are classified into the three different groups, based on their two-, three- or six-domain organization. We found orthologs of chordate six-domain copper-binding proteins in animals, plants, bacteria and archea. The phylogenetic analysis of 183 multicopper blue proteins and their copper-binding sites comparison make us think that all the modern six-domain blue proteins have originated from the common ancestral six-domain protein in the process of gene duplication and copper-binding sites loss as a result of amino acid substitutions.

The Ideas Lab Concept, Assembling the Tree of Life, and AVAToL

·

In August 2011, a week-long NSF-sponsored workshop focusing on the Tree of Life (ToL) took place in Lake Placid, New York. This workshop, called AVAToL (Assembling Visualizing, and Analyzing the Tree of Life), was the first application of NSF’s Ideas Lab concept to systematics. In this article we outline the history and motivation for the Ideas Lab approach and its application to the ToL, explain the nuts and bolts of the Ideas Lab process and look to the potential contributions of AVAToL funded projects to help enable the future of ToL and more broadly, comparative biological research.

An Algorithm for Calculating the Probability of Classes of Data Patterns on a Genealogy

·

Felsenstein’s pruning algorithm allows one to calculate the probability of any particular data pattern arising on a phylogeny given a model of character evolution. Here we present a similar dynamic programming algorithm. Our algorithm treats the tree and model as known. The algorithm makes it feasible to calculate the probability that a randomly selected character will be a member of a particular class of character patterns. Specifically, we are interested in binning patterns by the number of parsimony steps and the set of states observed at the tips of the tree. This algorithm was developed to expand the range of data set sizes that can be used with Waddell et al.’s marginal testing approach for assessing the adequacy of a model. The algorithms introduced can also be used in likelihood calculations which correct for ascertainment biases. For example, Lewis introduced an Mkv model which corrects for the lack of constant sites. The probability of a constant pattern arising can be calculated using the algorithm that we present, or by enumerating all possible constant patterns and calculating the probability of each one. Because the number of constant data patterns is small, both methods are efficient. However, elaborations of the Mkv model (such as those in Nylander et al) require calculating the probability of parsimony-uninformative patterns arising. For large trees and characters with many possible character states, the number of possible parismony-uninformative patterns is immense. In these cases, the algorithms introduced here will be more efficient. The algorithm has been implemented in open source software written in C++.

Standard maximum likelihood analyses of alignments with gaps can be statistically inconsistent

·

Background
Most statistical methods for phylogenetic estimation in use today treat a gap (generally representing an insertion or deletion, i.e., indel) within the input sequence alignment as missing data. However, the statistical properties of this treatment of indels have not been fully investigated.

Results
We prove that maximum likelihood phylogeny estimation, treating indels as missing data, can be statistically inconsistent for a general (and rather simple) model of sequence evolution, even when given the true alignment. Therefore, accurate phylogeny estimation cannot be guaranteed for maximum likelihood analyses, even given arbitrarily long sequences, when indels are present and treated as missing data.

Conclusions
Our result shows that the standard statistical techniques used to estimate phylogenies from sequence alignments may have unfavorable statistical properties, even when the sequence alignment is accurate and the assumed substitution model matches the generation model. This suggests that the recent research focus on developing statistical methods that treat indel events properly is an important direction for phylogeny estimation.

Phylogenetic discordance of human and canine carcinoembryonic antigen (CEA, CEACAM) families, but striking identity of the CEA receptors will impact comparative oncology studies.

·

Comparative oncology aims at speeding up developments for both, human and companion animal cancer patients. Following this line, carcinoembryonic antigen (CEA, CEACAM5) could be a therapeutic target not only for human but also for canine (Canis lupus familiaris; dog) patients. CEACAM5 interacts with CEA-receptor (CEAR) in the cytoplasm of human cancer cells. Our aim was, therefore, to phylogenetically verify the antigenic relationship of CEACAM molecules and CEAR in human and canine cancer.
Anti-human CEACAM5 antibody Col-1, previously being applied for cancer diagnosis in dogs, immunohistochemically reacted to 23 out of 30 canine mammary cancer samples. In immunoblot analyses Col-1 specifically detected human CEACAM5 at 180 kDa in human colon cancer cells HT29, and the canine antigen at 60, 120, or 180 kDa in CF33 and CF41 mammary carcinoma cells as well as in spontaneous mammary tumors. While according to phylogenicity canine CEACAM1 molecules should be most closely related to human CEACAM5, Col-1 did not react with canine CEACAM1, -23, -24, -25, -28 or -30 transfected to canine TLM-1 cells. By flow cytometry the Col-1 target molecule was localized intracellularly in canine CF33 and CF41 cells, in contrast to membranous and cytoplasmic expression of human CEACAM5 in HT29. Col-1 incubation had neither effect on canine nor human cancer cell proliferation. Yet, Col-1 treatment decreased AKT-phosphorylation in canine CF33 cells possibly suggestive of anti-apoptotic function, whereas Col-1 increased AKT-phosphorylation in human HT29 cells. We report further a 99% amino acid similarity of human and canine CEA receptor (CEAR) within the phylogenetic tree. CEAR could be detected in four canine cancer cell lines by immunoblot and intracellularly in 10 out of 10 mammary cancer specimens from dog by immunohistochemistry. Whether the specific canine Col-1 target molecule may as functional analogue to human CEACAM5 act as ligand to canine CEAR, remains to be defined. This study demonstrates the limitations of comparative oncology due to the complex functional evolution of the different CEACAM molecules in humans versus dogs. In contrast, CEAR may be a comprehensive interspecies target for novel cancer therapeutics.

Neotropical and North American Vaccinioideae (Ericaceae) share their mycorrhizal Sebacinales – an indication for concerted migration?

·

Neotropical Vaccinioideae (Ericaceae) are evolutionarily rather young and presumably of Northern Hemisphere origin. Vaccinioideae are highly dependent on their mycorrhizal symbionts and Sebacinales (basidiomycetes) were previously found to be the dominant mycobionts of Andean Clade Vaccinioideae (Neotropical Vaccinieae). We were interested to see whether the North American Vaccinioideae reached the Neotropics with their mycobionts or whether they acquired new, local Sebacinales.

We investigated Sebacinales of 58 individuals of Vaccinioideae from Ecuador, Panama and North America to examine whether mycobionts of each region are distantly or closely related.
We isolated the ITS of the ribosomal nuclear DNA in order to infer a molecular phylogeny of Sebacinales and to determine Molecular Operational Taxonomic Units (MOTUs). MOTU delimitation was based on a 3% threshold of ITS variability and conducted with complete linkage clustering. The analyses revealed that most Sebacinales from Ecuador, Panama and North America are closely related and that two MOTUs out of 33 have a distribution ranging from the Neotropics to the Pacific Northwest of North America. The data suggest that Neotropical and temperate Vaccinioideae of North America share their Sebacinales communities and that plants and fungi migrated together.

Cocos: Constructing multi-domain protein phylogenies

·

Phylogenies of multi-domain proteins have to incorporate macro-evolutionary events, which dramatically increases the complexity of their construction.
We present an application to infer ancestral multi-domain proteins given a species tree and domain phylogenies. As the individual domain phylogenies are often incongruent, we provide diagnostics for the identification and reconciliation of implausible topologies. We implement and extend a suggested algorithmic approach by Behzadi and Vingron (2006).

Resolving the phylogenetic and taxonomic relationship of Xanthomonas and Stenotrophomonas strains using complete rpoB gene sequence

·

The phytopathogenic genus Xanthomonas comprises numerous species and pathovars described primarily on their host and tissue specificities. Stenotrophomonas maltophilia , which is non-phytopathogenic and taxonomically closely related to Xanthomonas , has undergone several classifications from Pseudomonas to Xanthomonas and finally to Stenotrophomonas . In this study, we have investigated the phylogenetic and taxonomic status of these members using the complete RNA polymerase beta-subunit ( rpoB ) gene sequences available from their sequenced genomes. Not only did we obtain a phylogenetic tree for xanthomonads, but rpoB gene sequence information has also resolved the taxonomic relationship of X. axonopodis pathovars, X. albilineans and other Xanthomonas strains, with the most marked evidence being that Stenotrophomonas is synonymous to Xanthomonas . This study has revealed the power and potential of complete rpoB gene sequence in taxonomic, phylogenetic and evolutionary studies on Xanthomonas and Stenotrophomonas generic complex.