Protist Genomics

Deep Eukaryotic Phylogeny

Modeling Protein and Genomic Evolution

Protist Genomics

The development of technology for sequencing genes has revolutionized evolutionary biology. By comparing the gene sequences of distantly related organisms, we are now able to reconstruct their genealogy (phylogeny). A "tree of life" derived from comparisons of the small subunit ribosomal RNA (SSU rRNA) genes indicates that the living world can be divided into three major groups: the Eukaryota (with nucleated cells), the Archaebacteria and the Eubacteria (both with non-nucleated cells).

Eukaryotes include the multicellular kingdoms of animals, plants, fungi and a myriad of single-celled organisms, the protists. The SSU rRNA gene three of eukaryotes indicates that several protist groups with simple cell structures diverged first, followed by a series of branches leading to a vast radiation of multicellular and protistan organisms. However, this picture of early evolution is increasingly challenged by new data. Trees of genes encoding proteins conflict with the rRNA-based phylogeny in reconstructing the deepest relationships amongst eukaryotes. Moreover, recent studies indicate that the deep structure of the SSU rRNA and protein trees of eukaryotes could be dominated by artifacts arising from inadequacies in our tree-building methods. These problems hint that single gene trees of eukaryotes may be inadequate. In this proposal we will address these problems in two ways:

First, we will obtain the sequences of three additional protein genes from a variety of well-described and newly-discovered protists that are potentially deeply-branching. By combining these sequences with several other gene sequences from these organisms, we can assemble a large multi-gene dataset containing much more information than previously available. Sophisticated analyses of this information-rich dataset will then be performed to derive a more robust tree of eukaryotes.

Second, we will systematically investigate to what degree the deep structure of molecular trees of eukaryotes (and conflict between these trees) is due to tree-building artifacts. To accomplish this, we will develop and apply new computational methods to identify, and correct for, known sources of error in individual and combined gene phylogenies of eukaryotes.

These studies will lead to a better understanding of the early history of life on Earth, the causes of conflicts between deep molecular phylogenies, and the forces that influence molecular evolution on the deepest level.