only two domains, not three: changing views on the tree of life

issue: what is life?

10 may 2016 article

mt may 2016 tree of life human family tree

in 1857, charles darwin sent a letter to thomas huxley in which he wrote: “the time will come i believe, though i shall not live to see it, when we shall have very fairly true genealogical trees of each great kingdom of nature.”

a tree for all of life – the three-domains tree

genealogical or evolutionary trees show the relationships between organisms based upon common ancestry, like the family trees that we use to investigate our own parentage. for many biologists, darwin’s dream was realised on the grandest scale when, in 1990, carl woese and colleagues proposed that all cellular life could be placed into one of three separate fundamental groups or ‘domains’ – the bacteria, the archaea and the eukarya, based upon sequence comparisons of small subunit (ssu) ribosomal (r) rna sequences. according to the ‘three-domains tree’, the eukarya and archaea are more closely related to each other than they are to the bacteria (fig. 1). hence, in this tree our closest cousins are the archaea, a group of micro-organisms once thought to be restricted to anaerobic and other hostile habitats like hot springs and thermal vents in the deep ocean. however, although the three-domains tree of life has dominated debate about how to organise life’s diversity at the highest level for the past 20 years or so, there is now increasing evidence that it is not the best-supported hypothesis for the evolutionary relationship between eukaryotes and archaea.

data and evolutionary models – how are trees made?

although morphology has long been used to classify animals and plants, archaea and bacteria – which between them comprise much of earth’s genetic and biochemical diversity – lack the wealth of morphological characters needed to reconstruct their relationships to each other or to eukaryotes. in 1965, double nobel prize winner linus pauling and his collaborator emile zuckerkandl proposed that the sequences of the dna, rna and proteins found in all cells were “documents of evolutionary history” and hence were the best source of data for making global evolutionary trees. the basic procedure is to collect sequences from different species and to use a mathematical model of how we think sequences evolve to infer the evolutionary relationships between them, typically expressed in a tree diagram. because of their central importance in the process of making trees, it is important to appreciate that all of these mathematical models use simplifying assumptions to make the analyses computationally tractable, and that they are not accurate representations of how sequences really evolve: in the words of statistician george box, “all models are wrong, but some are useful.” in the early days of computational molecular evolution, the models used were very simple because the computers of the time were so slow. it was during this period that the three-domains tree first came to prominence, so it is interesting to ask how the tree has fared as both computers and models have improved.

most of the models traditionally used to make the three-domains tree have assumed that the same sequences in all organisms evolve in much the same way, but this is not supported by real sequence data. for example, the nucleotide composition of ssu rrna sequences, which are by far the most widely used molecules for making broad-scale evolutionary trees, varies dramatically in different species. this provides strong evidence that the ways in which ssu rrna sequences have evolved in different species have changed over time. in tree building, using a model of sequence evolution that does not fit the data being analysed can often produce an incorrect tree with strong support. recent work now suggests that this can explain why past analyses have recovered the ‘three-domains’ tree.

fig. 1. the three-domains and two-domains trees – competing hypotheses for the origin of eukaryotes. the iconic three-domains tree appears in most textbooks and divides cellular life into three separate major groups or 'domains': the bacteria, the archaea and the eukaryotes. in this tree the eukaryotes are held to have originated from a common prokaryotic ancestor shared with the archaea (enclosed in the shaded box). by contrast, the two-domains/eocyte tree recovers eukaryotes nested inside the archaea with the newly discovered lokiarchaeota currently thought to be the closest archaeal relatives of the eukaryotes. in the two-domains/eocyte tree the eukaryotic lineage had an ancestor that was already an archaea. studying uncultured archaeal diversity in nature thus holds the promise of finding ever-closer relatives of eukaryotes. the genomic and cellular features of these lineages could potentially illuminate important stages in the evolution of eukaryotic cells like our own. the thaumarchaeotaaigarchaeotacrenarchaeota and korarchaeota are commonly called the tack archaea in the literature.
mt may 16 tree of life domains trees

two domains is better supported than three when new methods are used

over the past few years, a number of new models have been developed by statisticians to try and better accommodate aspects of real molecular sequence evolution. for example, models are now available that recognise that the same sequences in different species can evolve differently in terms of their amino acid or nucleotide compositions, and other models have been developed that allow individual sites in molecular sequences to evolve in different ways to each other. although it is widely recognised that even the best currently available models have important limitations, they fit real sequence data much better than the simpler models used in the past. interestingly, when the new models were first used to analyse the molecular sequence data commonly taken to support the three-domains tree, an alternative hypothesis for the relationship between archaea and eukaryotes called the ‘eocyte tree’ was better supported. in the long-neglected eocyte tree, which was first proposed by james lake and colleagues in 1984 based upon ribosome structure, the bacteria and archaea can still be considered distinct primary domains but the eukaryotes originate from within the domain archaea (fig. 1). in other words, in the ‘two-domains/eocyte tree’, the eukaryotic lineage has an archaeal parent.

adding new groups of archaea increases confidence in the new tree

microbiologists have long suspected that the micro-organisms that have been studied in the laboratory are only a tiny fraction of natural microbial diversity. the original eocyte archaea included species like sulfolobus (later called the crenarchaeota in 1990 by woese and colleagues) that live in hot acidic springs, so they were seen as rather unusual and exotic micro-organisms. in the past few years, sampling of the natural microbial world has greatly increased, driven by the availability of new molecular methods to investigate uncultured microbial diversity (fig. 2). recently discovered archaea related to the eocytes include a variety of new lineages that have been informally grouped together as the ‘tack’ archaea. some of the tack archaea have major roles in the soil and marine nitrogen cycle, suggesting that their discovery and further study is not just important because of their potential relationship to eukaryotes, but also for understanding globally important nutrient cycles. improved sampling of lineages often has a positive impact on the accuracy of tree reconstruction, particularly if the new sequences populate parts of the tree that were previously poorly sampled. importantly, all of the recent analyses that have included a broad sample of the new tack archaea have supported the two-domains/eocyte tree (fig. 1).

fig. 2. molecular methods can be used to identify environmental micro-organisms without cultivation. the figure on the right shows a light micrograph of an anaerobic ciliate protozoan called trimyema that is commonly found in freshwater ponds in the uk and elsewhere. trimyema is the host for a particular type of archaea called a methanogen because it makes methane. like many environmental archaea the intracellular methanogens have not yet been isolated into laboratory culture but they can nevertheless be identified as a single new species of methanocorpusculum based upon their ssu rrna sequences, which can be isolated and read using modern dna technology. on the left a fluorescent dna probe (green) was used to confirm that all of the many methanogens living side trimyema have the same ssu rrna sequence. similar probes can facilitate isolation experiments because they can be used to identify samples enriched in the target species and also to confirm when a target species has been successfully cultured. bars, 10 μm.
mt may 16 tree of life trimyema

can the new tree help us to better understand eukaryotic origins?

the perspective on eukaryotic evolution provided by the two-domains/eocyte tree of life has already had a profound influence on ideas about how eukaryotes first evolved from their prokaryotic ancestors. eukaryotic cells have an internal structural complexity that is not found in prokaryotes and the origins of this complexity have long been a major evolutionary puzzle. a key prediction of the two-domains/eocyte tree is that archaea can be discovered that are more closely related to eukaryotes than the species that we already know about, and, because of this closer common ancestry, that their genomes will be more similar to eukaryotes in their protein repertoires. this prediction appears to have been vindicated by the discovery of a new archaeal lineage called the lokiarchaeota (fig. 3). the lokiarchaeota are the closest archaeal relatives of eukaryotes in evolutionary trees and, consistent with that closer relationship (fig. 1), their reconstructed genomes contain more genes for proteins that were previously thought to be eukaryote-specific. these include proteins that, in eukaryotes, are used for the cytoskeleton, in membrane remodelling and in phagocytosis, all features long-held to be unique to eukaryotic cells. at present, the evidence for the existence of lokiarchaeota comes from metagenomes constructed from environmental dna samples so it is now critically important to isolate viable cultures into the laboratory, to determine the cellular roles of their eukaryote-like proteins. achieving that goal may be difficult and will require all of the classic tools of microbiology, including selective isolation, microbial physiology and cell biology, and cutting edge microscopy. however, the prize to be gained is potentially enormous because success will bring the study of eukaryotic origins much more firmly into the realm of experimental science.

fig. 3. part of the soria moria hydrothermal vent field along the arctic mid-ocean ridge. the picture was taken close to the loki's castle sampling site from which the dna samples used to recover the genome of lokiarchaeota were isolated. the detailed methods used to reconstruct the lokiarchaeota genome are described by spang et al. (2015) nature 521, 173–179.
mt may 16 tree of life soria moria
martin embley

institute for cell and molecular biosciences, university of newcastle, newcastle-upon-tyne, tyne and wear ne1 7ru, uk
[email protected]

tom williams

school of earth sciences, university of bristol, senate house, tyndall avenue, bristol bs8 1th, uk
[email protected]

further reading

embley, t. m. & williams, t. a. (2015). evolution: steps on the road to eukaryotes. nature 521, 169–170.

pester, m., schleper, c. & wagner, m. (2011). the thaumarchaeota: an emerging view of their phylogeny and ecophysiology. curr opin microbiol 14, 300–306.

spang, a. & others (2015). complex archaea that bridge the gap between prokaryotes and eukaryotes. nature 521, 173–179.

williams, t. a., foster, p. g., cox, c. j. & embley, t. m. (2013). an archaeal origin of eukaryotes supports only two primary domains of life. nature 504, 231–236.


image: the human family tree from haeckel's anthropogenie (1874) (french copy from 1886). paul d. stewart/science photo library. fig. 1. modified from williams et al. (2013) nature 504, 231–236. fig.2. will lewis (newcastle university). fig. 3. rolf birger pedersen (centre for geobiology, university of bergen, norway)..