Microbial phylogenetics

From Wikipedia the free encyclopedia

Microbial phylogenetics is the study of the manner in which various groups of microorganisms are genetically related. This helps to trace their evolution.[1][2] To study these relationships biologists rely on comparative genomics, as physiology and comparative anatomy are not possible methods.[3]



Microbial phylogenetics emerged as a field of study in the 1960s, scientists started to create genealogical trees based on differences in the order of amino acids of proteins and nucleotides of genes instead of using comparative anatomy and physiology.[4][5]

One of the most important figures in the early stage of this field is Carl Woese, who in his researches, focused on Bacteria, looking at RNA instead of proteins. More specifically, he decided to compare the small subunit ribosomal RNA (16rRNA) oligonucleotides. Matching oligonucleotides in different bacteria could be compared to one another to determine how closely the organisms were related. In 1977, after collecting and comparing 16s rRNA fragments for almost 200 species of bacteria, Woese and his team in 1977 concluded that Archaebacteria were not part of Bacteria but completely independent organisms.[3][6]


In the 1980s microbial phylogenetics went into its golden age, as the techniques for sequencing RNA and DNA improved greatly.[7][8] For example, comparison of the nucleotide sequences of whole genes was facilitated by the development of the means to clone DNA, making possible to create many copies of sequences from minute samples. Of incredible impact for the microbial phylogenetics was the invention of the polymerase chain reaction (PCR).[9][10] All these new techniques led to the formal proposal of the three domains of life: Bacteria, Archaea (Woese himself proposed this name to replace the old nomination of Archaebacteria), and Eukarya, arguably one of the key passage in the history of taxonomy.[11]

One of the intrinsic problems of studying microbial organisms was the dependence of the studies from pure culture in a laboratory. Biologists tried to overcome this limitation by sequencing rRNA genes obtained from DNA isolated directly from the environment.[12][13] This technique made possible to fully appreciate that bacteria, not only to have the greatest diversity but to constitute the greatest biomass on earth.[14]

In the late 1990s sequencing of genomes from various microbial organisms started and by 2005, 260 complete genomes had been sequenced resulting in the classification of 33 eucaryotes, 206 eubacteria, and 21 archeons.[15]


In the early 2000s, scientists started creating phylogenetic trees based not on rRNA, but on other genes with different function (for example the gene for the enzyme RNA polymerase[16]). The resulting genealogies differed greatly from the ones based on the rRNA. These gene histories were so different between them that the only hypothesis that could explain these divergences was a major influence of horizontal gene transfer (HGT), a mechanism which permits a bacterium to acquire one or more genes from a completely unrelated organism.[17] HGT explains why similarities and differences in some genes have to be carefully studied before being used as a measure of genealogical relationship for microbial organisms.[18]

Studies aimed at understanding the widespread of HGT suggested that the ease with which genes are transferred among bacteria made impossible to apply ‘the biological species concept’ for them.[19][20]

Phylogenetic representation[edit]

Since Darwin, every phylogeny for every organism has been represented in the form of a tree. Nonetheless, due to the great role that HGT plays for microbes some evolutionary microbiologists suggested abandoning this classical view in favor of a representation of genealogies more closely resembling a web, also known as network. However, there are some issues with this network representation, such as the inability to precisely establish the donor organism for a HGT event and the difficulty to determine the correct path across organisms when multiple HGT events happened. Therefore, there is not still a consensus between biologists on which representation is a better fit for the microbial world.[21]

Methods for Microbial Phylogenetic Analysis[edit]

Most microbial taxa have never been cultivated or experimentally characterized. Utilizing taxonomy and phylogeny are essential tools for organizing the diversity of life. Collecting gene sequences, aligning such sequences based on homologies and thus using models of mutation to infer evolutionary history are common methods to estimate microbial phylogenies.[22] Small subunit (SSU) rRNA (SSU rRNA) have revolutionized microbial classification since the 1970s and has since become the most sequenced gene[23]. Phylogenetic inferences are determined based on the genes chosen, for example, 16S rRNA gene is commonly selected to investigate inferences in Bacteria and Archaea, and microbial eukaryotes most commonly use the 18S RNA gene.[24]

Phylogenetic comparative methods[edit]

Phylogenetic comparative methods (PCMs) are commonly utilized to compare multiple traits across organisms. Within the scope of microbiome studies, it is not common for the use of PCMs, however, recent studies have been successful in identifying genes associated with colonization of human gut.[22] This challenge was addressed through measuring the statistical association between a species that harbors the gene and the probability the species is present in the gut microbiome. The analyses showcase the combination of shotgun metagenomics paired with phylogenetically aware models.[25]

Ancestral state reconstruction[edit]

This method is commonly used for estimation of genetic and metabolic profiles of extant communities using a set of reference genomes, commonly performed with PICRUSt (Phylogenetic Investigation of Communities by Reconstructing of Unobserved States) in microbiome studies.[22] PICRUSt is a computational approach capable of prediction functional composition of a metagenome with marker data and a database of reference genomes. To predict which gene families are present, PICRUSt uses extended ancestral-state reconstruction algorithm and then combines the gene families to estimate composite metagenome.[26]

Analysis of phylogenetic variables and distances[edit]

Phylogenetic variables are used to describe variables that are constructed using features in the phylogeny to summarize and contrast data of species in the phylogenetic tree. Microbiome datasets can be simplifies using phylogenetic variables by reducing the dimensions of the data to a few variables carrying biological information.[22] Recent methods such as PhILR and phylofactorization address the challenges of phylogenetic variables analysis. The PhILR transform combines statistical and phylogenetic models to overcome compositional data challenges. Incorporating both microbial evolutionary models with the isometric log-ratio transform creates the PhILR transform.[27] Phylofactorization is a dimensionality-reducing tool used to identify edges in the phylogeny from which putative functional ecological traits may have arisen.[28]


Inferences in phylogenetics requires the assumption of common ancestry or homology but when this assumption is violated the signal can be disrupted by noise.[23] It is possible for microbial traits to be unrelated due to horizontal gene transfer causing the taxonomic composition to reveal little about the function of a system.[29]

See also[edit]


  1. ^ Oren, A (2010). Papke, RT (ed.). Molecular Phylogeny of Microorganisms. Caister Academic Press. ISBN 978-1-904455-67-7.
  2. ^ Blum, P, ed. (2010). Archaea: New Models for Prokaryotic Biology. Caister Academic Press. ISBN 978-1-904455-27-1.
  3. ^ a b Sapp, J. (2007). "The structure of microbial evolutionary theory". Stud. Hist. Phil. Biol. & Biomed. Sci. 38 (4): 780–795. doi:10.1016/j.shpsc.2007.09.011. PMID 18053933.
  4. ^ Dietrich, M. (1998). "Paradox and persuasion: Negotiating the place of molecular evolution within evolutionary biology". Journal of the History of Biology. 31 (1): 85–111. doi:10.1023/A:1004257523100. PMID 11619919. S2CID 29935487.
  5. ^ Dietrich, M. (1994). "The origins of the neutral theory of molecular evolution". Journal of the History of Biology. 27 (1): 21–59. doi:10.1007/BF01058626. PMID 11639258. S2CID 367102.
  6. ^ Woese, C.R.; Fox, G.E. (1977). "Phylogenetic structure of the procaryote domain: The primary kingdoms". Proceedings of the National Academy of Sciences. 75 (11): 5088–5090. Bibcode:1977PNAS...74.5088W. doi:10.1073/pnas.74.11.5088. PMC 432104. PMID 270744.
  7. ^ Sanger, F.; Nicklen, S.; Coulson, A.R. (1977). "DNA sequencing with chain-terminating inhibitors". Proceedings of the National Academy of Sciences. 74 (12): 5463–5467. Bibcode:1977PNAS...74.5463S. doi:10.1073/pnas.74.12.5463. PMC 431765. PMID 271968.
  8. ^ Maxam, A.M. (1977). "A new method for sequencing DNA". Proceedings of the National Academy of Sciences. 74 (2): 560–564. Bibcode:1977PNAS...74..560M. doi:10.1073/pnas.74.2.560. PMC 392330. PMID 265521.
  9. ^ Mullis, K.F.; et al. (1986). "Specific enzymatic amplification of DNA in vitro: The polymerase chain reaction". Cold Spring Harbor Symposia on Quantitative Biology. 51: 263–273. doi:10.1101/SQB.1986.051.01.032. PMID 3472723. S2CID 26180176.
  10. ^ Mullis, K.B.; Faloona, F.A. (1989). Recombinant DNA Methodology. Academic Press. pp. 189–204. ISBN 978-0-12-765560-4.
  11. ^ Woese, C.R.; et al. (1990). "Towards a natural system of organisms: Proposal for the domains Archaea, Bacteria, and Eucarya". Proceedings of the National Academy of Sciences. 87 (12): 4576–4579. Bibcode:1990PNAS...87.4576W. doi:10.1073/pnas.87.12.4576. PMC 54159. PMID 2112744.
  12. ^ Pace, N (1997). "A molecular view of microbial diversity and the biosphere". Science. 276 (5313): 734–740. doi:10.1126/science.276.5313.734. PMID 9115194.
  13. ^ Pace, N.R.; et al. (1985). "Analyzing natural microbial populations by rRNA sequences". American Society of Microbiology News. 51: 4–12.
  14. ^ Whitman, W. B; et al. (1998). "Procaryotes: The unseen majority". Proceedings of the National Academy of Sciences. 95 (12): 6578–6583. Bibcode:1998PNAS...95.6578W. doi:10.1073/pnas.95.12.6578. PMC 33863. PMID 9618454.
  15. ^ Delusc, F.; Brinkmann, H.; Philippe, H. (2005). "Phylogenomics and the reconstruction of the tree of life" (PDF). Nature Reviews Genetics. 6 (5): 361–375. doi:10.1038/nrg1603. PMID 15861208. S2CID 16379422.
  16. ^ Doolittle, W.F. (1999). "Phylogenetic classification and the universal tree". Science. 284 (5423): 2124–2128. doi:10.1126/science.284.5423.2124. PMID 10381871.
  17. ^ Bushman, F. (2002). Lateral DNA transfer: mechanisms and consequences. New York: Cold Spring Harbor Laboratory Press. ISBN 0879696036.
  18. ^ Andam, Cheryl P.; Williams, David; Gogarten, J. Peter (2010-06-08). "Biased gene transfer mimics patterns created through shared ancestry". Proceedings of the National Academy of Sciences. 107 (23): 10679–10684. doi:10.1073/pnas.1001418107. ISSN 0027-8424. PMC 2890805. PMID 20495090.
  19. ^ Ochman, H.; Lawrence, J.G.; Groisman, E.A. (2000). "Lateral gene transfer and the nature of bacterial innovation". Nature. 405 (6784): 299–304. Bibcode:2000Natur.405..299O. doi:10.1038/35012500. PMID 10830951. S2CID 85739173.
  20. ^ Eisen, J. (2000). "Horizontal gene transfer among microbial genomes: new insights from complete genome analysis". Current Opinion in Genetics & Development. 10 (6): 606–611. doi:10.1016/S0959-437X(00)00143-X. PMID 11088009.
  21. ^ Kunin, V.; Goldovsky, L.; Darzentas, N.; Ouzounis, C. A. (2005). "The net of life: Reconstructing the microbial phylogenetic network". Genome Research. 15 (7): 954–959. doi:10.1101/gr.3666505. PMC 1172039. PMID 15965028.
  22. ^ a b c d Washburne, Alex D.; Morton, James T.; Sanders, Jon; McDonald, Daniel; Zhu, Qiyun; Oliverio, Angela M.; Knight, Rob (2018-05-24). "Methods for phylogenetic analysis of microbiome data". Nature Microbiology. 3 (6): 652–661. doi:10.1038/s41564-018-0156-0. ISSN 2058-5276. PMID 29795540. S2CID 43962376.
  23. ^ a b Wu, Martin; Eisen, Jonathan A (2008). "A simple, fast, and accurate method of phylogenomic inference". Genome Biology. 9 (10): R151. doi:10.1186/gb-2008-9-10-r151. ISSN 1465-6906. PMC 2760878. PMID 18851752.
  24. ^ Hillis, David M.; Dixon, Michael T. (1991). "Ribosomal DNA: Molecular Evolution and Phylogenetic Inference". The Quarterly Review of Biology. 66 (4): 411–453. doi:10.1086/417338. ISSN 0033-5770. PMID 1784710. S2CID 32027097.
  25. ^ Bradley, Patrick H.; Nayfach, Stephen; Pollard, Katherine S. (2018-08-09). "Phylogeny-corrected identification of microbial gene families relevant to human gut colonization". PLOS Computational Biology. 14 (8): e1006242. doi:10.1371/journal.pcbi.1006242. ISSN 1553-7358. PMC 6084841. PMID 30091981.
  26. ^ Langille, Morgan G I; Zaneveld, Jesse; Caporaso, J Gregory; McDonald, Daniel; Knights, Dan; Reyes, Joshua A; Clemente, Jose C; Burkepile, Deron E; Vega Thurber, Rebecca L; Knight, Rob; Beiko, Robert G; Huttenhower, Curtis (2013). "Predictive functional profiling of microbial communities using 16S rRNA marker gene sequences". Nature Biotechnology. 31 (9): 814–821. doi:10.1038/nbt.2676. ISSN 1087-0156. PMC 3819121. PMID 23975157.
  27. ^ Silverman, Justin D; Washburne, Alex D; Mukherjee, Sayan; David, Lawrence A (2017-02-15). "A phylogenetic transform enhances analysis of compositional microbiota data". eLife. 6. doi:10.7554/eLife.21887. ISSN 2050-084X. PMC 5328592. PMID 28198697.
  28. ^ Washburne, Alex D.; Silverman, Justin D.; Leff, Jonathan W.; Bennett, Dominic J.; Darcy, John L.; Mukherjee, Sayan; Fierer, Noah; David, Lawrence A. (2017-02-09). "Phylogenetic factorization of compositional data yields lineage-level associations in microbiome datasets". PeerJ. 5: e2969. doi:10.7717/peerj.2969. ISSN 2167-8359. PMC 5345826. PMID 28289558.
  29. ^ Martiny, Jennifer B. H.; Jones, Stuart E.; Lennon, Jay T.; Martiny, Adam C. (2015-11-06). "Microbiomes in light of traits: A phylogenetic perspective". Science. 350 (6261). doi:10.1126/science.aac9323. ISSN 0036-8075.