Skip to main content

Characterizing gene family evolution

Abstract

Gene families are widely used in comparative genomics, molecular evolution, and in systematics. However, they are constructed in different manners, their data analyzed and interpreted differently, with different underlying assumptions, leading to sometimes divergent conclusions. In systematics, concepts like monophyly and the dichotomy between homoplasy and homology have been central to the analysis of phylogenies. We critique the traditional use of such concepts as applied to gene families and give examples of incorrect inferences they may lead to. Operational definitions that have emerged within functional genomics are contrasted with the common formal definitions derived from systematics. Lastly, we question the utility of layers of homology and the meaning of homology at the character state level in the context of sequence evolution. From this, we move forward to present an idealized strategy for characterizing gene family evolution for both systematic and functional purposes, including recent methodological improvements.

References

  1. Massey SE, Churbanov A, Rastogi S, Liberles DA. 2008. Characterizing positive and negative selection and their phylogenetic effects. Gene 2008; 418:22–26.

    Article  PubMed  CAS  Google Scholar 

  2. Russell RB, Sasieni PD, Sternberg MJ. Supersites within superfolds. Binding site similarity in the absence of homology. J Mol Biol 1998; 282:903–918.

    Article  PubMed  CAS  Google Scholar 

  3. Britten R. Almost all human genes resulted from ancient duplication. Proc Natl Acad Sci USA 2006; 103:19027–19032.

    Article  PubMed  CAS  Google Scholar 

  4. Fitch WM. Homology: A personal view on some of the problems. Trends in Genetics 2000; 16:227–231.

    Article  PubMed  CAS  Google Scholar 

  5. Hennig W. Phylogenetic systematics. Urbana, IL: University of Illinois Press, 1979.

    Google Scholar 

  6. Gordon MS. The concept of monophyly: A speculative essay. Biology and Philosophy 1999; 14:331–348.

    Article  Google Scholar 

  7. Liberles DA, Schreiber DR, Govindarajan S, Chamberlin SG, Benner SA. The Adaptive Evolution Database (TAED). Genome Biology 2001; 2(8):R0028.

    Google Scholar 

  8. Berglund-Sonnhammer AC, Steffansson P, Betts MJ, Liberles DA. Optimal gene trees from sequences and species trees using a soft interpretation of parsimony. Journal of Molecular Evolution 2006; 63:240–250.

    Article  PubMed  CAS  Google Scholar 

  9. Duret L, Mouchiroud D, Gouy M. HOVERGEN: A database of homologous vertebrate genes. Nucleic Acids Research 1994; 22:2360–2365.

    Article  PubMed  CAS  Google Scholar 

  10. Seoighe C, Johnston CR, Shields DC. Significantly different patterns of amino acid replacement after gene duplication as compared to after speciation. Mol Biol Evol 2003; 20:484–490.

    Article  PubMed  CAS  Google Scholar 

  11. Roth C, Liberles DA. A systematic search for positive selection in higher plants (Embryophytes). BMC Plant Biology 2006; 6:12.

    Article  PubMed  Google Scholar 

  12. Fletcher GL, Hew CL, Davies PL. Antifreeze proteins of teleost fish. Ann Rev Physiol 2001; 63:359–390.

    Article  CAS  Google Scholar 

  13. Grant T, Kluge AG. Data exploration in phylogenetic inference: scientific, heuristic, or neither. Cladistics 2003; 19:379–418.

    Article  Google Scholar 

  14. Koonin EV. An apology for orthologs- or brave new memes. Genome Biology 2001; 2(4):comment1005.1–1005.2.

    Article  Google Scholar 

  15. Arvestad L, Berglund AC, Lagergren J, Sennblad B. Gene tree reconstruction and orthology analysis based on an integrated model for duplications and sequence evolution. RECOMB 2004; 2004:326–335.

    Article  Google Scholar 

  16. Hallett M, Lagergren J, Tofigh A. Simultaneous identification of duplications and lateral transfers. RECOMB 2004; 2004:347–358.

    Article  Google Scholar 

  17. Lynch M, Conery JS. The evolutionary fate and consequences of duplicate genes. Science 2000; 290:1151–1155.

    Article  PubMed  CAS  Google Scholar 

  18. Roth C, Rastogi S, Arvestad L, Dittmar K, Light S, Ekman D, Liberles DA. Evolution after gene duplication: models, mechanisms, sequences, systems, and organisms. J Exp Zool 2007; 308B:58–73.

    Article  CAS  Google Scholar 

  19. Brower AV, Schawaroch V. Three steps of homology assessment. Cladistics 1996; 12: 265–272.

    Google Scholar 

  20. De Pinna MCC. Concepts and test of homology in the cladistic paradigm. Cladistics 1991; 7: 367–394.

    Article  Google Scholar 

  21. Page RDM, Holmes EC. Molecular Evolution. A Phylogenetic Approach. Blackwell Publishing, Oxford, 2005.

    Google Scholar 

  22. Gould SJ. The structure of evolutionary theory. Cambridge, MA: The Belknap Press of Harvard University Press, 2002.

    Google Scholar 

  23. Kimura M. A simple model for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. Journal of Molecular Evolution 1980; 16:111–120.

    Article  PubMed  CAS  Google Scholar 

  24. Farris JS. The logical basis of phylogenetic analysis. In NI Platnick and VA Funk (eds). Advances in Cladistics, vol. 2. New York: Columbia University Press, 1983, pp. 7–36.

    Google Scholar 

  25. Kluge AG. Moving targets and shell games. Cladistics 1994; 10:403–413.

    Article  Google Scholar 

  26. Kool ET. Hydrogen bonding, base stacking, and steric effects in DNA replication. Ann Rev Biophys Biomol Struc 2001; 30:1–22.

    Article  CAS  Google Scholar 

  27. Chang MS, Brenner, SA. Empirical analysis of protein insertions and deletions determining parameters for the correct placement of gaps in protein sequence alignments. J Mol Biol 2004; 341:617–631.

    Article  PubMed  CAS  Google Scholar 

  28. Edwards RJ, Shields DC. GASP: Gapped Ancestral Sequence Prediction for proteins. BMC Bioinformatics 2005; 5:123.

    Article  Google Scholar 

  29. Lunter G, Miklos I, Drummond A, Jensen JL, Hein J. Bayesian coestimation of phylogeny and sequence alignment. BMC Bioinformatics 2005; 6:83.

    Article  PubMed  Google Scholar 

  30. Redelings BD, Suchard MA. Joint Bayesian estimation of alignment and phylogeny. Syst Biol 2005; 54:401–418.

    Article  PubMed  Google Scholar 

  31. Wheeler WC. Sequence alignment, parameter sensitivity, and the phylogenetic analysis of molecular data. Systematic Biology 1995; 44:321–331.

    Google Scholar 

  32. Galtier N. Maximum likelihood phylogenetic analysis under a covarion-like model. Mol Biol Evol 2001; 18:866–873.

    PubMed  CAS  Google Scholar 

  33. Depristo MA, Weinreich DM, Hartl DL. Missense meanderings in sequence space: A biophysical view of protein evolution. Nature Reviews Genetics 2005; 6:678–687.

    Article  PubMed  CAS  Google Scholar 

  34. Kleinman CL, Rodrigue N, Bonnard C, Philippe H, Lartillot N. A maximum likelihood framework for protein design. BMC Bioinformatics 2006; 7:326.

    Article  PubMed  Google Scholar 

  35. Benner SA. Interpretive proteomics- finding biological meaning in genome and proteome databases. Adv Enzyme Reg 2003; 43:271–359.

    Article  CAS  Google Scholar 

  36. Anisimova M, Liberles DA. The quest for natural selection in the age of comparative genomics. Heredity 2007; 99:567–579.

    Article  PubMed  CAS  Google Scholar 

  37. Scannell DR, Frank AC, Conant GC, Byrne KP, Woolfit M, Wolfe KH. Independent sorting-out of thousands of duplicated gene pairs in two yeast species descended from a whole genome duplication. Proc Natl Acad Sci USA 2007; 104:8397–8402.

    Article  PubMed  CAS  Google Scholar 

  38. Page RDM. TREEVIEW: An application to display phylogenetic trees on personal computers. Computer Applications in the Biosciences 1996; 12: 357–358.

    PubMed  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to David A. Liberles.

Rights and permissions

Open Access This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Liberles, D.A., Dittmar, K. Characterizing gene family evolution. Biol. Proced. Online 10, 66–73 (2008). https://doi.org/10.1251/bpo144

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1251/bpo144

Indexing terms