1932

Abstract

Rapidly improving sequencing technology coupled with computational developments in sequence assembly are making reference-quality genome assembly economical. Hundreds of vertebrate genome assemblies are now publicly available, and projects are being proposed to sequence thousands of additional species in the next few years. Such dense sampling of the tree of life should give an unprecedented new understanding of evolution and allow a detailed determination of the events that led to the wealth of biodiversity around us. To gain this knowledge, these new genomes must be compared through genome alignment (at the sequence level) and comparative annotation (at the gene level). However, different alignment and annotation methods have different characteristics; before starting a comparative genomics analysis, it is important to understand the nature of, and biases and limitations inherent in, the chosen methods. This review is intended to act as a technical but high-level overview of the field that should provide this understanding. We briefly survey the state of the genome alignment and comparative annotation fields and potential future directions for these fields in a new, large-scale era of comparative genomics.

Loading

Article metrics loading...

/content/journals/10.1146/annurev-animal-020518-115005
2019-02-15
2024-03-29
Loading full text...

Full text loading...

/deliver/fulltext/animal/7/1/annurev-animal-020518-115005.html?itemId=/content/journals/10.1146/annurev-animal-020518-115005&mimeType=html&fmt=ahah

Literature Cited

  1. 1.  Needleman SB, Wunsch CD 1970. A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48:443–53
    [Google Scholar]
  2. 2.  Smith T, Waterman M 1981. Identification of common molecular subsequences. J. Mol. Biol. 147:195–97
    [Google Scholar]
  3. 3.  Bray N, Dubchak I, Pachter L 2003. AVID: a global alignment program. Genome Res 13:97–102
    [Google Scholar]
  4. 4.  Marçais G, Delcher AL, Phillippy AM, Coston R, Salzberg SL, Zimin A 2018. MUMmer4: a fast and versatile genome alignment system. PLOS Comput. Biol. 14:e1005944
    [Google Scholar]
  5. 5.  Brudno M, Do CB, Cooper GM, Kim MF, Davydov E et al. 2003. LAGAN and Multi-LAGAN: efficient tools for large-scale multiple alignment of genomic DNA. Genome Res 13:721–31
    [Google Scholar]
  6. 6.  Batzoglou S 2005. The many faces of sequence alignment. Brief. Bioinform. 6:6–22
    [Google Scholar]
  7. 7.  Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R et al. 2003. Human–mouse alignments with BLASTZ. Genome Res 13:103–7
    [Google Scholar]
  8. 8.  Blanchette M, Kent WJ, Riemer C, Elnitski L, Smith AFA et al. 2004. Aligning multiple genomic sequences with the threaded blockset aligner. Genome Res 14:708–15
    [Google Scholar]
  9. 9.  Johnson T 2007. Reciprocal best hits are not a logically sufficient condition for orthology. arXiv:0706.0117 [q-bio.GN]
  10. 10.  Koonin EV 2005. Orthologs, paralogs, and evolutionary genomics. Annu. Rev. Genet. 39:309–38
    [Google Scholar]
  11. 11.  Jiang T, Wang L 1994. On the complexity of multiple sequence alignment. J. Comput. Biol. 1:337–48
    [Google Scholar]
  12. 12.  Feng DF, Doolittle RF 1987. Progressive sequence alignment as a prerequisite to correct phylogenetic trees. J. Mol. Evol. 25:351–60
    [Google Scholar]
  13. 13.  Paten B, Herrero J, Beal K, Fitzgerald S, Birney E 2008. Enredo and Pecan: genome-wide mammalian consistency-based multiple alignment with paralogs. Genome Res 18:1814–28
    [Google Scholar]
  14. 14.  Raphael B, Zhi D, Tang H, Pevzner P 2004. A novel method for multiple alignment of sequences with repeated and shuffled elements. Genome Res 14:2336–46
    [Google Scholar]
  15. 15.  Aken BL, Achuthan P, Akanni W, Amode MR, Bernsdorff F et al. 2017. Ensembl 2017. Nucleic Acids Res 45:D635–42
    [Google Scholar]
  16. 16.  Zmasek CM, Eddy SR 2001. A simple algorithm to infer gene duplication and speciation events on a gene tree. Bioinformatics 17:821–28
    [Google Scholar]
  17. 17.  Altschul S, Gish W, Miller W, Myers E, Lipman D 1990. Basic local alignment search tool. J. Mol. Biol. 215:403–10
    [Google Scholar]
  18. 18.  Ma B, Tromp J, Li M 2002. PatternHunter: faster and more sensitive homology search. Bioinformatics 18:440–45
    [Google Scholar]
  19. 19.  Li H, Durbin R 2009. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25:1754–60
    [Google Scholar]
  20. 20.  Kent WJ 2002. BLAT—the BLAST-like alignment tool. Genome Res 12:656–64
    [Google Scholar]
  21. 21.  Harris R 2007. Improved pairwise alignment of genomic DNA PhD thesis Coll. Eng., Pa. State Univ. University Park, PA:
  22. 22.  Kiełbasa SM, Wan R, Sato K, Horton P, Frith MC 2011. Adaptive seeds tame genomic sequence comparison. Genome Res 21:487–93
    [Google Scholar]
  23. 23.  Jain C, Koren S, Dilthey A, Phillippy AM, Aluru S 2018. A fast adaptive algorithm for computing whole-genome homology maps. Bioinformatics 34:i748–56
    [Google Scholar]
  24. 24.  Jain C, Dilthey A, Koren S, Aluru S, Phillippy AM 2018. A fast approximation algorithm for mapping long reads to large reference databases. J. Comput. Biol. 25:766–79
    [Google Scholar]
  25. 25.  Darling ACE, Mau B, Blattner FR, Perna NT 2004. Mauve: multiple alignment of conserved genomic sequence with rearrangements. Genome Res 14:1394–403
    [Google Scholar]
  26. 26.  Kehr B, Trappe K, Holtgrewe M, Reinert K 2014. Genome alignment with graph data structures: a comparison. BMC Bioinform 15:99
    [Google Scholar]
  27. 27.  Brudno M, Malde S, Poliakov A, Do CB, Couronne O et al. 2003. Glocal alignment: finding rearrangements during alignment. Bioinformatics 19:i54–i62
    [Google Scholar]
  28. 28.  Brudno M, Morgenstern B 2002. Fast and sensitive alignment of large genomic sequences. Proceedings of the IEEE Computer Society Bioinformatics Conference138–47 New York: IEEE
    [Google Scholar]
  29. 29.  Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D 2003. Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. PNAS 100:11484–89
    [Google Scholar]
  30. 30.  Thompson JD, Higgins DG, Gibson TJ 1994. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22:4673–80
    [Google Scholar]
  31. 31.  Darling AE, Mau B, Perna NT 2010. ProgressiveMauve: multiple genome alignment with gene gain, loss and rearrangement. PLOS ONE 5:6e11147
    [Google Scholar]
  32. 32.  Angiuoli SV, Salzberg SL 2011. Mugsy: fast multiple alignment of closely related whole genomes. Bioinformatics 27:334–42
    [Google Scholar]
  33. 33.  Notredame C, Higgins D, Heringa J 2000. T-Coffee: a novel method for fast and accurate multiple sequence alignment. J. Mol. Biol. 302:205–17
    [Google Scholar]
  34. 34.  Casper J, Zweig AS, Villarreal C, Tyner C, Speir ML et al. 2017. The UCSC Genome Browser database: 2018 update. Nucleic Acids Res 46:D762–D69
    [Google Scholar]
  35. 35.  Pevzner PA, Tang H, Tesler G 2004. De novo repeat classification and fragment assembly. Genome Res 14:1786–96
    [Google Scholar]
  36. 36.  Dubchak I, Poliakov A, Kislyuk A, Brudno M 2009. Multiple whole-genome alignments without a reference organism. Genome Res 19:682–89
    [Google Scholar]
  37. 37.  Paten B, Herrero J, Fitzgerald S, Beal K, Flicek P et al. 2008. Genome-wide nucleotide-level mammalian ancestor reconstruction. Genome Res 18:1829–43
    [Google Scholar]
  38. 38.  Aken BL, Ayling S, Barrell D, Clarke L, Curwen V et al. 2016. The Ensembl gene annotation system. Database 2016:baw093
    [Google Scholar]
  39. 39.  Paten B, Diekhans M, Earl D, John JS, Ma J et al. 2011. Cactus graphs for genome comparisons. J. Comput. Biol. 18:469–81
    [Google Scholar]
  40. 40.  Paten B, Earl D, Nguyen N, Diekhans M, Zerbino D, Haussler D 2011. Cactus: algorithms for genome multiple sequence alignment. Genome Res 21:1512–28
    [Google Scholar]
  41. 41.  Earl D, Nguyen N, Hickey G, Harris RS, Fitzgerald S et al. 2014. Alignathon: a competitive assessment of whole-genome alignment methods. Genome Res 24:2077–89
    [Google Scholar]
  42. 42.  Nguyen N, Hickey G, Zerbino DR, Raney B, Earl D et al. 2015. Building a pan-genome reference for a population. J. Comput. Biol. 22:387–401
    [Google Scholar]
  43. 43.  Hickey G, Paten B, Earl D, Zerbino D, Haussler D 2013. HAL: a hierarchical format for storing and analyzing multiple genome alignments. Bioinformatics 29:1341–42
    [Google Scholar]
  44. 44.  Nguyen N, Hickey G, Raney BJ, Armstrong J, Clawson H et al. 2014. Comparative assembly hubs: web-accessible browsers for comparative genomics. Bioinformatics 30:3293–301
    [Google Scholar]
  45. 45.  Harrow J, Frankish A, Gonzalez JM, Tapanari E, Diekhans M et al. 2012. GENCODE: the reference human genome annotation for the encode project. Genome Res 22:1760–74
    [Google Scholar]
  46. 46.  Zhu J, Adli M, Zou JY, Verstappen G, Coyne M et al. 2013. Genome-wide chromatin state transitions associated with developmental and environmental cues. Cell 152:642–54
    [Google Scholar]
  47. 47. ENCODE Proj. Consort. 2004. The ENCODE (ENCyclopedia Of DNA Elements) Project. Science 306:636–40
    [Google Scholar]
  48. 48.  Deaton AM, Bird A 2011. CpG islands and the regulation of transcription. Genes Dev 25:1010–22
    [Google Scholar]
  49. 49.  Sherry ST, Ward MH, Kholodov M, Baker J, Phan L et al. 2001. dbSNP: the NCBI database of genetic variation. Nucleic Acids Res 29:308–11
    [Google Scholar]
  50. 50. 1000 Genomes Proj. Consort. 2010. A map of human genome variation from population-scale sequencing. Nature 467:1061–73
    [Google Scholar]
  51. 51.  Letovsky SI, Cottingham RW, Porter CJ, Li PW 1998. GDB: The Human Genome Database. Nucleic Acids Res 26:94–99
    [Google Scholar]
  52. 52.  Lukashin AV, Borodovsky M 1998. GeneMark.hmm: new solutions for gene finding. Nucleic Acids Res 26:1107–15
    [Google Scholar]
  53. 53.  Kulp D, Haussler D, Reese MG, Eeckman FH 1996. A generalized hidden Markov model for the recognition of human genes in DNA. Proc. Int. Conf. Intelligent Syst. Mol. Biol. 4:134–42
    [Google Scholar]
  54. 54.  Pruitt KD, Tatusova T, Maglott DR 2006. NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res 35:D61–D65
    [Google Scholar]
  55. 55.  Cantarel BL, Korf I, Robb SM, Parra G, Ross E et al. 2008. Maker: an easy-to-use annotation pipeline designed for emerging model organism genomes. Genome Res 18:188–96
    [Google Scholar]
  56. 56.  Gordon D, Huddleston J, Chaisson MJ, Hill CM, Kronenberg ZN et al. 2016. Long-read sequence assembly of the gorilla genome. Science 352:aae0344
    [Google Scholar]
  57. 57.  Weisenfeld NI, Kumar V, Shah P, Church D, Jaffe DB 2016. Direct determination of diploid genome sequences. Genome Res 27:5757–67
    [Google Scholar]
  58. 58.  Haussler D, O'Brien SJ, Ryder OA, Barker FK, Clamp M et al. 2009. Genome 10K: a proposal to obtain whole-genome sequence for 10,000 vertebrate species. J. Hered. 100:659–74
    [Google Scholar]
  59. 59.  Waterston RH, Pachter L 2002. Initial sequencing and comparative analysis of the mouse genome. Nature 420:520–62
    [Google Scholar]
  60. 60.  Flicek P, Keibler E, Hu P, Korf I, Brent MR 2003. Leveraging the mouse genome for gene prediction in human: from whole-genome shotgun reads to a global synteny map. Genome Res 13:46–54
    [Google Scholar]
  61. 61.  Wiehe T, Gebauer-Jung S, Mitchell-Olds T, Guigó R 2001. SGP-1: prediction and validation of homologous genes based on sequence alignments. Genome Res 11:1574–83
    [Google Scholar]
  62. 62.  Alexandersson M, Cawley S, Pachter L 2003. SLAM: cross-species gene finding and alignment with a generalized pair hidden Markov model. Genome Res 13:496–502
    [Google Scholar]
  63. 63.  Yeh RF, Lim LP, Burge CB 2001. Computational inference of homologous gene structures in the human genome. Genome Res 11:803–16
    [Google Scholar]
  64. 64.  Gelfand MS, Mironov AA, Pevzner PA 1996. Gene recognition via spliced sequence alignment. PNAS 93:9061–66
    [Google Scholar]
  65. 65.  Gross SS, Brent MR 2006. Using multiple alignments to improve gene prediction. J. Comput. Biol. 13:379–93
    [Google Scholar]
  66. 66.  van Baren MJ, Koebbe BC, Brent MR 2007. Using N-SCAN or TWINSCAN to predict gene structures in genomic DNA sequences. Curr. Protoc. Bioinform. 20:4.8.1–4.8.16
    [Google Scholar]
  67. 67.  Flicek P 2007. Gene prediction: compare and CONTRAST. Genome Biol 8:233
    [Google Scholar]
  68. 68.  Gross SS, Do CB, Sirota M, Batzoglou S 2007. CONTRAST: a discriminative, phylogeny-free approach to multiple informant de novo gene prediction. Genome Biol 8:R269
    [Google Scholar]
  69. 69.  Lafferty J, McCallum A, Pereira FC 2001. Conditional random fields: probabilistic models for segmenting and labeling sequence data Dep. Pap., Dep. Comput. Inf. Sci., Univ. Pa. Philadelphia:
  70. 70.  Stanke M, Diekhans M, Baertsch R, Haussler D 2008. Using native and syntenically mapped cDNA alignments to improve de novo gene finding. Bioinformatics 24:637–44
    [Google Scholar]
  71. 71.  Hoff K, Stanke M 2015. Current methods for automated annotation of protein-coding genes. Curr. Opin. Insect Sci. 7:8–14
    [Google Scholar]
  72. 72. Mamm. Gene Collect. Progr. Team. 2002. Generation and initial analysis of more than 15,000 full-length human and mouse cDNA sequences. PNAS 99:16899–903
    [Google Scholar]
  73. 73.  Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Rapp BA, Wheeler DL 2000. GenBank. Nucleic Acids Res 28:15–18
    [Google Scholar]
  74. 74.  Wei C, Brent MR 2006. Using ESTs to improve the accuracy of de novo gene prediction. BMC Bioinform 7:327
    [Google Scholar]
  75. 75.  Birney E, Clamp M, Durbin R 2004. GeneWise and genomewise. Genome Res 14:988–95
    [Google Scholar]
  76. 76.  Stanke M, Schöffmann O, Morgenstern B, Waack S 2006. Gene prediction in eukaryotes with a generalized hidden Markov model that uses hints from external sources. BMC Bioinform 7:62
    [Google Scholar]
  77. 77.  Stanke M, Steinkamp R, Waack S, Morgenstern B 2004. Augustus: a web server for gene finding in eukaryotes. Nucleic Acids Res 32:W309–12
    [Google Scholar]
  78. 78.  Yandell M, Ence D 2012. A beginner's guide to eukaryotic genome annotation. Nat. Rev. Genet. 13:329–42
    [Google Scholar]
  79. 79.  Zdobnov EM, Apweiler R 2001. InterProScan—an integration platform for the signature-recognition methods in InterPro. Bioinformatics 17:847–48
    [Google Scholar]
  80. 80.  Keller O, Kollmar M, Stanke M, Waack S 2011. A novel hybrid gene prediction method employing protein multiple sequence alignments. Bioinformatics 27:757–63
    [Google Scholar]
  81. 81.  GTEx Consort 2015. The Genotype-Tissue Expression (GTEx) pilot analysis: multitissue gene regulation in humans. Science 348:648–60
    [Google Scholar]
  82. 82.  Peng X, Gralinski L, Ferris MT, Frieman MB, Thomas MJ et al. 2011. Integrative deep sequencing of the mouse lung transcriptome reveals differential expression of diverse classes of small RNAs in response to respiratory virus infection. mBio 2:e00198–11
    [Google Scholar]
  83. 83.  Fiddes IT, Armstrong J, Diekhans M, Nachtweide S, Kronenberg ZN et al. 2018. Comparative Annotation Toolkit (CAT)—simultaneous clade and personal genome annotation. Genome Res 28:1029–38
    [Google Scholar]
  84. 84.  Sharma V, Elghafari A, Hiller M 2016. Coding exon-structure aware realigner (CESAR) utilizes genome alignments for accurate comparative gene annotation. Nucleic Acids Res 44:e103
    [Google Scholar]
  85. 85.  Meyer IM, Durbin R 2004. Gene structure conservation aids similarity based gene prediction. Nucleic Acids Res 32:776–83
    [Google Scholar]
  86. 86.  Florea L, Di Francesco V, Miller J, Turner R, Yao A et al. 2005. Gene and alternative splicing annotation with AIR. Genome Res 15:54–66
    [Google Scholar]
  87. 87.  Zhu J, Sanborn JZ, Diekhans M, Lowe CB, Pringle TH, Haussler D 2007. Comparative genomics search for losses of long-established genes on the human lineage. PLOS Comput. Biol. 3:e247
    [Google Scholar]
  88. 88.  Sharma V, Schwede P, Hiller M 2017. CESAR 2.0 substantially improves speed and accuracy of comparative gene annotation. Bioinformatics 33:3985–87
    [Google Scholar]
  89. 89.  König S, Romoth L, Gerischer L, Stanke M 2016. Simultaneous gene finding in multiple genomes. Bioinformatics 32:3388–95
    [Google Scholar]
  90. 90.  Lilue J, Doran AG, Fiddes IT, Abrudan M, Armstrong J et al. 2018. Multiple laboratory mouse reference genomes define strain specific haplotypes and novel functional loci. bioRxiv 235838. https://doi.org/10.1101/235838
    [Crossref]
  91. 91.  Sosinsky A, Glusman G, Lancet D 2000. The genomic structure of human olfactory receptor genes. Genomics 70:49–61
    [Google Scholar]
  92. 92.  Vilella AJ, Severin J, Ureta-Vidal A, Heng L, Durbin R, Birney E 2009. EnsemblCompara GeneTrees: complete, duplication-aware phylogenetic trees in vertebrates. Genome Res 19:327–35
    [Google Scholar]
  93. 93.  Rivas E, Eddy SR 2001. Noncoding RNA gene detection using comparative sequence analysis. BMC Bioinform 2:8
    [Google Scholar]
  94. 94.  Lagarde J, Uszczynska-Ratajczak B, Carbonell S, Pérez-Lluch S, Abad A et al. 2017. High-throughput annotation of full-length long noncoding RNAs with capture long-read sequencing. Nat. Genet. 49:1731–40
    [Google Scholar]
  95. 95.  Diederichs S 2014. The four dimensions of noncoding RNA conservation. Trends Genet 30:121–23
    [Google Scholar]
  96. 96.  Ulitsky I, Bartel DP 2013. lincRNAs: genomics, evolution, and mechanisms. Cell 154:26–46
    [Google Scholar]
  97. 97.  Lin MF, Jungreis I, Kellis M 2011. PhyloCSF: a comparative genomics method to distinguish protein coding and non-coding regions. Bioinformatics 27:i275–82
    [Google Scholar]
  98. 98.  Chaisson MJ, Huddleston J, Dennis MY, Sudmant PH, Malig M et al. 2015. Resolving the complexity of the human genome using single-molecule sequencing. Nature 517:608–11
    [Google Scholar]
  99. 99.  Jain M, Koren S, Miga KH, Quick J, Rand AC et al. 2018. Nanopore sequencing and assembly of a human genome with ultra-long reads. Nat. Biotechnol. 36:338–45
    [Google Scholar]
  100. 100.  Korlach J, Gedman G, Kingan SB, Chin CS, Howard JT et al. 2017. De novo PacBio long-read and phased avian genome assemblies correct and add to reference genes generated with intermediate and short reads. GigaScience 6:gix085
    [Google Scholar]
  101. 101.  Koepfli KP, Paten B, Genome 10K Community Sci., O'Brien SJ 2015. The Genome 10K project: a way forward. Annu. Rev. Anim. Biosci. 3:57–111
    [Google Scholar]
  102. 102.  Robinson GE, Hackett KJ, Purcell-Miramontes M, Brown SJ, Evans JD et al. 2011. Creating a buzz about insect genomes. Science 331:1386
    [Google Scholar]
  103. 103.  Stamatakis A 2014. RaxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30:1312–13
    [Google Scholar]
  104. 104.  Mirarab S, Bayzid MS, Boussau B, Warnow T 2014. Statistical binning enables an accurate coalescent-based estimation of the avian tree. Science 346:1250463
    [Google Scholar]
  105. 105.  Chauve C, El-Mabrouk N, Guéguen L, Semeria M, Tannier E 2013. Duplication, rearrangement and reconciliation: a follow-up 13 years later. Models and Algorithms for Genome Evolution 19 C Chauve, N El-Mabrouk, E Tannier 47–62 London: Springer
    [Google Scholar]
  106. 106.  Dewey CN 2011. Positional orthology: putting genomic evolutionary relationships into context. Brief. Bioinform. 12:401–12
    [Google Scholar]
  107. 107.  Paten B, Novak AM, Eizenga JM, Garrison E 2017. Genome graphs and the evolution of genome inference. Genome Res 27:665–76
    [Google Scholar]
  108. 108.  Marschall T, Marz M, Abeel T, Dijkstra L, Dutilh BE et al. 2018. Computational pan-genomics: status, promises and challenges. Brief. Bioinform. 19:118–35
    [Google Scholar]
  109. 109.  Bray N, Pimentel H, Melsted P, Pachter L 2015. Near-optimal probabilistic RNA-seq quantification. Nat. Biotechnol. 34:525–27
    [Google Scholar]
  110. 110.  Batzoglou S, Pachter L, Mesirov JP, Berger B, Lander ES 2000. Human and mouse gene structure: comparative analysis and application to exon prediction. Genome Res 10:950–58
    [Google Scholar]
  111. 111.  Pedersen JS, Hein J 2003. Gene finding with a hidden Markov model of genome structure and evolution. Bioinformatics 19:219–27
    [Google Scholar]
  112. 112.  Siepel A, Haussler D 2004. Computational identification of evolutionarily conserved exons. Proceedings of the Eighth Annual International Conference on Research in Computational Molecular Biology D Gusfield, P Bourne, S Istrail, P Pevzner, M Waterman 177–86 New York: Assoc. Comput. Mach.
    [Google Scholar]
  113. 113.  Carter D, Durbin R 2006. Vertebrate gene finding from multiple-species alignments using a two-level strategy. Genome Biol 7:Suppl. 1S6.1–12
    [Google Scholar]
  114. 114.  Haas BJ, Salzberg SL, Zhu W, Pertea M, Allen JE et al. 2008. Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. Genome Biol 9:R7
    [Google Scholar]
  115. 115.  Haas BJ, Papanicolaou A, Yassour M, Grabherr M, Blood PD et al. 2013. De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nat. Protoc. 8:1494–512
    [Google Scholar]
  116. 116.  Haas BJ, Delcher AL, Mount SM, Wortman JR, Smith RK Jr. et al. 2003. Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies. Nucleic Acids Res 31:5654–66
    [Google Scholar]
  117. 117.  Delcher AL, Kasif S, Fleischmann RD, Peterson J, White O, Salzberg SL 1999. Alignment of whole genomes. Nucleic Acids Res 27:2369–76
    [Google Scholar]
  118. 118.  Meyer IMM, Durbin R 2002. Comparative ab initio prediction of gene structures using pair HMMs. Bioinformatics 18:1309–18
    [Google Scholar]
/content/journals/10.1146/annurev-animal-020518-115005
Loading
/content/journals/10.1146/annurev-animal-020518-115005
Loading

Data & Media loading...

  • Article Type: Review Article
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error