HISAT: a fast spliced aligner with low memory requirements

Daehwan Kim; Ben Langmead; Steven L Salzberg

doi:10.1038/nmeth.3317

HISAT: a fast spliced aligner with low memory requirements

Nat Methods. 2015 Apr;12(4):357-60. doi: 10.1038/nmeth.3317. Epub 2015 Mar 9.

Authors

Daehwan Kim¹, Ben Langmead², Steven L Salzberg²

Affiliations

¹ 1] Center for Computational Biology, McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA. [2] Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, Maryland, USA.
² 1] Center for Computational Biology, McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA. [2] Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, Maryland, USA. [3] Department of Computer Science, Johns Hopkins University, Baltimore, Maryland, USA.

Abstract

HISAT (hierarchical indexing for spliced alignment of transcripts) is a highly efficient system for aligning reads from RNA sequencing experiments. HISAT uses an indexing scheme based on the Burrows-Wheeler transform and the Ferragina-Manzini (FM) index, employing two types of indexes for alignment: a whole-genome FM index to anchor each alignment and numerous local FM indexes for very rapid extensions of these alignments. HISAT's hierarchical index for the human genome contains 48,000 local FM indexes, each representing a genomic region of ∼64,000 bp. Tests on real and simulated data sets showed that HISAT is the fastest system currently available, with equal or better accuracy than any other method. Despite its large number of indexes, HISAT requires only 4.3 gigabytes of memory. HISAT supports genomes of any size, including those larger than 4 billion bases.

Publication types

Research Support, N.I.H., Extramural

MeSH terms

Humans
Limit of Detection
Pseudogenes / genetics
Sequence Alignment / methods*
Sequence Analysis, DNA / instrumentation*
Sequence Analysis, RNA

Abstract

Publication types

MeSH terms

Grants and funding