Accelerating read mapping with FastHASH

Xin, Hongyi; Lee, Donghyuk; Hormozdiari, Farhad; Yedkar, Samihan; Mutlu, Onur; Alkan, Can

doi:10.1184/R1/6468272.v1

file.pdf (937.25 kB)

Accelerating read mapping with FastHASH

journal contribution

posted on 2013-01-01, 00:00 authored by Hongyi Xin, Donghyuk Lee, Farhad Hormozdiari, Samihan Yedkar, Onur Mutlu, Can Alkan

With the introduction of next-generation sequencing (NGS) technologies, we are facing an exponential increase in the amount of genomic sequence data. The success of all medical and genetic applications of next-generation sequencing critically depends on the existence of computational techniques that can process and analyze the enormous amount of sequence data quickly and accurately. Unfortunately, the current read mapping algorithms have difficulties in coping with the massive amounts of data generated by NGS.We propose a new algorithm, FastHASH, which drastically improves the performance of the seed-and-extend type hash table based read mapping algorithms, while maintaining the high sensitivity and comprehensiveness of such methods. FastHASH is a generic algorithm compatible with all seed-and-extend class read mapping algorithms. It introduces two main techniques, namely Adjacency Filtering, and Cheap K-mer Selection.We implemented FastHASH and merged it into the codebase of the popular read mapping program, mrFAST. Depending on the edit distance cutoffs, we observed up to 19-fold speedup while still maintaining 100% sensitivity and high comprehensiveness.

History

Publisher Statement

© 2013 Xin et al.; licensee BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Date

2013-01-01

Usage metrics

Keywords

Algorithms Chromosome Mapping Databases Genetic Genome Human High-Throughput Nucleotide Sequencing Humans Sequence Alignment Software

Licence

CC BY 3.0

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

Accelerating read mapping with FastHASH

History

Publisher Statement

Date

Usage metrics

Categories

Keywords

Licence

Exports