Conference proceeding
A Comparison of Seed-and-Extend Techniques in Modern DNA Read Alignment Algorithms
2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
Dec 2016
Abstract
DNA read alignment is a major step in genome analysis. However, as DNA reads continue to become longer, new approaches need to be developed to effectively use these longer reads in the alignment process. Modern aligners commonly use a two-step approach for read alignment: 1. seeding, 2. extension. In this paper, we have investigated various seeding and extension techniques used in modern DNA read alignment algorithms to find the best seeding and extension combinations. We developed an open source generic DNA read aligner that can be used to compare the alignment accuracy and total execution time of different combinations of seeding and extension algorithms. For extension, our results show that local alignment is the best extension approach, achieving up to 3.6x more accuracy than other extension techniques, for longer reads. For seeding, if BLAST-like seed extension is used, the best seeding approach is identifying all SMEMs in the DNA read (e.g., approach used by BWA-MEM). This combination is up to 6x more accurate than other seeding techniques, for longer reads. With local alignment, we observed that the seeding technique does not impact the alignment accuracy. Furthermore, we showed that an optimized implementation of local alignment using vector instructions, enabling 4.5x speedup, makes it the fastest of all extension techniques. Overall, we show that using local alignment with non-overlapping maximal exact matching seeds is the best seeding-extension combination due to its high accuracy and higher potential for optimization/acceleration for future DNA reads.
Metrics
14 Record Views
Details
- Title
- A Comparison of Seed-and-Extend Techniques in Modern DNA Read Alignment Algorithms
- Creators
- N AhmedK.L.M BertelsZ Al-ArsTianhai TianQinghua JiangYunlong LiuKevin BurrageJiangning SongYadong WangXiaohua HuShinichi MorishitaQian ZhuGuohua Wang
- Publication Details
- 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
- Conference
- 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
- Publisher
- IEEE
- Resource Type
- Conference proceeding
- Language
- English
- Academic Unit
- Information Science (Informatics)
- Identifiers
- 991019189142404721