Logo image
Analysis of Subtelomeric REXTAL Assemblies Using QUAST
Journal article   Open access

Analysis of Subtelomeric REXTAL Assemblies Using QUAST

Tunazzina Islam, Desh Ranjan, Mohammad Zubair, Eleanor Young, Ming Xiao and Harold Riethman
IEEE/ACM transactions on computational biology and bioinformatics, v 18(1), pp 365-372
Jan 2021
PMID: 31056507
url
https://digitalcommons.odu.edu/context/computerscience_fac_pubs/article/1277/viewcontent/Ranjan_2021_AnalysisofSubtelomericREXTALAssembliesUsingQuastOCR.pdfView
Accepted (AM)Open Access (License Unspecified) Open

Abstract

Bioinformatics Computer science DNA genome gap Genomics misassembly Pipelines quality metric Regional assembly segmental duplication subtelomere tandem repeat Visualization
Genomic regions of high segmental duplication content and/or structural variation have led to gaps and misassemblies in the human reference sequence, and are refractory to assembly from whole-genome short-read datasets. Human subtelomere regions are highly enriched in both segmental duplication content and structural variations, and as a consequence are both impossible to assemble accurately and highly variable from individual to individual. Recently, we developed a pipeline for improved region-specific assembly called Regional Extension of Assemblies Using Linked-Reads (REXTAL). In this study, we evaluate REXTAL and genome-wide assembly (Supernova) approaches on 10X Genomics linked-reads data sets partitioned and barcoded using the Gel Bead in Emulsion (GEM) microfluidic method. Our results describe the accuracy and relative performance of these two approaches using the reference-based assessment module of QUAST. We show that REXTAL dramatically outperforms the Supernova whole genome assembler in subtelomeric segmental duplication regions, and results in highly accurate assemblies. Nearly all of the REXTAL "misassemblies" identified using default QUAST parameters simply pinpoint locations of tandem repeat arrays in the reference sequence where the repeat array length differs from that in the cognate REXTAL assembly by <inline-formula><tex-math notation="LaTeX">></tex-math> <mml:math><mml:mo>></mml:mo></mml:math><inline-graphic xlink:href="islam-ieq1-2913845.gif"/> </inline-formula> 1000 bp.

Metrics

17 Record Views
4 citations in Scopus

Details

UN Sustainable Development Goals (SDGs)

This publication has contributed to the advancement of the following goals:

#3 Good Health and Well-Being

InCites Highlights

Data related to this publication, from InCites Benchmarking & Analytics tool:

Collaboration types
Domestic collaboration
Web of Science research areas
Biochemical Research Methods
Computer Science, Interdisciplinary Applications
Mathematics, Interdisciplinary Applications
Statistics & Probability
Logo image