Title
Quality assessment of genome scaffolding tools using a large vertebrate genome
Location
Guy Harvey Oceanographic Center Facility
Start
5-20-2016 10:15 AM
End
5-20-2016 10:30 AM
Abstract
Understanding the genomic fundamentals of organisms provides insights into the functional evolution of genes and allows us to elucidate previously unrecognized connections within and between species. However, the formation of robust conclusions requires accurate and high-quality genome assembly. Acquiring it involves filtering and assembling huge volumes of raw sequence data produced by Next Generation Sequencing. Draft assembly should be further improved using additional data, e.g., mate-pairs. There are several scaffolding tools available to accomplish this, and the assemblies produced by each algorithm vary in quality. To determine the validity of these tools they are usually compared on simulated data or small subsets of real data (e.g., a single chromosome), which limits the usefulness of these assessments when choosing method for large genomes. Here, I present a comparison of several scaffolding tools using a large (~1.78 Gbp) vertebrate genome, Ophisaurus gracilis, the Burmese glass lizard. This species was chosen because it is well assembled and has a large volume of sequencing read data available. Analysis of the final assemblies was accomplished using multiple quality assessment methods, including QUAST and read mapping analysis. This assessment is part of a collaboration project between Nova Southeastern University and the Theodosius Dobzhansky Center for Genome Bioinformatics at Saint Petersburg State University (Russia).
Quality assessment of genome scaffolding tools using a large vertebrate genome
Guy Harvey Oceanographic Center Facility
Understanding the genomic fundamentals of organisms provides insights into the functional evolution of genes and allows us to elucidate previously unrecognized connections within and between species. However, the formation of robust conclusions requires accurate and high-quality genome assembly. Acquiring it involves filtering and assembling huge volumes of raw sequence data produced by Next Generation Sequencing. Draft assembly should be further improved using additional data, e.g., mate-pairs. There are several scaffolding tools available to accomplish this, and the assemblies produced by each algorithm vary in quality. To determine the validity of these tools they are usually compared on simulated data or small subsets of real data (e.g., a single chromosome), which limits the usefulness of these assessments when choosing method for large genomes. Here, I present a comparison of several scaffolding tools using a large (~1.78 Gbp) vertebrate genome, Ophisaurus gracilis, the Burmese glass lizard. This species was chosen because it is well assembled and has a large volume of sequencing read data available. Analysis of the final assemblies was accomplished using multiple quality assessment methods, including QUAST and read mapping analysis. This assessment is part of a collaboration project between Nova Southeastern University and the Theodosius Dobzhansky Center for Genome Bioinformatics at Saint Petersburg State University (Russia).