Biology Faculty Articles
Document Type
Article
Publication Date
7-24-2013
Publication Title
BMC Genomics
Keywords
Theobroma cacao, Transposable elements, Next generation sequencing, Graph based clustering, Retrotransposon
ISSN
1471-2164
Volume
14
First Page
502
Abstract
Background
Transposable elements (TEs) and other repetitive elements are a large and dynamically evolving part of eukaryotic genomes, especially in plants where they can account for a significant proportion of genome size. Their dynamic nature gives them the potential for use in identifying and characterizing crop germplasm. However, their repetitive nature makes them challenging to study using conventional methods of molecular biology. Next generation sequencing and new computational tools have greatly facilitated the investigation of TE variation within species and among closely related species.
Results
(i) We generated low-coverage Illumina whole genome shotgun sequencing reads for multiple individuals of cacao (Theobroma cacao) and related species. These reads were analysed using both an alignment/mapping approach and a de novo (graph based clustering) approach. (ii) A standard set of ultra-conserved orthologous sequences (UCOS) standardized TE data between samples and provided phylogenetic information on the relatedness of samples. (iii) The mapping approach proved highly effective within the reference species but underestimated TE abundance in interspecific comparisons relative to the de novo methods. (iv) Individual T. cacao accessions have unique patterns of TE abundance indicating that the TE composition of the genome is evolving actively within this species. (v) LTR/Gypsy elements are the most abundant, comprising c.10% of the genome. (vi) Within T. cacao the retroelement families show an order of magnitude greater sequence variability than the DNA transposon families. (vii) Theobroma grandiflorum has a similar TE composition to T. cacao, but the related genus Herrania is rather different, with LTRs making up a lower proportion of the genome, perhaps because of a massive presence (c. 20%) of distinctive low complexity satellite-like repeats in this genome.
Conclusions
(i) Short read alignment/mapping to reference TE contigs provides a simple and effective method of investigating intraspecific differences in TE composition. It is not appropriate for comparing repetitive elements across the species boundaries, for which de novo methods are more appropriate. (ii) Individual T. cacao accessions have unique spectra of TE composition indicating active evolution of TE abundance within this species. TE patterns could potentially be used as a “fingerprint” to identify and characterize cacao accessions.
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.
NSUWorks Citation
Sveinsson, Saemundur; Navdeep Gill; Nolan Kane; and Quentin Cronk. 2013. "Transposon Fingerprinting Using Low Coverage Whole Genome Shotgun Sequencing in Cacao (Theobroma cacao L.) and Related Species." BMC Genomics 14, (): 502. doi:https://doi.org/10.1186/1471-2164-14-502.
12864_2013_5202_MOESM2_ESM.pdf (17595 kB)
12864_2013_5202_MOESM3_ESM.pdf (243 kB)
12864_2013_5202_MOESM4_ESM.zip (2882 kB)
12864_2013_5202_MOESM5_ESM.fsa (384 kB)
12864_2013_5202_MOESM6_ESM.nex (222 kB)
ORCID ID
orcid-logo http://orcid.org/0000-0003-3746-1866
DOI
https://doi.org/10.1186/1471-2164-14-502
Comments
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.