Biology Faculty Articles
Document Type
Article
Publication Date
1-14-2015
Publication Title
GigaScience
Keywords
Genome, Selection, Resampling, Evolution, Population, Galaxy
ISSN
2047-217X
Volume
3
Issue/No.
1
First Page
1
Last Page
6
Abstract
Background: Adaptive alleles may rise in frequency as a consequence of positive selection, creating a pattern of decreased variation in the neighboring loci, known as a selective sweep. When the region containing this pattern is compared to another population with no history of selection, a rise in variance of allele frequencies between populations is observed. One challenge presented by large genome-wide datasets is the ability to differentiate between patterns that are remnants of natural selection from those expected to arise at random and/or as a consequence of selectively neutral demographic forces acting in the population.
Findings: SmileFinder is a simple program that looks for diversity and divergence patterns consistent with selection sweeps by evaluating allele frequencies in windows, including neighboring loci from two or more populations of a diploid species against the genome-wide neutral expectation. The program calculates the mean of heterozygosity and FST in a set of sliding windows of incrementally increasing sizes, and then builds a resampled distribution (the baseline) of random multi-locus sets matched to the sizes of sliding windows, using an unrestricted sampling. Percentiles of the values in the sliding windows are derived from the superimposed resampled distribution. The resampling can easily be scaled from 1 K to 100 M; the higher the number, the more precise the percentiles ascribed to the extreme observed values.
Conclusions: The output from SmileFinder can be used to plot percentile values to look for population diversity and divergence patterns that may suggest past actions of positive selection along chromosome maps, and to compare lists of suspected candidate genes under random gene sets to test for the overrepresentation of these patterns among gene categories. Both applications of the algorithm have already been used in published studies. Here we present a publicly available, open source program that will serve as a useful tool for preliminary scans of selection using worldwide databases of human genetic variation, as well as population datasets for many non-human species, from which such data is rapidly emerging with the advent of new genotyping and sequencing technologies.
Additional Comments
National Science Foundation grant #: MCB-1019454; Russian Ministry of Science grant #: 11.G34.31.0068; National Institute of General Medical Sciences grant #: R01GM092706
NSUWorks Citation
Guiblet, Wilfred M.; Kai Zhao; Stephen J. O'Brien; Steven E. Massey; Alfred L. Roca; and T. K. Oleksyk. 2015. "SmileFinder: A Resampling-Based Approach to Evaluate Signatures of Selection from Genome-Wide Sets of Matching Allele Frequency Data in Two or More Diploid Populations." GigaScience 3, (1): 1-6. https://nsuworks.nova.edu/cnso_bio_facarticles/740
ORCID ID
0000-0001-7353-8301
ResearcherID
N-1726-2015
Comments
© 2015 Guiblet et al.; licensee BioMed Central. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.