Biology Faculty Articles
Title
Cookiecutter: A Tool for Kmer-Based Read Filtering and Extraction
Document Type
Article
Publication Date
8-14-2015
Publication Title
bioRxiv
First Page
1
Last Page
6
Abstract
Motivation: Kmer-based analysis is a powerful method used in read error correction and implemented in various genome assembly tools. A number of read processing routines include extracting or removing sequence reads from the results of highthroughput sequencing experiments prior to further analysis. Here we present a new approach to sorting or filtering of raw reads based on a provided list of kmers.
Results: We developed Cookiecutter — a computational tool for rapid read extraction or removing according to a provided list of k-mers generated from a FASTA file. Cookiecutter is based on the implementation of the Aho-Corasik algorithm and is useful in routine processing of high-throughput sequencing datasets. Cookiecutter can be used for both removing undesirable reads and read extraction from a user-defined region of interest.
Availability: The open-source implementation with user instructions can be obtained from GitHub: https://github.com/ ad3002/Cookiecutter
NSUWorks Citation
Starostina, Ekaterina; Gaik Tamazian; Pavel Dobrynin; Stephen J. O'Brien; and Aleksey Komissarov. 2015. "Cookiecutter: A Tool for Kmer-Based Read Filtering and Extraction." bioRxiv , (): 1-6. doi:10.1101/024679.
ORCID ID
0000-0001-7353-8301
ResearcherID
N-1726-2015
DOI
10.1101/024679
Comments
The copyright holder for this preprint is the author/funder. It is made available under a CC-BY 4.0 International license.