"Cookiecutter: A Tool for Kmer-Based Read Filtering and Extraction" by Ekaterina Starostina, Gaik Tamazian et al.

Biology Faculty Articles

Title

Cookiecutter: A Tool for Kmer-Based Read Filtering and Extraction

Authors

Ekaterina Starostina, Bioinformatics Institute - Russia; St. Petersburg State University - Russia
Gaik Tamazian, St. Petersburg State University - Russia
Pavel Dobrynin, St. Petersburg State University - Russia
Stephen J. O'Brien, St. Petersburg State University - Russia; Nova Southeastern UniversityFollow
Aleksey Komissarov, St. Petersburg State University - Russia

Document Type

Article

Publication Date

8-14-2015

Publication Title

bioRxiv

First Page

Last Page

Abstract

Motivation: Kmer-based analysis is a powerful method used in read error correction and implemented in various genome assembly tools. A number of read processing routines include extracting or removing sequence reads from the results of highthroughput sequencing experiments prior to further analysis. Here we present a new approach to sorting or filtering of raw reads based on a provided list of kmers.

Results: We developed Cookiecutter — a computational tool for rapid read extraction or removing according to a provided list of k-mers generated from a FASTA file. Cookiecutter is based on the implementation of the Aho-Corasik algorithm and is useful in routine processing of high-throughput sequencing datasets. Cookiecutter can be used for both removing undesirable reads and read extraction from a user-defined region of interest.

Availability: The open-source implementation with user instructions can be obtained from GitHub: https://github.com/ ad3002/Cookiecutter

Comments

The copyright holder for this preprint is the author/funder. It is made available under a CC-BY 4.0 International license.

NSUWorks Citation

Starostina, Ekaterina; Gaik Tamazian; Pavel Dobrynin; Stephen J. O'Brien; and Aleksey Komissarov. 2015. "Cookiecutter: A Tool for Kmer-Based Read Filtering and Extraction." bioRxiv , (): 1-6. doi:10.1101/024679.

ORCID ID

0000-0001-7353-8301

ResearcherID

N-1726-2015

DOI

10.1101/024679

Link to Full Text

Find in your library

COinS

Biology Faculty Articles

Title

Cookiecutter: A Tool for Kmer-Based Read Filtering and Extraction

Authors

Document Type

Publication Date

Publication Title

First Page

Last Page

Abstract

Comments

NSUWorks Citation

ORCID ID

ResearcherID

DOI

Browse

Author Corner

Links

Connect with NSU

Biology Faculty Articles

Title

Cookiecutter: A Tool for Kmer-Based Read Filtering and Extraction

Authors

Document Type

Publication Date

Publication Title

First Page

Last Page

Abstract

Comments

NSUWorks Citation

ORCID ID

ResearcherID

DOI

Share

Browse

Author Corner

Links

Connect with NSU