Slim-filter: an interactive Windows-based application for illumina genome analyzer data assessment and manipulation.
Abstract
BACKGROUND
The emergence of Next Generation Sequencing technologies has made it possible for individual investigators to generate gigabases
of sequencing data per week. Effective analysis and manipulation of these data is limited due to large file sizes, so even
simple tasks such as data filtration and quality assessment have to be performed in several steps. This requires (potentially
problematic) interaction between the investigator and a bioinformatics/computational service provider. Furthermore, such services
are often performed using specialized computational facilities.
RESULTS
We present a Windows-based application, Slim-Filter designed to interactively examine the statistical properties of sequencing
reads produced by Illumina Genome Analyzer and to perform a broad spectrum of data manipulation tasks including: filtration
of low quality and low complexity reads; filtration of reads containing undesired subsequences (such as parts of adapters
and PCR primers used during the sample and sequencing libraries preparation steps); excluding duplicated reads (while keeping
each read's copy number information in a specialized data format); and sorting reads by copy numbers allowing for easy access
and manual editing of the resulting files. Slim-Filter is organized as a sequence of windows summarizing the statistical properties
of the reads. Each data manipulation step has roll-back abilities, allowing for return to previous steps of the data analysis
process. Slim-Filter is written in C++ and is compatible with fasta, fastq, and specialized AS file formats presented in this
manuscript. Setup files and a user's manual are available for download at the supplementary web site ( https://www.bioinfo.uh.edu/Slim_Filter/).
CONCLUSION
The presented Windows-based application has been developed with the goal of providing individual investigators with integrated
sequencing reads analysis, curation, and manipulation capabilities.
Links
Authors
Golovko G, Khanipov K, Rojas M, Martinez-Alcántara A, Howard JJ, Ballesteros E, Gupta S, Widger W, Fofanov Y
Institution
Center for BioMedical and Environmental Genomics, University of Houston, Houston, TX, USA. ggolovko@bioinfo.uh.edu
Source
BMC bioinformatics 13: 2012 pg 166MeSH
Computational BiologyDNA Primers
Genome
Sequence Analysis, DNA
Software
Pub Type(s)
Journal ArticleResearch Support, Non-U.S. Gov't
Research Support, U.S. Gov't, Non-P.H.S.
Language
eng
PubMed ID
22800377
Log In

