summaryrefslogtreecommitdiffstats
path: root/academic/seqkit/README
blob: d78d8069cd6afc4e870af7c4a4f23f47ace91731 (plain)
SeqKit - a cross-platform and ultrafast toolkit for FASTA/Q file
manipulation

FASTA and FASTQ are basic and ubiquitous formats for storing nucleotide
and protein sequences. Common manipulations of FASTA/Q file include
converting, searching, filtering, deduplication, splitting, shuffling,
and sampling. Existing tools only implement some of these manipulations,
and not particularly efficiently, and some are only available for
certain operating systems. Furthermore, the complicated installation
process of required packages and running environments can render these
programs less user friendly.

This project describes a cross-platform ultrafast comprehensive toolkit
for FASTA/Q processing. SeqKit provides executable binary files for all
major operating systems, including Windows, Linux, and Mac OS X, and can
be directly used without any dependencies or pre-configurations. SeqKit
demonstrates competitive performance in execution time and memory usage
compared to similar tools. The efficiency and usability of SeqKit enable
researchers to rapidly accomplish common FASTA/Q file manipulations.

Note: This just repackages the binaries provided from upstream.

Please cite:
Wei Shen,Shuai Le,Yan Li ,Fuquan Hu. SeqKit: A Cross-Platform and
Ultrafast Toolkit for FASTA/Q File Manipulation. October 5, 2016
https://doi.org/10.1371/journal.pone.0163962