[Debian-med-packaging] Bug#1023262: ITP: any2fasta -- convert various sequence formats to FASTA
Andreas Tille
tille at debian.org
Tue Nov 1 11:14:02 GMT 2022
Package: wnpp
Severity: wishlist
Subject: ITP: any2fasta -- convert various sequence formats to FASTA
Package: wnpp
Owner: Andreas Tille <tille at debian.org>
Severity: wishlist
* Package name : any2fasta
Version : 0.4.2
Upstream Author : Torsten Seemann
* URL : https://github.com/tseemann/any2fasta
* License : GPL-3+
Programming Lang: (C, C++, C#, Perl, Python, etc.)
Description : convert various sequence formats to FASTA
Established tools like readseq and seqret from EMBOSS, both create mangled
IDs containing | or . characters, and there is no way to fix this behaviour.
This resultes in inconsitences between .gbk and .fna versions of files in
pipelines.
.
This script uses only core Perl modules, has no other dependencies like
Bioperl or Biopython, and runs very quickly.
.
It supports the following input formats:
.
1. Genbank flat file, typically .gb, .gbk, .gbff (starts with LOCUS)
2. EMBL flat file, typically .embl, (starts with ID)
3. GFF with sequence, typically .gff, .gff3 (starts with ##gff)
4. FASTA DNA, typically .fasta, .fa, .fna, .ffn (starts with >)
5. FASTQ DNA, typically .fastq, .fq (starts with @)
6. CLUSTAL alignments, typically .clw, .clu (starts with CLUSTAL or MUSCLE)
7. STOCKHOLM alignments, typically .sth (starts with # STOCKHOLM)
8. GFA assembly graph, typically .gfa (starts with ^[A-Z]\t)
.
Files may be compressed with:
.
1. gzip, typically .gz
2. bzip2, typically .bz2
3. zip, typically .zip
Remark: This package is maintained by Debian Med Packaging Team at
https://salsa.debian.org/med-team/any2fasta
More information about the Debian-med-packaging
mailing list