[med-svn] [Git][med-team/seqkit][master] 3 commits: New upstream version 2.9.0+ds

Maytham Alsudany (@Maytha8) gitlab at salsa.debian.org
Sat Feb 22 11:13:08 GMT 2025



Maytham Alsudany pushed to branch master at Debian Med / seqkit


Commits:
3d2db80b by Maytham Alsudany at 2025-01-09T19:57:56+08:00
New upstream version 2.9.0+ds
- - - - -
e8f5ea54 by Maytham Alsudany at 2025-01-09T19:57:56+08:00
Update upstream source from tag 'upstream/2.9.0+ds'

Update to upstream version '2.9.0+ds'
with Debian dir cab183427bfed25b28445ed10df81490056ecef2
- - - - -
41d03f49 by Maytham Alsudany at 2025-01-09T20:17:53+08:00
Upload to unstable

- - - - -


14 changed files:

- CHANGELOG.md
- README.md
- debian/changelog
- doc/docs/download.md
- doc/docs/usage.md
- go.mod
- go.sum
- seqkit/cmd/grep.go
- seqkit/cmd/helper.go
- seqkit/cmd/locate.go
- seqkit/cmd/replace.go
- seqkit/cmd/stat.go
- seqkit/cmd/version.go
- seqkit/packaging.sh


Changes:

=====================================
CHANGELOG.md
=====================================
@@ -1,3 +1,13 @@
+- [SeqKit v2.9.0](https://github.com/shenwei356/seqkit/releases/tag/v2.9.0) - 2024-11-01
+[![Github Releases (by Release)](https://img.shields.io/github/downloads/shenwei356/seqkit/v2.9.0/total.svg)](https://github.com/shenwei356/seqkit/releases/tag/v2.9.0)
+    - `seqkit`:
+        - **Fix sequence ID parsing with the default regular expression (in this case, we actually use bytes.Index instead) for a rare case: "xxx\tyyy zzz" was wrongly parsed as "xxx\tyyy"**. [#486](https://github.com/shenwei356/seqkit/issues/486)
+    - `seqkit locate`:
+        - **Fix `-G/--non-greedy` for tandem repeats**, e.g., ATTCGATTCGATTCG (ATTCGx3).
+    - `seqkit grep/subseq`:
+        - Fix negative regions longer than sequence length. [#479](https://github.com/shenwei356/seqkit/issues/479).
+    - `seqkit stats`:
+        - Add an extra column `sum_n` to count the number of ambiguous characters. [#490](https://github.com/shenwei356/seqkit/issues/490)
 - [SeqKit v2.8.2](https://github.com/shenwei356/seqkit/releases/tag/v2.8.2) - 2024-05-17
 [![Github Releases (by Release)](https://img.shields.io/github/downloads/shenwei356/seqkit/v2.8.2/total.svg)](https://github.com/shenwei356/seqkit/releases/tag/v2.8.2)
     - `seqkit amplicon`:


=====================================
README.md
=====================================
@@ -1,6 +1,6 @@
 # SeqKit - a cross-platform and ultrafast toolkit for FASTA/Q file manipulation
 
-
+- [**Try SeqKit in your browser**](https://sandbox.bio/tutorials/seqkit-intro) (Tutorials and Exercises provided by [sandbox.bio](https://sandbox.bio/tutorials/seqkit-intro))
 - **Documents:** [http://bioinf.shenwei.me/seqkit](http://bioinf.shenwei.me/seqkit)
 ([**Usage**](http://bioinf.shenwei.me/seqkit/usage/),
 [**FAQs**](http://bioinf.shenwei.me/seqkit/faq/),
@@ -18,6 +18,7 @@ and
 [![Citation Badge](https://api.juleskreuer.eu/citation-badge.php?doi=10.1371/journal.pone.0163962)](https://scholar.google.com/citations?view_op=view_citation&hl=en&user=wHF3Lm8AAAAJ&citation_for_view=wHF3Lm8AAAAJ:zYLM7Y9cAGgC)
 - **Others**: [![check in Biotreasury](https://img.shields.io/badge/Biotreasury-collected-brightgreen)](https://biotreasury.rjmart.cn/#/tool?id=10081)  
 
+
 <a href="https://doi.org/10.1002/imt2.191"><img src="seqkit2.jpg" alt="Subcommands of SeqKit2" width="700"/></a>
 
 ## Features


=====================================
debian/changelog
=====================================
@@ -1,3 +1,10 @@
+seqkit (2.9.0+ds-1) unstable; urgency=medium
+
+  * Team upload.
+  * New upstream version 2.9.0
+
+ -- Maytham Alsudany <maytha8thedev at gmail.com>  Thu, 09 Jan 2025 19:58:02 +0800
+
 seqkit (2.8.2+ds-2) unstable; urgency=medium
 
   * Fixup FTBFS with newer mpb (Closes: #1087554)


=====================================
doc/docs/download.md
=====================================
@@ -13,14 +13,17 @@ Please cite:
 
 ## Current Version
 
-- [SeqKit v2.8.2](https://github.com/shenwei356/seqkit/releases/tag/v2.8.2) - 2024-05-17
-[![Github Releases (by Release)](https://img.shields.io/github/downloads/shenwei356/seqkit/v2.8.2/total.svg)](https://github.com/shenwei356/seqkit/releases/tag/v2.8.2)
-    - `seqkit amplicon`:
-        - Fix a big introduced in v2.7.0. When more than one pairs of primers are given, only the last one is used. [#457](https://github.com/shenwei356/seqkit/issues/457)
-    - `seqkit translate`:
-        - Add option `-e/--skip-translate-errors` to skip translate error and output empty sequence. [#458](https://github.com/shenwei356/seqkit/pull/458)
-    - `seqkit split`:
-        - Add flag `-I/--ignore-case` for `-i/--by-id`. [#462](https://github.com/shenwei356/seqkit/issues/462)
+- [SeqKit v2.9.0](https://github.com/shenwei356/seqkit/releases/tag/v2.9.0) - 2024-11-01
+[![Github Releases (by Release)](https://img.shields.io/github/downloads/shenwei356/seqkit/v2.9.0/total.svg)](https://github.com/shenwei356/seqkit/releases/tag/v2.9.0)
+    - `seqkit`:
+        - **Fix sequence ID parsing with the default regular expression (in this case, we actually use bytes.Index instead) for a rare case: "xxx\tyyy zzz" was wrongly parsed as "xxx\tyyy"**. [#486](https://github.com/shenwei356/seqkit/issues/486)
+    - `seqkit locate`:
+        - **Fix `-G/--non-greedy` for tandem repeats**, e.g., ATTCGATTCGATTCG (ATTCGx3).
+    - `seqkit grep/subseq`:
+        - Fix negative regions longer than sequence length. [#479](https://github.com/shenwei356/seqkit/issues/479).
+    - `seqkit stats`:
+        - Add an extra column `sum_n` to count the number of ambiguous characters. [#490](https://github.com/shenwei356/seqkit/issues/490)
+
 
 
 
@@ -29,13 +32,13 @@ Please cite:
 
 OS     |Arch      |File, 中国镜像                                                                                                                                                                                  |Download Count
 :------|:---------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-Linux  |32-bit    |[seqkit_linux_386.tar.gz](https://github.com/shenwei356/seqkit/releases/download/v2.8.2/seqkit_linux_386.tar.gz), <br/> [中国镜像](http://app.shenwei.me/data/seqkit/seqkit_linux_386.tar.gz)                            |[![Github Releases (by Asset)](https://img.shields.io/github/downloads/shenwei356/seqkit/latest/seqkit_linux_386.tar.gz.svg?maxAge=3600)](https://github.com/shenwei356/seqkit/releases/download/v2.8.2/seqkit_linux_386.tar.gz)
-Linux  |**64-bit**|[**seqkit_linux_amd64.tar.gz**](https://github.com/shenwei356/seqkit/releases/download/v2.8.2/seqkit_linux_amd64.tar.gz), <br/> [中国镜像](http://app.shenwei.me/data/seqkit/seqkit_linux_amd64.tar.gz)                  |[![Github Releases (by Asset)](https://img.shields.io/github/downloads/shenwei356/seqkit/latest/seqkit_linux_amd64.tar.gz.svg?maxAge=3600)](https://github.com/shenwei356/seqkit/releases/download/v2.8.2/seqkit_linux_amd64.tar.gz)
-Linux  |**arm64** |[**seqkit_linux_arm64.tar.gz**](https://github.com/shenwei356/seqkit/releases/download/v2.8.2/seqkit_linux_arm64.tar.gz), <br/> [中国镜像](http://app.shenwei.me/data/seqkit/seqkit_linux_arm64.tar.gz)                  |[![Github Releases (by Asset)](https://img.shields.io/github/downloads/shenwei356/seqkit/latest/seqkit_linux_arm64.tar.gz.svg?maxAge=3600)](https://github.com/shenwei356/seqkit/releases/download/v2.8.2/seqkit_linux_arm64.tar.gz)
-macOS  |**64-bit**|[**seqkit_darwin_amd64.tar.gz**](https://github.com/shenwei356/seqkit/releases/download/v2.8.2/seqkit_darwin_amd64.tar.gz), <br/> [中国镜像](http://app.shenwei.me/data/seqkit/seqkit_darwin_amd64.tar.gz)               |[![Github Releases (by Asset)](https://img.shields.io/github/downloads/shenwei356/seqkit/latest/seqkit_darwin_amd64.tar.gz.svg?maxAge=3600)](https://github.com/shenwei356/seqkit/releases/download/v2.8.2/seqkit_darwin_amd64.tar.gz)
-macOS  |**arm64** |[**seqkit_darwin_arm64.tar.gz**](https://github.com/shenwei356/seqkit/releases/download/v2.8.2/seqkit_darwin_arm64.tar.gz), <br/> [中国镜像](http://app.shenwei.me/data/seqkit/seqkit_darwin_arm64.tar.gz)               |[![Github Releases (by Asset)](https://img.shields.io/github/downloads/shenwei356/seqkit/latest/seqkit_darwin_arm64.tar.gz.svg?maxAge=3600)](https://github.com/shenwei356/seqkit/releases/download/v2.8.2/seqkit_darwin_arm64.tar.gz)
-Windows|32-bit    |[seqkit_windows_386.exe.tar.gz](https://github.com/shenwei356/seqkit/releases/download/v2.8.2/seqkit_windows_386.exe.tar.gz), <br/> [中国镜像](http://app.shenwei.me/data/seqkit/seqkit_windows_386.exe.tar.gz)          |[![Github Releases (by Asset)](https://img.shields.io/github/downloads/shenwei356/seqkit/latest/seqkit_windows_386.exe.tar.gz.svg?maxAge=3600)](https://github.com/shenwei356/seqkit/releases/download/v2.8.2/seqkit_windows_386.exe.tar.gz)
-Windows|**64-bit**|[**seqkit_windows_amd64.exe.tar.gz**](https://github.com/shenwei356/seqkit/releases/download/v2.8.2/seqkit_windows_amd64.exe.tar.gz), <br/> [中国镜像](http://app.shenwei.me/data/seqkit/seqkit_windows_amd64.exe.tar.gz)|[![Github Releases (by Asset)](https://img.shields.io/github/downloads/shenwei356/seqkit/latest/seqkit_windows_amd64.exe.tar.gz.svg?maxAge=3600)](https://github.com/shenwei356/seqkit/releases/download/v2.8.2/seqkit_windows_amd64.exe.tar.gz)
+Linux  |32-bit    |[seqkit_linux_386.tar.gz](https://github.com/shenwei356/seqkit/releases/download/v2.9.0/seqkit_linux_386.tar.gz), <br/> [中国镜像](http://app.shenwei.me/data/seqkit/seqkit_linux_386.tar.gz)                            |[![Github Releases (by Asset)](https://img.shields.io/github/downloads/shenwei356/seqkit/latest/seqkit_linux_386.tar.gz.svg?maxAge=3600)](https://github.com/shenwei356/seqkit/releases/download/v2.9.0/seqkit_linux_386.tar.gz)
+Linux  |**64-bit**|[**seqkit_linux_amd64.tar.gz**](https://github.com/shenwei356/seqkit/releases/download/v2.9.0/seqkit_linux_amd64.tar.gz), <br/> [中国镜像](http://app.shenwei.me/data/seqkit/seqkit_linux_amd64.tar.gz)                  |[![Github Releases (by Asset)](https://img.shields.io/github/downloads/shenwei356/seqkit/latest/seqkit_linux_amd64.tar.gz.svg?maxAge=3600)](https://github.com/shenwei356/seqkit/releases/download/v2.9.0/seqkit_linux_amd64.tar.gz)
+Linux  |**arm64** |[**seqkit_linux_arm64.tar.gz**](https://github.com/shenwei356/seqkit/releases/download/v2.9.0/seqkit_linux_arm64.tar.gz), <br/> [中国镜像](http://app.shenwei.me/data/seqkit/seqkit_linux_arm64.tar.gz)                  |[![Github Releases (by Asset)](https://img.shields.io/github/downloads/shenwei356/seqkit/latest/seqkit_linux_arm64.tar.gz.svg?maxAge=3600)](https://github.com/shenwei356/seqkit/releases/download/v2.9.0/seqkit_linux_arm64.tar.gz)
+macOS  |**64-bit**|[**seqkit_darwin_amd64.tar.gz**](https://github.com/shenwei356/seqkit/releases/download/v2.9.0/seqkit_darwin_amd64.tar.gz), <br/> [中国镜像](http://app.shenwei.me/data/seqkit/seqkit_darwin_amd64.tar.gz)               |[![Github Releases (by Asset)](https://img.shields.io/github/downloads/shenwei356/seqkit/latest/seqkit_darwin_amd64.tar.gz.svg?maxAge=3600)](https://github.com/shenwei356/seqkit/releases/download/v2.9.0/seqkit_darwin_amd64.tar.gz)
+macOS  |**arm64** |[**seqkit_darwin_arm64.tar.gz**](https://github.com/shenwei356/seqkit/releases/download/v2.9.0/seqkit_darwin_arm64.tar.gz), <br/> [中国镜像](http://app.shenwei.me/data/seqkit/seqkit_darwin_arm64.tar.gz)               |[![Github Releases (by Asset)](https://img.shields.io/github/downloads/shenwei356/seqkit/latest/seqkit_darwin_arm64.tar.gz.svg?maxAge=3600)](https://github.com/shenwei356/seqkit/releases/download/v2.9.0/seqkit_darwin_arm64.tar.gz)
+Windows|32-bit    |[seqkit_windows_386.exe.tar.gz](https://github.com/shenwei356/seqkit/releases/download/v2.9.0/seqkit_windows_386.exe.tar.gz), <br/> [中国镜像](http://app.shenwei.me/data/seqkit/seqkit_windows_386.exe.tar.gz)          |[![Github Releases (by Asset)](https://img.shields.io/github/downloads/shenwei356/seqkit/latest/seqkit_windows_386.exe.tar.gz.svg?maxAge=3600)](https://github.com/shenwei356/seqkit/releases/download/v2.9.0/seqkit_windows_386.exe.tar.gz)
+Windows|**64-bit**|[**seqkit_windows_amd64.exe.tar.gz**](https://github.com/shenwei356/seqkit/releases/download/v2.9.0/seqkit_windows_amd64.exe.tar.gz), <br/> [中国镜像](http://app.shenwei.me/data/seqkit/seqkit_windows_amd64.exe.tar.gz)|[![Github Releases (by Asset)](https://img.shields.io/github/downloads/shenwei356/seqkit/latest/seqkit_windows_amd64.exe.tar.gz.svg?maxAge=3600)](https://github.com/shenwei356/seqkit/releases/download/v2.9.0/seqkit_windows_amd64.exe.tar.gz)
 
 
 *Notes*
@@ -157,6 +160,14 @@ fish:
 
 ## Release history
 
+- [SeqKit v2.8.2](https://github.com/shenwei356/seqkit/releases/tag/v2.8.2) - 2024-05-17
+[![Github Releases (by Release)](https://img.shields.io/github/downloads/shenwei356/seqkit/v2.8.2/total.svg)](https://github.com/shenwei356/seqkit/releases/tag/v2.8.2)
+    - `seqkit amplicon`:
+        - Fix a big introduced in v2.7.0. When more than one pairs of primers are given, only the last one is used. [#457](https://github.com/shenwei356/seqkit/issues/457)
+    - `seqkit translate`:
+        - Add option `-e/--skip-translate-errors` to skip translate error and output empty sequence. [#458](https://github.com/shenwei356/seqkit/pull/458)
+    - `seqkit split`:
+        - Add flag `-I/--ignore-case` for `-i/--by-id`. [#462](https://github.com/shenwei356/seqkit/issues/462)
 - [SeqKit v2.8.1](https://github.com/shenwei356/seqkit/releases/tag/v2.8.1) - 2024-04-07
 [![Github Releases (by Release)](https://img.shields.io/github/downloads/shenwei356/seqkit/v2.8.1/total.svg)](https://github.com/shenwei356/seqkit/releases/tag/v2.8.1)
     - `seqkit sana`:


=====================================
doc/docs/usage.md
=====================================
@@ -158,7 +158,7 @@ reproduced in different environments with same random seed.
 ``` text
 SeqKit -- a cross-platform and ultrafast toolkit for FASTA/Q file manipulation
 
-Version: 2.8.0
+Version: 2.9.0
 
 Author: Wei Shen <shenwei356 at gmail.com>
 
@@ -698,6 +698,7 @@ Columns:
   16. Q30(%)    percentage of bases with the quality score greater than 30
   17. AvgQual   average quality
   18. GC(%)     percentage of GC content
+  19. sum_n     number of ambitious letters (N, n, X, x)
   
 Attention:
   1. Sequence length metrics (sum_len, min_len, avg_len, max_len, Q1, Q2, Q3)
@@ -788,13 +789,13 @@ Eexamples
 1. Extra information
 
         $ seqkit stats *.f{a,q}.gz -a
-        file               format  type  num_seqs    sum_len  min_len  avg_len  max_len   Q1   Q2   Q3  sum_gap  N50  N50_num  Q20(%)  Q30(%)  AvgQual  GC(%)
-        hairpin.fa.gz      FASTA   RNA     28,645  2,949,871       39      103    2,354   76   91  111        0  101      380       0       0        0  45.77
-        mature.fa.gz       FASTA   RNA     35,828    781,222       15     21.8       34   21   22   22        0   22       12       0       0        0   47.6
-        Illimina1.8.fq.gz  FASTQ   DNA     10,000  1,500,000      150      150      150  150  150  150        0  150        1   96.16   89.71    24.82  49.91
-        nanopore.fq.gz     FASTQ   DNA      4,000  1,798,723      153    449.7    6,006  271  318  391        0  395      585   40.79   12.63     9.48  46.66
-        reads_1.fq.gz      FASTQ   DNA      2,500    567,516      226      227      229  227  227  227        0  227        3   91.24   86.62    15.45  53.63
-        reads_2.fq.gz      FASTQ   DNA      2,500    560,002      223      224      225  224  224  224        0  224        2   91.06   87.66    14.62  54.77
+        file               format  type  num_seqs    sum_len  min_len  avg_len  max_len   Q1   Q2   Q3  sum_gap  N50  N50_num  Q20(%)  Q30(%)  AvgQual  GC(%)  sum_n
+        hairpin.fa.gz      FASTA   RNA     28,645  2,949,871       39      103    2,354   76   91  111        0  101      380       0       0        0  45.77    255
+        mature.fa.gz       FASTA   RNA     35,828    781,222       15     21.8       34   21   22   22        0   22       12       0       0        0   47.6      0
+        Illimina1.8.fq.gz  FASTQ   DNA     10,000  1,500,000      150      150      150  150  150  150        0  150        1   96.16   89.71    24.82  49.91     38
+        nanopore.fq.gz     FASTQ   DNA      4,000  1,798,723      153    449.7    6,006  271  318  391        0  395      585   40.79   12.63     9.48  46.66      0
+        reads_1.fq.gz      FASTQ   DNA      2,500    567,516      226      227      229  227  227  227        0  227        3   91.24   86.62    15.45  53.63     44
+        reads_2.fq.gz      FASTQ   DNA      2,500    560,002      223      224      225  224  224  224        0  224        2   91.06   87.66    14.62  54.77      2
 
 1. **Parallelize counting files, it's much faster for lots of small files, especially for files on SSD**
 
@@ -3091,7 +3092,9 @@ more on: http://bioinf.shenwei.me/seqkit/usage/#replace
 
 Special replacement symbols (only for replacing name not sequence):
 
-    {nr}    Record number, starting from 1
+    {fn}    File name
+    {fbn}   File base name
+    {fbne}  File base name without any extension
     {kv}    Corresponding value of the key (captured variable $n) by key-value file,
             n can be specified by flag -I (--key-capt-idx) (default: 1)
             
@@ -3236,6 +3239,13 @@ Examples
         >seq_00002
         ATTT
 
+1. Add file names.
+
+        $ seqkit replace ../tests/hairpin.fa -p '.+' -r '{fn}__{fbn}__{fbne}__{nr}' | seqkit seq -n | head -n 3
+        ../tests/hairpin.fa__hairpin.fa__hairpin__1
+        ../tests/hairpin.fa__hairpin.fa__hairpin__2
+        ../tests/hairpin.fa__hairpin.fa__hairpin__3
+
 1. Replace key with value by key-value file
 
         $ more test.fa


=====================================
go.mod
=====================================
@@ -20,12 +20,12 @@ require (
 	github.com/mattn/go-isatty v0.0.16
 	github.com/mitchellh/go-homedir v1.1.0
 	github.com/pkg/errors v0.9.1
-	github.com/shenwei356/bio v0.13.3
+	github.com/shenwei356/bio v0.13.6
 	github.com/shenwei356/breader v0.3.2
 	github.com/shenwei356/bwt v0.6.1
 	github.com/shenwei356/go-logging v0.0.0-20171012171522-c6b9702d88ba
 	github.com/shenwei356/stable v0.1.2
-	github.com/shenwei356/util v0.5.2
+	github.com/shenwei356/util v0.5.3
 	github.com/shenwei356/xopen v0.3.2
 	github.com/smallfish/simpleyaml v0.1.0
 	github.com/spf13/cobra v1.8.0


=====================================
go.sum
=====================================
@@ -123,8 +123,8 @@ github.com/rogpeppe/go-internal v1.9.0/go.mod h1:WtVeX8xhTBvf0smdhujwtBcq4Qrzq/f
 github.com/russross/blackfriday/v2 v2.1.0/go.mod h1:+Rmxgy9KzJVeS9/2gXHxylqXiyQDYRxCVz55jmeOWTM=
 github.com/ruudk/golang-pdf417 v0.0.0-20181029194003-1af4ab5afa58/go.mod h1:6lfFZQK844Gfx8o5WFuvpxWRwnSoipWe/p622j1v06w=
 github.com/ruudk/golang-pdf417 v0.0.0-20201230142125-a7e3863a1245/go.mod h1:pQAZKsJ8yyVxGRWYNEm9oFB8ieLgKFnamEyDmSA0BRk=
-github.com/shenwei356/bio v0.13.3 h1:tWspivWisj/kS9RmDv8N0WEovn007jUImk8OdtPYZ38=
-github.com/shenwei356/bio v0.13.3/go.mod h1:5TMT6kpb5lQsa1Uz6nh6PGLtvKi8fQ3SWO2sfiBEOnc=
+github.com/shenwei356/bio v0.13.6 h1:GoJDNHNFIE6824IEAzBTf2f8BGqqshrIxgVxjlEHLRk=
+github.com/shenwei356/bio v0.13.6/go.mod h1:5TMT6kpb5lQsa1Uz6nh6PGLtvKi8fQ3SWO2sfiBEOnc=
 github.com/shenwei356/breader v0.3.2 h1:GLy2clIMck6FdTwj8WLnmhv0PW/7Pp+Wcx7TVEHG0ks=
 github.com/shenwei356/breader v0.3.2/go.mod h1:BimwolkMTIr/O4iX7xXtjEB1z5y39G+8I5Tsm9guC3E=
 github.com/shenwei356/bwt v0.6.1 h1:BDS1nhhIxiC284OKtLBUgy+U5L/7WrvT/ehOExB/DSA=
@@ -135,8 +135,8 @@ github.com/shenwei356/natsort v0.0.0-20220117010048-580176ad49fb h1:pb0RhpaADsFr
 github.com/shenwei356/natsort v0.0.0-20220117010048-580176ad49fb/go.mod h1:SiiGiRFyRtV7S9RamOrmQR5gpGIRhWJM1w0EtmuQ1io=
 github.com/shenwei356/stable v0.1.2 h1:TCpfL8bFvZxbZcCm6XaSXpKYv4UopnRejIx6f+StiuM=
 github.com/shenwei356/stable v0.1.2/go.mod h1:KghgqlviHPiKn9AuSTpadb7ep74n42VsNtPLoZZ/JIc=
-github.com/shenwei356/util v0.5.2 h1:kU9bnkE3RRUAlya+hbfwy83iTMOJqIHOlYgejYPb7mU=
-github.com/shenwei356/util v0.5.2/go.mod h1:3tRAOfreWdgl/Zh1gE008h2lWocf5/YAxVSjgLKvd4k=
+github.com/shenwei356/util v0.5.3 h1:Yf9+rB3Kngnb4+K3xCo7Dg2d+C1CzGsWmv6L9aDFORg=
+github.com/shenwei356/util v0.5.3/go.mod h1:3tRAOfreWdgl/Zh1gE008h2lWocf5/YAxVSjgLKvd4k=
 github.com/shenwei356/xopen v0.3.2 h1:gD/0EvcMa6m2Y1XSdALs9WdhIgiZmn5wVZTjKldCCQo=
 github.com/shenwei356/xopen v0.3.2/go.mod h1:6EQUa6I7Zsl2GQKqcL9qGLrTzVE+oZyly+uhzovQYSk=
 github.com/smallfish/simpleyaml v0.1.0 h1:5uAZdLAiHxS9cmzkOxg7lH0dILXKTD7uRZbAhyHmyU0=


=====================================
seqkit/cmd/grep.go
=====================================
@@ -471,6 +471,9 @@ Examples:
 							}
 							if limitRegion {
 								target = sequence.SubSeq(start, end).Seq
+								if len(target) == 0 {
+									continue
+								}
 							} else if circular {
 								// concat two copies of sequence, and do not change orginal sequence
 								target = make([]byte, len(sequence.Seq)*2)
@@ -607,6 +610,9 @@ Examples:
 						}
 						if limitRegion {
 							target = sequence.SubSeq(start, end).Seq
+							if len(target) == 0 {
+								continue
+							}
 						} else if circular {
 							// concat two copies of sequence, and do not change orginal sequence
 							target = make([]byte, len(sequence.Seq)*2)


=====================================
seqkit/cmd/helper.go
=====================================
@@ -99,7 +99,7 @@ func getFileListFromFile(file string, checkFile bool) ([]string, error) {
 		return nil, fmt.Errorf("read file list from '%s': %s", file, err)
 	}
 
-	return lists, nil
+	return lists, fh.Close()
 }
 
 func getFileListFromArgsAndFile(cmd *cobra.Command, args []string, checkFileFromArgs bool, flag string, checkFileFromFile bool) []string {


=====================================
seqkit/cmd/locate.go
=====================================
@@ -857,7 +857,7 @@ Attention:
 						// }
 
 						if nonGreedy {
-							offset = offset + loc[1] + 1
+							offset = offset + loc[1]
 						} else {
 							offset = offset + loc[0] + 1
 						}
@@ -956,7 +956,7 @@ Attention:
 						// }
 
 						if nonGreedy {
-							offset = offset + loc[1] + 1
+							offset = offset + loc[1]
 						} else {
 							offset = offset + loc[0] + 1
 						}


=====================================
seqkit/cmd/replace.go
=====================================
@@ -24,6 +24,7 @@ import (
 	"bytes"
 	"fmt"
 	"io"
+	"path/filepath"
 	"regexp"
 	"runtime"
 	"strings"
@@ -61,6 +62,9 @@ more on: http://bioinf.shenwei.me/seqkit/usage/#replace
 Special replacement symbols (only for replacing name not sequence):
 
     {nr}    Record number, starting from 1
+    {fn}    File name
+    {fbn}   File base name
+    {fbne}  File base name without any extension
     {kv}    Corresponding value of the key (captured variable $n) by key-value file,
             n can be specified by flag -I (--key-capt-idx) (default: 1)
             
@@ -132,6 +136,21 @@ Filtering records to edit:
 			replaceWithNR = true
 		}
 
+		var replaceWithFN bool
+		if reFN.Match(replacement) {
+			replaceWithFN = true
+		}
+
+		var replaceWithFBN bool
+		if reFBN.Match(replacement) {
+			replaceWithFBN = true
+		}
+
+		var replaceWithFBNE bool
+		if reFN.Match(replacement) {
+			replaceWithFBNE = true
+		}
+
 		var replaceWithKV bool
 		var kvs map[string]string
 		if reKV.Match(replacement) {
@@ -328,10 +347,23 @@ Filtering records to edit:
 		var re *regexp.Regexp
 		var h uint64
 
+		var fileBase string
+
 		for _, file := range files {
 			fastxReader, err := fastx.NewReader(alphabet, file, idRegexp)
 			checkError(err)
+
 			nr := 0
+			bFile := []byte(file)
+			fileBase = filepath.Base(file)
+			bFileBase := []byte(fileBase)
+			var bFileBaseWithoutExtension []byte
+			if i := strings.Index(fileBase, "."); i >= 0 {
+				bFileBaseWithoutExtension = []byte(fileBase[:i])
+			} else {
+				bFileBaseWithoutExtension = []byte(fileBase)
+			}
+
 			for {
 				record, err = fastxReader.Read()
 				if err != nil {
@@ -429,6 +461,16 @@ Filtering records to edit:
 						r = reNR.ReplaceAll(r, []byte(fmt.Sprintf(nrFormat, nr)))
 					}
 
+					if replaceWithFN {
+						r = reFN.ReplaceAll(r, bFile)
+					}
+					if replaceWithFBN {
+						r = reFBN.ReplaceAll(r, bFileBase)
+					}
+					if replaceWithFBNE {
+						r = reFBNE.ReplaceAll(r, bFileBaseWithoutExtension)
+					}
+
 					if replaceWithKV {
 						founds = patternRegexp.FindAllSubmatch(record.Name, -1)
 						if len(founds) > 1 {
@@ -483,7 +525,7 @@ func init() {
 		"replacement. supporting capture variables. "+
 			" e.g. $1 represents the text of the first submatch. "+
 			"ATTENTION: for *nix OS, use SINGLE quote NOT double quotes or "+
-			`use the \ escape character. Record number is also supported by "{nr}".`+
+			`use the \ escape character. Record number and file name is also supported by "{nr}" and "{fn}".`+
 			`use ${1} instead of $1 when {kv} given!`)
 	replaceCmd.Flags().IntP("nr-width", "", 1, `minimum width for {nr} in flag -r/--replacement. e.g., formatting "1" to "001" by --nr-width 3`)
 	// replaceCmd.Flags().BoolP("by-name", "n", false, "replace full name instead of just id")
@@ -508,3 +550,6 @@ func init() {
 
 var reNR = regexp.MustCompile(`\{(NR|nr)\}`)
 var reKV = regexp.MustCompile(`\{(KV|kv)\}`)
+var reFN = regexp.MustCompile(`\{(FN|fn)\}`)
+var reFBN = regexp.MustCompile(`\{(FBN|fbn)\}`)
+var reFBNE = regexp.MustCompile(`\{(FBNE|fbne)\}`)


=====================================
seqkit/cmd/stat.go
=====================================
@@ -76,6 +76,7 @@ Columns:
   16. Q30(%)    percentage of bases with the quality score greater than 30
   17. AvgQual   average quality
   18. GC(%)     percentage of GC content
+  19. sum_n     number of ambitious letters (N, n, X, x)
   
 Attention:
   1. Sequence length metrics (sum_len, min_len, avg_len, max_len, Q1, Q2, Q3)
@@ -109,6 +110,7 @@ Tips:
 		}
 		gapLettersBytes := []byte(gapLetters)
 		gcLettersBytes := []byte{'g', 'c', 'G', 'C'}
+		nLettersBytes := []byte{'X', 'x', 'N', 'n'}
 
 		skipFileCheck := getFlagBool(cmd, "skip-file-check")
 		all := getFlagBool(cmd, "all")
@@ -194,7 +196,7 @@ Tips:
 				"max_len",
 			}
 			if all {
-				colnames = append(colnames, []string{"Q1", "Q2", "Q3", "sum_gap", "N50", "N50_num", "Q20(%)", "Q30(%)", "AvgQual", "GC(%)"}...)
+				colnames = append(colnames, []string{"Q1", "Q2", "Q3", "sum_gap", "N50", "N50_num", "Q20(%)", "Q30(%)", "AvgQual", "GC(%)", "sum_n"}...)
 			}
 
 			if hasNX {
@@ -242,7 +244,7 @@ Tips:
 							info.lenAvg,
 							info.lenMax)
 						if all {
-							fmt.Fprintf(outfh, "\t%.1f\t%.1f\t%.1f\t%d\t%d\t%d\t%.2f\t%.2f\t%.2f\t%.2f",
+							fmt.Fprintf(outfh, "\t%.1f\t%.1f\t%.1f\t%d\t%d\t%d\t%.2f\t%.2f\t%.2f\t%.2f\t%d",
 								info.Q1,
 								info.Q2,
 								info.Q3,
@@ -252,7 +254,9 @@ Tips:
 								info.q20,
 								info.q30,
 								info.avgQual,
-								info.gc)
+								info.gc,
+								info.nSum,
+							)
 						}
 						if hasNX {
 							for _, x = range info.nx {
@@ -283,7 +287,7 @@ Tips:
 							info.lenAvg,
 							info.lenMax)
 						if all {
-							fmt.Fprintf(outfh, "\t%.1f\t%.1f\t%.1f\t%d\t%d\t%d\t%.2f\t%.2f\t%.2f\t%.2f",
+							fmt.Fprintf(outfh, "\t%.1f\t%.1f\t%.1f\t%d\t%d\t%d\t%.2f\t%.2f\t%.2f\t%.2f\t%d",
 								info.Q1,
 								info.Q2,
 								info.Q3,
@@ -293,7 +297,9 @@ Tips:
 								info.q20,
 								info.q30,
 								info.avgQual,
-								info.gc)
+								info.gc,
+								info.nSum,
+							)
 						}
 						if hasNX {
 							for _, x = range info.nx {
@@ -332,7 +338,7 @@ Tips:
 							info.lenAvg,
 							info.lenMax)
 						if all {
-							fmt.Fprintf(outfh, "\t%.1f\t%.1f\t%.1f\t%d\t%d\t%d\t%.2f\t%.2f\t%.2f\t%.2f",
+							fmt.Fprintf(outfh, "\t%.1f\t%.1f\t%.1f\t%d\t%d\t%d\t%.2f\t%.2f\t%.2f\t%.2f\t%d",
 								info.Q1,
 								info.Q2,
 								info.Q3,
@@ -342,7 +348,9 @@ Tips:
 								info.q20,
 								info.q30,
 								info.avgQual,
-								info.gc)
+								info.gc,
+								info.nSum,
+							)
 						}
 						if hasNX {
 							for _, x = range info.nx {
@@ -400,6 +408,7 @@ Tips:
 
 				var gapSum uint64
 				var gcSum uint64
+				var nSum uint64
 
 				lensStats := util.NewLengthStats()
 
@@ -478,6 +487,7 @@ Tips:
 
 						gapSum += uint64(byteutil.CountBytes(record.Seq.Seq, gapLettersBytes))
 						gcSum += uint64(byteutil.CountBytes(record.Seq.Seq, gcLettersBytes))
+						nSum += uint64(byteutil.CountBytes(record.Seq.Seq, nLettersBytes))
 					}
 				}
 
@@ -528,7 +538,7 @@ Tips:
 						file = stdinLabel
 					}
 					ch <- statInfo{file, seqFormat, t,
-						0, 0, 0, 0,
+						0, 0, 0, 0, 0,
 						0, 0, 0, 0,
 						0, 0, 0,
 						0, 0, 0, 0,
@@ -542,7 +552,7 @@ Tips:
 						file = stdinLabel
 					}
 					ch <- statInfo{file, seqFormat, t,
-						lensStats.Count(), lensStats.Sum(), gapSum, lensStats.Min(),
+						lensStats.Count(), lensStats.Sum(), gapSum, lensStats.Min(), nSum,
 						mathutil.Round(lensStats.Mean(), 1), lensStats.Max(), n50, l50,
 						q1, q2, q3,
 						mathutil.Round(float64(q20)/float64(lensStats.Sum())*100, 2),
@@ -601,6 +611,7 @@ Tips:
 				{Header: "Q30(%)", Align: stable.AlignRight, HumanizeNumbers: true},
 				{Header: "AvgQual", Align: stable.AlignRight, HumanizeNumbers: true},
 				{Header: "GC(%)", Align: stable.AlignRight, HumanizeNumbers: true},
+				{Header: "sum_n", Align: stable.AlignRight, HumanizeNumbers: true},
 				// {Header: "L50", AlignRight: true},
 			}...)
 		}
@@ -634,6 +645,7 @@ Tips:
 				row = append(row, info.q30)
 				row = append(row, info.avgQual)
 				row = append(row, info.gc)
+				row = append(row, info.nSum)
 			}
 			if hasNX {
 				for _, x = range info.nx {
@@ -656,6 +668,7 @@ type statInfo struct {
 	lenSum uint64
 	gapSum uint64
 	lenMin uint64
+	nSum   uint64
 
 	lenAvg float64
 	lenMax uint64


=====================================
seqkit/cmd/version.go
=====================================
@@ -29,7 +29,7 @@ import (
 )
 
 // VERSION of seqkit
-const VERSION = "2.8.2"
+const VERSION = "2.9.0"
 
 // versionCmd represents the version command
 var versionCmd = &cobra.Command{


=====================================
seqkit/packaging.sh
=====================================
@@ -17,3 +17,5 @@ for f in seqkit_*; do
     rm -rf $f;
     cd ..;
 done;
+
+ls binaries/*.tar.gz | rush 'cd {/}; md5sum {%} > {%}.md5.txt'



View it on GitLab: https://salsa.debian.org/med-team/seqkit/-/compare/0d9b2783790c2961740f2003b84f44d50f7bcd05...41d03f49b7514777d0afaf7d67a9f3f33b240b4b

-- 
View it on GitLab: https://salsa.debian.org/med-team/seqkit/-/compare/0d9b2783790c2961740f2003b84f44d50f7bcd05...41d03f49b7514777d0afaf7d67a9f3f33b240b4b
You're receiving this email because of your account on salsa.debian.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20250222/73caf950/attachment-0001.htm>


More information about the debian-med-commit mailing list