[med-svn] [Git][med-team/fastp][upstream] New upstream version 0.24.0+dfsg

Dylan Aïssi (@daissi) gitlab at salsa.debian.org
Mon Feb 3 20:51:48 GMT 2025



Dylan Aïssi pushed to branch upstream at Debian Med / fastp


Commits:
a08b8141 by Dylan Aïssi at 2025-02-03T21:44:47+01:00
New upstream version 0.24.0+dfsg
- - - - -


10 changed files:

- README.md
- src/common.h
- src/main.cpp
- src/peprocessor.cpp
- src/read.cpp
- src/readpool.cpp
- src/seprocessor.cpp
- src/singleproducersingleconsumerlist.h
- src/threadconfig.cpp
- src/writerthread.cpp


Changes:

=====================================
README.md
=====================================
@@ -7,8 +7,12 @@ https://badges.debian.net/badges/debian/unstable/fastp/version.svg)](https://pac
 [![fastp ci](https://github.com/OpenGene/fastp/actions/workflows/ci.yml/badge.svg?branch=master)](https://github.com/OpenGene/fastp/actions/workflows/ci.yml)
 
 # fastp
-A tool designed to provide fast all-in-one preprocessing for FastQ files. This tool is developed in C++ with multithreading supported to afford high performance.
-- [fastp](#fastp)
+A tool designed to provide ultrafast all-in-one preprocessing and quality control for FastQ data.     
+
+This tool is designed for processing short reads (i.e. Illumina NovaSeq, MGI), if you are looking for tools to process long reads (i.e. Nanopore, PacBio, Cyclone), please use [fastplong](https://github.com/OpenGene/fastplong)  
+
+Citation: Shifu Chen. 2023. Ultrafast one-pass FASTQ data preprocessing, quality control, and deduplication using fastp. iMeta 2: e107. https://doi.org/10.1002/imt2.107
+
 - [features](#features)
 - [simple usage](#simple-usage)
 - [examples of report](#examples-of-report)
@@ -16,8 +20,8 @@ A tool designed to provide fast all-in-one preprocessing for FastQ files. This t
   - [install with Bioconda](#install-with-bioconda)
   - [or download the latest prebuilt binary for Linux users](#or-download-the-latest-prebuilt-binary-for-linux-users)
   - [or compile from source](#or-compile-from-source)
-    - [Step 1: download and build libisal](#step-1-download-and-build-libisal)
-    - [step 2: download and build libdeflate](#step-2-download-and-build-libdeflate)
+    - [Step 1: install isa-l](#step-1-install-isa-l)
+    - [step 2: install libdeflate](#step-2-install-libdeflate)
     - [Step 3: download and build fastp](#step-3-download-and-build-fastp)
 - [input and output](#input-and-output)
   - [output to STDOUT](#output-to-stdout)
@@ -103,28 +107,28 @@ This binary was compiled on CentOS, and tested on CentOS/Ubuntu
 wget http://opengene.org/fastp/fastp
 chmod a+x ./fastp
 
-# or download specified version, i.e. fastp v0.23.1
-wget http://opengene.org/fastp/fastp.0.23.1
-mv fastp.0.23.1 fastp
+# or download specified version, i.e. fastp v0.23.4
+wget http://opengene.org/fastp/fastp.0.23.4
+mv fastp.0.23.4 fastp
 chmod a+x ./fastp
 ```
 ## or compile from source
 `fastp` depends on `libdeflate` and `libisal`, while `libisal` is not compatible with gcc 4.8. If you use gcc 4.8, your fastp will fail to run. Please upgrade your gcc before you build the libraries and fastp.
 
-### Step 1: download and build libisal
-See https://github.com/intel/isa-l
-`autoconf`, `automake`, `libtools`, `nasm (>=v2.11.01)` and `yasm (>=1.2.0)` are required to build this isal
+### Step 1: install isa-l
+It's recommended that to install it using your package manager, for example `apt install isa-l` on ubuntu, or `brew install isa-l` on Mac. Otherwise you can compile it from source. Please be noted that `isa-l` is not compatible with gcc 4.8 or older versions. See https://github.com/intel/isa-l
+`autoconf`, `automake`, `libtools`, `nasm (>=2.11.01)` and `yasm (>=1.2.0)` are required to build isa-l.
 ```shell
 git clone https://github.com/intel/isa-l.git
 cd isa-l
 ./autogen.sh
 ./configure --prefix=/usr --libdir=/usr/lib64
-make
+make -j
 sudo make install
 ```
 
-### step 2: download and build libdeflate
-See https://github.com/ebiggers/libdeflate
+### step 2: install libdeflate
+It's recommended that to install it using your package manager, for example `apt install libdeflate` on ubuntu, or `brew install libdeflate` on Mac. Otherwise you can compile it from source. See https://github.com/ebiggers/libdeflate
 ```shell
 git clone https://github.com/ebiggers/libdeflate.git
 cd libdeflate
@@ -140,12 +144,11 @@ git clone https://github.com/OpenGene/fastp.git
 
 # build
 cd fastp
-make
+make -j
 
 # Install
 sudo make install
 ```
-You can add `-j8` option to `make/cmake` to use 8 threads for the compilation. 
 
 # input and output
 `fastp` supports both single-end (SE) and paired-end (PE) input/output.


=====================================
src/common.h
=====================================
@@ -1,7 +1,7 @@
 #ifndef COMMON_H
 #define COMMON_H
 
-#define FASTP_VER "0.23.4"
+#define FASTP_VER "0.24.0"
 
 #define _DEBUG false
 
@@ -29,20 +29,15 @@ const char ATCG_BASES[] = {'A', 'T', 'C', 'G'};
 
 #pragma pack() 
 
-// the limit of the queue to store the packs
-// error may happen if it generates more packs than this number
-static const int PACK_NUM_LIMIT  = 10000000;
 
 // how many reads one pack has
-static const int PACK_SIZE = 1000;
+static const int PACK_SIZE = 256;
 
 // if one pack is produced, but not consumed, it will be kept in the memory
 // this number limit the number of in memory packs
 // if the number of in memory packs is full, the producer thread should sleep
-static const int PACK_IN_MEM_LIMIT = 500;
+static const int PACK_IN_MEM_LIMIT = 128;
 
-// if read number is more than this, warn it
-static const int WARN_STANDALONE_READ_LIMIT = 10000;
 
 // different filtering results, bigger number means worse
 // if r1 and r2 are both failed, then the bigger one of the two results will be recorded


=====================================
src/main.cpp
=====================================
@@ -319,7 +319,7 @@ int main(int argc, char* argv[]){
             || cmd.exist("cut_front_window_size") || cmd.exist("cut_front_mean_quality") 
             || cmd.exist("cut_tail_window_size") || cmd.exist("cut_tail_mean_quality") 
             || cmd.exist("cut_right_window_size") || cmd.exist("cut_right_mean_quality"))
-            cerr << "WARNING: you specified the options for cutting by quality, but forogt to enable any of cut_front/cut_tail/cut_right. This will have no effect." << endl;
+            cerr << "WARNING: you specified the options for cutting by quality, but forgot to enable any of cut_front/cut_tail/cut_right. This will have no effect." << endl;
     }
 
     // quality filtering


=====================================
src/peprocessor.cpp
=====================================
@@ -58,6 +58,8 @@ PairEndProcessor::~PairEndProcessor() {
         delete mRightReadPool;
         mRightReadPool = NULL;
     }
+    delete[] mLeftInputLists;
+    delete[] mRightInputLists;
 }
 
 void PairEndProcessor::initOutput() {


=====================================
src/read.cpp
=====================================
@@ -56,7 +56,6 @@ Read* Read::reverseComplement(){
 	string seq = Sequence::reverseComplement(mSeq);
 	string qual;
 	qual.assign(mQuality->rbegin(), mQuality->rend());
-	string* strand=new string("+");
 	return new Read(mName->c_str(), seq.c_str(), "+", qual.c_str());
 }
 
@@ -181,7 +180,9 @@ bool Read::fixMGI() {
 	int len = mName->length();
 	if((*mName)[len-1]=='1' || (*mName)[len-1]=='2') {
 		if((*mName)[len-2] == '/') {
-			mName = new string(mName->substr(0, len-2) + " " + mName->substr(len-2, 2));
+			string* newName = new string(mName->substr(0, len-2) + " " + mName->substr(len-2, 2));
+			delete mName;
+			mName = newName;
 			return true;
 		}
 	}


=====================================
src/readpool.cpp
=====================================
@@ -31,8 +31,15 @@ bool ReadPool::input(int tid, Read* data) {
 }
 
 void ReadPool::cleanup() {
-    //TODO: delete unused pooled Reads.
-    //But since this is only called when the program exits, the one-by-one deletion can be skipped to save time
+    for(int t=0; t<mOptions->thread; t++) {
+        while(mBufferLists[t]->canBeConsumed()) {
+            Read* r = mBufferLists[t]->consume();
+            mConsumed++;
+            delete r;
+        }
+        delete mBufferLists[t];
+    }
+    delete[] mBufferLists;
 }
 
 void ReadPool::initBufferLists() {


=====================================
src/seprocessor.cpp
=====================================
@@ -41,6 +41,7 @@ SingleEndProcessor::~SingleEndProcessor() {
         delete mReadPool;
         mReadPool = NULL;
     }
+    delete[] mInputLists;
 }
 
 void SingleEndProcessor::initOutput() {


=====================================
src/singleproducersingleconsumerlist.h
=====================================
@@ -83,6 +83,8 @@ public:
             blocks[recycled & blocksRingBufferSizeMask] = NULL;
             recycled++;
         }
+        delete[] blocks;
+        blocks = NULL;
     }
     inline size_t size() {
         return produced -  consumed;


=====================================
src/threadconfig.cpp
=====================================
@@ -41,6 +41,26 @@ void ThreadConfig::cleanup() {
         delete mRightInputList;
         mRightInputList = NULL;
     }
+    if(mPreStats1) {
+        delete mPreStats1;
+        mPreStats1 = NULL;
+    }
+    if(mPostStats1) {
+        delete mPostStats1;
+        mPostStats1 = NULL;
+    }
+    if(mPreStats2) {
+        delete mPreStats2;
+        mPreStats2 = NULL;
+    }
+    if(mPostStats2) {
+        delete mPostStats2;
+        mPostStats2 = NULL;
+    }
+    if(mFilterResult) {
+        delete mFilterResult;
+        mFilterResult = NULL;
+    }
 }
 
 


=====================================
src/writerthread.cpp
=====================================
@@ -55,6 +55,11 @@ void WriterThread::input(int tid, string* data) {
 
 void WriterThread::cleanup() {
     deleteWriter();
+    for(int t=0; t<mOptions->thread; t++) {
+        delete mBufferLists[t];
+    }
+    delete[] mBufferLists;
+    mBufferLists = NULL;
 }
 
 void WriterThread::deleteWriter() {



View it on GitLab: https://salsa.debian.org/med-team/fastp/-/commit/a08b8141edce713537662ccbeeebfd332f673f4a

-- 
View it on GitLab: https://salsa.debian.org/med-team/fastp/-/commit/a08b8141edce713537662ccbeeebfd332f673f4a
You're receiving this email because of your account on salsa.debian.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20250203/10df689b/attachment-0001.htm>


More information about the debian-med-commit mailing list