[med-svn] [Git][med-team/libssw][upstream] New upstream version 1.2.4

Andreas Tille (@tille) gitlab at salsa.debian.org
Tue May 24 10:42:17 BST 2022



Andreas Tille pushed to branch upstream at Debian Med / libssw


Commits:
c9c6106f by Andreas Tille at 2022-05-24T07:21:16+02:00
New upstream version 1.2.4
- - - - -


26 changed files:

- README.md
- + demo/blosum62.txt
- + demo/new.txt
- + demo/old.txt
- + demo/pRead.fa
- + demo/pRef.fa
- + demo/protein1.fa
- + demo/protein2.fa
- + demo/query.fastq
- + demo/query2.fa
- + demo/r1.fa
- + demo/r1_query.fq
- + demo/target.fastq
- + demo/target2.fa
- − src/.gitignore
- src/Makefile
- src/example.cpp
- src/kseq.h
- src/main.c
- src/pyssw.py
- + src/sse2neon.h
- src/ssw.c
- src/ssw.h
- src/ssw_cpp.cpp
- src/ssw_cpp.h
- src/ssw_lib.py


Changes:

=====================================
README.md
=====================================
@@ -1,4 +1,6 @@
-#SSW Library: An SIMD Smith-Waterman C/C++/Python/Java Library for Use in Genomic Applications
+# SSW Library
+
+## An SIMD Smith-Waterman C/C++/Python/Java/R Library for Use in Genomic Applications
 
 License: MIT
 
@@ -10,21 +12,24 @@ The above copyright notice and this permission notice shall be included in all c
 
 THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
 
-Author:  
-C implementation: Mengyao Zhao  
-C++ wrapper: Wan-Ping Lee  
-Python wrapper: Yongan Zhao  
-Java wrapper: Daniel Cameron
+Authors:
+* C implementation: Mengyao Zhao
+* C++ wrapper: Wan-Ping Lee
+* Python wrapper: Yongan Zhao
+* Java wrapper: Daniel Cameron
+* R package: Nan Xiao
+
+Contact:
+* Mengyao Zhao <zhaomengyao at gmail.com>
+* Wan-Ping Lee <wanping.lee at gmail.com>
+* Yongan Zhao <zhaoyanswill at gmail.com>
+* Daniel Cameron <cameron.d at wehi.edu.au>
+* Nan Xiao <me at nanx.me>
 
-Contact:  
-Mengyao Zhao <zhaomengyao at gmail.com>  
-Wan-Ping Lee <wanping.lee at gmail.com>  
-Yongan Zhao <zhaoyanswill at gmail.com>   
-Daniel Cameron <cameron.d at wehi.edu.au>  
+Last revision: 2022-May-20
 
-Last revision: 06/29/2016
+## Overview
 
-##Overview
 SSW is a fast implementation of the Smith-Waterman algorithm, which uses the Single-Instruction Multiple-Data (SIMD) instructions to parallelize the algorithm at the instruction level. SSW library provides an API that can be flexibly used by programs written in C, C++ and other languages. We also provide a software that can do protein and genome alignment directly. Current version of our implementation is ~50 times faster than an ordinary Smith-Waterman. It can return the Smith-Waterman score, alignment location and traceback path (cigar) of the optimal alignment accurately; and return the sub-optimal alignment score and location heuristically.
 
 The Debian package of this library can be achieved here:
@@ -34,45 +39,49 @@ Note: When SSW open a gap, the gap open penalty alone is applied.
 
 ## C/C++ interface
 
-###How to use the API
-The API files include ssw.h and ssw.c, which can be directly used by any C or C++ program. For the C++ users who are more comfortable to use a C++ style interface, an additional C++ wrapper is provided with the file ssw_cpp.cpp and ssw_cpp.h.
+### How to use the API
+
+The API files include `ssw.h` and `ssw.c`, which can be directly used by any C or C++ program. For the C++ users who are more comfortable to use a C++ style interface, an additional C++ wrapper is provided with the file `ssw_cpp.cpp` and `ssw_cpp.h`.
 
 To use the C style API, please:
 
-1. Download ssw.h and ssw.c, and put them in the same folder of your own program files.
-2. Write #include "ssw.h" into your file that will call the API functions.
+1. Download `ssw.h` and `ssw.c`, and put them in the same folder of your own program files.
+2. Write `#include "ssw.h"` into your file that will call the API functions.
 3. The API files are ready to be compiled together with your own C/C++ files.
 
-The API function descriptions are in the file ssw.h. One simple example of the API usage is example.c. The Smith-Waterman penalties need to be integers. Small penalty numbers such as: match: 2, mismatch: -1, gap open (the total penalty when one gap is opened): -3, gap extension: -1 are recommended, which will lead to shorter running time.  
+The API function descriptions are in the file `ssw.h`. One simple example of the API usage is `example.c`. The Smith-Waterman penalties need to be integers. Small penalty numbers such as: match: 2, mismatch: -1, gap open (the total penalty when one gap is opened): -3, gap extension: -1 are recommended, which will lead to shorter running time.
 
 To use the C++ style API, please:
 
-1. Download ssw.h, ssw.c, ssw_cpp.cpp and ssw_cpp.h and put them in the same folder of your own program files.
-2. Write #include "ssw_cpp.h" into your file that will call the API functions.
+1. Download `ssw.h`, `ssw.c`, `ssw_cpp.cpp` and `ssw_cpp.h`, put them in the same folder of your own program files.
+2. Write `#include "ssw_cpp.h"` into your file that will call the API functions.
 3. The API files are ready to be compiled together with your own C/C++ files.
 
-The API function descriptions are in the file ssw_cpp.h. A simple example of using the C++ API is example.cpp.
+The API function descriptions are in the file `ssw_cpp.h`. A simple example of using the C++ API is `example.cpp`.
+
+### Speed and memory usage of the API
 
-###Speed and memory usage of the API
-Test data set: 
-Target sequence: reference genome of E. coli strain 536 (4,938,920 nucleotides) from NCBI
-Query sequences: 1000 reads of Ion Torrent sequenced E. coli strain DH10B (C23-140, 318 PGM Run, 11/2011), read length: ~25-540 bp, most reads are ~200 bp
+Test data set:
+
+* Target sequence: reference genome of _E. coli_ strain 536 (4,938,920 nucleotides) from NCBI
+* Query sequences: 1000 reads of Ion Torrent sequenced _E. coli_ strain DH10B (C23-140, 318 PGM Run, 11/2011), read length: ~25-540 bp, most reads are ~200 bp
 
 CPU time:
 
 * AMD CPU: default penalties: ~880 seconds; -m1 -x3 -o5 -e2: ~460 seconds
-* Intel CPU: default penalties: ~960 seconds; -m1 -x3 -o5 -e2: ~500 seconds 
+* Intel CPU: default penalties: ~960 seconds; -m1 -x3 -o5 -e2: ~500 seconds
 
 Memory usage: ~40MB
- 
-###Install the software
+
+### Install the software
 
 1. Download the software from https://github.com/mengyao/Complete-Striped-Smith-Waterman-Library.
-2. cd src
-3. make
-4. The executable file will be ssw_test.
+2. `cd src`
+3. `make`
+4. The executable file will be `ssw_test`.
+
+### Run the software
 
-###Run the software
 ```
 Usage: ssw_test [options] ... <target.fasta> <query.fasta>(or <query.fastq>)
 Options:
@@ -89,15 +98,18 @@ Options:
 	-h	If -s is used, include header in SAM output.
 ```
 
-###Software input
+### Software input
+
 The input files can be in FASTA or FASTQ format. Both target and query files can contain multiple sequences. Each sequence in the query file will be aligned with all sequences in the target file. If your target file has N sequences and your query file has M sequences, the results will have M*N alignments.
 
-###Software output
-The software can output SAM format or BLAST like format results. 
+### Software output
+
+The software can output SAM format or BLAST like format results.
 
 1. SAM format output:
 
 Example:
+
 ```
 @HD VN:1.4  SO:queryname
 @SQ SN:chr1 LN:1001
@@ -111,7 +123,9 @@ Y:26750420:R:-132;None;None/1   0   chr1    120 4   2M1I4M3D3M1I7M2I9M2D6M1I8M
 ```
 
 What is the output?
-Please check the document "The SAM Format Specification" at: http://samtools.sourceforge.net/SAM1.pdf for the full description.
+
+Please check the document "The SAM Format Specification" at: http://samtools.github.io/hts-specs/SAMv1.pdf for the full description.
+
 The additional optional field "ZS" indicates the suboptimal alignment score. For example, the 1st record in the upper example means the optimal alignment score of the given sequence is 37; the suboptimal alignment score is 28; the mismatch and INDEL base count within the aligned fragment of the read is 11.
 
 2. An example of the BLAST like output:
@@ -135,24 +149,33 @@ Query:         3    GA-AGAGTTAATTTAAGTCACTTCAAACAGATTAC-GTA-TCTTTT-TTTTCCCT    5
 ...
 ```
 
-##Python interface
+## Python interface
+
+### How to use the Python wrapper `ssw_lib.py`
 
-###How to use the python wrapper ssw_lib.py
+`ssw_lib.py` is a Python wrapper that fully supports APIs of the C library. To use this Python library, C programming knowledge is not required.
 
-ssw_lib.py is a Python wrapper that fully supports APIs of the C library. To use this Python library, C programming knowledge is not required.
+To use the Python wrapper, please:
 
-To use the python wrapper, please:
+1. Compile the `src` folder by either using the `makefile` or by compiling a dynamic shared library with `gcc`:
+
+```
+gcc -Wall -O3 -pipe -fPIC -shared -rdynamic -o libssw.so ssw.c ssw.h
+```
+
+2. Put `libssw.so` and `ssw_lib.py` in the same directory of your own program files or directories in sys.paths.
+
+3. The `LD_LIBRARY_PATH` environment variable may need to be modified to include the directory of the dynamic library `libssw.so` by one of the two following mathods:
 
-1. Compile the src folder by either using the makefile or by compiling a dynamic shared library with gcc  
-```gcc -Wall -O3 -pipe -fPIC -shared -rdynamic -o libssw.so ssw.c ssw.h```
-2. Put libssw.so and ssw_lib.py in the same directory of your own program files or directories in sys.paths.
-3. The LD_LIBRARY_PATH environment variable may need to be modified to include the directory of the dynamic library libssw.so by one of the two following mathods:
     * ```export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:path_of_libssw.so```
-    * For a definitive inclusion edit /etc/ld.so.conf and add the path of the libssw.so. Then, update the cache by ```/sbin/ldconfig```.
-4. In a python script or in a interactive interpreter, import the CSsw class by: ```from ssw_lib import CSsw``` or ```import ssw_lib``` and then call ```ssw_lib.CSsw```
-5. Call APIs with input parameters and parse the results (Please see pyssw.py as an example).
+    * For a definitive inclusion edit `/etc/ld.so.conf` and add the path of the `libssw.so`. Then, update the cache by `/sbin/ldconfig`.
+
+4. In a Python script or in a interactive interpreter, import the CSsw class by: `from ssw_lib import CSsw` or `import ssw_lib` and then call `ssw_lib.CSsw`.
+
+5. Call APIs with input parameters and parse the results (Please see `pyssw.py` as an example).
+
+### Run pyssw standalone
 
-###Run pyssw standalone 
 ```
 usage: pyssw.py [-h] [-l SLIBPATH] [-m NMATCH] [-x NMISMATCH] [-o NOPEN]
                 [-e NEXT] [-p] [-a SMATRIX] [-c] [-f NTHR] [-r] [-s] [-header]
@@ -193,38 +216,46 @@ optional arguments:
   -header, --bHeader    If -s is used, include header in SAM output.
 ```
 
-###pyssw output
+### pyssw output
+
+The software can output SAM format or BLAST like format results.
 
-The software can output SAM format or BLAST like format results. 
+### Speed and memory usage of pyssw
 
-###Speed and memory usage of pyssw
+The speed and memory are about the same as the C library.
 
-The speed and memory are about the same as the c library.
-	
-##Java interface
+## Java interface
 
-The java wrapper is a thin JNI (Java Native Interface) wrapper around the native C implementation.
+The Java wrapper is a thin JNI (Java Native Interface) wrapper around the native C implementation.
 
-###Building
+### Building
 
-Only the C, C++ and C shared libraries are generated from the default make goal, and as such, the java interface must be built explicitly.
+Only the C, C++, and C shared libraries are generated from the default make goal, and as such, the Java interface must be built explicitly.
 
-1. Ensure `javac` and `jar` are on path
-2. Ensure JAVA_HOME is set to an installed JRE or JDK, or the JNI include directory is included in the C system library search path
+1. Ensure `javac` and `jar` are in `PATH`.
+2. Ensure `JAVA_HOME` is set to an installed JRE or JDK, or the JNI include directory is included in the C system library search path.
 3. `make java` from the src directory.
-4. libsswjni.so and ssw.jar should be built.
+4. `libsswjni.so` and `ssw.jar` should be built.
+
+### How to use the Java wrapper
+
+The Java wrapper consist of the following components:
+
+* `libsswjni.so`: native C library exposing the JNI entry points
+* `ssw.jar` is a Java library containing the Java interface to the native C library. This small wrapper library is composed of:
+
+    * `ssw.Aligner` Java class: a thread-safe static class that exposes two `align()` methods. The first exposes the SSW C library directly. No error checking is performed on arguments passed to this method and misuse is highly likely to crash the JVM. The second `align()` method is a more user-friendly entry point that exposes a simpler API and performs some basic error checking.
+    * `ssw.Alignment` Java class: this class stores alignment results. Each each field has a direct correspondence and identical meaning to the C s_align struct.
+    * `ssw.Example` Java class: Java version of the example_c sample code. Run `java -jar ssw.jar` to execute the sample.
 
-###How to use the java wrapper
+To use the library, either reference the `ssw.jar` or including the Aligner and Alignment classes directly. As for any JNI library, the native library must be loaded (using `System.loadLibrary("sswjni")` or similar) before invokation of native methods. For the JVM to find the library, ensure that either the library is included in the `LD_LIBRARY_PATH` environment variable, or `-Djava.library.path=<directory containing libsswjni.so>` is set on the Java command line.
 
-The java wrapper consist of the following components:
+## R package
 
-* libsswjni.so: native C library exposing the JNI entry points
-* ssw.jar is a java library containing the Java interface to the native C library. This small wrapper library is composed of:
-    * ssw.Aligner java class: a thread-safe static class that exposes two `align()` methods. The first exposes the SSW C library directly. No error checking is performed on arguments passed to this method and misuse is highly likely to crash the JVM. The second align() method is a more user-friendly entry point that exposes a simpler API and performs some basic error checking.
-    * ssw.Alignment java class: this class stores alignment results. Each each field has a direct correspondence and identical meaning to the C s_align struct.
-    * ssw.Example java class: java version of the example_c sample code. Run `java -jar ssw.jar` to execute the sample.
+Please check all the R package information here:
+https://github.com/nanxstats/ssw-r
 
-To use the library, either reference the ssw.jar or including the Aligner and Alignment classes directly. As for any JNI library, the native library must be loaded (using `System.loadLibrary("sswjni")` or similar) before invokation of native methods. For the JVM to find the library, ensure that either the library is included in the LD_LIBRARY_PATH environment variable, or `-Djava.library.path=<directory containing libsswjni.so>` is set on the java command line.
+## Citation
 
-###Please cite this paper, if you need:
-http://dx.plos.org/10.1371/journal.pone.0082138
+Please cite this paper, if you need:
+https://doi.org/10.1371/journal.pone.0082138


=====================================
demo/blosum62.txt
=====================================
@@ -0,0 +1,31 @@
+#  Matrix made by matblas from blosum62.iij
+#  * column uses minimum score
+#  BLOSUM Clustered Scoring Matrix in 1/2 Bit Units
+#  Blocks Database = /data/blocks_5.0/blocks.dat
+#  Cluster Percentage: >= 62
+#  Entropy =   0.6979, Expected =  -0.5209
+   A  R  N  D  C  Q  E  G  H  I  L  K  M  F  P  S  T  W  Y  V  B  Z  X  *
+A  4 -1 -2 -2  0 -1 -1  0 -2 -1 -1 -1 -1 -2 -1  1  0 -3 -2  0 -2 -1  0 -4 
+R -1  5  0 -2 -3  1  0 -2  0 -3 -2  2 -1 -3 -2 -1 -1 -3 -2 -3 -1  0 -1 -4 
+N -2  0  6  1 -3  0  0  0  1 -3 -3  0 -2 -3 -2  1  0 -4 -2 -3  3  0 -1 -4 
+D -2 -2  1  6 -3  0  2 -1 -1 -3 -4 -1 -3 -3 -1  0 -1 -4 -3 -3  4  1 -1 -4 
+C  0 -3 -3 -3  9 -3 -4 -3 -3 -1 -1 -3 -1 -2 -3 -1 -1 -2 -2 -1 -3 -3 -2 -4 
+Q -1  1  0  0 -3  5  2 -2  0 -3 -2  1  0 -3 -1  0 -1 -2 -1 -2  0  3 -1 -4 
+E -1  0  0  2 -4  2  5 -2  0 -3 -3  1 -2 -3 -1  0 -1 -3 -2 -2  1  4 -1 -4 
+G  0 -2  0 -1 -3 -2 -2  6 -2 -4 -4 -2 -3 -3 -2  0 -2 -2 -3 -3 -1 -2 -1 -4 
+H -2  0  1 -1 -3  0  0 -2  8 -3 -3 -1 -2 -1 -2 -1 -2 -2  2 -3  0  0 -1 -4 
+I -1 -3 -3 -3 -1 -3 -3 -4 -3  4  2 -3  1  0 -3 -2 -1 -3 -1  3 -3 -3 -1 -4 
+L -1 -2 -3 -4 -1 -2 -3 -4 -3  2  4 -2  2  0 -3 -2 -1 -2 -1  1 -4 -3 -1 -4 
+K -1  2  0 -1 -3  1  1 -2 -1 -3 -2  5 -1 -3 -1  0 -1 -3 -2 -2  0  1 -1 -4 
+M -1 -1 -2 -3 -1  0 -2 -3 -2  1  2 -1  5  0 -2 -1 -1 -1 -1  1 -3 -1 -1 -4 
+F -2 -3 -3 -3 -2 -3 -3 -3 -1  0  0 -3  0  6 -4 -2 -2  1  3 -1 -3 -3 -1 -4 
+P -1 -2 -2 -1 -3 -1 -1 -2 -2 -3 -3 -1 -2 -4  7 -1 -1 -4 -3 -2 -2 -1 -2 -4 
+S  1 -1  1  0 -1  0  0  0 -1 -2 -2  0 -1 -2 -1  4  1 -3 -2 -2  0  0  0 -4 
+T  0 -1  0 -1 -1 -1 -1 -2 -2 -1 -1 -1 -1 -2 -1  1  5 -2 -2  0 -1 -1  0 -4 
+W -3 -3 -4 -4 -2 -2 -3 -2 -2 -3 -2 -3 -1  1 -4 -3 -2 11  2 -3 -4 -3 -2 -4 
+Y -2 -2 -2 -3 -2 -1 -2 -3  2 -1 -1 -2 -1  3 -3 -2 -2  2  7 -1 -3 -2 -1 -4 
+V  0 -3 -3 -3 -1 -2 -2 -3 -3  3  1 -2  1 -1 -2 -2  0 -3 -1  4 -3 -2 -1 -4 
+B -2 -1  3  4 -3  0  1 -1  0 -3 -4  0 -3 -3 -2  0 -1 -4 -3 -3  4  1 -1 -4 
+Z -1  0  0  1 -3  3  4 -2  0 -3 -3  1 -1 -3 -1  0 -1 -3 -2 -2  1  4 -1 -4 
+X  0 -1 -1 -1 -2 -1 -1 -1 -1 -1 -1 -1 -1 -1 -2  0  0 -2 -1 -1 -1 -1 -1 -4 
+* -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4  1


=====================================
demo/new.txt
=====================================
@@ -0,0 +1,400 @@
+target_name: chr3
+query_name: 6:163296599:F:198;None;None/1
+optimal_alignment_score: 53	suboptimal_alignment_score: 52	strand: +	target_end: 745864	query_end: 54
+
+target_name: chr3
+query_name: 3:153409880:F:224;None;3,153410143,G,A/1
+optimal_alignment_score: 47	suboptimal_alignment_score: 45	strand: +	target_end: 350986	query_end: 53
+
+target_name: chr3
+query_name: Y:26750420:R:-132;None;None/1
+optimal_alignment_score: 50	suboptimal_alignment_score: 49	strand: +	target_end: 319376	query_end: 52
+
+target_name: chr3
+query_name: 13:91170622:R:-276;None;None/1
+optimal_alignment_score: 49	suboptimal_alignment_score: 48	strand: +	target_end: 353946	query_end: 49
+
+target_name: chr3
+query_name: 15:37079528:R:-240;None;None/1
+optimal_alignment_score: 40	suboptimal_alignment_score: 39	strand: +	target_end: 257705	query_end: 53
+
+target_name: chr3
+query_name: 9:92308501:R:-176;None;None/1
+optimal_alignment_score: 80	suboptimal_alignment_score: 70	strand: +	target_end: 105983	query_end: 48
+
+target_name: chr3
+query_name: 3:109533106:R:-128;None;None/1
+optimal_alignment_score: 42	suboptimal_alignment_score: 42	strand: +	target_end: 198965	query_end: 47
+
+target_name: chr3
+query_name: 16:32366683:R:-158;16,32366730,G,A;None/1
+optimal_alignment_score: 50	suboptimal_alignment_score: 46	strand: +	target_end: 823370	query_end: 52
+
+target_name: chr3
+query_name: 13:111809970:R:-237;None;None/1
+optimal_alignment_score: 43	suboptimal_alignment_score: 40	strand: +	target_end: 509164	query_end: 53
+
+target_name: chr3
+query_name: 3:38871305:F:159;None;None/1
+optimal_alignment_score: 39	suboptimal_alignment_score: 37	strand: +	target_end: 841259	query_end: 54
+
+target_name: chr3
+query_name: 19:8434785:F:295;None;None/1
+optimal_alignment_score: 47	suboptimal_alignment_score: 43	strand: +	target_end: 330617	query_end: 53
+
+target_name: chr3
+query_name: 8:132563397:R:-144;None;None/1
+optimal_alignment_score: 49	suboptimal_alignment_score: 40	strand: +	target_end: 54827	query_end: 54
+
+target_name: chr3
+query_name: 3:190507952:R:-57;3,190507958,G,A;None/1
+optimal_alignment_score: 41	suboptimal_alignment_score: 41	strand: +	target_end: 138110	query_end: 54
+
+target_name: chr3
+query_name: 10:59883855:R:-120;None;None/1
+optimal_alignment_score: 43	suboptimal_alignment_score: 43	strand: +	target_end: 244247	query_end: 53
+
+target_name: chr3
+query_name: 6:148630669:R:-271;None;None/1
+optimal_alignment_score: 41	suboptimal_alignment_score: 39	strand: +	target_end: 658593	query_end: 54
+
+target_name: chr3
+query_name: 3:64518566:R:-93;None;None/1
+optimal_alignment_score: 46	suboptimal_alignment_score: 45	strand: +	target_end: 881546	query_end: 44
+
+target_name: chr3
+query_name: 14:42514746:R:-187;None;None/1
+optimal_alignment_score: 58	suboptimal_alignment_score: 58	strand: +	target_end: 206906	query_end: 48
+
+target_name: chr3
+query_name: 7:126497810:R:-191;None;None/1
+optimal_alignment_score: 45	suboptimal_alignment_score: 40	strand: +	target_end: 232372	query_end: 54
+
+target_name: chr3
+query_name: 21:15123085:R:-140;None;None/1
+optimal_alignment_score: 43	suboptimal_alignment_score: 43	strand: +	target_end: 492310	query_end: 51
+
+target_name: chr3
+query_name: 2:217172401:R:-226;None;None/1
+optimal_alignment_score: 42	suboptimal_alignment_score: 41	strand: +	target_end: 261738	query_end: 54
+
+target_name: chr3
+query_name: 6:129456169:R:-167;None;None/1
+optimal_alignment_score: 42	suboptimal_alignment_score: 39	strand: +	target_end: 750865	query_end: 53
+
+target_name: chr3
+query_name: 4:160333309:R:-131;None;4,160333230,T,G/1
+optimal_alignment_score: 42	suboptimal_alignment_score: 41	strand: +	target_end: 465814	query_end: 54
+
+target_name: chr3
+query_name: 3:37890126:R:-228;None;None/1
+optimal_alignment_score: 40	suboptimal_alignment_score: 40	strand: +	target_end: 661952	query_end: 53
+
+target_name: chr3
+query_name: 17:49899522:F:154;None;None/1
+optimal_alignment_score: 47	suboptimal_alignment_score: 46	strand: +	target_end: 94043	query_end: 51
+
+target_name: chr3
+query_name: 8:138662818:R:-211;None;8,138662609,C,T/1
+optimal_alignment_score: 43	suboptimal_alignment_score: 41	strand: +	target_end: 175380	query_end: 50
+
+target_name: chr3
+query_name: 3:97260940:R:-254;None;None/1
+optimal_alignment_score: 39	suboptimal_alignment_score: 38	strand: +	target_end: 903012	query_end: 54
+
+target_name: chr3
+query_name: 7:14401638:F:350;None;None/1
+optimal_alignment_score: 50	suboptimal_alignment_score: 47	strand: +	target_end: 265434	query_end: 48
+
+target_name: chr3
+query_name: 13:19068477:F:246;None;None/1
+optimal_alignment_score: 47	suboptimal_alignment_score: 45	strand: +	target_end: 628991	query_end: 54
+
+target_name: chr3
+query_name: 3:179655047:F:206;None;None/1
+optimal_alignment_score: 45	suboptimal_alignment_score: 44	strand: +	target_end: 476787	query_end: 51
+
+target_name: chr3
+query_name: 20:15001091:F:174;None;None/1
+optimal_alignment_score: 44	suboptimal_alignment_score: 43	strand: +	target_end: 226051	query_end: 51
+
+target_name: chr3
+query_name: 2:54967337:R:-153;None;None/1
+optimal_alignment_score: 43	suboptimal_alignment_score: 42	strand: +	target_end: 689523	query_end: 54
+
+target_name: chr3
+query_name: 18:39435195:F:202;None;None/1
+optimal_alignment_score: 51	suboptimal_alignment_score: 45	strand: +	target_end: 998370	query_end: 53
+
+target_name: chr3
+query_name: X:68133790:F:167;None;None/1
+optimal_alignment_score: 43	suboptimal_alignment_score: 40	strand: +	target_end: 612440	query_end: 50
+
+target_name: chr3
+query_name: 9:18241281:F:267;None;None/1
+optimal_alignment_score: 53	suboptimal_alignment_score: 49	strand: +	target_end: 872860	query_end: 53
+
+target_name: chr3
+query_name: 7:141812631:F:239;None;None/1
+optimal_alignment_score: 40	suboptimal_alignment_score: 40	strand: +	target_end: 510995	query_end: 44
+
+target_name: chr3
+query_name: 15:54133744:R:-269;None;None/1
+optimal_alignment_score: 41	suboptimal_alignment_score: 40	strand: +	target_end: 725603	query_end: 53
+
+target_name: chr3
+query_name: 4:66588719:R:-245;None;None/1
+optimal_alignment_score: 58	suboptimal_alignment_score: 58	strand: +	target_end: 163129	query_end: 54
+
+target_name: chr3
+query_name: 1:217587184:F:229;None;None/1
+optimal_alignment_score: 46	suboptimal_alignment_score: 46	strand: +	target_end: 196793	query_end: 52
+
+target_name: chr3
+query_name: 2:119213044:R:-184;None;None/1
+optimal_alignment_score: 49	suboptimal_alignment_score: 44	strand: +	target_end: 552663	query_end: 50
+
+target_name: chr3
+query_name: 2:36811349:F:265;None;None/1
+optimal_alignment_score: 44	suboptimal_alignment_score: 43	strand: +	target_end: 945971	query_end: 52
+
+target_name: chr3
+query_name: 1:68786835:F:349;None;None/1
+optimal_alignment_score: 45	suboptimal_alignment_score: 45	strand: +	target_end: 528144	query_end: 52
+
+target_name: chr3
+query_name: 4:126634771:F:185;None;None/1
+optimal_alignment_score: 45	suboptimal_alignment_score: 42	strand: +	target_end: 745309	query_end: 50
+
+target_name: chr3
+query_name: 4:114171402:R:-296;None;None/1
+optimal_alignment_score: 44	suboptimal_alignment_score: 42	strand: +	target_end: 751541	query_end: 53
+
+target_name: chr3
+query_name: 9:125206364:R:-140;None;None/1
+optimal_alignment_score: 41	suboptimal_alignment_score: 39	strand: +	target_end: 932480	query_end: 50
+
+target_name: chr3
+query_name: X:25208394:F:245;None;None/1
+optimal_alignment_score: 42	suboptimal_alignment_score: 41	strand: +	target_end: 696734	query_end: 52
+
+target_name: chr3
+query_name: 2:222217158:F:227;None;None/1
+optimal_alignment_score: 41	suboptimal_alignment_score: 40	strand: +	target_end: 704921	query_end: 53
+
+target_name: chr3
+query_name: 1:196837088:F:230;None;None/1
+optimal_alignment_score: 46	suboptimal_alignment_score: 45	strand: +	target_end: 563639	query_end: 53
+
+target_name: chr3
+query_name: 7:24300836:R:-208;7,24300863,*,+1T;None/1
+optimal_alignment_score: 44	suboptimal_alignment_score: 42	strand: +	target_end: 239063	query_end: 54
+
+target_name: chr3
+query_name: 1:76948473:F:230;None;None/1
+optimal_alignment_score: 45	suboptimal_alignment_score: 45	strand: +	target_end: 386767	query_end: 54
+
+target_name: chr3
+query_name: 8:62727783:F:135;None;None/1
+optimal_alignment_score: 47	suboptimal_alignment_score: 46	strand: +	target_end: 406541	query_end: 54
+
+target_name: chr3
+query_name: 7:94677617:R:-155;None;None/1
+optimal_alignment_score: 46	suboptimal_alignment_score: 46	strand: +	target_end: 345086	query_end: 54
+
+target_name: chr3
+query_name: 4:10707991:F:305;None;None/1
+optimal_alignment_score: 94	suboptimal_alignment_score: 86	strand: +	target_end: 198089	query_end: 53
+
+target_name: chr3
+query_name: 1:183203391:R:-141;None;None/1
+optimal_alignment_score: 48	suboptimal_alignment_score: 45	strand: +	target_end: 344874	query_end: 53
+
+target_name: chr3
+query_name: 15:63526307:R:-83;None;None/1
+optimal_alignment_score: 40	suboptimal_alignment_score: 40	strand: +	target_end: 78739	query_end: 52
+
+target_name: chr3
+query_name: 8:79068236:F:137;None;None/1
+optimal_alignment_score: 43	suboptimal_alignment_score: 42	strand: +	target_end: 326507	query_end: 47
+
+target_name: chr3
+query_name: 9:1290219:R:-251;None;None/1
+optimal_alignment_score: 84	suboptimal_alignment_score: 80	strand: +	target_end: 537411	query_end: 54
+
+target_name: chr3
+query_name: 5:122248402:R:-212;None;None/1
+optimal_alignment_score: 40	suboptimal_alignment_score: 39	strand: +	target_end: 40834	query_end: 52
+
+target_name: chr3
+query_name: X:153897575:R:-263;None;None/1
+optimal_alignment_score: 75	suboptimal_alignment_score: 74	strand: +	target_end: 223405	query_end: 49
+
+target_name: chr3
+query_name: 6:95453360:F:204;None;None/1
+optimal_alignment_score: 45	suboptimal_alignment_score: 42	strand: +	target_end: 182708	query_end: 52
+
+target_name: chr3
+query_name: 10:6951101:R:-212;None;None/1
+optimal_alignment_score: 41	suboptimal_alignment_score: 38	strand: +	target_end: 897043	query_end: 47
+
+target_name: chr3
+query_name: 13:59325230:F:98;None;None/1
+optimal_alignment_score: 39	suboptimal_alignment_score: 38	strand: +	target_end: 773481	query_end: 54
+
+target_name: chr3
+query_name: 3:37059536:F:184;None;None/1
+optimal_alignment_score: 49	suboptimal_alignment_score: 44	strand: +	target_end: 899262	query_end: 54
+
+target_name: chr3
+query_name: 9:91156368:F:149;None;None/1
+optimal_alignment_score: 43	suboptimal_alignment_score: 42	strand: +	target_end: 606747	query_end: 51
+
+target_name: chr3
+query_name: 12:55970708:R:-246;12,55970762,C,T;None/1
+optimal_alignment_score: 92	suboptimal_alignment_score: 88	strand: +	target_end: 771042	query_end: 54
+
+target_name: chr3
+query_name: 3:120828442:F:203;None;None/1
+optimal_alignment_score: 44	suboptimal_alignment_score: 41	strand: +	target_end: 595794	query_end: 50
+
+target_name: chr3
+query_name: 19:60104965:R:-147;None;None/1
+optimal_alignment_score: 65	suboptimal_alignment_score: 64	strand: +	target_end: 866783	query_end: 52
+
+target_name: chr3
+query_name: 11:98782460:R:-46;None;None/1
+optimal_alignment_score: 56	suboptimal_alignment_score: 42	strand: +	target_end: 819223	query_end: 51
+
+target_name: chr3
+query_name: 12:91111930:F:209;None;None/1
+optimal_alignment_score: 43	suboptimal_alignment_score: 42	strand: +	target_end: 679515	query_end: 51
+
+target_name: chr3
+query_name: 3:192601514:F:196;None;None/1
+optimal_alignment_score: 42	suboptimal_alignment_score: 40	strand: +	target_end: 392672	query_end: 40
+
+target_name: chr3
+query_name: X:47989386:R:-46;None;None/1
+optimal_alignment_score: 39	suboptimal_alignment_score: 36	strand: +	target_end: 793950	query_end: 54
+
+target_name: chr3
+query_name: 2:39824446:R:-127;None;None/1
+optimal_alignment_score: 72	suboptimal_alignment_score: 70	strand: +	target_end: 972352	query_end: 54
+
+target_name: chr3
+query_name: 1:227528596:F:219;None;None/1
+optimal_alignment_score: 44	suboptimal_alignment_score: 42	strand: +	target_end: 869543	query_end: 51
+
+target_name: chr3
+query_name: X:4440324:R:-90;None;None/1
+optimal_alignment_score: 42	suboptimal_alignment_score: 40	strand: +	target_end: 100289	query_end: 52
+
+target_name: chr3
+query_name: X:73634281:F:88;None;None/1
+optimal_alignment_score: 52	suboptimal_alignment_score: 41	strand: +	target_end: 922600	query_end: 49
+
+target_name: chr3
+query_name: 3:53909823:F:239;None;3,53910104,T,C/1
+optimal_alignment_score: 44	suboptimal_alignment_score: 44	strand: +	target_end: 640897	query_end: 47
+
+target_name: chr3
+query_name: 3:170059370:R:-173;None;3,170059212,G,C/1
+optimal_alignment_score: 44	suboptimal_alignment_score: 43	strand: +	target_end: 483764	query_end: 54
+
+target_name: chr3
+query_name: 20:8823533:F:251;None;None/1
+optimal_alignment_score: 39	suboptimal_alignment_score: 38	strand: +	target_end: 887639	query_end: 49
+
+target_name: chr3
+query_name: 5:106802036:F:200;None;None/1
+optimal_alignment_score: 44	suboptimal_alignment_score: 42	strand: +	target_end: 137797	query_end: 53
+
+target_name: chr3
+query_name: 5:74632980:R:-252;None;None/1
+optimal_alignment_score: 44	suboptimal_alignment_score: 43	strand: +	target_end: 297976	query_end: 54
+
+target_name: chr3
+query_name: 12:84751900:F:268;None;None/1
+optimal_alignment_score: 42	suboptimal_alignment_score: 42	strand: +	target_end: 183485	query_end: 48
+
+target_name: chr3
+query_name: X:135811790:R:-164;None;None/1
+optimal_alignment_score: 48	suboptimal_alignment_score: 44	strand: +	target_end: 178654	query_end: 54
+
+target_name: chr3
+query_name: 10:53108829:R:-138;None;None/1
+optimal_alignment_score: 45	suboptimal_alignment_score: 42	strand: +	target_end: 625392	query_end: 50
+
+target_name: chr3
+query_name: 11:83660431:R:-245;None;None/1
+optimal_alignment_score: 47	suboptimal_alignment_score: 44	strand: +	target_end: 226749	query_end: 54
+
+target_name: chr3
+query_name: 10:119650756:F:207;None;None/1
+optimal_alignment_score: 44	suboptimal_alignment_score: 42	strand: +	target_end: 529549	query_end: 41
+
+target_name: chr3
+query_name: 2:154848222:R:-185;None;None/1
+optimal_alignment_score: 45	suboptimal_alignment_score: 43	strand: +	target_end: 585761	query_end: 54
+
+target_name: chr3
+query_name: 3:63918850:F:276;None;None/1
+optimal_alignment_score: 42	suboptimal_alignment_score: 41	strand: +	target_end: 345340	query_end: 50
+
+target_name: chr3
+query_name: 10:11447534:F:239;None;None/1
+optimal_alignment_score: 38	suboptimal_alignment_score: 38	strand: +	target_end: 114931	query_end: 52
+
+target_name: chr3
+query_name: 1:57651555:R:-316;None;None/1
+optimal_alignment_score: 46	suboptimal_alignment_score: 44	strand: +	target_end: 104219	query_end: 54
+
+target_name: chr3
+query_name: X:91671697:R:-206;None;X,91671506,G,A/1
+optimal_alignment_score: 48	suboptimal_alignment_score: 47	strand: +	target_end: 771479	query_end: 49
+
+target_name: chr3
+query_name: 12:101622421:R:-187;None;None/1
+optimal_alignment_score: 45	suboptimal_alignment_score: 44	strand: +	target_end: 361502	query_end: 49
+
+target_name: chr3
+query_name: 13:43163355:R:-200;None;None/1
+optimal_alignment_score: 42	suboptimal_alignment_score: 42	strand: +	target_end: 167308	query_end: 53
+
+target_name: chr3
+query_name: 19:37349825:R:-160;None;None/1
+optimal_alignment_score: 70	suboptimal_alignment_score: 68	strand: +	target_end: 261459	query_end: 54
+
+target_name: chr3
+query_name: 8:41173199:R:-175;None;8,41173056,T,C/1
+optimal_alignment_score: 44	suboptimal_alignment_score: 43	strand: +	target_end: 644612	query_end: 47
+
+target_name: chr3
+query_name: 16:29068055:R:-232;None;None/1
+optimal_alignment_score: 44	suboptimal_alignment_score: 40	strand: +	target_end: 934198	query_end: 53
+
+target_name: chr3
+query_name: 1:4694340:F:179;None;None/1
+optimal_alignment_score: 48	suboptimal_alignment_score: 45	strand: +	target_end: 140768	query_end: 54
+
+target_name: chr3
+query_name: 17:76451446:F:221;None;None/1
+optimal_alignment_score: 42	suboptimal_alignment_score: 40	strand: +	target_end: 561765	query_end: 54
+
+target_name: chr3
+query_name: 1:242817485:F:158;None;None/1
+optimal_alignment_score: 52	suboptimal_alignment_score: 46	strand: +	target_end: 396383	query_end: 50
+
+target_name: chr3
+query_name: 18:29611708:R:-146;None;18,29611585,G,A/1
+optimal_alignment_score: 45	suboptimal_alignment_score: 40	strand: +	target_end: 157255	query_end: 47
+
+target_name: chr3
+query_name: 9:36493018:F:206;None;None/1
+optimal_alignment_score: 82	suboptimal_alignment_score: 80	strand: +	target_end: 872774	query_end: 54
+
+target_name: chr3
+query_name: 9:16274705:F:129;9,16274709,C,A:9,16274719,C,G:9,16274741,A,T;9,16274862,A,G/1
+optimal_alignment_score: 72	suboptimal_alignment_score: 64	strand: +	target_end: 524699	query_end: 54
+


=====================================
demo/old.txt
=====================================
@@ -0,0 +1,400 @@
+target_name: chr3
+query_name: 6:163296599:F:198;None;None/1
+optimal_alignment_score: 53	suboptimal_alignment_score: 52	strand: +	target_end: 745864	query_end: 54
+
+target_name: chr3
+query_name: 3:153409880:F:224;None;3,153410143,G,A/1
+optimal_alignment_score: 47	suboptimal_alignment_score: 45	strand: +	target_end: 350986	query_end: 53
+
+target_name: chr3
+query_name: Y:26750420:R:-132;None;None/1
+optimal_alignment_score: 50	suboptimal_alignment_score: 49	strand: +	target_end: 319376	query_end: 52
+
+target_name: chr3
+query_name: 13:91170622:R:-276;None;None/1
+optimal_alignment_score: 49	suboptimal_alignment_score: 48	strand: +	target_end: 353946	query_end: 49
+
+target_name: chr3
+query_name: 15:37079528:R:-240;None;None/1
+optimal_alignment_score: 40	suboptimal_alignment_score: 39	strand: +	target_end: 257705	query_end: 53
+
+target_name: chr3
+query_name: 9:92308501:R:-176;None;None/1
+optimal_alignment_score: 80	suboptimal_alignment_score: 70	strand: +	target_end: 105983	query_end: 48
+
+target_name: chr3
+query_name: 3:109533106:R:-128;None;None/1
+optimal_alignment_score: 42	suboptimal_alignment_score: 42	strand: +	target_end: 198965	query_end: 47
+
+target_name: chr3
+query_name: 16:32366683:R:-158;16,32366730,G,A;None/1
+optimal_alignment_score: 50	suboptimal_alignment_score: 46	strand: +	target_end: 823370	query_end: 52
+
+target_name: chr3
+query_name: 13:111809970:R:-237;None;None/1
+optimal_alignment_score: 43	suboptimal_alignment_score: 40	strand: +	target_end: 509164	query_end: 53
+
+target_name: chr3
+query_name: 3:38871305:F:159;None;None/1
+optimal_alignment_score: 39	suboptimal_alignment_score: 37	strand: +	target_end: 841259	query_end: 54
+
+target_name: chr3
+query_name: 19:8434785:F:295;None;None/1
+optimal_alignment_score: 47	suboptimal_alignment_score: 43	strand: +	target_end: 330617	query_end: 53
+
+target_name: chr3
+query_name: 8:132563397:R:-144;None;None/1
+optimal_alignment_score: 49	suboptimal_alignment_score: 40	strand: +	target_end: 54827	query_end: 54
+
+target_name: chr3
+query_name: 3:190507952:R:-57;3,190507958,G,A;None/1
+optimal_alignment_score: 41	suboptimal_alignment_score: 41	strand: +	target_end: 138110	query_end: 54
+
+target_name: chr3
+query_name: 10:59883855:R:-120;None;None/1
+optimal_alignment_score: 43	suboptimal_alignment_score: 43	strand: +	target_end: 244247	query_end: 53
+
+target_name: chr3
+query_name: 6:148630669:R:-271;None;None/1
+optimal_alignment_score: 41	suboptimal_alignment_score: 39	strand: +	target_end: 658593	query_end: 54
+
+target_name: chr3
+query_name: 3:64518566:R:-93;None;None/1
+optimal_alignment_score: 46	suboptimal_alignment_score: 45	strand: +	target_end: 881546	query_end: 44
+
+target_name: chr3
+query_name: 14:42514746:R:-187;None;None/1
+optimal_alignment_score: 58	suboptimal_alignment_score: 58	strand: +	target_end: 206906	query_end: 48
+
+target_name: chr3
+query_name: 7:126497810:R:-191;None;None/1
+optimal_alignment_score: 45	suboptimal_alignment_score: 40	strand: +	target_end: 232372	query_end: 54
+
+target_name: chr3
+query_name: 21:15123085:R:-140;None;None/1
+optimal_alignment_score: 43	suboptimal_alignment_score: 43	strand: +	target_end: 492310	query_end: 51
+
+target_name: chr3
+query_name: 2:217172401:R:-226;None;None/1
+optimal_alignment_score: 42	suboptimal_alignment_score: 41	strand: +	target_end: 261738	query_end: 54
+
+target_name: chr3
+query_name: 6:129456169:R:-167;None;None/1
+optimal_alignment_score: 42	suboptimal_alignment_score: 39	strand: +	target_end: 750865	query_end: 53
+
+target_name: chr3
+query_name: 4:160333309:R:-131;None;4,160333230,T,G/1
+optimal_alignment_score: 42	suboptimal_alignment_score: 41	strand: +	target_end: 465814	query_end: 54
+
+target_name: chr3
+query_name: 3:37890126:R:-228;None;None/1
+optimal_alignment_score: 40	suboptimal_alignment_score: 40	strand: +	target_end: 661952	query_end: 53
+
+target_name: chr3
+query_name: 17:49899522:F:154;None;None/1
+optimal_alignment_score: 47	suboptimal_alignment_score: 46	strand: +	target_end: 94043	query_end: 51
+
+target_name: chr3
+query_name: 8:138662818:R:-211;None;8,138662609,C,T/1
+optimal_alignment_score: 43	suboptimal_alignment_score: 41	strand: +	target_end: 175380	query_end: 50
+
+target_name: chr3
+query_name: 3:97260940:R:-254;None;None/1
+optimal_alignment_score: 39	suboptimal_alignment_score: 38	strand: +	target_end: 903012	query_end: 54
+
+target_name: chr3
+query_name: 7:14401638:F:350;None;None/1
+optimal_alignment_score: 50	suboptimal_alignment_score: 47	strand: +	target_end: 265434	query_end: 48
+
+target_name: chr3
+query_name: 13:19068477:F:246;None;None/1
+optimal_alignment_score: 47	suboptimal_alignment_score: 45	strand: +	target_end: 628991	query_end: 54
+
+target_name: chr3
+query_name: 3:179655047:F:206;None;None/1
+optimal_alignment_score: 45	suboptimal_alignment_score: 44	strand: +	target_end: 476787	query_end: 51
+
+target_name: chr3
+query_name: 20:15001091:F:174;None;None/1
+optimal_alignment_score: 44	suboptimal_alignment_score: 43	strand: +	target_end: 226051	query_end: 51
+
+target_name: chr3
+query_name: 2:54967337:R:-153;None;None/1
+optimal_alignment_score: 43	suboptimal_alignment_score: 42	strand: +	target_end: 689523	query_end: 54
+
+target_name: chr3
+query_name: 18:39435195:F:202;None;None/1
+optimal_alignment_score: 51	suboptimal_alignment_score: 45	strand: +	target_end: 998370	query_end: 53
+
+target_name: chr3
+query_name: X:68133790:F:167;None;None/1
+optimal_alignment_score: 43	suboptimal_alignment_score: 40	strand: +	target_end: 612440	query_end: 50
+
+target_name: chr3
+query_name: 9:18241281:F:267;None;None/1
+optimal_alignment_score: 53	suboptimal_alignment_score: 49	strand: +	target_end: 872860	query_end: 53
+
+target_name: chr3
+query_name: 7:141812631:F:239;None;None/1
+optimal_alignment_score: 40	suboptimal_alignment_score: 40	strand: +	target_end: 510995	query_end: 44
+
+target_name: chr3
+query_name: 15:54133744:R:-269;None;None/1
+optimal_alignment_score: 41	suboptimal_alignment_score: 40	strand: +	target_end: 725603	query_end: 53
+
+target_name: chr3
+query_name: 4:66588719:R:-245;None;None/1
+optimal_alignment_score: 58	suboptimal_alignment_score: 58	strand: +	target_end: 163129	query_end: 54
+
+target_name: chr3
+query_name: 1:217587184:F:229;None;None/1
+optimal_alignment_score: 46	suboptimal_alignment_score: 46	strand: +	target_end: 196793	query_end: 52
+
+target_name: chr3
+query_name: 2:119213044:R:-184;None;None/1
+optimal_alignment_score: 49	suboptimal_alignment_score: 44	strand: +	target_end: 552663	query_end: 50
+
+target_name: chr3
+query_name: 2:36811349:F:265;None;None/1
+optimal_alignment_score: 44	suboptimal_alignment_score: 43	strand: +	target_end: 945971	query_end: 52
+
+target_name: chr3
+query_name: 1:68786835:F:349;None;None/1
+optimal_alignment_score: 45	suboptimal_alignment_score: 45	strand: +	target_end: 528144	query_end: 52
+
+target_name: chr3
+query_name: 4:126634771:F:185;None;None/1
+optimal_alignment_score: 45	suboptimal_alignment_score: 42	strand: +	target_end: 745309	query_end: 50
+
+target_name: chr3
+query_name: 4:114171402:R:-296;None;None/1
+optimal_alignment_score: 44	suboptimal_alignment_score: 42	strand: +	target_end: 751541	query_end: 53
+
+target_name: chr3
+query_name: 9:125206364:R:-140;None;None/1
+optimal_alignment_score: 41	suboptimal_alignment_score: 39	strand: +	target_end: 932480	query_end: 50
+
+target_name: chr3
+query_name: X:25208394:F:245;None;None/1
+optimal_alignment_score: 42	suboptimal_alignment_score: 41	strand: +	target_end: 696734	query_end: 52
+
+target_name: chr3
+query_name: 2:222217158:F:227;None;None/1
+optimal_alignment_score: 41	suboptimal_alignment_score: 40	strand: +	target_end: 704921	query_end: 53
+
+target_name: chr3
+query_name: 1:196837088:F:230;None;None/1
+optimal_alignment_score: 46	suboptimal_alignment_score: 45	strand: +	target_end: 563639	query_end: 53
+
+target_name: chr3
+query_name: 7:24300836:R:-208;7,24300863,*,+1T;None/1
+optimal_alignment_score: 44	suboptimal_alignment_score: 42	strand: +	target_end: 239063	query_end: 54
+
+target_name: chr3
+query_name: 1:76948473:F:230;None;None/1
+optimal_alignment_score: 45	suboptimal_alignment_score: 45	strand: +	target_end: 386767	query_end: 54
+
+target_name: chr3
+query_name: 8:62727783:F:135;None;None/1
+optimal_alignment_score: 47	suboptimal_alignment_score: 46	strand: +	target_end: 406541	query_end: 54
+
+target_name: chr3
+query_name: 7:94677617:R:-155;None;None/1
+optimal_alignment_score: 46	suboptimal_alignment_score: 46	strand: +	target_end: 345086	query_end: 54
+
+target_name: chr3
+query_name: 4:10707991:F:305;None;None/1
+optimal_alignment_score: 94	suboptimal_alignment_score: 86	strand: +	target_end: 198089	query_end: 53
+
+target_name: chr3
+query_name: 1:183203391:R:-141;None;None/1
+optimal_alignment_score: 48	suboptimal_alignment_score: 45	strand: +	target_end: 344874	query_end: 53
+
+target_name: chr3
+query_name: 15:63526307:R:-83;None;None/1
+optimal_alignment_score: 40	suboptimal_alignment_score: 40	strand: +	target_end: 78739	query_end: 52
+
+target_name: chr3
+query_name: 8:79068236:F:137;None;None/1
+optimal_alignment_score: 43	suboptimal_alignment_score: 42	strand: +	target_end: 326507	query_end: 47
+
+target_name: chr3
+query_name: 9:1290219:R:-251;None;None/1
+optimal_alignment_score: 84	suboptimal_alignment_score: 80	strand: +	target_end: 537411	query_end: 54
+
+target_name: chr3
+query_name: 5:122248402:R:-212;None;None/1
+optimal_alignment_score: 40	suboptimal_alignment_score: 39	strand: +	target_end: 40834	query_end: 52
+
+target_name: chr3
+query_name: X:153897575:R:-263;None;None/1
+optimal_alignment_score: 75	suboptimal_alignment_score: 74	strand: +	target_end: 223405	query_end: 49
+
+target_name: chr3
+query_name: 6:95453360:F:204;None;None/1
+optimal_alignment_score: 45	suboptimal_alignment_score: 42	strand: +	target_end: 182708	query_end: 52
+
+target_name: chr3
+query_name: 10:6951101:R:-212;None;None/1
+optimal_alignment_score: 41	suboptimal_alignment_score: 38	strand: +	target_end: 897043	query_end: 47
+
+target_name: chr3
+query_name: 13:59325230:F:98;None;None/1
+optimal_alignment_score: 39	suboptimal_alignment_score: 38	strand: +	target_end: 773481	query_end: 54
+
+target_name: chr3
+query_name: 3:37059536:F:184;None;None/1
+optimal_alignment_score: 49	suboptimal_alignment_score: 44	strand: +	target_end: 899262	query_end: 54
+
+target_name: chr3
+query_name: 9:91156368:F:149;None;None/1
+optimal_alignment_score: 43	suboptimal_alignment_score: 42	strand: +	target_end: 606747	query_end: 51
+
+target_name: chr3
+query_name: 12:55970708:R:-246;12,55970762,C,T;None/1
+optimal_alignment_score: 92	suboptimal_alignment_score: 88	strand: +	target_end: 771042	query_end: 54
+
+target_name: chr3
+query_name: 3:120828442:F:203;None;None/1
+optimal_alignment_score: 44	suboptimal_alignment_score: 41	strand: +	target_end: 595794	query_end: 50
+
+target_name: chr3
+query_name: 19:60104965:R:-147;None;None/1
+optimal_alignment_score: 65	suboptimal_alignment_score: 64	strand: +	target_end: 866783	query_end: 52
+
+target_name: chr3
+query_name: 11:98782460:R:-46;None;None/1
+optimal_alignment_score: 56	suboptimal_alignment_score: 42	strand: +	target_end: 819223	query_end: 51
+
+target_name: chr3
+query_name: 12:91111930:F:209;None;None/1
+optimal_alignment_score: 43	suboptimal_alignment_score: 42	strand: +	target_end: 679515	query_end: 51
+
+target_name: chr3
+query_name: 3:192601514:F:196;None;None/1
+optimal_alignment_score: 42	suboptimal_alignment_score: 40	strand: +	target_end: 392672	query_end: 40
+
+target_name: chr3
+query_name: X:47989386:R:-46;None;None/1
+optimal_alignment_score: 39	suboptimal_alignment_score: 36	strand: +	target_end: 793950	query_end: 54
+
+target_name: chr3
+query_name: 2:39824446:R:-127;None;None/1
+optimal_alignment_score: 72	suboptimal_alignment_score: 70	strand: +	target_end: 972352	query_end: 54
+
+target_name: chr3
+query_name: 1:227528596:F:219;None;None/1
+optimal_alignment_score: 44	suboptimal_alignment_score: 42	strand: +	target_end: 869543	query_end: 51
+
+target_name: chr3
+query_name: X:4440324:R:-90;None;None/1
+optimal_alignment_score: 42	suboptimal_alignment_score: 40	strand: +	target_end: 100289	query_end: 52
+
+target_name: chr3
+query_name: X:73634281:F:88;None;None/1
+optimal_alignment_score: 52	suboptimal_alignment_score: 41	strand: +	target_end: 922600	query_end: 49
+
+target_name: chr3
+query_name: 3:53909823:F:239;None;3,53910104,T,C/1
+optimal_alignment_score: 44	suboptimal_alignment_score: 44	strand: +	target_end: 640897	query_end: 47
+
+target_name: chr3
+query_name: 3:170059370:R:-173;None;3,170059212,G,C/1
+optimal_alignment_score: 44	suboptimal_alignment_score: 43	strand: +	target_end: 483764	query_end: 54
+
+target_name: chr3
+query_name: 20:8823533:F:251;None;None/1
+optimal_alignment_score: 39	suboptimal_alignment_score: 38	strand: +	target_end: 887639	query_end: 49
+
+target_name: chr3
+query_name: 5:106802036:F:200;None;None/1
+optimal_alignment_score: 44	suboptimal_alignment_score: 42	strand: +	target_end: 137797	query_end: 53
+
+target_name: chr3
+query_name: 5:74632980:R:-252;None;None/1
+optimal_alignment_score: 44	suboptimal_alignment_score: 43	strand: +	target_end: 297976	query_end: 54
+
+target_name: chr3
+query_name: 12:84751900:F:268;None;None/1
+optimal_alignment_score: 42	suboptimal_alignment_score: 42	strand: +	target_end: 183485	query_end: 48
+
+target_name: chr3
+query_name: X:135811790:R:-164;None;None/1
+optimal_alignment_score: 48	suboptimal_alignment_score: 44	strand: +	target_end: 178654	query_end: 54
+
+target_name: chr3
+query_name: 10:53108829:R:-138;None;None/1
+optimal_alignment_score: 45	suboptimal_alignment_score: 42	strand: +	target_end: 625392	query_end: 50
+
+target_name: chr3
+query_name: 11:83660431:R:-245;None;None/1
+optimal_alignment_score: 47	suboptimal_alignment_score: 44	strand: +	target_end: 226749	query_end: 54
+
+target_name: chr3
+query_name: 10:119650756:F:207;None;None/1
+optimal_alignment_score: 44	suboptimal_alignment_score: 42	strand: +	target_end: 529549	query_end: 41
+
+target_name: chr3
+query_name: 2:154848222:R:-185;None;None/1
+optimal_alignment_score: 45	suboptimal_alignment_score: 43	strand: +	target_end: 585761	query_end: 54
+
+target_name: chr3
+query_name: 3:63918850:F:276;None;None/1
+optimal_alignment_score: 42	suboptimal_alignment_score: 41	strand: +	target_end: 345340	query_end: 50
+
+target_name: chr3
+query_name: 10:11447534:F:239;None;None/1
+optimal_alignment_score: 38	suboptimal_alignment_score: 38	strand: +	target_end: 114931	query_end: 52
+
+target_name: chr3
+query_name: 1:57651555:R:-316;None;None/1
+optimal_alignment_score: 46	suboptimal_alignment_score: 44	strand: +	target_end: 104219	query_end: 54
+
+target_name: chr3
+query_name: X:91671697:R:-206;None;X,91671506,G,A/1
+optimal_alignment_score: 48	suboptimal_alignment_score: 47	strand: +	target_end: 771479	query_end: 49
+
+target_name: chr3
+query_name: 12:101622421:R:-187;None;None/1
+optimal_alignment_score: 45	suboptimal_alignment_score: 44	strand: +	target_end: 361502	query_end: 49
+
+target_name: chr3
+query_name: 13:43163355:R:-200;None;None/1
+optimal_alignment_score: 42	suboptimal_alignment_score: 42	strand: +	target_end: 167308	query_end: 53
+
+target_name: chr3
+query_name: 19:37349825:R:-160;None;None/1
+optimal_alignment_score: 70	suboptimal_alignment_score: 68	strand: +	target_end: 261459	query_end: 54
+
+target_name: chr3
+query_name: 8:41173199:R:-175;None;8,41173056,T,C/1
+optimal_alignment_score: 44	suboptimal_alignment_score: 43	strand: +	target_end: 644612	query_end: 47
+
+target_name: chr3
+query_name: 16:29068055:R:-232;None;None/1
+optimal_alignment_score: 44	suboptimal_alignment_score: 40	strand: +	target_end: 934198	query_end: 53
+
+target_name: chr3
+query_name: 1:4694340:F:179;None;None/1
+optimal_alignment_score: 48	suboptimal_alignment_score: 45	strand: +	target_end: 140768	query_end: 54
+
+target_name: chr3
+query_name: 17:76451446:F:221;None;None/1
+optimal_alignment_score: 42	suboptimal_alignment_score: 40	strand: +	target_end: 561765	query_end: 54
+
+target_name: chr3
+query_name: 1:242817485:F:158;None;None/1
+optimal_alignment_score: 52	suboptimal_alignment_score: 46	strand: +	target_end: 396383	query_end: 50
+
+target_name: chr3
+query_name: 18:29611708:R:-146;None;18,29611585,G,A/1
+optimal_alignment_score: 45	suboptimal_alignment_score: 40	strand: +	target_end: 157255	query_end: 47
+
+target_name: chr3
+query_name: 9:36493018:F:206;None;None/1
+optimal_alignment_score: 82	suboptimal_alignment_score: 80	strand: +	target_end: 872774	query_end: 54
+
+target_name: chr3
+query_name: 9:16274705:F:129;9,16274709,C,A:9,16274719,C,G:9,16274741,A,T;9,16274862,A,G/1
+optimal_alignment_score: 72	suboptimal_alignment_score: 64	strand: +	target_end: 524699	query_end: 54
+


=====================================
demo/pRead.fa
=====================================
@@ -0,0 +1,3 @@
+>read
+AGGTAGGTAGGTAGGTAGGTAGGTAGGTAGGTAGGTAGGT
+


=====================================
demo/pRef.fa
=====================================
@@ -0,0 +1,3 @@
+>ref
+CCCAGCTAGCTAGCTAGCTAGCTAGCTAGCTAGCTAGCTAGCT
+


=====================================
demo/protein1.fa
=====================================
@@ -0,0 +1,2 @@
+>random1
+CLKQTQMRTDHAMCGDFWEESHHHFTLCIA


=====================================
demo/protein2.fa
=====================================
@@ -0,0 +1,2 @@
+>random1
+CLKQTQMRTDHARCGDFWEESHHHHHHFTLCIA


=====================================
demo/query.fastq
=====================================
@@ -0,0 +1,8 @@
+ at 6:163296599:F:198;None;None/1
+GTGTGAGCCACCACGCCCAGCCCAAAATCTGTTTTAATGGTGGATTTGTGTTTC
++
+%/978679:99788::8888689788888888986637/%/*%--/11,%*/*%
+ at 3:153409880:F:224;None;3,153410143,G,A/1
+AGGAAGAGTTAATTTAAGTCACTTCAAACAGATTACGTATCTTTTTTTTCCCTC
++
+%-;;8;;9289.73697758:5,2984,54687/*8,'.001/550'+'.(9((


=====================================
demo/query2.fa
=====================================
@@ -0,0 +1,3 @@
+>query2
+CTTTCCTCCCCCTTGCCCTCTAGCAGGGGCCACCGGAGGAGGAGGGAGGGCAAACGAAGGATGGGGGAAGATGTG
+


=====================================
demo/r1.fa
=====================================
@@ -0,0 +1,2 @@
+>HLA-B*08:01:01_2 AC:HLA00146.1 CR:AJ295294.1,AL669854.0,D84394.0,DQ249172.0,DQ249174.0,HG794374.0,L76093.1,LN624507.0,LN877354.0,LT220237.0,M24036.1,M28204.1,M59841.1
+gctcccactccatgaggtatttcgacaccgccatgtcccggcccggccgcggggagccccgcttcatctcagtgggctacgtggacgacacgcagttcgtgaggttcgacagcgacgccgcgagtccgagagaggagccgcgggcgccgtggatagagcaggaggggccggagtattgggaccggaacacacagatcttcaagaccaacacacagactgaccgagagagcctgcggaacctgcgcggctactacaaccagagcgaggccg


=====================================
demo/r1_query.fq
=====================================
@@ -0,0 +1,4 @@
+ at ST-E00185:49:H5LVWCCXX:5:1101:18001:19276 1:N:0:0
+GGATCCCAGACCCGGAGACTCGGGAGTACCCGGGGCGTCCGTGGGGCATGGAGGTGGGGGGTCGTGATCTGCGCCCTGGGCCGGGGTTACTCACTGGCCTCGCTCTGGTTGTAGTAGCCGCGCAGGGTCTGCAGGTTCATTCTGTCAGTC
++
+AAFFFKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKFKKFKKKKKKKKKKKKKKKKKKKKAKKKKKKKKKKKKKKKKFKAAKAKKKKKKKKKKKKKKKKKKFKKKKKKKAFFKKKKAK<FFA<K


=====================================
demo/target.fastq
=====================================
@@ -0,0 +1,8 @@
+ at 20:8823533:F:251;None;None/1
+GTGGTGGATGTCTCAGCAGAGGCCTGCATAGAAAGACGACATTGAGGATCAGTG
++
+CEEECEA?AEEEEE;>CEA5AEEEEB5?EEEEEEEAE>EE=AD at E?B>BBD at BB
+ at 5:106802036:F:200;None;None/1
+TTTTGCTCTCAGTTGTGTCCCAGCAACCATGAGCTGGAAATCCAGAGACTCCTT
++
+EECEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEED?.<EAABCBD=@>AB?D?=


=====================================
demo/target2.fa
=====================================
@@ -0,0 +1,2 @@
+TATCTCTCTCAACAACAACAACGGAGGAGGAGGAAAAGAGAGAGATATCTCTCTCAACAACAACAACGGAGGAGGAGGAAAAGAGAGAGA
+


=====================================
src/.gitignore deleted
=====================================
@@ -1,12 +0,0 @@
-*.*~
-example_c
-example_cpp
-libssw.so
-libsswjni.so
-ssw.jar
-ssw.o
-ssw/Aligner.class
-ssw/Alignment.class
-ssw/Example.class
-ssw_cpp.o
-ssw_test 


=====================================
src/Makefile
=====================================
@@ -1,6 +1,7 @@
 CC = gcc
 CXX = g++
-CFLAGS := -Wall -pipe -O2
+CFLAGS += -Wall -pipe -O2
+#CFLAGS += -Wall -pipe -g -fsanitize=address	# for debug
 CXXFLAGS := $(CFLAGS)
 LOBJS = ssw.o
 LCPPOBJS = ssw_cpp.o
@@ -10,17 +11,16 @@ EXAMPLE = example_c
 EXAMPLE_CPP = example_cpp
 JAVA_JAR = ssw.jar
 JAVA_LIB = libsswjni.so
-JAVA_INLCUDES = -I"$(JAVA_HOME)/include" -I"$(JAVA_HOME)/include/linux"
+JAVA_INLCUDES = -I$(JAVA_HOME)/include -I$(JAVA_HOME)/include/linux
 JAVA_OBJ = ssw/Aligner.class ssw/Alignment.class ssw/Example.class
 
 .PHONY: all default java clean
 
 default: $(PROG) $(EXAMPLE) $(EXAMPLE_CPP) $(LIB) 
-
+core: $(PROG)
+java: java_home $(JAVA_JAR) $(JAVA_LIB)
 all: default java
 
-java: $(JAVA_JAR) $(JAVA_LIB)
-
 $(LIB): ssw.c ssw.h
 	$(CC) $(CFLAGS) -fPIC -shared -rdynamic -o $@ $<
 
@@ -28,11 +28,16 @@ $(PROG): main.c kseq.h
 
 $(EXAMPLE): example.c
 
+ifdef __arm__ # (M1)
+$(PROG) $(EXAMPLE): $(LOBJS)
+	$(CC) -o $@ $(filter-out %.h,$^) $(CFLAGS) $(LDFLAGS) -lm -lz -march=armv8-a+fp+simd+crypto+crc
+else # x86(Intel)
 $(PROG) $(EXAMPLE): $(LOBJS)
-	$(CC) -o $@ $(filter-out %.h,$^) $(CFLAGS) -lm -lz
+	$(CC) -o $@ $(filter-out %.h,$^) $(CFLAGS) $(LDFLAGS) -lm -lz
+endif
 
 $(EXAMPLE_CPP): example.cpp $(LOBJS) $(LCPPOBJS)
-	$(CXX) -o $@ $^ $(CXXFLAGS) -lm -lz
+	$(CXX) -o $@ $^ $(CXXFLAGS) $(LDFLAGS) -lm -lz
 
 $(JAVA_LIB): sswjni.c ssw.c ssw.h
 	$(CC) $(CFLAGS) $(JAVA_INLCUDES) -fPIC -shared -rdynamic -o $@ $< ssw.c 
@@ -42,7 +47,13 @@ $(JAVA_JAR): $(JAVA_OBJ)
 
 %.class: %.java
 	javac -cp ./ $<
-	
+
+java_home:
+ifndef JAVA_HOME
+	@$(ECHO) "ERROR: Your JAVA_HOME environment variable is not set."
+	exit 1
+endif
+
 ssw.o: ssw.c ssw.h
 	$(CC) -c -o $@ $< $(CFLAGS)
 


=====================================
src/example.cpp
=====================================
@@ -5,10 +5,11 @@
 // 1) g++ -Wall ssw_cpp.cpp ssw.c example.cpp
 // 2) ./a.out
 // Created by Wan-Ping Lee on 09/04/12.
+// Last revision by Mengyao Zhao on 2017-06-05
 // ==========================
 
 #include <iostream>
-#include <string>
+#include <string.h>
 
 #include "ssw_cpp.h"
 
@@ -21,6 +22,8 @@ static void PrintAlignment(const StripedSmithWaterman::Alignment& alignment);
 int main() {
   const string ref   = "CAGCCTTTCTGACCCGGAAATCAAAATAGGCACAACAAA";
   const string query = "CTGAGCCGGTAAATC";
+  int32_t maskLen = strlen(query.c_str())/2;
+  maskLen = maskLen < 15 ? 15 : maskLen;
   //const string ref   = "CCGTTTATCGCA";
   //const string query = "CCTTTTATCGCA";
 
@@ -31,7 +34,7 @@ int main() {
   // Declares an alignment that stores the result
   StripedSmithWaterman::Alignment alignment;
   // Aligns the query to the ref
-  aligner.Align(query.c_str(), ref.c_str(), ref.size(), filter, &alignment);
+  aligner.Align(query.c_str(), ref.c_str(), ref.size(), filter, &alignment, maskLen);
 
   PrintAlignment(alignment);
 


=====================================
src/kseq.h
=====================================
@@ -25,7 +25,7 @@
 
 /* Contact: Heng Li <lh3 at sanger.ac.uk> */
 
-/* Last Modified: 12APR2009 */
+/* Last Modified by Mengyao: 2022Mar28 */
 
 #ifndef AC_KSEQ_H
 #define AC_KSEQ_H
@@ -149,11 +149,11 @@ typedef struct __kstring_t {
 		s->f = ks_init(fd);												\
 		return s;														\
 	}																	\
-	static inline void kseq_rewind(kseq_t *ks)							\
+/*	static inline void kseq_rewind(kseq_t *ks)							\
 	{																	\
 		ks->last_char = 0;												\
 		ks->f->is_eof = ks->f->begin = ks->f->end = 0;					\
-	}																	\
+	}	*/																\
 	static inline void kseq_destroy(kseq_t *ks)							\
 	{																	\
 		if (!ks) return;												\


=====================================
src/main.c
=====================================
@@ -1,12 +1,11 @@
 /*  main.c
  *  Created by Mengyao Zhao on 06/23/11.
- *	Version 1.0
- *  Last revision by Mengyao Zhao on 07/19/16.
+ *	Version 1.2.2
+ *  Last revision by Mengyao Zhao on 2022-May-21.
  */
 
 #include <stdlib.h>
 #include <stdint.h>
-#include <emmintrin.h>
 #include <zlib.h>
 #include <stdio.h>
 #include <time.h>
@@ -16,6 +15,13 @@
 #include "ssw.h"
 #include "kseq.h"
 
+#ifdef __ARM_NEON // (M1)
+#include "sse2neon.h"
+#else // x86 (Intel)
+#include <emmintrin.h>
+#endif
+
+
 #ifdef __GNUC__
 #define LIKELY(x) __builtin_expect((x),1)
 #define UNLIKELY(x) __builtin_expect((x),0)
@@ -60,6 +66,8 @@ static void ssw_write (s_align* a,
 			const kseq_t* ref_seq,
 			const kseq_t* read,
 			const char* read_seq,	// strand == 0: original read; strand == 1: reverse complement read
+			const int8_t* ref_num,
+			const int8_t* read_num,
 			const int8_t* table,
 			int8_t strand,	// 0: forward aligned ; 1: reverse complement aligned
 			int8_t sam) {	// 0: Blast like output; 1: Sam format output
@@ -155,56 +163,28 @@ end:
 		fprintf(stdout, "%s\t", read->name.s);
 		if (a->score1 == 0) fprintf(stdout, "4\t*\t0\t255\t*\t*\t0\t0\t*\t*\n");
 		else {
-			int32_t c, l = a->read_end1 - a->read_begin1 + 1, qb = a->ref_begin1, pb = a->read_begin1, p;
+			int32_t c, p;
 			uint32_t mapq = -4.343 * log(1 - (double)abs(a->score1 - a->score2)/(double)a->score1);
 			mapq = (uint32_t) (mapq + 4.99);
 			mapq = mapq < 254 ? mapq : 254;
 			if (strand) fprintf(stdout, "16\t");
 			else fprintf(stdout, "0\t");
 			fprintf(stdout, "%s\t%d\t%d\t", ref_seq->name.s, a->ref_begin1 + 1, mapq);
-			mismatch = mark_mismatch(a->ref_begin1, a->read_begin1, a->read_end1, ref_seq->seq.s, read_seq, read->seq.l, &a->cigar, &a->cigarLen);
+			mismatch = mark_mismatch(a->ref_begin1, a->read_begin1, a->read_end1, ref_num, read_num, read->seq.l, &a->cigar, &a->cigarLen);
 			for (c = 0; c < a->cigarLen; ++c) {
 				char letter = cigar_int_to_op(a->cigar[c]);
 				uint32_t length = cigar_int_to_len(a->cigar[c]);
 				fprintf(stdout, "%lu%c", (unsigned long)length, letter);
 			}
-	//		fprintf(stderr, "%s\tmismatch: %d\n", read->name.s, mismatch);
 			fprintf(stdout, "\t*\t0\t0\t");
-			for (c = a->read_begin1; c <= a->read_end1; ++c) fprintf(stdout, "%c", read_seq[c]);
+			fprintf(stdout, "%s", read_seq);
 			fprintf(stdout, "\t");
 			if (read->qual.s && strand) {
-				p = a->read_end1;
-				for (c = 0; c < l; ++c) {
-					fprintf(stdout, "%c", read->qual.s[p]);
-					--p;
-				}
-			}else if (read->qual.s){
-				p = a->read_begin1;
-				for (c = 0; c < l; ++c) {
-					fprintf(stdout, "%c", read->qual.s[p]);
-					++p;
-				}
-			} else fprintf(stdout, "*");
+				for (p = read->qual.l - 1; p >= 0; --p) fprintf(stdout, "%c", read->qual.s[p]);
+			}else if (read->qual.s) fprintf (stdout, "%s", read->qual.s);
+			else fprintf(stdout, "*");
 			fprintf(stdout, "\tAS:i:%d", a->score1);
-			mapq = 0;	// counter of difference
-			for (c = 0; c < a->cigarLen; ++c) {
-				char letter = cigar_int_to_op(a->cigar[c]);
-				uint32_t length = cigar_int_to_len(a->cigar[c]);
-				if (letter == 'M') {
-					for (p = 0; p < length; ++p){
-						if (table[(int)*(ref_seq->seq.s + qb)] != table[(int)*(read_seq + pb)]) ++mapq;
-						++qb;
-						++pb;
-					}
-				} else if (letter == 'I') {
-					pb += length;
-					mapq += length;
-				} else {
-					qb += length;
-					mapq += length;
-				}
-			}
-			fprintf(stdout,"\tNM:i:%d\t", mapq);
+			fprintf(stdout,"\tNM:i:%d\t", mismatch);
 			if (a->score2 > 0) fprintf(stdout, "ZS:i:%d\n", a->score2);
 			else fprintf(stdout, "\n");
 		}
@@ -217,13 +197,9 @@ int main (int argc, char * const argv[]) {
 	gzFile read_fp, ref_fp;
 	kseq_t *read_seq, *ref_seq;
 	int32_t l, m, k, match = 2, mismatch = 2, gap_open = 3, gap_extension = 1, path = 0, reverse = 0, n = 5, sam = 0, protein = 0, header = 0, s1 = 67108864, s2 = 128, filter = 0;
-	int8_t* mata = (int8_t*)calloc(25, sizeof(int8_t));
-	const int8_t* mat = mata;
-	char mat_name[16];
-	mat_name[0] = '\0';
-	int8_t* ref_num = (int8_t*)malloc(s1);
-	int8_t* num = (int8_t*)malloc(s2), *num_rc = 0;
-	char* read_rc = 0;
+    int8_t *mata, *ref_num, *num, *num_rc = 0;
+    const int8_t* mat;
+	char* read_rc = 0, *mat_name = 0;
 
 	static const int8_t mat50[] = {
 	//  A   R   N   D   C   Q   E   G   H   I   L   K   M   F   P   S   T   W   Y   V   B   Z   X   *
@@ -279,22 +255,6 @@ int main (int argc, char * const argv[]) {
 
 	int8_t* table = nt_table;
 
-	// Parse command line.
-	while ((l = getopt(argc, argv, "m:x:o:e:a:f:pcrsh")) >= 0) {
-		switch (l) {
-			case 'm': match = atoi(optarg); break;
-			case 'x': mismatch = atoi(optarg); break;
-			case 'o': gap_open = atoi(optarg); break;
-			case 'e': gap_extension = atoi(optarg); break;
-			case 'a': strcpy(mat_name, optarg); break;
-			case 'f': filter = atoi(optarg); break;
-			case 'p': protein = 1; break;
-			case 'c': path = 1; break;
-			case 'r': reverse = 1; break;
-			case 's': sam = 1; break;
-			case 'h': header = 1; break;
-		}
-	}
 	if (optind + 2 > argc) {
 		fprintf(stderr, "\n");
 		fprintf(stderr, "Usage: ssw_test [options] ... <target.fasta> <query.fasta>(or <query.fastq>)\n");
@@ -313,21 +273,52 @@ int main (int argc, char * const argv[]) {
 		return 1;
 	}
 
+	// Parse command line.
+	while ((l = getopt(argc, argv, "m:x:o:e:a:f:pcrsh")) >= 0) {
+		switch (l) {
+			case 'm': match = atoi(optarg); break;
+			case 'x': mismatch = atoi(optarg); break;
+			case 'o': gap_open = atoi(optarg); break;
+			case 'e': gap_extension = atoi(optarg); break;
+
+			case 'a': 
+                mat_name = (char*)malloc(strlen(optarg) + 1);
+                strcpy(mat_name, optarg); 
+                break;
+
+			case 'f': filter = atoi(optarg); break;
+			case 'p': protein = 1; break;
+			case 'c': path = 1; break;
+			case 'r': reverse = 1; break;
+			case 's': sam = 1; break;
+			case 'h': header = 1; break;
+		}
+	}
+
 	// initialize scoring matrix for genome sequences
+    mata = (int8_t*)calloc(25, sizeof(int8_t));
 	for (l = k = 0; LIKELY(l < 4); ++l) {
 		for (m = 0; LIKELY(m < 4); ++m) mata[k++] = l == m ? match : -mismatch;	/* weight_match : -weight_mismatch */
 		mata[k++] = 0; // ambiguous base
 	}
 	for (m = 0; LIKELY(m < 5); ++m) mata[k++] = 0;
+    mat = mata;
 
-	if (protein == 1 && (! strcmp(mat_name, "\0"))) {
+	if (protein == 1 && mat_name == 0) {
 		n = 24;
 		table = aa_table;
 		mat = mat50;
-	} else if (strcmp(mat_name, "\0")) {
+	} else if (mat_name != 0) {
 
 	// Parse score matrix.
 		FILE *f_mat = fopen(mat_name, "r");
+        free(mat_name);
+        if (f_mat == NULL) { 
+            fprintf(stderr, "Failed to open the weight matrix file.\n");
+            free(mata);
+            return 1;
+        }
+
 		char line[128];
 		mata = (int8_t*)realloc(mata, 1024 * sizeof(int8_t));
 		k = 0;
@@ -359,6 +350,7 @@ int main (int argc, char * const argv[]) {
 		}
 		if (k == 0) {
 			fprintf(stderr, "Problem of reading the weight matrix file.\n");
+            free(mata);
 			return 1;
 		}
 		fclose(f_mat);
@@ -367,7 +359,6 @@ int main (int argc, char * const argv[]) {
 		mat = mata;
 	}
 
-	//fprintf(stderr, "query: %s\n", argv[optind + 1]);
 	read_fp = gzopen(argv[optind + 1], "r");
 
     if (! read_fp) {
@@ -389,6 +380,8 @@ int main (int argc, char * const argv[]) {
 	}
 
 	// alignment
+	ref_num = (int8_t*)malloc(s1);
+	num = (int8_t*)malloc(s2);
 	if (reverse == 1 && n == 5) {
 		read_rc = (char*)malloc(s2);
 		num_rc = (int8_t*)malloc(s2);
@@ -416,7 +409,13 @@ int main (int argc, char * const argv[]) {
 			p_rc = ssw_init(num_rc, readLen, mat, n, 2);
 		}else if (reverse == 1 && n == 24) {
 			fprintf (stderr, "Reverse complement alignment is not available for protein sequences. \n");
-			return 1;
+            free(num);
+            free(ref_num);
+            if (num_rc) {
+                free(num_rc);
+                free(read_rc);
+            }
+            return 1;
 		}
 
 		ref_fp = gzopen(argv[optind], "r");
@@ -436,12 +435,17 @@ int main (int argc, char * const argv[]) {
 			if (reverse == 1 && protein == 0)
 				result_rc = ssw_align(p_rc, ref_num, refLen, gap_open, gap_extension, flag, filter, 0, maskLen);
 			if (result_rc && result_rc->score1 > result->score1 && result_rc->score1 >= filter) {
-				if (sam) ssw_write (result_rc, ref_seq, read_seq, read_rc, table, 1, 1);
-				else ssw_write (result_rc, ref_seq, read_seq, read_rc, table, 1, 0);
+                if (result_rc->flag == 2) fprintf(stderr, "Warning: The reverse compliment alignment of the following sequences may miss a small part.\nref_seq: %s\nread_seq: %s\n\n", ref_seq->name.s, read_seq->name.s);
+				if (sam) ssw_write (result_rc, ref_seq, read_seq, read_rc, ref_num, num_rc, table, 1, 1);
+				else ssw_write (result_rc, ref_seq, read_seq, read_rc, ref_num, num_rc, table, 1, 0);
 			}else if (result && result->score1 >= filter){
-				if (sam) ssw_write(result, ref_seq, read_seq, read_seq->seq.s, table, 0, 1);
-				else ssw_write(result, ref_seq, read_seq, read_seq->seq.s, table, 0, 0);
-			} else if (! result) return 1;
+                if (result->flag == 2) fprintf(stderr, "Warning: The alignment of the following sequences may miss a small part.\nref_seq: %s\nread_seq: %s\n\n", ref_seq->name.s, read_seq->name.s);
+				if (sam) ssw_write(result, ref_seq, read_seq, read_seq->seq.s, ref_num, num, table, 0, 1);
+				else ssw_write(result, ref_seq, read_seq, read_seq->seq.s, ref_num, num, table, 0, 0);
+			} else if (! result) {
+                fprintf(stderr, "Warning: Alignment between the following sequences is failed.\nref_seq: %s\nread_seq: %s\n\n", ref_seq->name.s, read_seq->name.s);
+                continue;
+            }
 			if (result_rc) align_destroy(result_rc);
 			align_destroy(result);
 		}


=====================================
src/pyssw.py
=====================================
@@ -3,6 +3,7 @@
 Simple python wrapper for SSW library
 Please put the path of libssw.so into LD_LIBRARY_PATH or pass it explicitly as a parameter
 By Yongan Zhao (March 2016)
+Revised by Mengyao Zhao on 2022-May-23
 """
 
 import sys
@@ -12,12 +13,9 @@ import ctypes as ct
 import timeit as ti
 import gzip
 import math
-
 import ssw_lib
 
 
-
-
 def read(sFile):
     """
     read a sequence file
@@ -52,9 +50,9 @@ def read(sFile):
         sQual = ''
         for l in f:
             sId = l.strip()[1:].split()[0]
-            sSeq = f.next()
-            s3 = f.next()
-            sQual = f.next()
+            sSeq = f.readline().strip()
+            s3 = f.readline().strip()
+            sQual = f.readline().strip()
 
             yield sId, sSeq, sQual
 
@@ -63,23 +61,23 @@ def read(sFile):
     ext = op.splitext(sFile)[1][1:].strip().lower()
     if ext == 'gz' or ext == 'gzip':
         with gzip.open(sFile, 'r') as f:
-            l = f.next()
+            l = f.readline()
             if l.startswith('>'):
                 bFasta = True
             elif l.startswith('@'):
                 bFasta = False
             else:
-                print >> sys.stderr, 'file format cannot be recognized'
+                sys.stderr.write('file format cannot be recognized\n')
                 sys.exit()
     else:
         with open(sFile, 'r') as f:
-            l = f.next()
+            l = f.readline()
             if l.startswith('>'):
                 bFasta = True
             elif l.startswith('@'):
                 bFasta = False
             else:
-                print >> sys.stderr, 'file format cannot be recognized'
+                sys.stderr.write('file format cannot be recognized\n')
                 sys.exit()
 
 # read
@@ -171,7 +169,7 @@ def buildPath(q, r, nQryBeg, nRefBeg, lCigar):
 
         if c == 'M':
             sQ += q[nQOff : nQOff+n]
-            sA += ''.join(['|' if q[nQOff+j] == r[nROff+j] else '*' for j in xrange(n)])
+            sA += ''.join(['|' if q[nQOff+j] == r[nROff+j] else '*' for j in range(n)])
             sR += r[nROff : nROff+n]
             nQOff += n
             nROff += n
@@ -185,36 +183,33 @@ def buildPath(q, r, nQryBeg, nRefBeg, lCigar):
             sA += ' ' * n
             sR += r[nROff : nROff+n]
             nROff += n
-
     return sCigar, sQ, sA, sR
 
 
-
-
 def main(args):
     lEle = []
     dRc = {} 
     dEle2Int = {}
     dInt2Ele = {}
-    if False == args.bProtien:
+    if False == args.bProtein:
 # init DNA score matrix
         if not args.sMatrix:
             lEle = ['A', 'C', 'G', 'T', 'N']
-            dRc = {'A':'C', 'C':'G', 'G':'C', 'T':'A', 'a':'C', 'c':'G', 'g':'C', 't':'A'} 
+            dRc = {'A':'T', 'C':'G', 'G':'C', 'T':'A', 'a':'T', 'c':'G', 'g':'C', 't':'A'} 
             for i,ele in enumerate(lEle):
                 dEle2Int[ele] = i
                 dEle2Int[ele.lower()] = i
                 dInt2Ele[i] = ele
             nEleNum = len(lEle)
-            lScore = [0 for i in xrange(nEleNum**2)]
-            for i in xrange(nEleNum-1):
-                for j in xrange(nEleNum-1):
+            lScore = [0 for i in range(nEleNum**2)]
+            for i in range(nEleNum-1):
+                for j in range(nEleNum-1):
                     if lEle[i] == lEle[j]:
                         lScore[i*nEleNum+j] = args.nMatch
                     else:
                         lScore[i*nEleNum+j] = -args.nMismatch
         else:
-            lEle, dEle2Int, dInt2Ele, lScore = ssw.read_matrix(args.sMatrix)
+            lEle, dEle2Int, dInt2Ele, lScore = ssw_lib.read_matrix(args.sMatrix)
     else:
 # load AA score matrix
         if not args.sMatrix:
@@ -227,10 +222,10 @@ def main(args):
             lScore = ssw_lib.lBlosum50
         else:
             # assume the format of the input score matrix is the same as that of http://www.ncbi.nlm.nih.gov/Class/FieldGuide/BLOSUM62.txt
-            lEle, dEle2Int, dInt2Ele, lScore = ssw.read_matrix(args.sMatrix)
+            lEle, dEle2Int, dInt2Ele, lScore = ssw_lib.read_matrix(args.sMatrix)
 
     if args.bBest and args.bProtien:
-        print >> sys.stderr, 'Reverse complement alignment is not available for protein sequences.'
+        sys.stderr.write('Reverse complement alignment is not available for protein sequences.\n')
 
 # translate score matrix to ctypes
     mat = (len(lScore) * ct.c_int8) ()
@@ -241,11 +236,11 @@ def main(args):
         nFlag = 2
 # print sam head
     if args.bSam and args.bHeader and args.bPath:
-        print '@HD\tVN:1.4\tSO:queryname'
-        for sRId,sRSeq,_ in read(sTarget):
-            print '@SQ\tSN:{}\tLN:{}'.format(sRId, len(sRSeq))
+        print('@HD\tVN:1.4\tSO:queryname')
+        for sRId,sRSeq,_ in read(args.target):
+            print('@SQ\tSN:{}\tLN:{}'.format(sRId, len(sRSeq)))
     elif args.bSam and not args.bPath:
-        print >> sys.stderr, 'SAM format output is only available together with option -c.\n'
+        sys.stderr.write('SAM format output is only available together with option -c.\n')
         args.bSam = False
 
     ssw = ssw_lib.CSsw(args.sLibPath)
@@ -255,25 +250,21 @@ def main(args):
         qNum = to_int(sQSeq, lEle, dEle2Int)
         qProfile = ssw.ssw_init(qNum, ct.c_int32(len(sQSeq)), mat, len(lEle), 2)
 # build rc query profile
-        if args.bBest and not args.bProtien:
+        if args.bBest and not args.bProtein:
             sQRcSeq = ''.join([dRc[x] for x in sQSeq[::-1]])
             qRcNum = to_int(sQRcSeq, lEle, dEle2Int)
             qRcProfile = ssw.ssw_init(qRcNum, ct.c_int32(len(sQSeq)), mat, len(lEle), 2)
 # set mask len
-        if len(sQSeq) > 30:
-            nMaskLen = len(sQSeq) / 2
-        else:
-            nMaskLen = 15
-
+        nMaskLen = len(sQSeq) // 2
+        
 # iter target sequence
         for sRId,sRSeq,_ in read(args.target):
             rNum = to_int(sRSeq, lEle, dEle2Int)
-
-# format ofres: (nScore, nScore2, nRefBeg, nRefEnd, nQryBeg, nQryEnd, nRefEnd2, nCigarLen, lCigar)
+# format of res: (nScore, nScore2, nRefBeg, nRefEnd, nQryBeg, nQryEnd, nRefEnd2, nCigarLen, lCigar)
             res = align_one(ssw, qProfile, rNum, len(sRSeq), args.nOpen, args.nExt, nFlag, nMaskLen)
 # align rc query
             resRc = None
-            if args.bBest and not args.bProtien:
+            if args.bBest and not args.bProtein:
                 resRc = align_one(ssw, qRcProfile, rNum, len(sRSeq), args.nOpen, args.nExt, nFlag, nMaskLen)
 
 # build cigar and trace back path
@@ -289,89 +280,89 @@ def main(args):
 
 # print results
             if not args.bSam:
-                print 'target_name: {}\nquery_name: {}\noptimal_alignment_score: {}\t'.format(sRId, sQId, resPrint[0]),
+                print('target_name: {}\nquery_name: {}\noptimal_alignment_score: {}\t'.format(sRId, sQId, resPrint[0])),
                 if resPrint[1] > 0:
-                    print 'suboptimal_alignment_score: {}\t'.format(resPrint[1]),
+                    print('suboptimal_alignment_score: {}\t'.format(resPrint[1])),
                 if strand == 0:
-                    print 'strand: +\t',
+                    print('strand: +\t'),
                 else: 
-                    print 'strand: -\t',
+                    print('strand: -\t'),
                 if resPrint[2] + 1:
-                    print 'target_begin: {}\t'.format(resPrint[2] + 1),
-                print 'target_end: {}\t'.format(resPrint[3] + 1),
+                    print('target_begin: {}\t'.format(resPrint[2] + 1)),
+                print('target_end: {}\t'.format(resPrint[3] + 1)),
                 if resPrint[4] + 1:
-                    print 'query_begin: {}\t'.format(resPrint[4] + 1),
-                print 'query_end: {}\n'.format(resPrint[5] + 1)
+                    print('query_begin: {}\t'.format(resPrint[4] + 1)),
+                print('query_end: {}\n'.format(resPrint[5] + 1))
                 if resPrint[-2] > 0:
                     n1 = 1 + resPrint[2]
                     n2 = min(60,len(sR)) + resPrint[2] - sR.count('-',0,60)
                     n3 = 1 + resPrint[4]
                     n4 = min(60,len(sQ)) + resPrint[4] - sQ.count('-',0,60)
                     for i in range(0, len(sQ), 60):
-                        print 'Target:{:>8}\t{}\t{}'.format(n1, sR[i:i+60], n2)
+                        print('Target:{:>8}\t{}\t{}'.format(n1, sR[i:i+60], n2))
                         n1 = n2 + 1
                         n2 = n2 + min(60,len(sR)-i-60) - sR.count('-',i+60,i+120)
 
-                        print '{: ^15}\t{}'.format('', sA[i:i+60])
+                        print('{: ^15}\t{}'.format('', sA[i:i+60]))
 
-                        print 'Query:{:>9}\t{}\t{}\n'.format(n3, sQ[i:i+60], n4)
+                        print('Query:{:>9}\t{}\t{}\n'.format(n3, sQ[i:i+60], n4))
                         n3 = n4 + 1
                         n4 = n4 + min(60,len(sQ)-i-60) - sQ.count('-',i+60,i+120)
             else:
-                print "{}\t".format(sQId),
+                print("{}\t".format(sQId)),
                 if resPrint[0] == 0:
-                    print "4\t*\t0\t255\t*\t*\t0\t0\t*\t*",
+                    print("4\t*\t0\t255\t*\t*\t0\t0\t*\t*"),
                 else:
                     mapq = int(-4.343 * math.log(1-abs(resPrint[0]-resPrint[1])/float(resPrint[0])))
                     mapq = int(mapq + 4.99);
                     if mapq >= 254:
                         mapq = 254
                     if strand == 1:
-                        print '16\t',
+                        print('16\t'),
                     else:
-                        print '0\t',
-                    print '{}\t{}\t{}\t'.format(sRId, resPrint[2]+1, mapq),
-                    print sCigar,
-                    print '\t*\t0\t0\t',
-                    print sQSeq[resPrint[4]:resPrint[5]+1] if strand==0 else sQRcSeq[resPrint[4]:resPrint[5]+1],
-                    print '\t',
+                        print('0\t'),
+                    print('{}\t{}\t{}\t'.format(sRId, resPrint[2]+1, mapq)),
+                    print(sCigar),
+                    print('\t*\t0\t0\t'),
+                    print(sQSeq[resPrint[4]:resPrint[5]+1]) if strand==0 else print(sQRcSeq[resPrint[4]:resPrint[5]+1]),
+                    print('\t'),
                     if sQQual:
                         if strand == 0:
-                            print sQQual[resPrint[4]:resPrint[5]+1],
+                            print(sQQual[resPrint[4]:resPrint[5]+1]),
                         else:
-                            print sQQual[-resPrint[4]-1:-resPrint[5]-1:-1]
+                            print(sQQual[-resPrint[4]-1:-resPrint[5]-1:-1])
                     else:
-                        print '*',
+                        print('*'),
 
-                    print '\tAS:i:{}'.format(resPrint[0]),
-                    print '\tNM:i:{}\t'.format(len(sA)-sA.count('|')),
+                    print('\tAS:i:{}'.format(resPrint[0])),
+                    print('\tNM:i:{}\t'.format(len(sA)-sA.count('|'))),
                     if resPrint[1] > 0:
-                        print 'ZS:i:{}'.format(resPrint[1])
+                        print('ZS:i:{}'.format(resPrint[1]))
                     else:
                         print
 
 
         ssw.init_destroy(qProfile)
-        if args.bBest and not args.bProtien:
+        if args.bBest and not args.bProtein:
             ssw.init_destroy(qRcProfile)
 
 
 if __name__ == '__main__':
 
     parser = ap.ArgumentParser()
-    parser.add_argument('-l', '--sLibPath', default='', help='path of libssw.so')
+    parser.add_argument('-l', '--sLibPath', default='./', help='path of libssw.so')
     parser.add_argument('-m', '--nMatch', type=int, default=2, help='a positive integer as the score for a match in genome sequence alignment. [default: 2]')
     parser.add_argument('-x', '--nMismatch', type=int, default=2, help='a positive integer as the score for a mismatch in genome sequence alignment. [default: 2]')
     parser.add_argument('-o', '--nOpen', type=int, default=3, help='a positive integer as the penalty for the gap opening in genome sequence alignment. [default: 3]')
     parser.add_argument('-e', '--nExt', type=int, default=1, help='a positive integer as the penalty for the gap extension in genome sequence alignment. [default: 1]')
-    parser.add_argument('-p', '--bProtien', action='store_true', help='Do protein sequence alignment. Without this option, the ssw_test will do genome sequence alignment. [default: False]')
+    parser.add_argument('-p', '--bProtein', action='store_true', help='Do protein sequence alignment. Without this option, pyssw will do genome sequence alignment. [default: False]')
     parser.add_argument('-a', '--sMatrix', default='', help='a file for either Blosum or Pam weight matrix. [default: Blosum50]')
     parser.add_argument('-c', '--bPath', action='store_true', help='Return the alignment path. [default: False]')
     parser.add_argument('-f', '--nThr', default=0, help='a positive integer. Only output the alignments with the Smith-Waterman score >= N.')
     parser.add_argument('-r', '--bBest', action='store_true', help='The best alignment will be picked between the original read alignment and the reverse complement read alignment. [default: False]')
     parser.add_argument('-s', '--bSam', action='store_true', help='Output in SAM format. [default: no header]')
     parser.add_argument('-header', '--bHeader', action='store_true', help='If -s is used, include header in SAM output.')
-    parser.add_argument('target', help='targe file')
+    parser.add_argument('target', help='target file')
     parser.add_argument('query', help='query file')
     if len(sys.argv) == 1:
         parser.print_help()
@@ -381,4 +372,5 @@ if __name__ == '__main__':
     t1 = ti.default_timer()
     main(args)
     t2 = ti.default_timer()
-    print >> sys.stderr, 'CPU time: {} seconds'.format(t2 - t1)
+    sys.stderr.write('CPU time: {} seconds\n'.format(t2 - t1))
+


=====================================
src/sse2neon.h
=====================================
The diff for this file was not included because it is too large.

=====================================
src/ssw.c
=====================================
@@ -1,6 +1,6 @@
 /* The MIT License
 
-   Copyright (c) 2012-1015 Boston College.
+   Copyright (c) 2012-2015 Boston College.
 
    Permission is hereby granted, free of charge, to any person obtaining
    a copy of this software and associated documentation files (the
@@ -23,19 +23,49 @@
    SOFTWARE.
 */
 
-/* Contact: Mengyao Zhao <zhangmp at bc.edu> */
+/* The 2-clause BSD License
+
+   Copyright 2006 Michael Farrar.  
+
+   Redistribution and use in source and binary forms, with or without
+   modification, are permitted provided that the following conditions are
+   met:
+   
+   1. Redistributions of source code must retain the above copyright
+      notice, this list of conditions and the following disclaimer.
+   
+   2. Redistributions in binary form must reproduce the above copyright
+      notice, this list of conditions and the following disclaimer in the
+      documentation and/or other materials provided with the distribution.
+   
+   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+   HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+*/
 
 /*
  *  ssw.c
  *
  *  Created by Mengyao Zhao on 6/22/10.
  *  Copyright 2010 Boston College. All rights reserved.
- *	Version 0.1.4
- *	Last revision by Mengyao Zhao on 07/19/16.
+ *	Version 1.2.4
+ *	Last revision by Mengyao Zhao on 2022-Apr-17.
+ *
+ *  The lazy-F loop implementation was derived from SWPS3, which is
+ *  MIT licensed under ETH Zürich, Institute of Computational Science.
  *
+ *  The core SW loop referenced the swsse2 implementation, which is
+ *  BSD licensed under Micharl Farrar.
  */
 
-#include <emmintrin.h>
 #include <stdint.h>
 #include <stdlib.h>
 #include <stdio.h>
@@ -43,6 +73,13 @@
 #include <math.h>
 #include "ssw.h"
 
+#ifdef __ARM_NEON // (M1)
+#include "sse2neon.h"
+#else // x86 (Intel)
+#include <emmintrin.h>
+#endif
+
+
 #ifdef __GNUC__
 #define LIKELY(x) __builtin_expect((x),1)
 #define UNLIKELY(x) __builtin_expect((x),0)
@@ -85,6 +122,43 @@ struct _profile{
 	uint8_t bias;
 };
 
+/* array index is an ASCII character value from a CIGAR, 
+   element value is the corresponding integer opcode between 0 and 8 */
+const uint8_t encoded_ops[] = {
+	0,         0,         0,         0,
+	0,         0,         0,         0,
+	0,         0,         0,         0,
+	0,         0,         0,         0,
+	0,         0,         0,         0,
+	0,         0,         0,         0,
+	0,         0,         0,         0,
+	0,         0,         0,         0,
+	0 /*   */, 0 /* ! */, 0 /* " */, 0 /* # */,
+	0 /* $ */, 0 /* % */, 0 /* & */, 0 /* ' */,
+	0 /* ( */, 0 /* ) */, 0 /* * */, 0 /* + */,
+	0 /* , */, 0 /* - */, 0 /* . */, 0 /* / */,
+	0 /* 0 */, 0 /* 1 */, 0 /* 2 */, 0 /* 3 */,
+	0 /* 4 */, 0 /* 5 */, 0 /* 6 */, 0 /* 7 */,
+	0 /* 8 */, 0 /* 9 */, 0 /* : */, 0 /* ; */,
+	0 /* < */, 7 /* = */, 0 /* > */, 0 /* ? */,
+	0 /* @ */, 0 /* A */, 0 /* B */, 0 /* C */,
+	2 /* D */, 0 /* E */, 0 /* F */, 0 /* G */,
+	5 /* H */, 1 /* I */, 0 /* J */, 0 /* K */,
+	0 /* L */, 0 /* M */, 3 /* N */, 0 /* O */,
+	6 /* P */, 0 /* Q */, 0 /* R */, 4 /* S */,
+	0 /* T */, 0 /* U */, 0 /* V */, 0 /* W */,
+	8 /* X */, 0 /* Y */, 0 /* Z */, 0 /* [ */,
+	0 /* \ */, 0 /* ] */, 0 /* ^ */, 0 /* _ */,
+	0 /* ` */, 0 /* a */, 0 /* b */, 0 /* c */,
+	0 /* d */, 0 /* e */, 0 /* f */, 0 /* g */,
+	0 /* h */, 0 /* i */, 0 /* j */, 0 /* k */,
+	0 /* l */, 0 /* m */, 0 /* n */, 0 /* o */,
+	0 /* p */, 0 /* q */, 0 /* r */, 0 /* s */,
+	0 /* t */, 0 /* u */, 0 /* v */, 0 /* w */,
+	0 /* x */, 0 /* y */, 0 /* z */, 0 /* { */,
+	0 /* | */, 0 /* } */, 0 /* ~ */, 0 /*  */
+};
+
 /* Generate query profile rearrange query sequence & calculate the weight of match/mismatch. */
 static __m128i* qP_byte (const int8_t* read_num,
 				  const int8_t* mat,
@@ -134,6 +208,7 @@ static alignment_end* sw_sse2_byte (const int8_t* ref,
 	 						 uint8_t bias,  /* Shift 0 point to a positive value. */
 							 int32_t maskLen) {
 
+// Put the largest number of the 16 numbers in vm into m.
 #define max16(m, vm) (vm) = _mm_max_epu8((vm), _mm_srli_si128((vm), 8)); \
 					  (vm) = _mm_max_epu8((vm), _mm_srli_si128((vm), 4)); \
 					  (vm) = _mm_max_epu8((vm), _mm_srli_si128((vm), 2)); \
@@ -159,7 +234,7 @@ static alignment_end* sw_sse2_byte (const int8_t* ref,
 	__m128i* pvE = (__m128i*) calloc(segLen, sizeof(__m128i));
 	__m128i* pvHmax = (__m128i*) calloc(segLen, sizeof(__m128i));
 
-	int32_t i, j;
+	int32_t i, j, k;
 	/* 16 byte insertion begin vector */
 	__m128i vGapO = _mm_set1_epi8(weight_gapO);
 
@@ -223,39 +298,21 @@ static alignment_end* sw_sse2_byte (const int8_t* ref,
 			vH = _mm_load_si128(pvHLoad + j);
 		}
 
-		/* Lazy_F loop: has been revised to disallow adjecent insertion and then deletion, so don't update E(i, j), learn from SWPS3 */
-        /* reset pointers to the start of the saved data */
-        j = 0;
-        vH = _mm_load_si128 (pvHStore + j);
-
-        /*  the computed vF value is for the given column.  since */
-        /*  we are at the end, we need to shift the vF value over */
-        /*  to the next column. */
-        vF = _mm_slli_si128 (vF, 1);
-        vTemp = _mm_subs_epu8 (vH, vGapO);
-		vTemp = _mm_subs_epu8 (vF, vTemp);
-		vTemp = _mm_cmpeq_epi8 (vTemp, vZero);
-		cmp  = _mm_movemask_epi8 (vTemp);
-
-        while (cmp != 0xffff)
-        {
-            vH = _mm_max_epu8 (vH, vF);
-			vMaxColumn = _mm_max_epu8(vMaxColumn, vH);
-            _mm_store_si128 (pvHStore + j, vH);
-            vF = _mm_subs_epu8 (vF, vGapE);
-            j++;
-            if (j >= segLen)
-            {
-                j = 0;
-                vF = _mm_slli_si128 (vF, 1);
-            }
-            vH = _mm_load_si128 (pvHStore + j);
-
-            vTemp = _mm_subs_epu8 (vH, vGapO);
-            vTemp = _mm_subs_epu8 (vF, vTemp);
-            vTemp = _mm_cmpeq_epi8 (vTemp, vZero);
-            cmp  = _mm_movemask_epi8 (vTemp);
-        }
+        /* Lazy_F loop: has been revised to disallow adjecent insertion and then deletion, so don't update E(i, j), learn from SWPS3 */
+		for (k = 0; LIKELY(k < 16); ++k) {
+			vF = _mm_slli_si128 (vF, 1);
+			for (j = 0; LIKELY(j < segLen); ++j) {
+				vH = _mm_load_si128(pvHStore + j);
+				vH = _mm_max_epu8(vH, vF);
+				vMaxColumn = _mm_max_epu8(vMaxColumn, vH);	// newly added line
+				_mm_store_si128(pvHStore + j, vH);
+				vH = _mm_subs_epu8(vH, vGapO);
+				vF = _mm_subs_epu8(vF, vGapE);
+				if (UNLIKELY(! _mm_movemask_epi8(_mm_cmpgt_epi8(vF, vH)))) goto end;
+			}
+		}
+
+end:		
 
 		vMaxScore = _mm_max_epu8(vMaxScore, vMaxColumn);
 		vTemp = _mm_cmpeq_epi8(vMaxMark, vMaxScore);
@@ -529,37 +586,6 @@ end:
 	return bests;
 }
 
-/*!     @function               Produce CIGAR 32-bit unsigned integer from CIGAR operation and CIGAR length
-        @param  length          length of CIGAR
-        @param  op_letter       CIGAR operation character ('M', 'I', etc)
-        @return                 32-bit unsigned integer, representing encoded CIGAR operation and length
-*/
-uint32_t to_cigar_int (uint32_t length, char op_letter)
-{
-        switch (op_letter) {
-                case 'M': /* alignment match (can be a sequence match or mismatch */
-                default:
-                        return length << BAM_CIGAR_SHIFT;
-                case 'S': /* soft clipping (clipped sequences present in SEQ) */
-                        return (length << BAM_CIGAR_SHIFT) | (4u);
-                case 'D': /* deletion from the reference */
-                        return (length << BAM_CIGAR_SHIFT) | (2u);
-                case 'I': /* insertion to the reference */
-                        return (length << BAM_CIGAR_SHIFT) | (1u);
-                case 'H': /* hard clipping (clipped sequences NOT present in SEQ) */
-                        return (length << BAM_CIGAR_SHIFT) | (5u);
-                case 'N': /* skipped region from the reference */
-                        return (length << BAM_CIGAR_SHIFT) | (3u);
-                case 'P': /* padding (silent deletion from padded reference) */
-                        return (length << BAM_CIGAR_SHIFT) | (6u);
-                case '=': /* sequence match */
-                        return (length << BAM_CIGAR_SHIFT) | (7u);
-                case 'X': /* sequence mismatch */
-                        return (length << BAM_CIGAR_SHIFT) | (8u);
-        }
-        return (uint32_t)-1; // This never happens
-}
-
 static cigar* banded_sw (const int8_t* ref,
 				 const int8_t* read,
 				 int32_t refLen,
@@ -572,11 +598,12 @@ static cigar* banded_sw (const int8_t* ref,
 				 int32_t n) {
 
 	uint32_t *c = (uint32_t*)malloc(16 * sizeof(uint32_t)), *c1;
-	int32_t i, j, e, f, temp1, temp2, s = 16, s1 = 8, l, max = 0;
+	int32_t i, j, e, f, temp1, temp2, s = 16, s1 = 8, l, max = 0, len;
 	int64_t s2 = 1024;
 	char op, prev_op;
 	int32_t width, width_d, *h_b, *e_b, *h_c;
 	int8_t *direction, *direction_line;
+    len = refLen > readLen ? refLen : readLen;
 	cigar* result = (cigar*)malloc(sizeof(cigar));
 	h_b = (int32_t*)malloc(s1 * sizeof(int32_t));
 	e_b = (int32_t*)malloc(s1 * sizeof(int32_t));
@@ -592,13 +619,9 @@ static cigar* banded_sw (const int8_t* ref,
 			e_b = (int32_t*)realloc(e_b, s1 * sizeof(int32_t));
 			h_c = (int32_t*)realloc(h_c, s1 * sizeof(int32_t));
 		}
-		while (width_d * readLen * 3 >= s2) {
+		while (width_d * readLen * 3 >= s2) {   // width_d*readLen* overflow before s2
 			++s2;
 			kroundup32(s2);
-			if (s2 < 0) {
-				fprintf(stderr, "Alignment score and position are not consensus.\n");
-				exit(1);
-			}
 			direction = (int8_t*)realloc(direction, s2 * sizeof(int8_t));
 		}
 		direction_line = direction;
@@ -621,6 +644,7 @@ static cigar* banded_sw (const int8_t* ref,
 
 				temp1 = i == 0 ? -weight_gapO : h_b[e] - weight_gapO;
 				temp2 = i == 0 ? -weight_gapE : e_b[e] - weight_gapE;
+                //fprintf(stderr, "e_b length: %d, u: %d\n", s1, u);
 				e_b[u] = temp1 > temp2 ? temp1 : temp2;
 				direction_line[de] = temp1 > temp2 ? 3 : 2;
 
@@ -643,7 +667,7 @@ static cigar* banded_sw (const int8_t* ref,
 			for (j = 1; j <= u; j ++) h_b[j] = h_c[j];
 		}
 		band_width *= 2;
-	} while (LIKELY(max < score));
+	} while (max < score && band_width <= len); // 2022-Apr-08
 	band_width /= 2;
 
 	// trace back
@@ -653,7 +677,7 @@ static cigar* banded_sw (const int8_t* ref,
 	l = 0;	// record length of current cigar
 	op = prev_op = 'M';
 	temp2 = 2;	// h
-	while (LIKELY(i > 0)) {
+    while (LIKELY(i >= 0 && j > 0)) {
 		set_d(temp1, band_width, i, j, temp2);
 		switch (direction_line[temp1]) {
 			case 1:
@@ -692,7 +716,7 @@ static cigar* banded_sw (const int8_t* ref,
 				free(e_b);
 				free(h_b);
 				free(c);
-				free(result);
+				free(result); 
 				return 0;
 		}
 		if (op == prev_op) ++e;
@@ -810,6 +834,7 @@ s_align* ssw_align (const s_profile* prof,
 	r->read_begin1 = -1;
 	r->cigar = 0;
 	r->cigarLen = 0;
+    r->flag = 0;
 	if (maskLen < 15) {
 		fprintf(stderr, "When maskLen < 15, the function ssw_align doesn't return 2nd best alignment information.\n");
 	}
@@ -835,8 +860,8 @@ s_align* ssw_align (const s_profile* prof,
 		return NULL;
 	}
 	r->score1 = bests[0].score;
-	r->ref_end1 = bests[0].ref;
-	r->read_end1 = bests[0].read;
+	r->ref_end1 = bests[0].ref; // 0_based, always count from the input seq begin
+	r->read_end1 = bests[0].read;   // 0_based, count from the alignment begin (aligned length of the read)
 	if (maskLen >= 15) {
 		r->score2 = bests[1].score;
 		r->ref_end2 = bests[1].ref;
@@ -860,7 +885,13 @@ s_align* ssw_align (const s_profile* prof,
 	free(read_reverse);
 	r->ref_begin1 = bests_reverse[0].ref;
 	r->read_begin1 = r->read_end1 - bests_reverse[0].read;
-	free(bests_reverse);
+
+    if (UNLIKELY(r->score1 > bests_reverse[0].score)) { // banded_sw result will miss a small part
+		fprintf(stderr, "Warning: The alignment path of one pair of sequences may miss a small part. [ssw.c ssw_align]\n");
+        r->flag = 2;  
+    }
+    free(bests_reverse);
+
 	if ((7&flag) == 0 || ((2&flag) != 0 && r->score1 < filters) || ((4&flag) != 0 && (r->ref_end1 - r->ref_begin1 > filterd || r->read_end1 - r->read_begin1 > filterd))) goto end;
 
 	// Generate cigar.
@@ -868,11 +899,18 @@ s_align* ssw_align (const s_profile* prof,
 	readLen = r->read_end1 - r->read_begin1 + 1;
 	band_width = abs(refLen - readLen) + 1;
 	path = banded_sw(ref + r->ref_begin1, prof->read + r->read_begin1, refLen, readLen, r->score1, weight_gapO, weight_gapE, band_width, prof->mat, prof->n);
-	if (path == 0) {
-		free(r);
-		r = NULL;
-	}
-	else {
+
+/*int32_t i, length;
+    char op;
+	for (i = 0; i < path->length; ++i) {
+		op = cigar_int_to_op(path->seq[i]);
+		length = cigar_int_to_len(path->seq[i]);
+        fprintf(stderr, "%d%c", length, op);
+    }
+    fprintf(stderr, "\n");*/
+
+    if (path == 0) r->flag = 1;    // banded_sw is failed.
+    else {
 		r->cigar = path->seq;
 		r->cigarLen = path->length;
 		free(path);
@@ -925,8 +963,8 @@ uint32_t* store_previous_m (int8_t choice,	// 0: current not M, 1: current match
 int32_t mark_mismatch (int32_t ref_begin1,
 					   int32_t read_begin1,
 					   int32_t read_end1,
-					   const char* ref,
-					   const char* read,
+					   const int8_t* ref,
+					   const int8_t* read,
 					   int32_t readLen,
 					   uint32_t** cigar,
 					   int32_t* cigarLen) {
@@ -943,10 +981,8 @@ int32_t mark_mismatch (int32_t ref_begin1,
 		length = cigar_int_to_len((*cigar)[i]);
 		if (op == 'M') {
 			for (j = 0; j < length; ++j) {
-				fprintf(stderr, "ref[%d]: %c\tread[%d]: %c\n", j, *ref, j, *read);
 				if (*ref != *read) {
 					++ mismatch_length;
-					fprintf(stderr, "length_m: %d\n", length_m);
 					// the previous is match; however the current one is mismatche
 					new_cigar = store_previous_m (2, &length_m, &length_x, &p, &s, new_cigar);			
 					++ length_x;


=====================================
src/ssw.h
=====================================
@@ -3,8 +3,8 @@
  *
  *  Created by Mengyao Zhao on 6/22/10.
  *  Copyright 2010 Boston College. All rights reserved.
- *	Version 0.1.4
- *	Last revision by Mengyao Zhao on 07/19/16.
+ *	Version 1.2.3
+ *	Last revision by Mengyao Zhao on 2022-Apr-15.
  *
  */
 
@@ -14,7 +14,13 @@
 #include <stdio.h>
 #include <stdint.h>
 #include <string.h>
+
+#ifdef __ARM_NEON // (M1)
+#include "sse2neon.h"
+#else // x86 (Intel)
 #include <emmintrin.h>
+#endif
+
 
 #ifdef __cplusplus
 extern "C" {
@@ -25,24 +31,26 @@ extern "C" {
 #define BAM_CIGAR_SHIFT 4u
 #endif
 
+extern const uint8_t encoded_ops[];
 
 /*!	@typedef	structure of the query profile	*/
 struct _profile;
 typedef struct _profile s_profile;
 
 /*!	@typedef	structure of the alignment result
-	@field	score1	the best alignment score
-	@field	score2	sub-optimal alignment score
-	@field	ref_begin1	0-based best alignment beginning position on reference;	ref_begin1 = -1 when the best alignment beginning
+    @field	score1	the best alignment score
+    @field	score2	sub-optimal alignment score
+    @field	ref_begin1	0-based best alignment beginning position on reference;	ref_begin1 = -1 when the best alignment beginning
 						position is not available
-	@field	ref_end1	0-based best alignment ending position on reference
-	@field	read_begin1	0-based best alignment beginning position on read; read_begin1 = -1 when the best alignment beginning
+    @field	ref_end1	0-based best alignment ending position on reference
+    @field	read_begin1	0-based best alignment beginning position on read; read_begin1 = -1 when the best alignment beginning
 						position is not available
-	@field	read_end1	0-based best alignment ending position on read
-	@field	read_end2	0-based sub-optimal alignment ending position on read
-	@field	cigar	best alignment cigar; stored the same as that in BAM format, high 28 bits: length, low 4 bits: M/I/D (0/1/2);
+    @field	read_end1	0-based best alignment ending position on read
+    @field	read_end2	0-based sub-optimal alignment ending position on read
+    @field	cigar	best alignment cigar; stored the same as that in BAM format, high 28 bits: length, low 4 bits: M/I/D (0/1/2);
 					cigar = 0 when the best alignment path is not available
-	@field	cigarLen	length of the cigar string; cigarLen = 0 when the best alignment path is not available
+    @field	cigarLen	length of the cigar string; cigarLen = 0 when the best alignment path is not available
+    @field  flag  If the alignment path is accurate (or has missing part). 0: accurate; 1: banded_sw is totally failed; 2: banded_sw returned path has missing part
 */
 typedef struct {
 	uint16_t score1;
@@ -54,6 +62,7 @@ typedef struct {
 	int32_t ref_end2;
 	uint32_t* cigar;
 	int32_t cigarLen;
+    uint16_t flag;
 } s_align;
 
 /*!	@function	Create the query profile using the query sequence.
@@ -148,8 +157,8 @@ void align_destroy (s_align* a);
 int32_t mark_mismatch (int32_t ref_begin1,
 					   int32_t read_begin1,
 					   int32_t read_end1,
-					   const char* ref,
-					   const char* read,
+					   const int8_t* ref,
+					   const int8_t* read,
 					   int32_t readLen,
 					   uint32_t** cigar, 
 					   int32_t* cigarLen);
@@ -159,28 +168,27 @@ int32_t mark_mismatch (int32_t ref_begin1,
 	@param	op_letter	CIGAR operation character ('M', 'I', etc)
 	@return			32-bit unsigned integer, representing encoded CIGAR operation and length
 */
-uint32_t to_cigar_int (uint32_t length, char op_letter);
+static inline uint32_t to_cigar_int (uint32_t length, char op_letter) {
+	return (length << BAM_CIGAR_SHIFT) | (encoded_ops[(int)op_letter]);
+}
 
 /*!	@function		Extract CIGAR operation character from CIGAR 32-bit unsigned integer
 	@param	cigar_int	32-bit unsigned integer, representing encoded CIGAR operation and length
 	@return			CIGAR operation character ('M', 'I', etc)
 */
 //char cigar_int_to_op (uint32_t cigar_int);
-static inline char cigar_int_to_op(uint32_t cigar_int) 
-{
+static inline char cigar_int_to_op(uint32_t cigar_int) {
 	return (cigar_int & 0xfU) > 8 ? 'M': MAPSTR[cigar_int & 0xfU];
 }
 
-
 /*!	@function		Extract length of a CIGAR operation from CIGAR 32-bit unsigned integer
 	@param	cigar_int	32-bit unsigned integer, representing encoded CIGAR operation and length
 	@return			length of CIGAR operation
 */
-//uint32_t cigar_int_to_len (uint32_t cigar_int);
-static inline uint32_t cigar_int_to_len (uint32_t cigar_int)
-{
+static inline uint32_t cigar_int_to_len (uint32_t cigar_int) {
 	return cigar_int >> BAM_CIGAR_SHIFT;
 }
+
 #ifdef __cplusplus
 }
 #endif	// __cplusplus


=====================================
src/ssw_cpp.cpp
=====================================
@@ -1,3 +1,7 @@
+// ssw_cpp.cpp
+// Created by Wan-Ping Lee
+// Last revision by Mengyao Zhao on 2017-05-30
+
 #include "ssw_cpp.h"
 #include "ssw.h"
 
@@ -323,7 +327,7 @@ int Aligner::TranslateBase(const char* bases, const int& length,
 
 
 bool Aligner::Align(const char* query, const Filter& filter,
-                    Alignment* alignment) const
+                    Alignment* alignment, const int32_t maskLen) const
 {
   if (!translation_matrix_) return false;
   if (reference_length_ == 0) return false;
@@ -342,7 +346,7 @@ bool Aligner::Align(const char* query, const Filter& filter,
   s_align* s_al = ssw_align(profile, translated_reference_, reference_length_,
                                  static_cast<int>(gap_opening_penalty_),
 				 static_cast<int>(gap_extending_penalty_),
-				 flag, filter.score_filter, filter.distance_filter, query_len);
+				 flag, filter.score_filter, filter.distance_filter, maskLen);
 
   alignment->Clear();
   ConvertAlignment(*s_al, query_len, alignment);
@@ -359,7 +363,7 @@ bool Aligner::Align(const char* query, const Filter& filter,
 
 
 bool Aligner::Align(const char* query, const char* ref, const int& ref_len,
-                    const Filter& filter, Alignment* alignment) const
+                    const Filter& filter, Alignment* alignment, const int32_t maskLen) const
 {
   if (!translation_matrix_) return false;
 
@@ -369,9 +373,6 @@ bool Aligner::Align(const char* query, const char* ref, const int& ref_len,
   TranslateBase(query, query_len, translated_query);
 
   // calculate the valid length
-  //int calculated_ref_length = static_cast<int>(strlen(ref));
-  //int valid_ref_len = (calculated_ref_length > ref_len)
-  //                    ? ref_len : calculated_ref_length;
   int valid_ref_len = ref_len;
   int8_t* translated_ref = new int8_t[valid_ref_len];
   TranslateBase(ref, valid_ref_len, translated_ref);
@@ -386,7 +387,7 @@ bool Aligner::Align(const char* query, const char* ref, const int& ref_len,
   s_align* s_al = ssw_align(profile, translated_ref, valid_ref_len,
                                  static_cast<int>(gap_opening_penalty_),
 				 static_cast<int>(gap_extending_penalty_),
-				 flag, filter.score_filter, filter.distance_filter, query_len);
+				 flag, filter.score_filter, filter.distance_filter, maskLen);
 
   alignment->Clear();
   ConvertAlignment(*s_al, query_len, alignment);


=====================================
src/ssw_cpp.h
=====================================
@@ -1,3 +1,7 @@
+// ssw_cpp.h
+// Created by Wan-Ping Lee
+// Last revision by Mengyao Zhao on 2017-05-30
+
 #ifndef COMPLETE_STRIPED_SMITH_WATERMAN_CPP_H_
 #define COMPLETE_STRIPED_SMITH_WATERMAN_CPP_H_
 
@@ -127,9 +131,12 @@ class Aligner {
   // @param    query     The query sequence.
   // @param    filter    The filter for the alignment.
   // @param    alignment The container contains the result.
+  // @param    maskLen   The distance between the optimal and suboptimal alignment ending position will >= maskLen. We suggest to 
+  //                     use readLen/2, if you don't have special concerns. Note: maskLen has to be >= 15, otherwise this function 
+  //                     will NOT return the suboptimal alignment information.
   // @return   True: succeed; false: fail.
   // =========
-  bool Align(const char* query, const Filter& filter, Alignment* alignment) const;
+  bool Align(const char* query, const Filter& filter, Alignment* alignment, const int32_t maskLen) const;
 
   // =========
   // @function Align the query againt the reference.
@@ -141,10 +148,13 @@ class Aligner {
   // @param    ref_len   The length of the reference sequence.
   // @param    filter    The filter for the alignment.
   // @param    alignment The container contains the result.
+  // @param    maskLen   The distance between the optimal and suboptimal alignment ending position will >= maskLen. We suggest to 
+  //                     use readLen/2, if you don't have special concerns. Note: maskLen has to be >= 15, otherwise this function 
+  //                     will NOT return the suboptimal alignment information.
   // @return   True: succeed; false: fail.
   // =========
   bool Align(const char* query, const char* ref, const int& ref_len,
-             const Filter& filter, Alignment* alignment) const;
+             const Filter& filter, Alignment* alignment, const int32_t maskLen) const;
 
   // @function Clear up all containers and thus the aligner is disabled.
   //             To rebuild the aligner please use Build functions.


=====================================
src/ssw_lib.py
=====================================
@@ -3,6 +3,7 @@
 Simple python wrapper for SSW library
 Please put the path of libssw.so into LD_LIBRARY_PATH or pass it explicitly as a parameter
 By Yongan Zhao (March 2016)
+Revised by Mengyao Zhao on 2022-May-19
 """
 
 import sys
@@ -101,10 +102,10 @@ class CSsw(object):
         """
 # load libssw
         sLibName = 'libssw.so'
-        if not sLibPath:
-# user doesn't give the path explicitly
+        if sLibPath:
+# user does give the path explicitly
             if not op.exists(op.join(sLibPath, sLibName)):
-                print >> sys.stderr, 'libssw.so does not exist in the input path'
+                sys.stderr.write('libssw.so does not exist in the input path.')
                 sys.exit()
             self.ssw = ct.cdll.LoadLibrary(op.join(sLibPath,sLibName))
         else:
@@ -115,7 +116,7 @@ class CSsw(object):
                     bFound = True
                     self.ssw = ct.cdll.LoadLibrary(op.join(s,sLibName))
             if bFound == False:
-                print >> sys.stderr, 'libssw.so does not exist in PATH'
+                sys.stderr.write('libssw.so does not exist in PATH')
                 sys.exit()
 
 # init ssw_init



View it on GitLab: https://salsa.debian.org/med-team/libssw/-/commit/c9c6106f955eac1ec9d8a40cc85495ccf3178810

-- 
View it on GitLab: https://salsa.debian.org/med-team/libssw/-/commit/c9c6106f955eac1ec9d8a40cc85495ccf3178810
You're receiving this email because of your account on salsa.debian.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20220524/4f6bb5cc/attachment-0001.htm>


More information about the debian-med-commit mailing list