[med-svn] [Git][med-team/mindthegap][upstream] New upstream version 2.3.0

Nilesh Patra (@nilesh) gitlab at salsa.debian.org
Sun May 29 06:53:31 BST 2022



Nilesh Patra pushed to branch upstream at Debian Med / mindthegap


Commits:
b9057443 by Nilesh Patra at 2022-05-29T11:15:30+05:30
New upstream version 2.3.0
- - - - -


30 changed files:

- − .travis.yml
- CHANGELOG.md
- CMakeLists.txt
- README.md
- data/reads_r1.fastq
- data/reads_r2.fastq
- data/reference.fasta
- doc/MindTheGap_insertion_caller.md
- scripts/jenkins/tool-mindthegap-build-debian7-64bits-gcc-4.7.sh
- scripts/jenkins/tool-mindthegap-build-macos-10.9.5-gcc-4.2.1.sh
- scripts/jenkins/tool-mindthegap-release-debian.sh
- src/FindBreakpoints.hpp
- src/FindHeteroInsertion.hpp
- src/FindSNP.hpp
- + src/FindSmallInsertion.hpp
- src/Finder.cpp
- src/Finder.hpp
- src/main.cpp
- test/full_test/README
- test/full_test/allele1.fasta
- test/full_test/allele2.fasta
- test/full_test/gold.breakpoints
- test/full_test/gold.insertions.fasta
- test/full_test/gold.insertions.vcf
- test/full_test/gold.othervariants.vcf
- test/full_test/gold_bed.breakpoints
- test/full_test/gold_bed.othervariants.vcf
- test/full_test/gold_fill.output
- test/full_test/gold_find.output
- test/full_test/reference.fasta


Changes:

=====================================
.travis.yml deleted
=====================================
@@ -1,34 +0,0 @@
-language: cpp
-os:
-- linux
-- osx
-compiler:
-- clang
-- gcc
-addons:
-  apt:
-    sources:
-    - ubuntu-toolchain-r-test
-    - llvm-toolchain-precise-3.7
-    - george-edison55-precise-backports  # for cmake 3
-    packages:
-    - libcppunit-dev
-    - g++-4.8
-    - clang-3.7
-    - cmake
-    - cmake-data
-install:
-- if [ "`echo $CXX`" == "g++" ]     && [ "$TRAVIS_OS_NAME" == "linux" ]; then export CXX=g++-4.8; fi
-- if [ "`echo $CXX`" == "clang++" ] && [ "$TRAVIS_OS_NAME" == "linux" ]; then export CXX=clang++-3.7; fi
-matrix:
-  exclude:
-  - os: osx
-    compiler: gcc
-script:
-- mkdir build
-- cd build
-- cmake .. && make 
-- cd ../test && ./simple_full_test.sh
-env:
-  global:
-    - MAKEFLAGS="-j 4"


=====================================
CHANGELOG.md
=====================================
@@ -3,6 +3,16 @@
 --------------------------------------------------------------------------------
 ## [Unreleased]
 
+--------------------------------------------------------------------------------
+## [2.3.0] - 2022-04-20
+
+Improving the `Find` (insertion breakpoint finder) module:
+
+* very small insertions (1 or 2 bp) are now directly assembled in the `Find` module and are output in the `.othervariants.vcf` file. This may increase the running time of the `Find` module but the overall running time of MindTheGap (Find+Fill) is drastically reduced. Indeed, these numerous small insertions are no longer output in the breakpoint file, nor given as input for the `Fill` assembly module which performs a deeper traversal of the de Bruijn graph (designed for longer insertions). 
+* a novel filter is implemented to reduce the amount of False Positive insertion sites. It is based on the number of branching kmers in a 100-bp window before a heterozygous site. It can be tuned with the novel option `-branching-filter`. It is now activated by default, so this may modify the amount of heterozygous sites detected with respect to previous versions. 
+
+With this new version, the running time of MindTheGap as an insertion variant caller is reduced for real large datasets, such as human genome re-sequencing data. 
+
 --------------------------------------------------------------------------------
 ## [2.2.3] - 2021-06-11
 


=====================================
CMakeLists.txt
=====================================
@@ -27,8 +27,8 @@ cmake_minimum_required(VERSION 3.1)
 ################################################################################
 # The default version number is the latest official build
 SET (gatb-tool_VERSION_MAJOR 2)
-SET (gatb-tool_VERSION_MINOR 2)
-SET (gatb-tool_VERSION_PATCH 3)
+SET (gatb-tool_VERSION_MINOR 3)
+SET (gatb-tool_VERSION_PATCH 0)
 
 # But, it is possible to define another release number during a local build
 IF (DEFINED MAJOR)


=====================================
README.md
=====================================
@@ -2,7 +2,7 @@
 
 | **Linux** | **Mac OSX** |
 |-----------|-------------|
-[![Build Status](https://ci.inria.fr/gatb-core/view/MindTheGap/job/tool-mindthegap-build-debian7-64bits-gcc-4.7/badge/icon)](https://ci.inria.fr/gatb-core/view/MindTheGap/job/tool-mindthegap-build-debian7-64bits-gcc-4.7/) | [![Build Status](https://ci.inria.fr/gatb-core/view/MindTheGap/job/tool-mindthegap-build-macos-10.9.5-gcc-4.2.1/badge/icon)](https://ci.inria.fr/gatb-core/view/MindTheGap/job/tool-mindthegap-build-macos-10.9.5-gcc-4.2.1/)
+[![Build Status](https://ci.inria.fr/gatb-core/view/MindTheGap-gitlab/job/tool-mindthegap-build-debian7-64bits-gcc-4.7-gitlab/badge/icon)](https://ci.inria.fr/gatb-core/view/MindTheGap/job/tool-mindthegap-build-debian7-64bits-gcc-4.7/) | [![Build Status](https://ci.inria.fr/gatb-core/view/MindTheGap-gitlab/job/tool-mindthegap-build-macos-10.9.5-gcc-4.2.1-gitlab/badge/icon)](https://ci.inria.fr/gatb-core/view/MindTheGap/job/tool-mindthegap-build-macos-10.9.5-gcc-4.2.1/)
 
 [![install with bioconda](https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg?style=flat)](http://bioconda.github.io/recipes/mindthegap/README.html)
 
@@ -194,27 +194,30 @@ MindTheGap is composed of two main modules : breakpoint detection (`find` module
 4. **MindTheGap Output**
 
     All the output files are prefixed either by a default name: "MindTheGap_Expe-[date:YY:MM:DD-HH:mm]" or by a user defined prefix (option `-out` of MindTheGap).
-    Both MindTheGap modules generate the graph file if reads were given as input: 
     
-* a graph file (`.h5`). This is a binary file, to obtain information stored in it, you can use the utility program `dbginfo` located in your bin directory or in ext/gatb-core/bin/.
-  
-    `MindTheGap find` generates the following output files:
+    The main results files are output by the Fill module, these are:
     
-    * a breakpoint file (`.breakpoints`) in fasta format. 
+    * an **insertion variant file** (`.insertions.vcf`) in vcf format, in the case of insertion variant detection (for insertions >2 bp).
+
+    * an **assembly graph file** (`.gfa`) in GFA format, in the case of contig gap-filling. It contains the original contigs and the obtained gap-fill sequences (nodes of the graph), together with their overlapping relationships (arcs of the graph).
+
+    Additional output files are:
     
-* a variant file (`.othervariants.vcf`) in vcf format. It contains SNPs and deletion events.
+	* a graph file (`.h5`), output by both MindTheGap modules. This is a binary file containing the de Bruijn graph data structure. To obtain information stored in it, you can use the utility program `dbginfo` located in your bin directory or in ext/gatb-core/bin/.
   
-    `MindTheGap fill` generates the following output files:
+    * Files output specifically by `MindTheGap find`:
     
-* a sequence file (`.insertions.fasta`) in fasta format. It contains the inserted sequences or contig gap-fills that were successfully assembled. 
-  
-* an insertion variant file (`.insertions.vcf`) in vcf format, in the case of insertion variant detection. 
+    	* a breakpoint file (`.breakpoints`) in fasta format. 
+    
+		* a variant file (`.othervariants.vcf`) in vcf format. It contains SNPs, deletions and very small insertions (1-2 bp).
   
-* an assembly graph file (`.gfa`) in GFA format, in the case of contig gap-filling. It contains the original contigs and the obtained gap-fill sequences (nodes of the graph), together with their overlapping relationships (arcs of the graph).
+    * Files output specifically by `MindTheGap fill`:
+    
+		* a sequence file (`.insertions.fasta`) in fasta format. It contains the inserted sequences (for insertions >2 bp) or contig gap-fills that were successfully assembled. 
   
-* a log file (`.info.txt`), a tabular file with some information about the filling process for each breakpoint/grap-fill. 
+		* a log file (`.info.txt`), a tabular file with some information about the filling process for each breakpoint/grap-fill. 
   
-* with option `-extend`, an additional sequence file (`.extensions.fasta`) in fasta format. It contains sequence extensions for failed insertion or gap-filling assemblies, ie. when the target kmer was not found, the first contig immediately after the source kmer is output.
+		* with option `-extend`, an additional sequence file (`.extensions.fasta`) in fasta format. It contains sequence extensions for failed insertion or gap-filling assemblies, ie. when the target kmer was not found, the first contig immediately after the source kmer is output.
   
   ​    
 
@@ -233,10 +236,14 @@ Either in your `bin/` directory or in `ext/gatb-core/bin/`, you can find additio
 
 ## Reference
 
+If you use MindTheGap, please cite: 
+
 MindTheGap: integrated detection and assembly of short and long insertions. Guillaume Rizk, Anaïs Gouin, Rayan Chikhi and Claire Lemaitre. Bioinformatics 2014 30(24):3451-3457. http://bioinformatics.oxfordjournals.org/content/30/24/3451
 
 [Web page](https://gatb.inria.fr/software/mind-the-gap/) with some updated results.
 
+MindTheGap was also evaluated in a recent benchmark exploring many different genomic features (size, nature, repeat context, junctional homology at breakpoints) of human insertion variants. Among other tested SV callers, MindTheGap was the only tool able to output sequence-resolved insertions for many types of insertions. Read more: [Towards a better understanding of the low recall of insertion variants with short-read based variant callers.](https://bmcgenomics.biomedcentral.com/articles/10.1186/s12864-020-07125-5) Delage W, Thevenon J, Lemaitre C. *BMC Genomics* **2020**, 21(1):762.
+
 
 # Contact
 


=====================================
data/reads_r1.fastq
=====================================
The diff for this file was not included because it is too large.

=====================================
data/reads_r2.fastq
=====================================
The diff for this file was not included because it is too large.

=====================================
data/reference.fasta
=====================================
@@ -8,3 +8,7 @@ TGCTGCCGATCGCTACGACGTCCTACCTTACACACAACGGGCCGCGTTCATACCCACGTATGAAGACATGCGGTTATCCG
 TATTGCGCCCTTCAAGAAGCTTCTGCTGACCGTAGGCGTCTCGGCGGTTTGTACTTTGAAAAATTAGCTGCACTACATCCGATGGGTATCCCTCCTCAATCTCAGCAGACCCGGAAAGCGATAGAATCAGCCACGCGGTCGTCCGGGCTAGGGGCCCTGCGCAAGGAAGGTTGGACAGGGCTAGACCCGGAAGCATCGGCTTTTCCTAAATGGTGACGGAGTTATATAGGGTAAGCCTGATAGCGCGGTAGGTGTAATGGCCATCCCCTCGCCTAGCGTGCGCGCAGACAAGTCCAGTCCCGGAGGAGGCATAGGCCTCATTATCATTTCCCTAGAATCGCTCTTGACATCTAGGTTGTACTAGGGACCAGGCGCCCAAAGCGGACGGTTCTCCGTGCTTTCGTGCCGTTTCAGCGTAAGATGCTATTTTTTGGGGAAATGGTCGGCGTGTGCGGGGGAGAACCACGGTACCAACTACGATAAGTCCGTCGTGTAACTTACGTGAAGGTGCTGTGAAGCAGGAATCCGTGCCAAAATGTCCGTGCGATATCCAACTTTCATAGTATTACACGAGAGCCTATGATTTGCCCAGGCGCGACCCGTGAATCGAGGTAATCGCCGACCAGATATTGCGAAACACCACATTACATGACTACTGTCCGCTTGAAGAGTTATATACTTGACAGTCCTGGTTGACGGCACAGCATATCTCCAATGTGTGGTTTAAAGTCTCACGTTCTTCATGCGCGCCGGCCCATGGGAACAGGTATCCTTACTTTCGGTACAAATGAGGCTCCAAAATAGCACGCTTGCAGCAGTCAAGTTGAACGCCTTAAAAGGCACCGCCGCTCGTTCATTGGGATTCCTTGAGAATCGTGACTTGTTACACTATAAGATCATGGATTGGACAAAATAGGCCAACTCCCGCACGCTGTGGCTATTCTTAAGTTGCATAGGTGGGAGTAGCCTTATACTCGATTTCTAAAAAGAGTAGGTGAGC
 >Seq4
 TTCCGGCGCCGCACTAATTGAAGTGGTGAGCTGACCAGTCGTTCAGGATCCGAAGGCGGGGATGGCGCTATAGGAGCCGGCAGGTATGCTTTGCCGCAAAATTTCGGGGTGGTGGAACCGTCTTACCGAAAGTTAGCTACAGCCTGGAATGTGAAATTCCATGACCTGCCCGTCCTGTGTCCACAGGGCGACATTTGCCACGTAGGTAGGGCGACCATTAGAATGCTGCATTATCGGGCGATAAAAAGTTTTATACCCAAGAATCCTACAAAGATGAAAATTTCGAAGAGCTGCACGCAGTTGTAAGTTGCTTTTCTGGGGTAATCGAGATTCTCCACCATAACCTGCGCAATGCATCGTGAAGCTTTACCGCGCCCAAGGGGAGCGTCTCAGTGGGGTTGCCTCCAGGGATATATTGAAAGTTGAAGAAGAAGATCACAGGTTAAGCGGTATGTTAAGTTAGAACTCACGGGGAGCCGCCTTGATTTTGTTCGACATGAACCAGAGACCAAGTGTGTTATGTTCTGGAACCTTAATACGTACGTCGCCAGCACCGAGCCGGCACTCCATCTCTTTTGGGTGCGCAACATTGCTATACTTAGGATCCATTGACATCTGTCAGCCGTCTTTCCAGAACGTTATAAGACTCGTGAGGAAATTATACAAATCGTTGCCATCATCCAAAGCAAAGTACTTCCGCTTAGGAGTGCCTTGAAGAACCGATTATCTCTGACAATGTAATGCCACAGCACCCTCGACAAAGTTCTACATTCGTTCCAGGTCATGATACAGCGCGCTAAATTACCGCTACGAGCCATACCCCGAACATTGAGACCTGGCCAGTAGGTAGGTGTCAAATCGATATCCACACCTGTCGAAGCAGCTAGGGACCTAGACGCAACAGTAACCGCCTCGGAGTAAGCCCTGGTAAAGATCGGTTGCGGCGGGAGTCCTCCATTCAGGCCAAACGTGCAGTGCTCGATGTGCTTCCTATCGCTCT
+>Seq5
+GATGTTTAGAAGTTTCCAGGTCACGCCAATGATTGGCATTTACACACGTGGATCAGCGGACATATCTAACCCTTAGTGTTCTTAAGAGCAACTCACTACTCATTTCCACTAACCCCGCCGGCGGTAATTCCAATCTAGTTGATCAGACTTCCCAGTCAATGAAAGCGACACCGTGCGTCTGTAATACCAACAAGACCCTGCTGTCGTCCCGCAGAGGACGCGGCACCTCCGGATTTTGAGTCCAGTCTGAACGATTTTCGATCACTCACCATGGATCTGGAAAACGGAGTCGAGTACTCAGAGCCAAATTGATGCATTTCCAATGACCCGATGCAGGTGCGACCGATCTTCGCCTATGCTTCCCGCCGTAATTATTGAGTCTGGGTCCCGGCCGCTAACGTTGACTCACGGGGAGGTACCCGTGCGTATTCTTCTCAAAGTGACGCTGGACAGCAGCGCATGTCCGAGCCCCATCGTCCTATCTGGTGTAGAGTCTTACCCTAATTAGAGTGATCGAACCAGTAGGTGTCGCGGTCTTAGGGCTCCCATTGTCCAAGGGAACGTGAACAGATATGAATCTGGGAGAATAGTGCAGCGTTGCCCTTCTGGTCGGTCAGCCCTTGCCTACGGCCCGTATGCGGAGAATGAAGGCGTGAAACATTCTGCTCTTTTAGAAGCAGCGGCTGCACCCGTATAACAATCGCACGATCGTACGTCTCATTTGCCGCGTTGGCGCGCCCGTGGATGATGGACCACGGTATGAACCTCTGCACTTCAAATTTGACGCAATCCTGCACTCACCGCACACAGTTCTAGTCTAACCGTCGCAGTGTCTGCTTTAAGGTAGAGATCGATACTTAGGATATGTTCATGTGTGTTTGTAGCGCTGGACCCTCTTATGGTGTGGTCACTTGTGATGGATCGAGGAACTTAGGCGGTTAACTTGTTTCGACGTCTCACCGACAATATCAGGATTTAGTATCG
+>Seq6
+ACCGAAAATGACAATGTTCACACGCATGCTCGGCGTGGAAAAGAGCCTTTTCTAAGACCGACTCGTTCCGGGCAGCAGGATTATTAGCCAATCAAAATTATCGACCGGTCATCAAGCTGCGATAGTGCAGGCGCATGCCGTCCAATGGGTCCACGGCGGAAGTGCGTTCGTCTACTCTGTCAAATCTTAACATTTTTTGAGGCTAATCCGGCCGGTAGTGTACCGTGAACCAAAGTCCTTCTACGAGCGTATTAGATTGCTCAAAAGATCCGGGAGAATTGACCAGGTCGTATCTTTAAAAACGCTGGTGCGAGCAGCTGCTGTTTTATCAACACCCATTTAGTCCTGTGAAGTTTGCTTAGCAGATACACCTTCCCGCGTGGTATGAGAGGCTGTTCTTTTAAAAACTATGAGGCTCTGGCACCTTCGACGCTAACAAAGTCCCCACGGACCATGATACCCTTACGCAACTCTCTTTGCACGCTAGGGCGAGAGTACTGCCCCTAGACTAGGTACACGCCGGGTAAACTCTCTCGCACACCTTTACGCTCGACTACAGGCTTCTAACCCTTCCGAACGCATATAATTCAAATGGCACTTAGTAACAGACGAATCACGGCTCACAGGCAGAATTCACTGGAGTAAAAGGATTCAGAACAATAGATAGTGTGTTAACTTTACAGTCATCCGTATTATAACGTAGCGAGAGGATTGAGTTCTTGTTAGGAAGGAAGGTCCTATAGACGAGTGCGGTAGCGCACCCGGTCGCCTTGCGTAGTCATGCCCGACGTGTTGATGGTTCCCTTTTAGCCGCCACACAAGGGATCCGAGGGTGAGAGACACATGGCCCTCACCGACGAGACTTACTCAGCCTGCCTCGCTATTGCCCTCTTTTTGATCGTCCCTTTGTGGCTCTCGAGGACTCGTGCAGCGTGTATCTGGGGATTTGTAAGCTTAAGACTACCTTCCATAGGA


=====================================
doc/MindTheGap_insertion_caller.md
=====================================
@@ -34,18 +34,28 @@ MindTheGap is composed of two main modules : breakpoint detection (`find` module
 4. **Find module specific options**
   
     In addition to the read or graph files, the `find` module has one mandatory option `-ref` and several optional options:
+    
     * `-ref`: the path to the reference genome file (in fasta format).
     * `-homo-only`: only homozygous insertions are reported (default: not activated).
     * `-max-rep`: maximal repeat size allowed for fuzzy sites  [default '5']. 
     * `-het-max-occ`: maximal number of occurrences of a (k-1)mer in the reference genome allowed for heterozyguous insertion breakpoints  [default '1']. In order to detect an heterozyguous insertion breakpoints, both flanking k-1-mers, at each side of the insertion site, must have strictly less than this number of occurrences in the reference genome. This prevents false positive predictions inside repeated regions. Warning : increasing this parameter may lead to numerous false positives (genomic approximate repeats).
-    * `-bed`: the path to a bed file defining genomic regions, to limit the find algorithm to particular regions of the genome. This can be usefull for exome data.
+    * `-branching-filter`: maximal number of branching kmers in a 100-bp window before a heterozygous site [default '15', '-1' means no filter applied]. This filter prevents numerous false positive predictions inside repeated regions. In large and complex genomes, such as human, this parameter can be set to lower values (10 or 5), in order to decrease the running time of the Fill module (but this may result in a loss of recall in repeat-rich regions).
+    * `-bed`: the path to a bed file defining genomic regions, to limit the find algorithm to particular regions of the genome. This can be usefull for exome data. Important: the bed file has to be sorted and overlapping intervals merged, such as:
+    
+    	```
+    	sort -k1,1 -k2,2n file.bed > file_sorted.bed
+		bedtools merge -i file_sorted.bed > file_final.bed
+		```
+		
     
 5. **Fill module specific options**
   
-    In addition to the read or graph files, the `fill` module has one other mandatory option, `-bkpt` 	
-    * `-bkpt`: the breakpoint file path. This is one of the output of the `find` module and contains for each detected insertion site its left and right kmers from and to which the local assembly will be performed (see section E for details about the format).
+    In addition to the read or graph files, the `fill` module has one other mandatory option, `-bkpt`:
+     	
+    * `-bkpt`: the breakpoint file path. This is one of the output of the `find` module and contains for each detected insertion site its left and right kmers from and to which the local assembly will be performed (see section [Output formats](#output-formats) for details about the format).
 	
 	The fill module has several optional options:
+	
 	* `-max-nodes`: maximum number of nodes in contig graph for each insertion assembly [default '100']. This arguments limits the computational time, this is especially useful for complex genomes.
     * `-max-length`: maximum number of assembled nucleotides in the contig graph (nt)  [default '10000']. This arguments limits the computational time, this is especially useful for complex genomes.
     * `-filter`: if set, insertions with multiple solutions are not output in the final vcf file (default : not activated).
@@ -55,21 +65,25 @@ MindTheGap is composed of two main modules : breakpoint detection (`find` module
   
     All the output files are prefixed either by a default name: "MindTheGap_Expe-[date:YY:MM:DD-HH:mm]" or by a user defined prefix (option `-out` of MindTheGap)
     Both MindTheGap modules generate the graph file if reads were given as input: 
+    
     * a graph file (`.h5`). This is a binary file, to obtain information stored in it, you can use the utility program dbginfo located in your bin directory or in ext/gatb-core/bin/.
     
     `MindTheGap find` generates the following output files:
+    
     * a breakpoint file (`.breakpoints`) in fasta format. It contains the breakpoint sequences of each detected insertion site. Each insertion site corresponds to 2 consecutive entries in the fasta file : sequences are the left and right side flanking kmers.
-    * a variant file (`.othervariants.vcf`) in vcf format. It contains SNPs and deletion events.
+    * a variant file (`.othervariants.vcf`) in vcf format. It contains SNPs, deletion and very small insertions (1-2 bp).
     
     `MindTheGap fill` generates the following output files:
-    * a sequence file (`.insertions.fasta`) in fasta format. It contains the inserted sequences or contig gap-fills that were successfully assembled. In the case of insertion variants, the location of each insertion on the reference genome can be found in its fasta header. The fasta header includes also information about each gap-fill such as its length, quality score and median kmer abundance.
-    * an insertion variant file (`.insertions.vcf`) in vcf format, in the case of insertion variant detection. This file contains all information of assembled insertion variants as in the `.insertions.fasta` file but in a different format. Here, insertion site positions are 1-based and left-normalized according to the VCF format specifications (contrary to positions indicated in the `.breakpoints` and `insertions.fasta` files which are right-normalized). Normalization occurs when multiple positions are possible for a single variation due to a small repeat. 
+    
+    * a sequence file (`.insertions.fasta`) in fasta format (for insertions >2 bp). It contains the inserted sequences or contig gap-fills that were successfully assembled. In the case of insertion variants, the location of each insertion on the reference genome can be found in its fasta header. The fasta header includes also information about each gap-fill such as its length, quality score and median kmer abundance.
+    * an insertion variant file (`.insertions.vcf`) in vcf format (for insertions >2 bp). This file contains all information of assembled insertion variants as in the `.insertions.fasta` file but in a different format. Here, insertion site positions are 1-based and left-normalized according to the VCF format specifications (contrary to positions indicated in the `.breakpoints` and `insertions.fasta` files which are right-normalized). Normalization occurs when multiple positions are possible for a single variation due to a small repeat. 
 	* a log file (`.info.txt`), a tabular file with some information about the filling process for each breakpoint/grap-fill. 
     
 
 
 
 ## Details on output formats
+<a name="output-formats"></a>
 
 1. Breakpoint format
   
@@ -98,6 +112,7 @@ MindTheGap is composed of two main modules : breakpoint detection (`find` module
 	FILTER field: can be `PASS`or `LOWQUAL` (for insertions with multiple solutions)
 	
 	INFO fields:  
+	
 	* `TYPE`: variant type, INS for insertion
 	* `LEN`: insertion size in bp
 	* `QUAL`: quality of the insertion (quality scores range from 0 to 50, 50 being the best quality, see the different quality scores [below](#quality))
@@ -128,6 +143,7 @@ MindTheGap is composed of two main modules : breakpoint detection (`find` module
 <a name="quality"></a>
   
     Each insertion is assigned a quality score ranging from 0 (low quality) to 50 (highest quality). This quality score reflects mainly repeat-associated criteria:
+    
     * `qual=5`: if one of the breakpoint kmer could not be found exactly but with 2 errors (mismatches)
     * `qual=10`: if one of the breakpoint kmer could not be found exactly but with 1 error (mismatch)
     * `qual=15`: if multiple sequences can be assembled for a given breakpoint (note that to output multiple sequences, they must differ from each other significantly, ie. <90% identity)
@@ -137,6 +153,7 @@ MindTheGap is composed of two main modules : breakpoint detection (`find` module
 5. Gap-filling information file:
 
     For each gap-fill, some informations about the filling process are given in the file `.info.txt`, whether it has been successfully filled or not. This can help understand why some breakpoints could not be filled. Here are the description of the columns:
+    
     * column 1 : breakpoint name       
     * column 2-4 : number of nodes in the contig graph, total nt assembled, number of nodes containing the right breakpoint kmer
     * (optionnally) column 5-7 : same informations as in column 2-4 but for the filling process in the reverse direction from right to left kmer, activated only if the filling failed in the forward direction


=====================================
scripts/jenkins/tool-mindthegap-build-debian7-64bits-gcc-4.7.sh
=====================================
@@ -28,6 +28,8 @@ DO_NOT_STOP_AT_ERROR : ${DO_NOT_STOP_AT_ERROR}
  Jenkins build parameters (built in)
 -----------------------------------------
 BUILD_NUMBER         : ${BUILD_NUMBER}
+JENKINS_HOME         : ${JENKINS_HOME}
+WORKSPACE            : ${WORKSPACE}
 "
 
 error_code () { [ "$DO_NOT_STOP_AT_ERROR" = "true" ] && { return 0 ; } }
@@ -54,12 +56,15 @@ g++ --version
 
 [ `gcc -dumpversion` = 4.7 ] && { echo "GCC 4.7"; } || { echo "GCC version is not 4.7, we exit"; exit 1; }
 
-JENKINS_TASK=tool-${TOOL_NAME}-build-debian7-64bits-gcc-4.7
+JENKINS_TASK=tool-${TOOL_NAME}-build-debian7-64bits-gcc-4.7-gitlab
+JENKINS_WORKSPACE=$WORKSPACE/$JENKINS_TASK/
+
 GIT_DIR=/scratchdir/builds/workspace/gatb-${TOOL_NAME}
 BUILD_DIR=/scratchdir/$JENKINS_TASK/gatb-${TOOL_NAME}/build
 
 rm -rf $BUILD_DIR
 mkdir -p $BUILD_DIR
+mkdir -p $JENKINS_WORKSPACE
 
 #-----------------------------------------------
 # we need gatb-core submodule to be initialized
@@ -118,10 +123,19 @@ cd build
 #                       PACKAGING                              #
 ################################################################
 
-# Upload bin bundle to the forge
+#-- Upload bin bundle as a build artifact
+#   -> bin bundle *-bin-Linux.tar.gz will be archived as a build artifact
+#   -> source package is handled by the osx task
+
 if [ $? -eq 0 ] && [ "$INRIA_FORGE_LOGIN" != none ] && [ "$DO_NOT_STOP_AT_ERROR" != true ]; then
-	make package
-    scp ${ARCHIVE_NAME}-${BRANCH_TO_BUILD}-bin-Linux.tar.gz ${INRIA_FORGE_LOGIN}@scm.gforge.inria.fr:/home/groups/gatb-tools/htdocs/ci-inria
-    # source package is handled by the osx task
+    echo "Creating a binary archive... "
+    make package
+
+    pwd
+    ls -atlhrsF
+
+    #-- Move the generated bin bundle to the workspace (so that it can be uploaded as a Jenkins job artifact)
+    mv *-${BRANCH_TO_BUILD}-bin-Linux.tar.gz $JENKINS_WORKSPACE/
+
 fi
 


=====================================
scripts/jenkins/tool-mindthegap-build-macos-10.9.5-gcc-4.2.1.sh
=====================================
@@ -28,6 +28,8 @@ DO_NOT_STOP_AT_ERROR : ${DO_NOT_STOP_AT_ERROR}
  Jenkins build parameters (built in)
 -----------------------------------------
 BUILD_NUMBER         : ${BUILD_NUMBER}
+JENKINS_HOME         : ${JENKINS_HOME}
+WORKSPACE            : ${WORKSPACE}
 "
 
 error_code () { [ "$DO_NOT_STOP_AT_ERROR" = "true" ] && { return 0 ; } }
@@ -117,9 +119,12 @@ cd ../build
 
 # Prepare and upload bin and source bundle to the forge
 if [ $? -eq 0 ] && [ "$INRIA_FORGE_LOGIN" != none ] && [ "$DO_NOT_STOP_AT_ERROR" != true ]; then
-	make package
+    make package
     make package_source
-    scp ${ARCHIVE_NAME}-${BRANCH_TO_BUILD}-bin-Darwin.tar.gz ${INRIA_FORGE_LOGIN}@scm.gforge.inria.fr:/home/groups/gatb-tools/htdocs/ci-inria
-    scp ${ARCHIVE_NAME}-${BRANCH_TO_BUILD}-Source.tar.gz ${INRIA_FORGE_LOGIN}@scm.gforge.inria.fr:/home/groups/gatb-tools/htdocs/ci-inria
+
+    # make both tar.gz available as Jenkins build artifacts    
+    cp ${ARCHIVE_NAME}-${BRANCH_TO_BUILD}-bin-Darwin.tar.gz ${WORKSPACE}/
+    cp ${ARCHIVE_NAME}-${BRANCH_TO_BUILD}-Source.tar.gz ${WORKSPACE}/
+
 fi
 


=====================================
scripts/jenkins/tool-mindthegap-release-debian.sh
=====================================
@@ -85,18 +85,24 @@ if [ "$INRIA_FORGE_LOGIN" == none ]; then
 fi
 
 cd $BUILD_DIR
-git clone https://github.com/pgdurand/github-release-api.git
+git clone https://github.com/GATB/github-release-api.git
 
 ################################################################
 #                       RETRIEVE ARCHIVES FROM INRIA FORGE     #
 ################################################################
 
+CI_URL=https://ci.inria.fr/gatb-core/view/MindTheGap-gitlab/job
+JENKINS_TASK_DEB=tool-mindthegap-build-debian7-64bits-gcc-4.7-gitlab
+JENKINS_TASK_MAC=tool-mindthegap-build-macos-10.9.5-gcc-4.2.1-gitlab
+
 #retrieve last build from ci-inria (see tool-lean-build-XXX tasks)
-scp ${INRIA_FORGE_LOGIN}@scm.gforge.inria.fr:/home/groups/gatb-tools/htdocs/ci-inria/${ARCHIVE_NAME}-${BRANCH_TO_BUILD}-bin-Linux.tar.gz .
+wget $CI_URL/$JENKINS_TASK_DEB/lastSuccessfulBuild/artifact/$JENKINS_TASK_DEB/${ARCHIVE_NAME}-${BRANCH_TO_BUILD}-bin-Linux.tar.gz
 [ $? != 0 ] && exit 1
-scp ${INRIA_FORGE_LOGIN}@scm.gforge.inria.fr:/home/groups/gatb-tools/htdocs/ci-inria/${ARCHIVE_NAME}-${BRANCH_TO_BUILD}-bin-Darwin.tar.gz .
+
+wget $CI_URL/$JENKINS_TASK_MAC/lastSuccessfulBuild/artifact/${ARCHIVE_NAME}-${BRANCH_TO_BUILD}-bin-Darwin.tar.gz
 [ $? != 0 ] && exit 1
-scp ${INRIA_FORGE_LOGIN}@scm.gforge.inria.fr:/home/groups/gatb-tools/htdocs/ci-inria/${ARCHIVE_NAME}-${BRANCH_TO_BUILD}-Source.tar.gz .
+
+wget $CI_URL/$JENKINS_TASK_MAC/lastSuccessfulBuild/artifact/${ARCHIVE_NAME}-${BRANCH_TO_BUILD}-Source.tar.gz
 [ $? != 0 ] && exit 1
 
 ################################################################


=====================================
src/FindBreakpoints.hpp
=====================================
@@ -103,7 +103,7 @@ public :
     /** writes a given variant in the output vcf file
      */
     void writeVcfVariant(int bkt_id, string& chrom_name, uint64_t position, char* ref_char, char* alt_char, int repeat_size, string type);
-
+    void writeIndel(int bkt_id, string &chrom_name, uint64_t position, string ref_char, string alt_char, int repeat_size, string type);
 
     /*Getter*/
     /** Return the number of found breakpoints
@@ -130,7 +130,7 @@ public :
      */
     size_t kmer_size();
 
-    /** Return the numbre of max repeat 
+    /** Return the max repeat size at breakpoint
      */
     int max_repeat();
 
@@ -138,6 +138,9 @@ public :
      */
     int snp_min_val();
 
+    /** Return the threashold value of the branching filter
+    */
+    int branching_threshold();
     
     /** The last solid kmer before gap
      */
@@ -233,7 +236,18 @@ public :
     /** Incremente the value of backup_iterate
      */
     int backup_iterate();
-    
+
+    /*Incremente the value of homo_clean_indel
+     */
+    int homo_clean_indel_iterate();
+
+    /* Incremente the value of homo_fuzzy_indel
+     */
+    int homo_fuzzy_indel_iterate();
+    /* Incremente the value of hetero_indel
+     */
+    int hetero_indel_iterate();
+
     /*Setter*/
     /** Set value of recent_hetero
      */
@@ -465,8 +479,12 @@ void FindBreakpoints<span>::operator()()
                     v.push_back(token);
                 }
                 if(v[0]==m_chrom_name){ // we are on the current chromosome
-                    interval=std::make_pair(std::stoi(v[1]),std::stoi(v[2]));
-                    interval_vector.push_back( tuple<uint64_t ,uint64_t>(interval));
+                    uint64_t bed_begin = std::stoi(v[1]);
+                    uint64_t bed_end = std::stoi(v[2]);
+                    if ((bed_end-bed_begin) > this->finder->_kmerSize){
+                        interval=std::make_pair(std::stoi(v[1]),std::stoi(v[2]));
+                        interval_vector.push_back( tuple<uint64_t ,uint64_t>(interval));
+                    }
                 }
                 iss.clear();
             }
@@ -489,18 +507,28 @@ void FindBreakpoints<span>::operator()()
                         end_pos=get<1>(interval_vector.front());
                     }
                     
-                    if(!(*m_it_kmer).isValid() || (m_position<start_pos))
+                    if(!(*m_it_kmer).isValid())
                     {
-                        //Reintialize stretch_size for each bed region
-                        
+                        //Re-initialize stretch_size
                         this->m_solid_stretch_size = 0;
                         this->m_gap_stretch_size = 0;
                         this->m_kmer_begin = KmerCanonical();
                         this->m_kmer_end = KmerCanonical();
-                        //DEBUG
-                        //cout<<"n";
+
                     }
-                    
+
+                    if(m_position==start_pos-1) //for each beginning of bed region
+                    {
+                        //Re-initialize stretch_size for each bed region
+                        this->m_solid_stretch_size = 0;
+                        this->m_gap_stretch_size = 0;
+                        this->m_kmer_begin = KmerCanonical();
+                        this->m_kmer_end = KmerCanonical();
+                        
+                        //Re-initialize het_kmer_history for each bed region
+                        memset(this->m_het_kmer_history, 0, sizeof(info_type)*256);
+                    }
+
                     
                     if(((*m_it_kmer).isValid()) && (m_position>=start_pos)) //inside the current bed interval
                     {
@@ -648,7 +676,30 @@ void FindBreakpoints<span>::writeVcfVariant(int bkt_id, string& chrom_name, uint
 			repeat_size
 	);
 }
-
+template <size_t span>
+void FindBreakpoints<span>::writeIndel(int bkt_id, string &chrom_name, uint64_t position, string ref_string, string alt_string, int repeat_size, string type)
+{
+    // NOTE : currently all positions coming from FindObservers are 0-based, VCF is supposed to be 1-based, so we add +1
+    int variant_size = alt_string.length() - 1;
+    string GT = "./.";
+    if (type == "HOM")
+    {
+        GT = "1/1";
+    }
+    if (type == "HET")
+    {
+        GT = "0/1";
+    }
+    fprintf(this->finder->_vcf_file, "%s\t%lli\tbkpt%i\t%s\t%s\t.\tPASS\tTYPE=INS;LEN=%i;FUZZY=%i\tGT\t%s\n",
+            chrom_name.c_str(),
+            position + 1, //switch to 1-based
+            bkt_id,
+            ref_string.c_str(),
+            alt_string.c_str(),
+            variant_size,
+            repeat_size,
+            GT.c_str());
+}
 /*Getter*/
 template<size_t span>
 int FindBreakpoints<span>::node_in_branch(Node& kmer_node)
@@ -710,6 +761,12 @@ int FindBreakpoints<span>::snp_min_val()
     return this->finder->_snp_min_val;
 }
 
+template<size_t span>
+int FindBreakpoints<span>::branching_threshold()
+{
+    return this->finder->_branching_threshold;
+}
+
 /*Kmer related object*/
 template<size_t span>
 typename FindBreakpoints<span>::KmerCanonical& FindBreakpoints<span>::kmer_begin()
@@ -870,7 +927,22 @@ int FindBreakpoints<span>::backup_iterate()
 {
     return this->finder->_nb_backup++;
 }
+template <size_t span>
+int FindBreakpoints<span>::homo_clean_indel_iterate()
+{
+    return this->finder->_nb_homo_clean_indel++;
+}
 
+template <size_t span>
+int FindBreakpoints<span>::homo_fuzzy_indel_iterate()
+{
+    return this->finder->_nb_homo_fuzzy_indel++;
+}
+template <size_t span>
+int FindBreakpoints<span>::hetero_indel_iterate()
+{
+    return this->finder->_nb_hetero_indel++;
+}
 /*Setter*/
 template<size_t span>
 void FindBreakpoints<span>::recent_hetero(int value)


=====================================
src/FindHeteroInsertion.hpp
=====================================
@@ -29,8 +29,10 @@ template<size_t span>
 class FindHeteroInsertion : public IFindObserver<span>
 {
 public :
-
-    /** \copydoc IFindObserver::IFindObserver
+	typedef typename gatb::core::kmer::impl::Kmer<span> Kmer;
+	typedef typename Kmer::ModelCanonical KmerModel;
+	typedef typename KmerModel::Iterator KmerIterator;
+	/** \copydoc IFindObserver::IFindObserver
      */
     FindHeteroInsertion(FindBreakpoints<span> * find);
 
@@ -47,12 +49,24 @@ bool FindHeteroInsertion<span>::update()
 {
 	if(!this->_find->homo_only())
 	{
+        // branching filter parameters
+        int branching_threshold = this->_find->branching_threshold(); //max number of branching kmers in the 100 bp window of previous kmers
+        int max_branching_kmers = branching_threshold;
+        bool filtering = true;
+        if (branching_threshold<0){
+            filtering = false;
+            max_branching_kmers = 100;
+        }
+        int filter_window_size = 100 ; //should not be larger than the size of het_kmer_history = 256
+  
+        
 		// hetero site detection
 		if(!this->_find->kmer_end_is_repeated() && this->_find->current_info().nb_in == 2 && !this->_find->recent_hetero())
 		{
 			//loop over putative repeat size (0=clean, >0 fuzzy), reports only the smallest repeat size found.
 			for(int i = 0; i <= this->_find->max_repeat(); i++)
 			{
+				bool found_base_one = false;
 				if(this->_find->het_kmer_history(this->_find->het_kmer_begin_index()+i).nb_out == 2 && !this->_find->het_kmer_history(this->_find->het_kmer_begin_index()+i).is_repeated)
 				{
 					//hetero breakpoint found
@@ -60,32 +74,102 @@ bool FindHeteroInsertion<span>::update()
 					//string kmer_end_str = this->_find->model().toString(this->_find->current_info().kmer);
                     //modif 15/06/2018 to check !!! (before in case of fuzzy>0, the end and right kmers overlapped, => insertion of wrong size (- fuzzy), missing the repeat + loss of recall if insertion of size < repeat)
                     string kmer_end_str = string(&(this->_find->chrom_seq()[this->_find->position() + i]), this->_find->kmer_size());
-                    if (!this->_find->model().codeSeed(&(this->_find->chrom_seq()[this->_find->position() +i]),Data::ASCII).isValid())
+					string ref = kmer_begin_str.substr(kmer_begin_str.size() - 1 - i, 1);
+                    
+                    //Tests if this can be a small (1-2 bp) insertion
+					char nucleo[20][6] = {"A", "C", "G", "T", "AA", "AC", "AG", "AT", "CA", "CC", "CG", "CT", "GA", "GC", "GG", "GT", "TA", "TC", "TG", "TT"};
+					KmerModel local_m(this->_find->kmer_size());
+					KmerIterator local_it(local_m);
+					std::string seq;
+					string inser_base_one;
+					if (!this->_find->model().codeSeed(&(this->_find->chrom_seq()[this->_find->position() +i]),Data::ASCII).isValid())
                     {
                                return false;
                     }
-                    this->_find->writeBreakpoint(this->_find->breakpoint_id(), this->_find->chrom_name(), this->_find->position()-1+i, kmer_begin_str, kmer_end_str,i, STR_HET_TYPE,  this->_find->het_kmer_history(this->_find->het_kmer_begin_index()+i).is_repeated,this->_find->kmer_end_is_repeated() );
-					
-					this->_find->breakpoint_id_iterate();
-					
-					if(i==0)
+
+					for (int a = 0; a < 20; a++) // for all possible 1-2 bp insertions, perform a micro-assembly
 					{
-						this->_find->hetero_clean_iterate();
+						seq = kmer_begin_str + nucleo[a] + kmer_end_str;
+						Data local_d(const_cast<char *>(seq.c_str()));
+						int sum_valid = 0;
+						//        // Init this variable
+						local_d.setRef(const_cast<char *>(seq.c_str()), (size_t)seq.length());
+						local_it.setData(local_d);
+						for (local_it.first(); !local_it.isDone(); local_it.next())
+						{
+							if (this->contains(local_it->forward()))
+							{
+								sum_valid++;
+							}
+							else
+							{
+								break;
+							}
+							if (sum_valid == this->_find->kmer_size())
+							{
+								inser_base_one = ref + nucleo[a];
+								found_base_one = true;
+							}
+						}
+						if (found_base_one == true)
+							break;
 					}
-					else
+					if (found_base_one)
 					{
-						this->_find->hetero_fuzzy_iterate();
+						this->_find->writeIndel(this->_find->breakpoint_id(), this->_find->chrom_name(), this->_find->position() - 1, ref, inser_base_one, i, STR_HET_TYPE);
+						this->_find->hetero_indel_iterate();
+						this->_find->breakpoint_id_iterate();
+						return true;
 					}
-					
-					this->_find->recent_hetero(this->_find->max_repeat()); // we found a breakpoint, the next hetero one mus be at least _max_repeat apart from this one.
-					return true; //reports only the smallest repeat size found.
+					else
+					{
+                        
+                        //this may be a large insertion
+                        
+                         int nb_branching = 0;
+                        //Applying the branching-filter :
+                        if (filtering){
+                            //counts the number of branching-kmers among the 100 previous ones
+                            int nb_prev = 0;
+                            unsigned char begin_index = this->_find->het_kmer_begin_index()-1;
+                            while ((nb_branching <= max_branching_kmers) && (nb_prev<filter_window_size)){
+                                //cout << "in loop" << nb_prev << "  " << begin_index-nb_prev << endl;
+                                if(this->_find->het_kmer_history(begin_index-nb_prev).nb_out >1 || this->_find->het_kmer_history(begin_index-nb_prev).nb_in >1 ){
+                                    nb_branching ++;
+                                }
+                                nb_prev++;
+                            }
+                        }
+                        
+                        if(nb_branching <= max_branching_kmers){
+                            this->_find->writeBreakpoint(this->_find->breakpoint_id(), this->_find->chrom_name(), this->_find->position() - 1 + i, kmer_begin_str, kmer_end_str, i, STR_HET_TYPE, this->_find->het_kmer_history(this->_find->het_kmer_begin_index() + i).is_repeated, this->_find->kmer_end_is_repeated());
+
+                            this->_find->breakpoint_id_iterate();
+
+                            if (i == 0)
+                            {
+                                this->_find->hetero_clean_iterate();
+                            }
+                            else
+                            {
+                                this->_find->hetero_fuzzy_iterate();
+                            }
+
+                            this->_find->recent_hetero(this->_find->max_repeat()); // we found a breakpoint, the next hetero one mus be at least _max_repeat apart from this one.
+                            return true;										   //reports only the smallest repeat size found.
+                        }
+                        else{ // stop the loop over fuzzy size, because the branching context will remain not good for other fuzzy sizes
+                            this->_find->recent_hetero(max(0, this->_find->recent_hetero() - 1)); // when recent_hetero=0 : we are sufficiently far from the previous hetero-site
+                            return false;
+                        }
+                    }
 				}
 			}
 		}
-		
-		this->_find->recent_hetero(max(0, this->_find->recent_hetero() - 1));  // when recent_hetero=0 : we are sufficiently far from the previous hetero-site
+
+		this->_find->recent_hetero(max(0, this->_find->recent_hetero() - 1)); // when recent_hetero=0 : we are sufficiently far from the previous hetero-site
 	}
-	
+
 	return false;
 }
 


=====================================
src/FindSNP.hpp
=====================================
@@ -146,9 +146,7 @@ bool FindSNP<span>::snp_at_end(unsigned char* beginpos, size_t limit, KmerType*
     nuc[un] = 0;
     nuc[deux] = 0;
     nuc[trois] = 0;
-	
-    unsigned char endpos = (*beginpos + limit) % 256;
-	
+		
     unsigned char  beginpos_init = (*beginpos);
     //this->remove_nuc(nuc, *beginpos);
     *ref_nuc = this->_find->het_kmer_history(*beginpos).kmer & 3; // obtain the reference nuc


=====================================
src/FindSmallInsertion.hpp
=====================================
@@ -0,0 +1,216 @@
+/*****************************************************************************
+*   MindTheGap: Integrated detection and assembly of insertion variants
+*   A tool from the GATB (Genome Assembly Tool Box)
+*   Copyright (C) 2022  INRIA
+*   Authors: C. Lemaitre, G. Rizk, P. Marijon, W. Delage
+*
+*  This program is free software: you can redistribute it and/or modify
+*  it under the terms of the GNU Affero General Public License as
+*  published by the Free Software Foundation, either version 3 of the
+*  License, or (at your option) any later version.
+*
+*  This program is distributed in the hope that it will be useful,
+*  but WITHOUT ANY WARRANTY; without even the implied warranty of
+*  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+*  GNU Affero General Public License for more details.
+*
+*  You should have received a copy of the GNU Affero General Public License
+*  along with this program.  If not, see <http://www.gnu.org/licenses/>.
+*****************************************************************************/
+
+#ifndef FINDSMALLINSERTION_HPP_
+#define FINDSMALLINSERTION_HPP_
+//**********************************
+#include <IFindObserver.hpp>
+#include <FindBreakpoints.hpp>
+
+
+template<size_t span>
+class FindSmallCleanInsertion : public IFindObserver<span>
+{
+public :
+
+    typedef typename gatb::core::kmer::impl::Kmer<span> Kmer;
+
+    typedef typename Kmer::ModelCanonical KmerModel;
+    typedef typename KmerModel::Iterator KmerIterator;
+
+public:
+
+    /** \copydoc IFindObserver<span>
+     */
+    /** \copydoc IFindObserver::IFindObserver
+     */
+    FindSmallCleanInsertion(FindBreakpoints<span> * find);
+
+
+    /** \copydoc IFindObserver::IFindObserver
+     */
+    bool update();
+};
+
+template<size_t span>
+FindSmallCleanInsertion<span>::FindSmallCleanInsertion(FindBreakpoints<span> * find) : IFindObserver<span>(find){}
+
+template<size_t span>
+bool FindSmallCleanInsertion<span>::update()
+{
+
+    if((this->_find->kmer_begin().isValid() && this->_find->kmer_end().isValid()) == false)
+    {
+        return false;
+    }
+
+    if(this->_find->gap_stretch_size() == (this->_find->kmer_size()-1)) //Check size of gap
+    {
+        // obtains the kmer sequence
+        string kmer_begin_str = this->_find->model().toString(this->_find->kmer_begin().forward());
+        string kmer_end_str = this->_find->model().toString(this->_find->kmer_end().forward());
+        string ref = kmer_begin_str.substr(kmer_begin_str.size()-1,1);
+
+        //All possible insertions of size 1 and 2
+        char nucleo[20][6] = {"A","C","G","T","AA","AC","AG","AT","CA","CC","CG","CT","GA","GC","GG","GT","TA","TC","TG","TT"};
+
+        KmerModel local_m(this->_find->kmer_size());
+        KmerIterator local_it(local_m);
+        std::string seq;
+        string inser_base_one;
+        bool found_base_one=false;
+        
+        //Test all possible insertions, by performing a micro-guided-assembly, ie. checks if all kmers of the insertion are present in the graph
+        for (int i=0; i<20; i++)
+        {
+            seq = kmer_begin_str+ nucleo[i] + kmer_end_str;
+            Data local_d(const_cast<char*>(seq.c_str()));
+            int sum_valid=0;
+            // Init this variable
+            local_d.setRef(const_cast<char*>(seq.c_str()), (size_t)seq.length());
+            local_it.setData(local_d);
+            for(local_it.first(); !local_it.isDone(); local_it.next())
+            {
+                if(this->contains(local_it->forward()))
+                {
+                    sum_valid++;
+                }
+                else
+                {
+                    break;
+                }
+                if (sum_valid==this->_find->kmer_size())
+                {
+                    inser_base_one=ref+nucleo[i];
+                    found_base_one=true;
+                }
+            }
+            if (found_base_one==true) break;
+        }
+        if (!found_base_one) return false;
+        
+        this->_find->writeIndel(this->_find->breakpoint_id(),this->_find->chrom_name(),this->_find->position()-2, ref, inser_base_one, 0, STR_HOM_TYPE);
+        this->_find->homo_clean_indel_iterate();
+        this->_find->breakpoint_id_iterate();
+        
+        return true;
+    }
+    return false;
+}
+
+///*
+template<size_t span>
+class FindSmallFuzzyInsertion : public IFindObserver<span>
+{
+public :
+
+    typedef typename gatb::core::kmer::impl::Kmer<span> Kmer;
+
+    typedef typename Kmer::ModelCanonical KmerModel;
+    typedef typename KmerModel::Iterator KmerIterator;
+
+public:
+
+    /** \copydoc IFindObserver<span>
+     */
+    /** \copydoc IFindObserver::IFindObserver
+     */
+    FindSmallFuzzyInsertion(FindBreakpoints<span> * find);
+
+
+    /** \copydoc IFindObserver::IFindObserver
+     */
+    bool update();
+};
+
+template<size_t span>
+FindSmallFuzzyInsertion<span>::FindSmallFuzzyInsertion(FindBreakpoints<span> * find):IFindObserver<span>(find){}
+
+template<size_t span>
+bool FindSmallFuzzyInsertion<span>::update()
+{
+    if((this->_find->kmer_begin().isValid() && this->_find->kmer_end().isValid()) == false)
+    {
+        return false;
+    }
+
+    if(this->_find->gap_stretch_size() < this->_find->kmer_size() - 1 && this->_find->gap_stretch_size() >= this->_find->kmer_size() - 1 - this->_find->max_repeat())
+    {
+        int repeat_size = this->_find->kmer_size() - 1 - this->_find->gap_stretch_size();
+        // obtains the kmer sequence
+        string kmer_begin_str = this->_find->model().toString(this->_find->kmer_begin().forward());
+        string kmer_end_str = string(&(this->_find->chrom_seq()[this->_find->position() - 1 + repeat_size]), this->_find->kmer_size());
+        if ((this->nb_out_branch(this->_find->kmer_begin().forward())==0) || (this->nb_in_branch(this->_find->kmer_end().forward())==0) || (!this->_find->model().codeSeed(&(this->_find->chrom_seq()[this->_find->position() - 1 + repeat_size]),Data::ASCII).isValid()))
+        {
+                   return false;
+        }
+        else
+        {
+            string ref = kmer_begin_str.substr(kmer_begin_str.size()-1-repeat_size,1);
+            
+            //All possible insertions of size 1 and 2
+            char nucleo[20][6] = {"A","C","G","T","AA","AC","AG","AT","CA","CC","CG","CT","GA","GC","GG","GT","TA","TC","TG","TT"};
+            KmerModel local_m(this->_find->kmer_size());
+            KmerIterator local_it(local_m);
+            std::string seq;
+            string inser_base_one;
+            bool found_base_one=false;
+            //std::list<char> fourth (nucleo, nucleo + sizeof(nucleo) / sizeof(char) );
+            for (int i=0; i<20; i++)
+            {
+                seq = kmer_begin_str+ nucleo[i] + kmer_end_str;
+                //std::cout << seq << endl;
+                Data local_d(const_cast<char*>(seq.c_str()));
+                int sum_valid=0;
+        //        // Init this variable
+                local_d.setRef(const_cast<char*>(seq.c_str()), (size_t)seq.length());
+                local_it.setData(local_d);
+            for(local_it.first(); !local_it.isDone(); local_it.next())
+               {
+                    if(this->contains(local_it->forward()))
+                    {
+                    sum_valid++;
+                    }
+                    else
+                    {
+                        break;
+                    }
+                    if (sum_valid==this->_find->kmer_size())
+                    {
+                        inser_base_one=ref+nucleo[i];
+                        found_base_one=true;
+                    }
+               }
+            if (found_base_one==true) break;
+            }
+            if (!found_base_one) return false;
+            this->_find->writeIndel(this->_find->breakpoint_id(),this->_find->chrom_name(),this->_find->position()- 2, ref, inser_base_one, repeat_size, STR_HOM_TYPE);
+            this->_find->homo_clean_indel_iterate();
+            this->_find->breakpoint_id_iterate();
+
+            return true;
+        }
+        }
+            return false;
+    }
+
+
+#endif // FINDSMALLINSERTION_HPP_
+


=====================================
src/Finder.cpp
=====================================
@@ -27,6 +27,7 @@
 #include <FindInsertion.hpp>
 #include <FindSNP.hpp>
 #include <limits> //for std::numeric_limits
+#include <FindSmallInsertion.hpp>
 
 //#define PRINT_DEBUG
 /********************************************************************************/
@@ -63,6 +64,7 @@ Finder::Finder ()  : Tool ("MindTheGap find")
     _max_repeat = 0;
     _het_max_occ = 1;
     _snp_min_val = 5;
+    _branching_threshold = 5;
     _nbCores = 0;
     _breakpoint_file_name = "";
     _vcf_file_name = "";
@@ -75,14 +77,17 @@ Finder::Finder ()  : Tool ("MindTheGap find")
     _nb_solo_snp = 0;
     _nb_multi_snp = 0;
     _nb_backup = 0;
-    
+    _nb_homo_clean_indel = 0;
+    _nb_homo_fuzzy_indel = 0;
+    _nb_hetero_indel = 0;
     _homo_only = false;
     _homo_insert = true;
     _hete_insert = true;
     _snp = true;
     _backup = false;
     _deletion = true;
-    
+    _small_homo = true;
+
     _bed_file_name="";
 	
 	setHelp(&HelpFinder);
@@ -112,6 +117,7 @@ Finder::Finder ()  : Tool ("MindTheGap find")
     finderParser->push_front (new OptionNoParam (STR_INSERT_ONLY, "search only insertion breakpoints (do not report other variants)", false));
     //finderParser->getParser(STR_INSERT_ONLY)->setVisible(false);
     finderParser->push_front (new OptionOneParam (STR_HET_MAX_OCC, "maximal number of occurrences of a kmer in the reference genome allowed for heterozyguous breakpoints", false,"1"));
+    finderParser->push_front (new OptionOneParam (STR_BRANCHING_FILTER, "branching filter paramater for heterozygous insertions, maximal number of branching kmers in a 100-bp window before a heterozygous site (if -1 = no filter)", false,"15"));
     //allow to find heterozyguous breakpoints in n-repeated regions of the reference genome
     finderParser->push_front (new OptionOneParam (STR_MAX_REPEAT, "maximal repeat size detected for fuzzy sites", false, "5"));
     finderParser->push_front (new OptionNoParam (STR_HOMO_ONLY, "search only homozygous breakpoints", false));
@@ -305,7 +311,8 @@ void Finder::execute ()
     _max_repeat = getInput()->getInt(STR_MAX_REPEAT);
     _het_max_occ=getInput()->getInt(STR_HET_MAX_OCC);
     _snp_min_val=getInput()->getInt(STR_SNP_MIN_VAL);
-
+    _branching_threshold = getInput()->getInt(STR_BRANCHING_FILTER);
+    
     if(_het_max_occ<1){
     	_het_max_occ=1;
     }
@@ -318,6 +325,7 @@ void Finder::execute ()
 	_snp = true;
 	_backup = false;
 	_deletion = true;
+    _small_homo = true;
     }
     
     if(getInput()->get(STR_INSERT_ONLY) != 0)
@@ -328,6 +336,7 @@ void Finder::execute ()
 	_snp = false;
 	_backup = false;
 	_deletion = false;
+    _small_homo = true;
     }
 
     if(getInput()->get(STR_SNP_ONLY) != 0)
@@ -338,6 +347,7 @@ void Finder::execute ()
 	_snp = true;
 	_backup = false;
 	_deletion = false;
+    _small_homo = true;
     }
 
     if(getInput()->get(STR_DELETION_ONLY) != 0)
@@ -348,6 +358,7 @@ void Finder::execute ()
 	_snp = false;
 	_backup = false;
 	_deletion = true;
+    _small_homo = true;
     }
 
     if(getInput()->get(STR_HETERO_ONLY) != 0)
@@ -358,6 +369,7 @@ void Finder::execute ()
 	_snp = false;
 	_backup = false;
 	_deletion = false;
+    _small_homo = true;
     }
 
     if(getInput()->get(STR_WITH_BACKUP) != 0)
@@ -419,6 +431,10 @@ void Finder::resumeParameters(){
         getInfo()->add(2,"Graph",getInput()->getStr(STR_URI_GRAPH).c_str());
     }
     getInfo()->add(2,"Reference",getInput()->getStr(STR_URI_REF).c_str());
+    if(getInput()->get(STR_BED) != 0)
+    {
+        getInfo()->add(2,"Bed file",_bed_file_name.c_str());
+    }
     getInfo()->add(1,"Graph");
     getInfo()->add(2,"kmer-size","%i", _kmerSize);
 
@@ -456,6 +472,7 @@ void Finder::resumeParameters(){
     getInfo()->add(1,"Breakpoint detection options");
     getInfo()->add(2,"max_repeat","%i", _max_repeat);
     getInfo()->add(2,"hetero_max_occ","%i", _het_max_occ);
+    getInfo()->add(2,"branching filter value", "‰i", _branching_threshold);
     getInfo()->add(2,"homo_insertions","%s", _homo_insert ? "yes" : "no");
     getInfo()->add(2,"hete_insertions","%s", _hete_insert ? "yes" : "no");
     getInfo()->add(2,"snp","%s", _snp ? "yes" : "no");
@@ -474,6 +491,8 @@ void Finder::resumeResults(double seconds){
     getInfo()->add(3,"fuzzy","%i", _nb_hetero_fuzzy);
     getInfo()->add(1,"Other variants");
     getInfo()->add(2,"deletions","%i", _nb_clean_deletion+_nb_fuzzy_deletion);
+    getInfo()->add(2, "Homozygous insertions 1-2 bp size", "%i", _nb_homo_clean_indel + _nb_homo_fuzzy_indel);
+    getInfo()->add(2, "Heterozygous insertions 1-2 bp size", "%i", _nb_hetero_indel);
     //getInfo()->add(3,"clean", "%i", _nb_clean_deletion);
     //getInfo()->add(3,"fuzzy", "%i", _nb_fuzzy_deletion);
     getInfo()->add(2,"SNPs","%i", _nb_solo_snp+_nb_multi_snp);
@@ -540,8 +559,12 @@ void Finder::runFindBreakpoints<span>::operator ()  (Finder* object)
 	{
 		findBreakpoints.addGapObserver(new FindDeletion<span>(&findBreakpoints));
 	}
-	
-	if(object->_homo_insert)
+    if (object->_small_homo)
+    {
+        findBreakpoints.addGapObserver(new FindSmallCleanInsertion<span>(&findBreakpoints));
+        findBreakpoints.addGapObserver(new FindSmallFuzzyInsertion<span>(&findBreakpoints));
+    }
+    if(object->_homo_insert)
 	{
 		findBreakpoints.addGapObserver(new FindCleanInsertion<span>(&findBreakpoints));
 		findBreakpoints.addGapObserver(new FindFuzzyInsertion<span>(&findBreakpoints));


=====================================
src/Finder.hpp
=====================================
@@ -31,6 +31,7 @@ static const char* STR_URI_REF = "-ref";
 static const char* STR_MAX_REPEAT = "-max-rep";;
 static const char* STR_HET_MAX_OCC = "-het-max-occ";
 static const char* STR_SNP_MIN_VAL = "-snp-min-val";
+static const char* STR_BRANCHING_FILTER = "-branching-filter";
 
 static const char* STR_HOMO_ONLY = "-homo-only";
 static const char* STR_INSERT_ONLY = "-insert-only";
@@ -65,10 +66,12 @@ public:
     const char* _mtg_version;
     size_t _kmerSize;
     Graph _graph;
-    //Graph _ref_graph; // no longer used
+
+    //parameters
     int _max_repeat;
     int _het_max_occ;
     int _snp_min_val;
+    int _branching_threshold;
     int _nbCores;
     bool _homo_only;
     bool _homo_insert;
@@ -76,6 +79,10 @@ public:
     bool _snp;
     bool _backup;
     bool _deletion;
+    bool _small_homo;
+    bool _small_hetero;
+    
+    //input/output files
     IBank* _refBank;
     string _breakpoint_file_name;
     FILE * _breakpoint_file;
@@ -84,6 +91,7 @@ public:
 
     string _bed_file_name;
 
+    //results statistics
     int _nb_homo_clean;
     int _nb_homo_fuzzy;
     int _nb_hetero_clean;
@@ -93,7 +101,9 @@ public:
     int _nb_solo_snp;
     int _nb_multi_snp;
     int _nb_backup;
-
+    int _nb_homo_clean_indel;
+    int _nb_homo_fuzzy_indel;
+    int _nb_hetero_indel;
     // Actual job done by the tool is here
     void execute ();
 


=====================================
src/main.cpp
=====================================
@@ -26,7 +26,7 @@
 
 using namespace std;
 
-static const char* MTG_VERSION = "2.2.3";
+static const char* MTG_VERSION = "2.3.0";
 
 static const char* STR_FIND        = "find";
 static const char* STR_FILL = "fill";


=====================================
test/full_test/README
=====================================
@@ -121,5 +121,46 @@ cp reference.fasta ../../data/reference.fasta
 
 # 7. Create Gold files for automated tests
 
-../../build/bin/MindTheGap find -in ../../data/reads_r1.fastq,../../data/reads_r2.fastq -ref ../../data/reference.fasta -out gold > gold_find.output
-../../build/bin/MindTheGap fill -graph gold.h5 -bkpt gold.breakpoints -out gold > gold_fill.output
\ No newline at end of file
+../../build/bin/MindTheGap find -in ../../data/reads_r1.fastq,../../data/reads_r2.fastq -ref ../../data/reference.fasta -out gold -nb-cores 1 > gold_find.output
+../../build/bin/MindTheGap fill -graph gold.h5 -bkpt gold.breakpoints -out gold -nb-cores 1 > gold_fill.output
+
+
+# 08/02/2022 : rajoute des petites insertions 1-2 bp 
+
+ou ajout de seq5 et seq6 depuis : Projets/mindTheGap/test-small-indels (990 premiers nt de chr1 et chr2, 6/10 HOM, 4/10 HET seulement dans allele 1)
+
+# change les param de simulation des reads : augmente couverture (+diminue tx erreurs)
+~/Bin/samtools-0.1.18/misc/wgsim -e 0.001 -d 200 -s 20 -N 1000 -1 100 -2 100 -r 0 -R 0 allele1.fasta allele1_r1.fq allele1_r2.fq
+~/Bin/samtools-0.1.18/misc/wgsim -e 0.001 -d 200 -s 20 -N 1000 -1 100 -2 100 -r 0 -R 0 allele2.fasta allele2_r1.fq allele2_r2.fq
+
+cat allele1_r1.fq allele2_r1.fq > reads_r1.fastq
+cat allele1_r2.fq allele2_r2.fq > reads_r2.fastq
+
+ATTENTION : changements dans les résultats : othervariants.vcf : perd 2 snps (Seq1 206, 219), gagne 1 del (Seq0 297) mais en perd une autre (Seq1 740) ; ne change pas les résultats des grandes insertions
+
+RQ : rate 2 petites insertions de taille 2 (Seq6 : pos 500 et 900)
+
+# List of all mutations :
+Seq0	101
+Seq0	123
+Seq0	816
+Seq1	206
+Seq1	219
+Seq1	342
+Seq1	740
+Seq2	320
+Seq2	344
+Seq2	379
+Seq2	535
+Seq2	834
+Seq3	256
+Seq3	511
+Seq3	766
+Seq3	781
+Seq4	257
+Seq4	349
+Seq4	512
+Seq4	600
+Seq4	841
+Seq4	884
+Seq4	821
\ No newline at end of file


=====================================
test/full_test/allele1.fasta
=====================================
@@ -8,3 +8,7 @@ TGCTGCCGATCGCTACGACGTCCTACCTTACACACAACGGGCCGCGTTCATACCCACGTATGAAGACATGCGGTTATCCG
 TATTGCGCCCTTCAAGAAGCTTCTGCTGACCGTAGGCGTCTCGGCGGTTTGTACTTTGAAAAATTAGCTGCACTACATCCGATGGGTATCCCTCCTCAATCTCAGCAGACCCGGAAAGCGATAGAATCAGCCACGCGGTCGTCCGGGCTAGGGGCCCTGCGCAAGGAAGGTTGGACAGGGCTAGACCCGGAAGCATCGGCTTTTCCTAAATGGTGACGGAGTTATATAGGGTAAGCCTGATAGCGCGGTAGGTGTTATGGCCATCCCCTCGCCTAGCGTGCGCGCAGACAAGTCCAGTCCCGGAGGAGGCATAGGCCTCATTATCATTTCCCTAGAATCGCTCTTGACATCTAGGTTGTACTAGGGACCAGGCGCCCAAAGCGGACGGTTCTCCGTGCTTTCGTGCCGTTTCAGCGTAAGATGCTATTTTTTGGGGAAATGGTCGGCGTGTGCGGGGGAGAACCACGGTACCAACTACGATAAGTCCGTCGTGTAACTTACGTGAAGGTGATGTGAAGCAGGAATCCGTGCCAAAATGTCCGTGCGATATCCAACTTTCATAGTATTACACGAGAGCCTATGATTTGCCCAGGCGCGACCCGTGAATCGAGGTAATCGCCGACCAGATATTGCGAAACACCACATTACATGACTACTGTCCGCTTGAAGAGTTATATACTTGACAGTCCTGGTTGACGGCACAGCATATCTCCAATGTGTGGTTTAAAGTCTCACGTTCTTCATGCGCGCCGGCCCATGGGAACAAGTATCCTTACTTTCGTTTGCAGCACTAGCCGTTCCTTGACATCTGCGGCCAACTTGTGCCTGAACCTGGAGTTTCGACAGCGTGGCGCTCTGGCCTAGTTCTTCGCTGGCACCTGGAAGAGCCGCCGTACAAATGAGGCTCCAAAATAGCACGCTTGCAGCAGTCAAGTTGAACGCCTTAAAAGGCACCGCCGCTCGTTCATTGGGATTCCTTGAGAATCGTGACTTGTTACACTATAAGATCATGGATTGGACAAAATAGGCCAACTCCCGCACGCTGTGGCTATTCTTAAGTTGCATAGGTGGGAGTAGCCTTATACTCGATTTCTAAAAAGAGTAGGTGAGC
 >Seq4
 TTCCGGCGCCGCACTAATTGAAGTGGTGAGCTGACCAGTCGTTCAGGATCCGAAGGCGGGGATGGCGCTATAGGAGCCGGCAGGTATGCTTTGCCGCAAAATTTCGGGGTGGTGGAACCGTCTTACCGAAAGTTAGCTACAGCCTGGAATGTGAAATTCCATGACCTGCCCGTCCTGTGTCCACAGGGCGACATTTGCCACGTAGGTAGGGCGACCATTAGAATGCTGCATTATCGGGCGATAAAAAGTTTTATACTCAAGAATCCTACAAAGATGAAAATTTCGAAGAGCTGCACGCAGTTGTAAGTTGCTTTTCTGGGGTAATCGAGATTCTCCACCATAACCTGCGCAATGCATCGTGAAGCTTTACCGCGCCCAAGGGGAGCGTCTCAGTGGGGTTGCCTCCAGGGATATATTGAAAGTTGAAGAAGAAGATCACAGGTTAAGCGGTATGTTAAGTTAGAACTCACGGGGAGCCGCCTTGATTTTGTTCGACATGAACCAGAGACCAGGTGTGTTATGTTCTGGAACCTTAATACGTACGTCGCCAGCACCGAGCCGGCACTCCATCTCTTTTGGGTGCGCAACATTGCTATACTTAGGTGTATTCCTGGGTTGAGTGGCAGGTTTCTCTTAATTCTTCCCTAAGTAGCTCCGAGGATCCATTGACATCTGTCAGCCGTCTTTCCAGAACGTTATAAGACTCGTGAGGAAATTATACAAATCGTTGCCATCATCCAAAGCAAAGTACTTCCGCTTAGGAGTGCCTTGAAGAACCGATTATCTCTGACAATGTAATGCCACAGCACCCTCGACAAAGTTCTACATTCGTTCCAGGTCATGATACAGCGCGCTAAATTACCGCTACGAGCCATACCGGATGGCGGCCGGAGAGCGCTGCAATCGCATGGCTCGGGACCGAACATTGAGACCTGGCTAGTAGGTAGGTGTCAAATCGATATCCACACCTGTCGAAGCAGCTAAAGATCGGTTGCGGCGGGAGTCCTCCATTCAGGCCAAACGTGCAGTGCTCGATGTGCTTCCTATCGCTCT
+>Seq5
+GATGTTTAGAAGTTTCCAGGTCACGCCAATGATTGGCATTTACACACGTGGATCAGCGGACATATCTAACCCTTAGTGTTCTTAAGAGCAACTCACTACTCCATTTCCACTAACCCCGCCGGCGGTAATTCCAATCTAGTTGATCAGACTTCCCAGTCAATGAAAGCGACACCGTGCGTCTGTAATACCAACAAGACCCTGGCTGTCGTCCCGCAGAGGACGCGGCACCTCCGGATTTTGAGTCCAGTCTGAACGATTTTCGATCACTCACCATGGATCTGGAAAACGGAGTCGAGTACTCACGAGCCAAATTGATGCATTTCCAATGACCCGATGCAGGTGCGACCGATCTTCGCCTATGCTTCCCGCCGTAATTATTGAGTCTGGGTCCCGGCCGCTAACGTTTGACTCACGGGGAGGTACCCGTGCGTATTCTTCTCAAAGTGACGCTGGACAGCAGCGCATGTCCGAGCCCCATCGTCCTATCTGGTGTAGAGTCTTACCTCTAATTAGAGTGATCGAACCAGTAGGTGTCGCGGTCTTAGGGCTCCCATTGTCCAAGGGAACGTGAACAGATATGAATCTGGGAGAATAGTGCAGCGTTGACCCTTCTGGTCGGTCAGCCCTTGCCTACGGCCCGTATGCGGAGAATGAAGGCGTGAAACATTCTGCTCTTTTAGAAGCAGCGGCTGCACCCGTATAACAACTCGCACGATCGTACGTCTCATTTGCCGCGTTGGCGCGCCCGTGGATGATGGACCACGGTATGAACCTCTGCACTTCAAATTTGACGCAATCCTGCACTCACCCGCACACAGTTCTAGTCTAACCGTCGCAGTGTCTGCTTTAAGGTAGAGATCGATACTTAGGATATGTTCATGTGTGTTTGTAGCGCTGGACCCTCTTATGGGTGTGGTCACTTGTGATGGATCGAGGAACTTAGGCGGTTAACTTGTTTCGACGTCTCACCGACAATATCAGGATTTAGTATCG
+>Seq6
+ACCGAAAATGACAATGTTCACACGCATGCTCGGCGTGGAAAAGAGCCTTTTCTAAGACCGACTCGTTCCGGGCAGCAGGATTATTAGCCAATCAAAATTATATCGACCGGTCATCAAGCTGCGATAGTGCAGGCGCATGCCGTCCAATGGGTCCACGGCGGAAGTGCGTTCGTCTACTCTGTCAAATCTTAACATTTTTTGAGCGGCTAATCCGGCCGGTAGTGTACCGTGAACCAAAGTCCTTCTACGAGCGTATTAGATTGCTCAAAAGATCCGGGAGAATTGACCAGGTCGTATCTTTAAAATAACGCTGGTGCGAGCAGCTGCTGTTTTATCAACACCCATTTAGTCCTGTGAAGTTTGCTTAGCAGATACACCTTCCCGCGTGGTATGAGAGGCTGTTCTTCATTAAAAACTATGAGGCTCTGGCACCTTCGACGCTAACAAAGTCCCCACGGACCATGATACCCTTACGCAACTCTCTTTGCACGCTAGGGCGAGAGTACTGTCCCCCTAGACTAGGTACACGCCGGGTAAACTCTCTCGCACACCTTTACGCTCGACTACAGGCTTCTAACCCTTCCGAACGCATATAATTCAAATGGCACTTCAAGTAACAGACGAATCACGGCTCACAGGCAGAATTCACTGGAGTAAAAGGATTCAGAACAATAGATAGTGTGTTAACTTTACAGTCATCCGTATTATAACGTGTAGCGAGAGGATTGAGTTCTTGTTAGGAAGGAAGGTCCTATAGACGAGTGCGGTAGCGCACCCGGTCGCCTTGCGTAGTCATGCCCGACGTGTTGATGGTGGTCCCTTTTAGCCGCCACACAAGGGATCCGAGGGTGAGAGACACATGGCCCTCACCGACGAGACTTACTCAGCCTGCCTCGCTATTGCCCTCTTTTTGATCACGTCCCTTTGTGGCTCTCGAGGACTCGTGCAGCGTGTATCTGGGGATTTGTAAGCTTAAGACTACCTTCCATAGGA


=====================================
test/full_test/allele2.fasta
=====================================
@@ -8,3 +8,7 @@ TGCTGCCGATCGCTACGACGTCCTACCTTACACACAACGGGCCGCGTTCATACCCACGTATGAAGACATGCGGTTATCCG
 TATTGCGCCCTTCAAGAAGCTTCTGCTGACCGTAGGCGTCTCGGCGGTTTGTACTTTGAAAAATTAGCTGCACTACATCCGATGGGTATCCCTCCTCAATCTCAGCAGACCCGGAAAGCGATAGAATCAGCCACGCGGTCGTCCGGGCTAGGGGCCCTGCGCAAGGAAGGTTGGACAGGGCTAGACCCGGAAGCATCGGCTTTTCCTAAATGGTGACGGAGTTATATAGGGTAAGCCTGATAGCGCGGTAGGTGTTATGGCCATCCCCTCGCCTAGCGTGCGCGCAGACAAGTCCAGTCCCGGAGGAGGCATAGGCCTCATTATCATTTCCCTAGAATCGCTCTTGACATCTAGGTTGTACTAGGGACCAGGCGCCCAAAGCGGACGGTTCTCCGTGCTTTCGTGCCGTTTCAGCGTAAGATGCTATTTTTTGGGGAAATGGTCGGCGTGTGCGGGGGAGAACCACGGTACCAACTACGATAAGTCCGTCGTGTAACTTACGTGAAGGTGATGTGAAGCAGGAATCCGTGCCAAAATGTCCGTGCGATATCCAACTTTCATAGTATTACACGAGAGCCTATGATTTGCCCAGGCGCGACCCGTGAATCGAGGTAATCGCCGACCAGATATTGCGAAACACCACATTACATGACTACTGTCCGCTTGAAGAGTTATATACTTGACAGTCCTGGTTGACGGCACAGCATATCTCCAATGTGTGGTTTAAAGTCTCACGTTCTTCATGCGCGCCGGCCCATGGGAACAAGTATCCTTACTTTCGTTTGCAGCACTAGCCGTTCCTTGACATCTGCGGCCAACTTGTGCCTGAACCTGGAGTTTCGACAGCGTGGCGCTCTGGCCTAGTTCTTCGCTGGCACCTGGAAGAGCCGCCGTACAAATGAGGCTCCAAAATAGCACGCTTGCAGCAGTCAAGTTGAACGCCTTAAAAGGCACCGCCGCTCGTTCATTGGGATTCCTTGAGAATCGTGACTTGTTACACTATAAGATCATGGATTGGACAAAATAGGCCAACTCCCGCACGCTGTGGCTATTCTTAAGTTGCATAGGTGGGAGTAGCCTTATACTCGATTTCTAAAAAGAGTAGGTGAGC
 >Seq4
 TTCCGGCGCCGCACTAATTGAAGTGGTGAGCTGACCAGTCGTTCAGGATCCGAAGGCGGGGATGGCGCTATAGGAGCCGGCAGGTATGCTTTGCCGCAAAATTTCGGGGTGGTGGAACCGTCTTACCGAAAGTTAGCTACAGCCTGGAATGTGAAATTCCATGACCTGCCCGTCCTGTGTCCACAGGGCGACATTTGCCACGTAGGTAGGGCGACCATTAGAATGCTGCATTATCGGGCGATAAAAAGTTTTATACTCAAGAATCCTACAAAGATGAAAATTTCGAAGAGCTGCACGCAGTTGTAAGTTGCTTTTCTGGGGTAATCGAGATTCTCCACCATAACCTGCGCAGTCTTAACCTTAAGACCGTTCATTGATAAAACTTGCTCACGCTCTAGATGGCGTGAAGCGAAACCTAGGAAAAAGTTTTGCAGATAATTAGATTATGCGCGATACTCCGCCGTGTGTTCAATGCATCGTGAAGCTTTACCGCGCCCAAGGGGAGCGTCTCAGTGGGGTTGCCTCCAGGGATATATTGAAAGTTGAAGAAGAAGATCACAGGTTAAGCGGTATGTTAAGTTAGAACTCACGGGGAGCCGCCTTGATTTTGTTCGACATGAACCAGAGACCAGGTGTGTTATGTTCTGGAACCTTAATACGTACGTCGCCAGCACCGAGCCGGCACTCCATCTCTTTTGGGTGCGCAACATTGCTATACTTAGGTGTATTCCTGGGTTGAGTGGCAGGTTTCTCTTAATTCTTCCCTAAGTAGCTCCGAGGATCCATTGACATCTGTCAGCCGTCTTTCCAGAACGTTATAAGACTCGTGAGGAAATTATACAAATCGTTGCCATCATCCAAAGCAAAGTACTTCCGCTTAGGAGTGCCTTGAAGAACCGATTATCTCTGACAATGTAATGCCACAGCACCCTCGACAAAGTTCTACATTCGTTCCAGGTCATGATACAGCGCGCTAAATTACCGCTACGAGCCATACCGGATGGCGGCCGGAGAGCGCTGCAATCGCATGGCTCGGGACCGAACATTGAGACCTGGCTAGTAGGTAGGTGTCAAATCGATATCCACACCTGTCGAAGCAGCTAAAGATCGGTTGCGGCGGGAGTCCTCCATTCAGGCCAAACGTGCAGTGCTCGATGTGCTTCCTATCGCTCT
+>Seq5
+GATGTTTAGAAGTTTCCAGGTCACGCCAATGATTGGCATTTACACACGTGGATCAGCGGACATATCTAACCCTTAGTGTTCTTAAGAGCAACTCACTACTCCATTTCCACTAACCCCGCCGGCGGTAATTCCAATCTAGTTGATCAGACTTCCCAGTCAATGAAAGCGACACCGTGCGTCTGTAATACCAACAAGACCCTGGCTGTCGTCCCGCAGAGGACGCGGCACCTCCGGATTTTGAGTCCAGTCTGAACGATTTTCGATCACTCACCATGGATCTGGAAAACGGAGTCGAGTACTCACGAGCCAAATTGATGCATTTCCAATGACCCGATGCAGGTGCGACCGATCTTCGCCTATGCTTCCCGCCGTAATTATTGAGTCTGGGTCCCGGCCGCTAACGTTTGACTCACGGGGAGGTACCCGTGCGTATTCTTCTCAAAGTGACGCTGGACAGCAGCGCATGTCCGAGCCCCATCGTCCTATCTGGTGTAGAGTCTTACCTCTAATTAGAGTGATCGAACCAGTAGGTGTCGCGGTCTTAGGGCTCCCATTGTCCAAGGGAACGTGAACAGATATGAATCTGGGAGAATAGTGCAGCGTTGCCCTTCTGGTCGGTCAGCCCTTGCCTACGGCCCGTATGCGGAGAATGAAGGCGTGAAACATTCTGCTCTTTTAGAAGCAGCGGCTGCACCCGTATAACAATCGCACGATCGTACGTCTCATTTGCCGCGTTGGCGCGCCCGTGGATGATGGACCACGGTATGAACCTCTGCACTTCAAATTTGACGCAATCCTGCACTCACCGCACACAGTTCTAGTCTAACCGTCGCAGTGTCTGCTTTAAGGTAGAGATCGATACTTAGGATATGTTCATGTGTGTTTGTAGCGCTGGACCCTCTTATGGTGTGGTCACTTGTGATGGATCGAGGAACTTAGGCGGTTAACTTGTTTCGACGTCTCACCGACAATATCAGGATTTAGTATCG
+>Seq6
+ACCGAAAATGACAATGTTCACACGCATGCTCGGCGTGGAAAAGAGCCTTTTCTAAGACCGACTCGTTCCGGGCAGCAGGATTATTAGCCAATCAAAATTATATCGACCGGTCATCAAGCTGCGATAGTGCAGGCGCATGCCGTCCAATGGGTCCACGGCGGAAGTGCGTTCGTCTACTCTGTCAAATCTTAACATTTTTTGAGCGGCTAATCCGGCCGGTAGTGTACCGTGAACCAAAGTCCTTCTACGAGCGTATTAGATTGCTCAAAAGATCCGGGAGAATTGACCAGGTCGTATCTTTAAAATAACGCTGGTGCGAGCAGCTGCTGTTTTATCAACACCCATTTAGTCCTGTGAAGTTTGCTTAGCAGATACACCTTCCCGCGTGGTATGAGAGGCTGTTCTTCATTAAAAACTATGAGGCTCTGGCACCTTCGACGCTAACAAAGTCCCCACGGACCATGATACCCTTACGCAACTCTCTTTGCACGCTAGGGCGAGAGTACTGTCCCCCTAGACTAGGTACACGCCGGGTAAACTCTCTCGCACACCTTTACGCTCGACTACAGGCTTCTAACCCTTCCGAACGCATATAATTCAAATGGCACTTAGTAACAGACGAATCACGGCTCACAGGCAGAATTCACTGGAGTAAAAGGATTCAGAACAATAGATAGTGTGTTAACTTTACAGTCATCCGTATTATAACGTAGCGAGAGGATTGAGTTCTTGTTAGGAAGGAAGGTCCTATAGACGAGTGCGGTAGCGCACCCGGTCGCCTTGCGTAGTCATGCCCGACGTGTTGATGGTTCCCTTTTAGCCGCCACACAAGGGATCCGAGGGTGAGAGACACATGGCCCTCACCGACGAGACTTACTCAGCCTGCCTCGCTATTGCCCTCTTTTTGATCGTCCCTTTGTGGCTCTCGAGGACTCGTGCAGCGTGTATCTGGGGATTTGTAAGCTTAAGACTACCTTCCATAGGA


=====================================
test/full_test/gold.breakpoints
=====================================
@@ -2,31 +2,31 @@
 CTCCGGATCTCCGTGTTCTTCGGAAGCTTAG
 >bkpt2_Seq0_pos_123_fuzzy_0_HET  right_kmer
 GTCACGCGCGTCATACTACAGTAAGTTACTG
->bkpt6_Seq1_pos_342_fuzzy_0_HET  left_kmer
+>bkpt5_Seq1_pos_342_fuzzy_0_HET  left_kmer
 GCCGCGCAAAGCCGGTCAACAGCGTTAGTAT
->bkpt6_Seq1_pos_342_fuzzy_0_HET  right_kmer
+>bkpt5_Seq1_pos_342_fuzzy_0_HET  right_kmer
 GTTGAAAGTTTACTCAGATCGCTTCTGTCGG
->bkpt11_Seq2_pos_535_fuzzy_0_HOM  left_kmer
+>bkpt9_Seq2_pos_535_fuzzy_0_HOM  left_kmer
 GGCATGCGTAAGTTATCGTGAAACCATGATG
->bkpt11_Seq2_pos_535_fuzzy_0_HOM  right_kmer
+>bkpt9_Seq2_pos_535_fuzzy_0_HOM  right_kmer
 GCCCCTTACTAGACCAAATGTACTGAATGCG
->bkpt12_Seq2_pos_835_fuzzy_1_HOM  left_kmer
+>bkpt10_Seq2_pos_835_fuzzy_1_HOM  left_kmer
 GAGCTACCCGCCCTCGGTGAGAAGGTAGTAT
->bkpt12_Seq2_pos_835_fuzzy_1_HOM  right_kmer
+>bkpt10_Seq2_pos_835_fuzzy_1_HOM  right_kmer
 ACCCAAACGCGTCCTATGCAGTTTTGGGCTT
->bkpt16_Seq3_pos_781_fuzzy_0_HOM  left_kmer
+>bkpt14_Seq3_pos_781_fuzzy_0_HOM  left_kmer
 CGGCCCATGGGAACAAGTATCCTTACTTTCG
->bkpt16_Seq3_pos_781_fuzzy_0_HOM  right_kmer
+>bkpt14_Seq3_pos_781_fuzzy_0_HOM  right_kmer
 GTACAAATGAGGCTCCAAAATAGCACGCTTG
->bkpt18_Seq4_pos_351_fuzzy_2_HET  left_kmer
+>bkpt16_Seq4_pos_351_fuzzy_2_HET  left_kmer
 GTAATCGAGATTCTCCACCATAACCTGCGCA
->bkpt18_Seq4_pos_351_fuzzy_2_HET  right_kmer
+>bkpt16_Seq4_pos_351_fuzzy_2_HET  right_kmer
 ATGCATCGTGAAGCTTTACCGCGCCCAAGGG
->bkpt20_Seq4_pos_603_fuzzy_3_HOM  left_kmer
+>bkpt18_Seq4_pos_603_fuzzy_3_HOM  left_kmer
 CTTTTGGGTGCGCAACATTGCTATACTTAGG
->bkpt20_Seq4_pos_603_fuzzy_3_HOM  right_kmer
+>bkpt18_Seq4_pos_603_fuzzy_3_HOM  right_kmer
 ATCCATTGACATCTGTCAGCCGTCTTTCCAG
->bkpt22_Seq4_pos_821_fuzzy_0_HOM  left_kmer
+>bkpt20_Seq4_pos_821_fuzzy_0_HOM  left_kmer
 AGCGCGCTAAATTACCGCTACGAGCCATACC
->bkpt22_Seq4_pos_821_fuzzy_0_HOM  right_kmer
+>bkpt20_Seq4_pos_821_fuzzy_0_HOM  right_kmer
 CCGAACATTGAGACCTGGCTAGTAGGTAGGT


=====================================
test/full_test/gold.insertions.fasta
=====================================
@@ -1,16 +1,16 @@
->bkpt2_Seq0_pos_123_fuzzy_0_HET_len_137_qual_50_avg_cov_8.38_median_cov_8.00   
+>bkpt2_Seq0_pos_123_fuzzy_0_HET_len_137_qual_50_avg_cov_21.59_median_cov_21.00   
 ATCTAAGCTGTGACCTTGTGGCCGAGGCGCTTTTCACGCCTACATTAACTCCTGGGAAGCTCTCTGCTCTAGTTTCAGTGCACATCTCCAGGTGAGCAACCCTGGCAAGCAGCCCCTTCCTGTAGAAATTACTTAGC
->bkpt6_Seq1_pos_342_fuzzy_0_HET_len_125_qual_50_avg_cov_9.91_median_cov_9.00   
+>bkpt5_Seq1_pos_342_fuzzy_0_HET_len_125_qual_50_avg_cov_25.17_median_cov_24.00   
 ATGGTTTATAGAACCCGGGCGTTCATGTCCGTCAGAACGATCTTGGCACGGTAGCCCCTGGTCCAGAGAGCCAAGGTGACTCAGCCCCACGATGGTGGTCTAGAGCGAAATAACCCTCGCCGAGA
->bkpt11_Seq2_pos_535_fuzzy_0_HOM_len_140_qual_50_avg_cov_21.63_median_cov_22.00   
+>bkpt9_Seq2_pos_535_fuzzy_0_HOM_len_140_qual_50_avg_cov_41.66_median_cov_43.00   
 TAACGTTCGCTGAACATCGACTCCGGTGACGACATACGATTCAAGAAGAGAGTGACTCTGTAGGATAACATCCCGCAACGCCTAATCCATCCAGCCTGGCACCATGTATAAAGGGCGTCAGGTATGTTAACGAGACTATT
->bkpt12_Seq2_pos_835_fuzzy_1_HOM_len_207_qual_50_avg_cov_16.50_median_cov_16.00   
+>bkpt10_Seq2_pos_835_fuzzy_1_HOM_len_207_qual_50_avg_cov_40.42_median_cov_42.00   
 GCACGCTGCAGGATTGGAACCACAATGTACGCCGATCCAAGCAGTAGTGGTTCATTGTATAAGTATCCTCCCTTGATTGGTCGAATATTAGGCATGCCCCGGGAGCATGTGGGCTCGAGCCACGGAGAGCAACTAATCGCGCATAAAACAAATACCTCATGGTTTTTGTGCGGAAAACCGTTGGGTGGACCATCAGCGGTTGTGATT
->bkpt16_Seq3_pos_781_fuzzy_0_HOM_len_111_qual_50_avg_cov_20.85_median_cov_20.50   
+>bkpt14_Seq3_pos_781_fuzzy_0_HOM_len_111_qual_50_avg_cov_50.54_median_cov_53.00   
 TTTGCAGCACTAGCCGTTCCTTGACATCTGCGGCCAACTTGTGCCTGAACCTGGAGTTTCGACAGCGTGGCGCTCTGGCCTAGTTCTTCGCTGGCACCTGGAAGAGCCGCC
->bkpt18_Seq4_pos_351_fuzzy_2_HET_len_120_qual_50_avg_cov_9.73_median_cov_10.00   
+>bkpt16_Seq4_pos_351_fuzzy_2_HET_len_120_qual_50_avg_cov_25.67_median_cov_26.00   
 GTCTTAACCTTAAGACCGTTCATTGATAAAACTTGCTCACGCTCTAGATGGCGTGAAGCGAAACCTAGGAAAAAGTTTTGCAGATAATTAGATTATGCGCGATACTCCGCCGTGTGTTCA
->bkpt20_Seq4_pos_603_fuzzy_3_HOM_len_57_qual_50_avg_cov_22.71_median_cov_23.00   
+>bkpt18_Seq4_pos_603_fuzzy_3_HOM_len_57_qual_50_avg_cov_46.66_median_cov_47.00   
 TGTATTCCTGGGTTGAGTGGCAGGTTTCTCTTAATTCTTCCCTAAGTAGCTCCGAGG
->bkpt22_Seq4_pos_821_fuzzy_0_HOM_len_40_qual_50_avg_cov_24.34_median_cov_24.00   
+>bkpt20_Seq4_pos_821_fuzzy_0_HOM_len_40_qual_50_avg_cov_37.63_median_cov_38.00   
 GGATGGCGGCCGGAGAGCGCTGCAATCGCATGGCTCGGGA


=====================================
test/full_test/gold.insertions.vcf
=====================================
@@ -1,8 +1,8 @@
 ##fileformat=VCFv4.1
-##filedate=Thu May  9 11:36:09 2019
-##source=MindTheGap fill version 2.2.0
-##SAMPLE=file:test-output/full-test.h5
-##REF=file:test-output/full-test
+##filedate=Tue Feb  8 12:29:50 2022
+##source=MindTheGap fill version 2.2.3
+##SAMPLE=file:gold.h5
+##REF=file:gold
 ##INFO=<ID=TYPE,Number=1,Type=String,Description="INS">
 ##INFO=<ID=LEN,Number=1,Type=Integer,Description="variant size">
 ##INFO=<=QUAL,Number=.,Type=Integer,Description="Quality of the insertion">
@@ -12,11 +12,11 @@
 ##INFO=<ID=NPOS,Number=1,Type=Integer,Description="number of alternative positions for the insertion site (= size of repeat (fuzzy) +1)">
 ##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
 #CHROM	POS	ID	REF	ALT	QUAL	FILTER	INFO	FORMAT	G1
-Seq0	123	bkpt2	G	GATCTAAGCTGTGACCTTGTGGCCGAGGCGCTTTTCACGCCTACATTAACTCCTGGGAAGCTCTCTGCTCTAGTTTCAGTGCACATCTCCAGGTGAGCAACCCTGGCAAGCAGCCCCTTCCTGTAGAAATTACTTAGC	.	PASS	TYPE=INS;LEN=137;QUAL=50;NSOL=1;NPOS=1;AVK=8.38;MDK=8.00	GT	0/1
-Seq1	342	bkpt6	T	TATGGTTTATAGAACCCGGGCGTTCATGTCCGTCAGAACGATCTTGGCACGGTAGCCCCTGGTCCAGAGAGCCAAGGTGACTCAGCCCCACGATGGTGGTCTAGAGCGAAATAACCCTCGCCGAGA	.	PASS	TYPE=INS;LEN=125;QUAL=50;NSOL=1;NPOS=1;AVK=9.91;MDK=9.00	GT	0/1
-Seq2	535	bkpt11	G	GTAACGTTCGCTGAACATCGACTCCGGTGACGACATACGATTCAAGAAGAGAGTGACTCTGTAGGATAACATCCCGCAACGCCTAATCCATCCAGCCTGGCACCATGTATAAAGGGCGTCAGGTATGTTAACGAGACTATT	.	PASS	TYPE=INS;LEN=140;QUAL=50;NSOL=1;NPOS=1;AVK=21.63;MDK=22.00	GT	1/1
-Seq2	834	bkpt12	A	ATGCACGCTGCAGGATTGGAACCACAATGTACGCCGATCCAAGCAGTAGTGGTTCATTGTATAAGTATCCTCCCTTGATTGGTCGAATATTAGGCATGCCCCGGGAGCATGTGGGCTCGAGCCACGGAGAGCAACTAATCGCGCATAAAACAAATACCTCATGGTTTTTGTGCGGAAAACCGTTGGGTGGACCATCAGCGGTTGTGAT	.	PASS	TYPE=INS;LEN=207;QUAL=50;NSOL=1;NPOS=2;AVK=16.50;MDK=16.00	GT	1/1
-Seq3	781	bkpt16	G	GTTTGCAGCACTAGCCGTTCCTTGACATCTGCGGCCAACTTGTGCCTGAACCTGGAGTTTCGACAGCGTGGCGCTCTGGCCTAGTTCTTCGCTGGCACCTGGAAGAGCCGCC	.	PASS	TYPE=INS;LEN=111;QUAL=50;NSOL=1;NPOS=1;AVK=20.85;MDK=20.50	GT	1/1
-Seq4	349	bkpt18	G	GCAGTCTTAACCTTAAGACCGTTCATTGATAAAACTTGCTCACGCTCTAGATGGCGTGAAGCGAAACCTAGGAAAAAGTTTTGCAGATAATTAGATTATGCGCGATACTCCGCCGTGTGTT	.	PASS	TYPE=INS;LEN=120;QUAL=50;NSOL=1;NPOS=3;AVK=9.73;MDK=10.00	GT	0/1
-Seq4	600	bkpt20	T	TAGGTGTATTCCTGGGTTGAGTGGCAGGTTTCTCTTAATTCTTCCCTAAGTAGCTCCG	.	PASS	TYPE=INS;LEN=57;QUAL=50;NSOL=1;NPOS=4;AVK=22.71;MDK=23.00	GT	1/1
-Seq4	821	bkpt22	C	CGGATGGCGGCCGGAGAGCGCTGCAATCGCATGGCTCGGGA	.	PASS	TYPE=INS;LEN=40;QUAL=50;NSOL=1;NPOS=1;AVK=24.34;MDK=24.00	GT	1/1
+Seq0	123	bkpt2	G	GATCTAAGCTGTGACCTTGTGGCCGAGGCGCTTTTCACGCCTACATTAACTCCTGGGAAGCTCTCTGCTCTAGTTTCAGTGCACATCTCCAGGTGAGCAACCCTGGCAAGCAGCCCCTTCCTGTAGAAATTACTTAGC	.	PASS	TYPE=INS;LEN=137;QUAL=50;NSOL=1;NPOS=1;AVK=21.59;MDK=21.00	GT	0/1
+Seq1	342	bkpt5	T	TATGGTTTATAGAACCCGGGCGTTCATGTCCGTCAGAACGATCTTGGCACGGTAGCCCCTGGTCCAGAGAGCCAAGGTGACTCAGCCCCACGATGGTGGTCTAGAGCGAAATAACCCTCGCCGAGA	.	PASS	TYPE=INS;LEN=125;QUAL=50;NSOL=1;NPOS=1;AVK=25.17;MDK=24.00	GT	0/1
+Seq2	535	bkpt9	G	GTAACGTTCGCTGAACATCGACTCCGGTGACGACATACGATTCAAGAAGAGAGTGACTCTGTAGGATAACATCCCGCAACGCCTAATCCATCCAGCCTGGCACCATGTATAAAGGGCGTCAGGTATGTTAACGAGACTATT	.	PASS	TYPE=INS;LEN=140;QUAL=50;NSOL=1;NPOS=1;AVK=41.66;MDK=43.00	GT	1/1
+Seq2	834	bkpt10	A	ATGCACGCTGCAGGATTGGAACCACAATGTACGCCGATCCAAGCAGTAGTGGTTCATTGTATAAGTATCCTCCCTTGATTGGTCGAATATTAGGCATGCCCCGGGAGCATGTGGGCTCGAGCCACGGAGAGCAACTAATCGCGCATAAAACAAATACCTCATGGTTTTTGTGCGGAAAACCGTTGGGTGGACCATCAGCGGTTGTGAT	.	PASS	TYPE=INS;LEN=207;QUAL=50;NSOL=1;NPOS=2;AVK=40.42;MDK=42.00	GT	1/1
+Seq3	781	bkpt14	G	GTTTGCAGCACTAGCCGTTCCTTGACATCTGCGGCCAACTTGTGCCTGAACCTGGAGTTTCGACAGCGTGGCGCTCTGGCCTAGTTCTTCGCTGGCACCTGGAAGAGCCGCC	.	PASS	TYPE=INS;LEN=111;QUAL=50;NSOL=1;NPOS=1;AVK=50.54;MDK=53.00	GT	1/1
+Seq4	349	bkpt16	G	GCAGTCTTAACCTTAAGACCGTTCATTGATAAAACTTGCTCACGCTCTAGATGGCGTGAAGCGAAACCTAGGAAAAAGTTTTGCAGATAATTAGATTATGCGCGATACTCCGCCGTGTGTT	.	PASS	TYPE=INS;LEN=120;QUAL=50;NSOL=1;NPOS=3;AVK=25.67;MDK=26.00	GT	0/1
+Seq4	600	bkpt18	T	TAGGTGTATTCCTGGGTTGAGTGGCAGGTTTCTCTTAATTCTTCCCTAAGTAGCTCCG	.	PASS	TYPE=INS;LEN=57;QUAL=50;NSOL=1;NPOS=4;AVK=46.66;MDK=47.00	GT	1/1
+Seq4	821	bkpt20	C	CGGATGGCGGCCGGAGAGCGCTGCAATCGCATGGCTCGGGA	.	PASS	TYPE=INS;LEN=40;QUAL=50;NSOL=1;NPOS=1;AVK=37.63;MDK=38.00	GT	1/1


=====================================
test/full_test/gold.othervariants.vcf
=====================================
@@ -1,25 +1,39 @@
 ##fileformat=VCFv4.1
-##filedate=Thu May  9 11:49:49 2019
-##source=MindTheGap find version 2.2.0
-##SAMPLE=file:../data/reads_r1.fastq,../data/reads_r2.fastq
-##REF=file:../data/reference.fasta
+##filedate=Tue Feb  8 12:29:50 2022
+##source=MindTheGap find version 2.2.3
+##SAMPLE=file:../../data/reads_r1.fastq,../../data/reads_r2.fastq
+##REF=file:../../data/reference.fasta
 ##INFO=<ID=TYPE,Number=1,Type=String,Description="SNP, INS, DEL or .">
 ##INFO=<ID=LEN,Number=1,Type=Integer,Description="variant size">
 ##INFO=<ID=FUZZY,Number=1,Type=Integer,Description="repeat size at the breakpoint, only for INS and DEL">
 ##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
 #CHROM	POS	ID	REF	ALT	QUAL	FILTER	INFO	FORMAT	G1
 Seq0	101	bkpt1	T	C	.	PASS	TYPE=SNP;LEN=1;FUZZY=0	GT	1/1
-Seq0	816	bkpt3	C	A	.	PASS	TYPE=SNP;LEN=1;FUZZY=0	GT	1/1
-Seq1	206	bkpt4	G	C	.	PASS	TYPE=SNP;LEN=1;FUZZY=0	GT	1/1
-Seq1	219	bkpt5	T	A	.	PASS	TYPE=SNP;LEN=1;FUZZY=0	GT	1/1
-Seq1	740	bkpt7	CCTGTTGGGAAGGAATTGCAATACTCTCCGAACCAGCTTAGGGCCCCCCGCCGCCGCAATTCGAGCGTTATGCCCGGAGCATTTGCACGATGCCATTAAACTATATCAA	C	.	PASS	TYPE=DEL;LEN=108;FUZZY=2	GT	1/1
-Seq2	320	bkpt8	T	C	.	PASS	TYPE=SNP;LEN=1;FUZZY=0	GT	1/1
-Seq2	344	bkpt9	C	A	.	PASS	TYPE=SNP;LEN=1;FUZZY=0	GT	1/1
-Seq2	379	bkpt10	G	C	.	PASS	TYPE=SNP;LEN=1;FUZZY=0	GT	1/1
-Seq3	256	bkpt13	A	T	.	PASS	TYPE=SNP;LEN=1;FUZZY=0	GT	1/1
-Seq3	511	bkpt14	C	A	.	PASS	TYPE=SNP;LEN=1;FUZZY=0	GT	1/1
-Seq3	766	bkpt15	G	A	.	PASS	TYPE=SNP;LEN=1;FUZZY=0	GT	1/1
-Seq4	257	bkpt17	C	T	.	PASS	TYPE=SNP;LEN=1;FUZZY=0	GT	1/1
-Seq4	512	bkpt19	A	G	.	PASS	TYPE=SNP;LEN=1;FUZZY=0	GT	1/1
-Seq4	841	bkpt21	C	T	.	PASS	TYPE=SNP;LEN=1;FUZZY=0	GT	1/1
-Seq4	884	bkpt23	CTAGGGACCTAGACGCAACAGTAACCGCCTCGGAGTAAGCCCTGG	C	.	PASS	TYPE=DEL;LEN=44;FUZZY=2	GT	1/1
+Seq0	297	bkpt3	CTAGCTTGAGAGTGCGTATCTCACCGATCCCCTGGCTATGCTCCGCGATTCACTAGTAGTTTCACGCCGACAGAGCGAAACCGTGATAGGTCATCATGCCGGTCTGCAGTCACGT	C	.	PASS	TYPE=DEL;LEN=114;FUZZY=0	GT	1/1
+Seq0	816	bkpt4	C	A	.	PASS	TYPE=SNP;LEN=1;FUZZY=0	GT	1/1
+Seq2	320	bkpt6	T	C	.	PASS	TYPE=SNP;LEN=1;FUZZY=0	GT	1/1
+Seq2	344	bkpt7	C	A	.	PASS	TYPE=SNP;LEN=1;FUZZY=0	GT	1/1
+Seq2	379	bkpt8	G	C	.	PASS	TYPE=SNP;LEN=1;FUZZY=0	GT	1/1
+Seq3	256	bkpt11	A	T	.	PASS	TYPE=SNP;LEN=1;FUZZY=0	GT	1/1
+Seq3	511	bkpt12	C	A	.	PASS	TYPE=SNP;LEN=1;FUZZY=0	GT	1/1
+Seq3	766	bkpt13	G	A	.	PASS	TYPE=SNP;LEN=1;FUZZY=0	GT	1/1
+Seq4	257	bkpt15	C	T	.	PASS	TYPE=SNP;LEN=1;FUZZY=0	GT	1/1
+Seq4	512	bkpt17	A	G	.	PASS	TYPE=SNP;LEN=1;FUZZY=0	GT	1/1
+Seq4	841	bkpt19	C	T	.	PASS	TYPE=SNP;LEN=1;FUZZY=0	GT	1/1
+Seq4	884	bkpt21	CTAGGGACCTAGACGCAACAGTAACCGCCTCGGAGTAAGCCCTGG	C	.	PASS	TYPE=DEL;LEN=44;FUZZY=2	GT	1/1
+Seq5	100	bkpt22	T	TC	.	PASS	TYPE=INS;LEN=1;FUZZY=1	GT	1/1
+Seq5	199	bkpt23	T	TG	.	PASS	TYPE=INS;LEN=1;FUZZY=1	GT	1/1
+Seq5	300	bkpt24	A	AC	.	PASS	TYPE=INS;LEN=1;FUZZY=0	GT	1/1
+Seq5	400	bkpt25	G	GT	.	PASS	TYPE=INS;LEN=1;FUZZY=2	GT	1/1
+Seq5	500	bkpt26	C	CT	.	PASS	TYPE=INS;LEN=1;FUZZY=0	GT	1/1
+Seq5	600	bkpt27	G	GA	.	PASS	TYPE=INS;LEN=1;FUZZY=0	GT	0/1
+Seq5	700	bkpt28	A	AC	.	PASS	TYPE=INS;LEN=1;FUZZY=0	GT	0/1
+Seq5	800	bkpt29	A	AC	.	PASS	TYPE=INS;LEN=1;FUZZY=2	GT	0/1
+Seq5	900	bkpt30	T	TG	.	PASS	TYPE=INS;LEN=1;FUZZY=2	GT	0/1
+Seq6	98	bkpt31	T	TAT	.	PASS	TYPE=INS;LEN=2;FUZZY=3	GT	1/1
+Seq6	200	bkpt32	A	ACG	.	PASS	TYPE=INS;LEN=2;FUZZY=1	GT	1/1
+Seq6	300	bkpt33	A	ATA	.	PASS	TYPE=INS;LEN=2;FUZZY=1	GT	1/1
+Seq6	400	bkpt34	T	TCA	.	PASS	TYPE=INS;LEN=2;FUZZY=0	GT	1/1
+Seq6	600	bkpt35	T	TCA	.	PASS	TYPE=INS;LEN=2;FUZZY=0	GT	0/1
+Seq6	699	bkpt36	C	CGT	.	PASS	TYPE=INS;LEN=2;FUZZY=2	GT	0/1
+Seq6	800	bkpt37	T	TGG	.	PASS	TYPE=INS;LEN=2;FUZZY=0	GT	0/1


=====================================
test/full_test/gold_bed.breakpoints
=====================================
@@ -2,11 +2,11 @@
 CTCCGGATCTCCGTGTTCTTCGGAAGCTTAG
 >bkpt2_Seq0_pos_123_fuzzy_0_HET  right_kmer
 GTCACGCGCGTCATACTACAGTAAGTTACTG
->bkpt3_Seq1_pos_342_fuzzy_0_HET  left_kmer
+>bkpt4_Seq1_pos_342_fuzzy_0_HET  left_kmer
 GCCGCGCAAAGCCGGTCAACAGCGTTAGTAT
->bkpt3_Seq1_pos_342_fuzzy_0_HET  right_kmer
+>bkpt4_Seq1_pos_342_fuzzy_0_HET  right_kmer
 GTTGAAAGTTTACTCAGATCGCTTCTGTCGG
->bkpt4_Seq2_pos_535_fuzzy_0_HOM  left_kmer
+>bkpt5_Seq2_pos_535_fuzzy_0_HOM  left_kmer
 GGCATGCGTAAGTTATCGTGAAACCATGATG
->bkpt4_Seq2_pos_535_fuzzy_0_HOM  right_kmer
+>bkpt5_Seq2_pos_535_fuzzy_0_HOM  right_kmer
 GCCCCTTACTAGACCAAATGTACTGAATGCG


=====================================
test/full_test/gold_bed.othervariants.vcf
=====================================
@@ -1,6 +1,6 @@
 ##fileformat=VCFv4.1
-##filedate=Thu May  9 11:40:18 2019
-##source=MindTheGap find version 2.2.0
+##filedate=Tue Feb  8 12:46:04 2022
+##source=MindTheGap find version 2.2.3
 ##SAMPLE=file:../../data/reads_r1.fastq,../../data/reads_r2.fastq
 ##REF=file:../../data/reference.fasta
 ##INFO=<ID=TYPE,Number=1,Type=String,Description="SNP, INS, DEL or .">
@@ -9,3 +9,4 @@
 ##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
 #CHROM	POS	ID	REF	ALT	QUAL	FILTER	INFO	FORMAT	G1
 Seq0	101	bkpt1	T	C	.	PASS	TYPE=SNP;LEN=1;FUZZY=0	GT	1/1
+Seq0	297	bkpt3	CTAGCTTGAGAGTGCGTATCTCACCGATCCCCTGGCTATGCTCCGCGATTCACTAGTAGTTTCACGCCGACAGAGCGAAACCGTGATAGGTCATCATGCCGGTCTGCAGTCACGT	C	.	PASS	TYPE=DEL;LEN=114;FUZZY=0	GT	1/1


=====================================
test/full_test/gold_fill.output
=====================================
@@ -1,7 +1,6 @@
-nb breakpoints=7
 MindTheGap fill                         
-    version                                  : 1.0.0
-    gatb-core-library                        : 1.2.0
+    version                                  : 2.2.3
+    gatb-core-library                        : 1.4.2
     supported_kmer_sizes                     : 32 64 96 128
 Parameters                              
     Input data                              
@@ -9,18 +8,21 @@ Parameters
         Breakpoints                              : gold.breakpoints
     Graph                                   
         kmer-size                                : 31
-        abundance_min (auto inferred)            : 3 
-        abundance_min (used)                     : 3
-        nb_solid_kmers                           : 5473
-        nb_branching_nodes                       : 34
+        abundance_min (auto inferred)            : 7 
+        abundance_min (used)                     : 7
+        nb_solid_kmers                           : 7419
+        nb_branching_nodes                       : 36
     Assembly options                        
         max_depth                                : 10000
         max_nodes                                : 100
 Results                                 
     Breakpoints                             
-        nb_input                                 : 7
-        nb_filled                                : 7
-            unique_sequence                          : 7
-            multiple_sequence                        : 0
-    Time                                     : 0.0 s
-    Output file                              : gold.insertions.fasta
+        nb_input_breakpoints                     : 8
+        nb_filled_breakpoints                    : 8
+            as_unique_sequence                       : 8
+            as_multiple_sequence                     : 0
+    Time                                     : 1.0 s
+    Output files                            
+        assembled sequence file                  : gold.insertions.fasta
+        insertion variant vcf file               : gold.insertions.vcf
+        assembly statistics file                 : gold.info.txt


=====================================
test/full_test/gold_find.output
=====================================
@@ -1,6 +1,6 @@
 MindTheGap find                         
-    version                                  : 1.0.0
-    gatb-core-library                        : 1.2.0
+    version                                  : 2.2.3
+    gatb-core-library                        : 1.4.2
     supported_kmer_sizes                     : 32 64 96 128
 Parameters                              
     Input data                              
@@ -8,11 +8,11 @@ Parameters
         Reference                                : ../../data/reference.fasta
     Graph                                   
         kmer-size                                : 31
-        abundance_min (auto inferred)            : 3 
-        abundance_min (used)                     : 3
+        abundance_min (auto inferred)            : 7 
+        abundance_min (used)                     : 7
         abundance_max                            : 2147483647
-        nb_solid_kmers                           : 5473
-        nb_branching_nodes                       : 34
+        nb_solid_kmers                           : 7419
+        nb_branching_nodes                       : 36
     Breakpoint detection options            
         max_repeat                               : 5
         hetero_max_occ                           : 1
@@ -25,12 +25,14 @@ Results
         homozygous                               : 5
             clean                                    : 3
             fuzzy                                    : 2
-        heterozygous                             : 2
-            clean                                    : 1
+        heterozygous                             : 3
+            clean                                    : 2
             fuzzy                                    : 1
     Other variants                          
-        deletions                                : 3
-        SNPs                                     : 13
+        deletions                                : 2
+        Homozygous insertions 1-2 bp size        : 9
+        Heterozygous insertions 1-2 bp size      : 7
+        SNPs                                     : 11
     Time                                     : 0.0 s
     Output files                            
         graph_file                               : gold.h5


=====================================
test/full_test/reference.fasta
=====================================
@@ -8,3 +8,7 @@ TGCTGCCGATCGCTACGACGTCCTACCTTACACACAACGGGCCGCGTTCATACCCACGTATGAAGACATGCGGTTATCCG
 TATTGCGCCCTTCAAGAAGCTTCTGCTGACCGTAGGCGTCTCGGCGGTTTGTACTTTGAAAAATTAGCTGCACTACATCCGATGGGTATCCCTCCTCAATCTCAGCAGACCCGGAAAGCGATAGAATCAGCCACGCGGTCGTCCGGGCTAGGGGCCCTGCGCAAGGAAGGTTGGACAGGGCTAGACCCGGAAGCATCGGCTTTTCCTAAATGGTGACGGAGTTATATAGGGTAAGCCTGATAGCGCGGTAGGTGTAATGGCCATCCCCTCGCCTAGCGTGCGCGCAGACAAGTCCAGTCCCGGAGGAGGCATAGGCCTCATTATCATTTCCCTAGAATCGCTCTTGACATCTAGGTTGTACTAGGGACCAGGCGCCCAAAGCGGACGGTTCTCCGTGCTTTCGTGCCGTTTCAGCGTAAGATGCTATTTTTTGGGGAAATGGTCGGCGTGTGCGGGGGAGAACCACGGTACCAACTACGATAAGTCCGTCGTGTAACTTACGTGAAGGTGCTGTGAAGCAGGAATCCGTGCCAAAATGTCCGTGCGATATCCAACTTTCATAGTATTACACGAGAGCCTATGATTTGCCCAGGCGCGACCCGTGAATCGAGGTAATCGCCGACCAGATATTGCGAAACACCACATTACATGACTACTGTCCGCTTGAAGAGTTATATACTTGACAGTCCTGGTTGACGGCACAGCATATCTCCAATGTGTGGTTTAAAGTCTCACGTTCTTCATGCGCGCCGGCCCATGGGAACAGGTATCCTTACTTTCGGTACAAATGAGGCTCCAAAATAGCACGCTTGCAGCAGTCAAGTTGAACGCCTTAAAAGGCACCGCCGCTCGTTCATTGGGATTCCTTGAGAATCGTGACTTGTTACACTATAAGATCATGGATTGGACAAAATAGGCCAACTCCCGCACGCTGTGGCTATTCTTAAGTTGCATAGGTGGGAGTAGCCTTATACTCGATTTCTAAAAAGAGTAGGTGAGC
 >Seq4
 TTCCGGCGCCGCACTAATTGAAGTGGTGAGCTGACCAGTCGTTCAGGATCCGAAGGCGGGGATGGCGCTATAGGAGCCGGCAGGTATGCTTTGCCGCAAAATTTCGGGGTGGTGGAACCGTCTTACCGAAAGTTAGCTACAGCCTGGAATGTGAAATTCCATGACCTGCCCGTCCTGTGTCCACAGGGCGACATTTGCCACGTAGGTAGGGCGACCATTAGAATGCTGCATTATCGGGCGATAAAAAGTTTTATACCCAAGAATCCTACAAAGATGAAAATTTCGAAGAGCTGCACGCAGTTGTAAGTTGCTTTTCTGGGGTAATCGAGATTCTCCACCATAACCTGCGCAATGCATCGTGAAGCTTTACCGCGCCCAAGGGGAGCGTCTCAGTGGGGTTGCCTCCAGGGATATATTGAAAGTTGAAGAAGAAGATCACAGGTTAAGCGGTATGTTAAGTTAGAACTCACGGGGAGCCGCCTTGATTTTGTTCGACATGAACCAGAGACCAAGTGTGTTATGTTCTGGAACCTTAATACGTACGTCGCCAGCACCGAGCCGGCACTCCATCTCTTTTGGGTGCGCAACATTGCTATACTTAGGATCCATTGACATCTGTCAGCCGTCTTTCCAGAACGTTATAAGACTCGTGAGGAAATTATACAAATCGTTGCCATCATCCAAAGCAAAGTACTTCCGCTTAGGAGTGCCTTGAAGAACCGATTATCTCTGACAATGTAATGCCACAGCACCCTCGACAAAGTTCTACATTCGTTCCAGGTCATGATACAGCGCGCTAAATTACCGCTACGAGCCATACCCCGAACATTGAGACCTGGCCAGTAGGTAGGTGTCAAATCGATATCCACACCTGTCGAAGCAGCTAGGGACCTAGACGCAACAGTAACCGCCTCGGAGTAAGCCCTGGTAAAGATCGGTTGCGGCGGGAGTCCTCCATTCAGGCCAAACGTGCAGTGCTCGATGTGCTTCCTATCGCTCT
+>Seq5
+GATGTTTAGAAGTTTCCAGGTCACGCCAATGATTGGCATTTACACACGTGGATCAGCGGACATATCTAACCCTTAGTGTTCTTAAGAGCAACTCACTACTCATTTCCACTAACCCCGCCGGCGGTAATTCCAATCTAGTTGATCAGACTTCCCAGTCAATGAAAGCGACACCGTGCGTCTGTAATACCAACAAGACCCTGCTGTCGTCCCGCAGAGGACGCGGCACCTCCGGATTTTGAGTCCAGTCTGAACGATTTTCGATCACTCACCATGGATCTGGAAAACGGAGTCGAGTACTCAGAGCCAAATTGATGCATTTCCAATGACCCGATGCAGGTGCGACCGATCTTCGCCTATGCTTCCCGCCGTAATTATTGAGTCTGGGTCCCGGCCGCTAACGTTGACTCACGGGGAGGTACCCGTGCGTATTCTTCTCAAAGTGACGCTGGACAGCAGCGCATGTCCGAGCCCCATCGTCCTATCTGGTGTAGAGTCTTACCCTAATTAGAGTGATCGAACCAGTAGGTGTCGCGGTCTTAGGGCTCCCATTGTCCAAGGGAACGTGAACAGATATGAATCTGGGAGAATAGTGCAGCGTTGCCCTTCTGGTCGGTCAGCCCTTGCCTACGGCCCGTATGCGGAGAATGAAGGCGTGAAACATTCTGCTCTTTTAGAAGCAGCGGCTGCACCCGTATAACAATCGCACGATCGTACGTCTCATTTGCCGCGTTGGCGCGCCCGTGGATGATGGACCACGGTATGAACCTCTGCACTTCAAATTTGACGCAATCCTGCACTCACCGCACACAGTTCTAGTCTAACCGTCGCAGTGTCTGCTTTAAGGTAGAGATCGATACTTAGGATATGTTCATGTGTGTTTGTAGCGCTGGACCCTCTTATGGTGTGGTCACTTGTGATGGATCGAGGAACTTAGGCGGTTAACTTGTTTCGACGTCTCACCGACAATATCAGGATTTAGTATCG
+>Seq6
+ACCGAAAATGACAATGTTCACACGCATGCTCGGCGTGGAAAAGAGCCTTTTCTAAGACCGACTCGTTCCGGGCAGCAGGATTATTAGCCAATCAAAATTATCGACCGGTCATCAAGCTGCGATAGTGCAGGCGCATGCCGTCCAATGGGTCCACGGCGGAAGTGCGTTCGTCTACTCTGTCAAATCTTAACATTTTTTGAGGCTAATCCGGCCGGTAGTGTACCGTGAACCAAAGTCCTTCTACGAGCGTATTAGATTGCTCAAAAGATCCGGGAGAATTGACCAGGTCGTATCTTTAAAAACGCTGGTGCGAGCAGCTGCTGTTTTATCAACACCCATTTAGTCCTGTGAAGTTTGCTTAGCAGATACACCTTCCCGCGTGGTATGAGAGGCTGTTCTTTTAAAAACTATGAGGCTCTGGCACCTTCGACGCTAACAAAGTCCCCACGGACCATGATACCCTTACGCAACTCTCTTTGCACGCTAGGGCGAGAGTACTGCCCCTAGACTAGGTACACGCCGGGTAAACTCTCTCGCACACCTTTACGCTCGACTACAGGCTTCTAACCCTTCCGAACGCATATAATTCAAATGGCACTTAGTAACAGACGAATCACGGCTCACAGGCAGAATTCACTGGAGTAAAAGGATTCAGAACAATAGATAGTGTGTTAACTTTACAGTCATCCGTATTATAACGTAGCGAGAGGATTGAGTTCTTGTTAGGAAGGAAGGTCCTATAGACGAGTGCGGTAGCGCACCCGGTCGCCTTGCGTAGTCATGCCCGACGTGTTGATGGTTCCCTTTTAGCCGCCACACAAGGGATCCGAGGGTGAGAGACACATGGCCCTCACCGACGAGACTTACTCAGCCTGCCTCGCTATTGCCCTCTTTTTGATCGTCCCTTTGTGGCTCTCGAGGACTCGTGCAGCGTGTATCTGGGGATTTGTAAGCTTAAGACTACCTTCCATAGGA



View it on GitLab: https://salsa.debian.org/med-team/mindthegap/-/commit/b905744345c06c4c96602510025ab2bf5aefebbc

-- 
View it on GitLab: https://salsa.debian.org/med-team/mindthegap/-/commit/b905744345c06c4c96602510025ab2bf5aefebbc
You're receiving this email because of your account on salsa.debian.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20220529/c45e7d12/attachment-0001.htm>


More information about the debian-med-commit mailing list