[med-svn] [Git][med-team/staden-io-lib][master] 6 commits: routine-update: New upstream version

Étienne Mollier (@emollier) gitlab at salsa.debian.org
Sat Jan 20 19:35:36 GMT 2024



Étienne Mollier pushed to branch master at Debian Med / staden-io-lib


Commits:
e1543c41 by Étienne Mollier at 2024-01-20T18:48:24+01:00
routine-update: New upstream version

- - - - -
7bb8453e by Étienne Mollier at 2024-01-20T18:48:25+01:00
New upstream version 1.15.0
- - - - -
94f9bfb1 by Étienne Mollier at 2024-01-20T18:48:37+01:00
Update upstream source from tag 'upstream/1.15.0'

Update to upstream version '1.15.0'
with Debian dir 12f011155c5c87ef4ee52e8eb5dcbb5b3aca127e
- - - - -
47e97ba2 by Étienne Mollier at 2024-01-20T18:48:37+01:00
routine-update: Standards-Version: 4.6.2

- - - - -
c5991c59 by Étienne Mollier at 2024-01-20T19:01:34+01:00
d/*lintian-overrides: fix mismatched override.

- - - - -
75c3b713 by Étienne Mollier at 2024-01-20T19:03:39+01:00
typo.patch: fix typo caught by lintian.

- - - - -


14 changed files:

- CHANGES
- README.md
- configure.ac
- debian/changelog
- debian/control
- debian/patches/series
- + debian/patches/typo.patch
- debian/staden-io-lib-examples.lintian-overrides
- io_lib/cram_encode.c
- progs/scramble.c
- tests/data/xx#MD.sam
- tests/data/xx#MD2.sam
- tests/data/xx#rg.sam
- tests/data/xx.fa


Changes:

=====================================
CHANGES
=====================================
@@ -1,3 +1,21 @@
+Version 1.15.0 (14th April 2023)
+--------------
+
+Version number bumped to reflect the official status of CRAM 3.1.
+
+Updates:
+
+* Formally accept CRAM 3.1 as an official standard.  Warning removed.
+  For best compatibility CRAM 3.0 is still the default CRAM, but use
+  "-V3.1" to specify the version.
+
+* Updated to latest htscodecs.  This has a significant speed
+  improvement in encoding with fqzcomp (enabled in "-X small" profile).
+
+  Tested on a NovaSeq dataset, encoding from BAM to CRAM was 27% faster.
+  Decoding a CRAM with fqzcomp is also around 6% faster.
+
+
 Version 1.14.15 (6th December 2022)
 ---------------
 


=====================================
README.md
=====================================
@@ -1,5 +1,5 @@
-Io_lib:  Version 1.14.15
-========================
+Io_lib:  Version 1.15.0
+=======================
 
 Io_lib is a library of file reading and writing code to provide a general
 purpose SAM/BAM/CRAM, trace file (and Experiment File) reading
@@ -33,131 +33,30 @@ See the CHANGES for a summary of older updates or git logs for the
 full details.
 
 
-Version 1.14.15 (6th December 2022)
----------------
+Version 1.15.0 (14th April 2023)
+--------------
 
-This is primarily a bug fix release.
+The first release that no longer warns about CRAM 3.1 being draft.
+No changes have been made to the format and it is fully compatible
+with the 1.14.x releases.
 
 
-Version 1.14.14 (17th March 2021)
----------------
+Technology Demo: 4.0
+====================
 
-This is simply a bug fix release.  It also updates to the latest
-htscodecs submodule, now at an official 1.0 release.
+The current official GA4GH CRAM version is 3.1.
 
-Version 1.14.13 (3rd July 2020)
----------------
+The current default CRAM output is 3.0, for maximum compatibility with
+other tools.  Use the -V3.1 option to select CRAM 3.1 if needed.
 
-This release has a mixture of on-going CRAM 4 work (not compatible
-with previous CRAM 4) and some more general quality of life
-improvements for all CRAM versions including speed-ups and better
-multi-threading.
-
-Note both CRAM 3.1 and 4.0 are still to be considered an unofficial
-CRAM extensions.
-
-Updates:
-
-* Scramble can now filter-in or filter-out aux tags during
-  transcoding.  This is done using -d and -D options.  For example:
-
-      scramble -D OQ,BI,BD in.bam out.cram
-
-  removes the GATK added OQ, BI and BD aux tags.
-  Requested by @jhaezebrouck in issue #24.
-
-* The Scramble -X <profile> options are now implemented using a
-  CRAM_OPT_PROFILE option.  This simplifies the scramble code and
-  makes it easier to call from a library.  This also fixes a number of
-  bugs in the order of argument parsing.
-
-* Improved CRAM writing speeds.
-
-  The bam_copy function now only copies the number of used bytes
-  rather than the number of allocated bytes, which can sometimes be
-  substantially smaller.  As this was done in the main thread it may
-  have a significant benefit when multi-threading.
-
-* Added libdeflate support into CRAM too (in addition to the existing
-  support in BAM).  This isn't a huge change to CRAM speeds except at
-  high levels (-8 and -9) which are now slower, but also better
-  compression ratio.  A modest 2-3% speed gain is visible are low and
-  mid levels, and at -1/-2 to -4 the compression ratio is also
-  improved.
-
-* CRAM 3.1 compression level -1 is now 25% faster, but 4% larger.
-  This is achieved by difference choice of compression codecs, most
-  notably disabling the name tokeniser for level 1.  Use level 2 for
-  something comparable to the old behaviour.
-
-* Added an io_lib/version.h to make it easier to detect the version
-  being compiled against using IOLIB_VERSION macros.
-  Requested by German Tischler in issue #25.
-
-* Refactored the cram encoding interface used by biobambam.
-  Implemented by German Tischler in PR#27.
-
-* CRAM 4 now uses E_CONST instead of a uni-value version of
-  E_HUFFMAN.  Also added offset field to VARINT_SIGNED and
-  VARINT_UNSIGNED which helps for data series that have values from -1
-  to MAXINT.
-
-* CRAM 4 container structure has changed so that all values are
-  variable sized integers instead of fixed size.
-
-* Further improvements with CRAM 4's use of signed values.
-  - Ref_seq_id is container and slice headers are now signed.
-  - RI (ref ID) data series and NS (mate ref ID) are also now signed
-    as -1 is a valid value.
-  - Embedded ref id is now 0 for unusued instead of -1.
-
-* Reversed the use of CRAM 4 delta encoding for the B array.  It only
-  helps at the moment for ONT signal data, so it needs more work to
-  make it auto-detect when delta makes sense. (Enabling it globally
-  for CRAM4 B aux tags was accidental.)
-
-* Htscodecs submodule has gained support for big-endian platforms
-  Other big-endian improvements to parts of CRAM4 too.
-
-Bug fixes:
-
-* Fixed CRAM MD tag generatin when using the "b" feature code
-  (NB: unused by known CRAM encoders).
-  Also see https://github.com/samtools/htslib/pull/1086 for more details.
-
-* Fixed CRAM quality string when using "q" feature code (unused by
-  encoders?) and in lossy-quality mode (maybe utilised in old
-  Cramtools).
-  Also see https://github.com/samtools/htslib/pull/1094 for more details.
-
-* Fixed some minor memory leaks.
-
-* "Scramble -X archive -1" enabled lzma, which should only have
-  arrived at level 7 and above. (It compared integer 7 vs ASCII '1'.)
-
-* Removed minor compilation warning in printf debugging.
-
-* Fixed a 7 year old bug in scram_pileup which couldn't cope with
-  soft-clips being followed by hard-clips.
-
-
-Technology Demo: CRAM 3.1 and 4.0
-=================================
-
-The current official GA4GH CRAM version is 3.0.
-
-For purposes of *EVALUATION ONLY* this release of io_lib includes CRAM
-version 3.1, with new compression codecs (but is otherwise identical
-file layout to 3.0), and 4.0 with a few additional format
+For purposes of *EVALUATION ONLY* this release of io_lib also includes
+an experimental CRAM version 4.0.  The format very likely to change
+and should not be used for production data.  CRAM 4.0 includes format
 modifications, such as 64-bit sizes, deduplication of read names,
 orientation changes of quality strings and a revised variable sized
-integer encoding.
+integer encoding.  It can be enabled using scramble -V4.0
 
-They can be turned on using e.g. scramble -V3.1 or scramble -V4.0.
-It is likely CRAM v4.0 will be official significantly later, but we
-plan on v3.1 being a recognised GA4GH standard this year.
-
-By default enabling either of these will also enable the new codecs.
+Enabling CRAM 3.1 or 4.0 will also enable the new codecs.
 Which codecs are used also depends on the profile specified (eg via
 "-X small").  Some of the new codecs are considerably slower,
 especially at decompression, but by default CRAM 3.1 aims to be
@@ -167,79 +66,37 @@ small and archive respectively).
 
 Here are some example file sizes and timings with different codecs and
 levels on 10 million 150bp NovaSeq reads, single threaded.  Decode
-timing is checked using "scram_flagstat -b".  Tests were performed
-on an Intel i5-4570 processor at 3.2GHz.
+timing is checked using "scram_flagstat -b".
+
+Table produced with Io_lib 1.15.0 on a laptop with Intel i7-1185G7
+CPU running Ubuntu 20.04 under Microsoft's WSL2.
 
 |Scramble opts.      |Size(MB) |Enc(s)|Dec(s)|Codecs used                |
 |--------------------|--------:|-----:|-----:|---------------------------|
-|-O bam              |    531.9|  92.3|   7.5|bgzf(zlib)                 |
-|-O bam -1           |    611.4|  26.4|   5.4|bgzf(libdeflate)           |
-|-O bam (default)    |    539.5|  45.0|   4.9|bgzf(libdeflate)           |
-|-O bam -9           |    499.5| 920.2|   4.9|bgzf(libdeflate)           |
-||||||
-|-V2.0 -X fast       |    317.7|  38.8|  11.8|(default, level 1)         |
-|-V2.0 (default)     |    267.6|  47.0|  10.5|(default)                  |
-|-V2.0 -X small      |    218.0| 124.6|  33.1|bzip2                      |
-||||||
-|-V3.0 -X fast       |    264.9|  31.3|  10.8|(default, level 1)         |
-|-V3.0 (default)     |    223.7|  34.7|  10.3|(default)                  |
-|-V3.0 -X small      |    212.3|  88.3|  18.2|bzip2                      |
-|-V3.0 -X archive    |    209.4|  98.7|  18.2|bzip2                      |
-||||||
-|-V3.1 -X fast       |    262.4|  29.1|   9.3|rANS++                     |
-|-V3.1 (default)     |    186.4|  33.7|   8.3|rANS++,tok3                |
-|-V3.1 -X small      |    176.8|  74.0|  35.2|rANS++,tok3,fqz            |
-|-V3.1 -X archive    |    171.9| 127.9|  34.9|rANS++,tok3,fqz,bzip2,arith|
-||||||
-|-V4.0 -X fast       |    251.2|  28.9|   9.6|rANS++                     |
-|-V4.0 (default)     |    182.1|  32.9|   8.2|rANS++,tok3                |
-|-V4.0 -X small      |    170.9|  70.9|  35.0|rANS++,tok3,fqz            |
-|-V4.0 -X archive    |    166.9| 116.4|  34.2|rANS++,tok3,fqz,bzip2,arith|
-
-We also tested on a small human aligned HiSeq run (ERR317482)
-representing older Illumina data with pre-binning era quality values.
-This dataset shows less impressive gains with 4.0 over 3.0 in the
-default profile, but major gains in small profile once fqzcomp quality
-encoding is enabled.
-
-Note for this file, the file sizes are larger meaning less disk
-caching is possible (the test machine wasn't a memory stressed
-desktop).  Threading was also enabled, albeit with just 4 threads,
-which further exacerbates I/O bottlenecks.  The previous test
-demonstrated BAM being faster to read than CRAM, but with large files
-in a more I/O stressed situation this test demonstrates the default
-profile of CRAM is faster to read than BAM, due to the smaller I/O
-footprint.
-
-NB: the table below was produced with 1.14.12.
-
-|Scramble opts.         |Size(MB) |Enc(s)|Dec(s)|Codecs used                     |
-|--------------------   |--------:|-----:|-----:|--------------------------------|
-|-t4 -O bam (default)   |    6526 | 115.4|  44.7|bgzf(libdeflate)                |
+|-O bam (default)    |    518.2|  65.8|   5.7|bgzf(zlib)                 |
+|-O bam -1           |    584.5|  17.4|   3.5|bgzf(libdeflate)           |
+|-O bam (default)    |    524.6|  27.8|   2.9|bgzf(libdeflate)           |
+|-O bam -9           |    486.5| 810.4|   3.0|bgzf(libdeflate)           |
 ||||||
-|-t4 -V2.0 -X fast      |    3674 |  87.4|  31.4|(default, level 1)              |
-|-t4 -V2.0 (default)    |    3435 |  91.4|  30.7|(default)                       |
-|-t4 -V2.0 -X small     |    3373 | 145.5|  47.8|bzip2                           |
-|-t4 -V2.0 -X archive   |    3377 | 166.3|  49.7|bzip2                           |
-|-t4 -V2.0 -X archive -9|    3125 |1900.6|  76.9|bzip2                           |
+|-V2.0 -X fast       |    294.5|  23.1|   7.8|(default, level 1)         |
+|-V2.0 (default)     |    252.3|  32.9|   8.0|(default)                  |
+|-V2.0 -X small      |    208.0|  85.2|  23.5|bzip2                      |
+|-V2.0 -X archive    |    206.0|  88.1|  24.3|bzip2                      |
 ||||||
-|-t4 -V3.0 -X fast      |    3620 |  88.3|  29.3|(default, level 1)              |
-|-t4 -V3.0 (default)    |    3287 |  90.5|  29.5|(default)                       |
-|-t4 -V3.0 -X small     |    3238 | 128.5|  40.3|bzip2                           |
-|-t4 -V3.0 -X archive   |    3220 | 164.9|  50.0|bzip2                           |
-|-t4 -V3.0 -X archive -9|    3115 |1866.6|  75.2|bzip2, lzma                     |
+|-V3.0 -X fast       |    241.1|  19.7|   8.5|(default, level 1)         |
+|-V3.0 (default)     |    208.5|  23.0|   8.8|(default)                  |
+|-V3.0 -X small      |    201.7|  60.0|  14.5|bzip2                      |
+|-V3.0 -X archive    |    199.9|  61.7|  13.6|bzip2                      |
 ||||||
-|-t4 -V3.1 -X fast      |    3611 |  87.9|  29.2|rANS++                          |
-|-t4 -V3.1 (default)    |    3161 |  88.8|  29.7|rANS++,tok3                     |
-|-t4 -V3.1 -X small     |    2249 | 192.2| 146.1|rANS++,tok3,fqz                 |
-|-t4 -V3.1 -X archive   |    2157 | 235.2| 127.5|rANS++,tok3,fqz,bzip2,arith     |
-|-t4 -V3.1 -X archive   |    2145 | 480.3| 128.9|rANS++,tok3,fqz,bzip2,arith,lzma|
+|-V3.1 -X fast       |    237.1|  22.1|   7.9|rANS++                     |
+|-V3.1 (default)     |    175.8|  26.7|   8.9|rANS++,tok3                |
+|-V3.1 -X small      |    166.9|  47.9|  24.6|rANS++,tok3,fqz            |
+|-V3.1 -X archive    |    162.2|  72.5|  20.5|rANS++,tok3,fqz,bzip2,arith|
 ||||||
-|-t4 -V4.0 -X fast      |    3551 |  87.8|  29.5|rANS++                          |
-|-t4 -V4.0 (default)    |    3148 |  88.9|  30.0|rANS++,tok3                     |
-|-t4 -V4.0 -X small     |    2236 | 189.7| 142.6|rANS++,tok3,fqz                 |
-|-t4 -V4.0 -X archive   |    2139 | 226.7| 127.5|rANS++,tok3,fqz,bzip2,arith     |
-|-t4 -V4.0 -X archive -9|    2132 | 453.5| 128.2|rANS++,tok3,fqz,bzip2,arith,lzma|
+|-V4.0 -X fast       |    227.5|  16.6|   6.2|rANS++                     |
+|-V4.0 (default)     |    172.8|  19.7|   6.3|rANS++,tok3                |
+|-V4.0 -X small      |    162.3|  34.8|  20.2|rANS++,tok3,fqz            |
+|-V4.0 -X archive    |    157.9|  82.2|  26.2|rANS++,tok3,fqz,bzip2,arith|
 
 
 Building


=====================================
configure.ac
=====================================
@@ -1,5 +1,5 @@
 dnl Process this file with autoconf to produce a configure script.
-AC_INIT(io_lib, 1.14.15)
+AC_INIT(io_lib, 1.15.0)
 IOLIB_VERSION=$PACKAGE_VERSION
 IOLIB_VERSION_MAJOR=`expr "$PACKAGE_VERSION" : '\([[0-9]]*\)'`
 IOLIB_VERSION_MINOR=`expr "$PACKAGE_VERSION" : '[[0-9]]*\.\([[0-9]]*\)'`
@@ -69,7 +69,7 @@ AX_SUBDIRS_CONFIGURE([htscodecs],[[--disable-shared],[--with-pic]])
 #       libstaden-read.so.1.1.0
 
 VERS_CURRENT=15
-VERS_REVISION=2
+VERS_REVISION=3
 VERS_AGE=1
 AC_SUBST(VERS_CURRENT)
 AC_SUBST(VERS_REVISION)


=====================================
debian/changelog
=====================================
@@ -1,3 +1,11 @@
+staden-io-lib (1.15.0-1) UNRELEASED; urgency=medium
+
+  * Team upload.
+  * New upstream version
+  * Standards-Version: 4.6.2 (routine-update)
+
+ -- Étienne Mollier <emollier at debian.org>  Sat, 20 Jan 2024 18:48:24 +0100
+
 staden-io-lib (1.14.15-1) unstable; urgency=medium
 
   * New upstream version (bugfix release needed for salmon)


=====================================
debian/control
=====================================
@@ -13,7 +13,7 @@ Build-Depends: debhelper-compat (= 13),
                d-shlibs,
                libbz2-dev,
                liblzma-dev
-Standards-Version: 4.6.1
+Standards-Version: 4.6.2
 Vcs-Browser: https://salsa.debian.org/med-team/staden-io-lib
 Vcs-Git: https://salsa.debian.org/med-team/staden-io-lib.git
 Homepage: https://github.com/jkbonfield/io_lib


=====================================
debian/patches/series
=====================================
@@ -2,3 +2,4 @@
 pathmax.patch
 fix_fseeko.patch
 usedebianhtscodecs.patch
+typo.patch


=====================================
debian/patches/typo.patch
=====================================
@@ -0,0 +1,17 @@
+Description: fix typo caught by lintian.
+Author: Étienne Mollier <emollier at debian.org>
+Forwarded: no
+Last-Update: 2024-01-20
+---
+This patch header follows DEP-3: http://dep.debian.net/deps/dep3/
+--- staden-io-lib.orig/man/man3/read_scf.3
++++ staden-io-lib/man/man3/read_scf.3
+@@ -66,7 +66,7 @@
+ .SH RETURN VALUES
+ .LP
+ On successful completion, the \fBread_scf()\fR and \fBfread_scf()\fR functions
+-return a pointer to a \fBScf\fR structure. Othewise these functions return a
++return a pointer to a \fBScf\fR structure. Otherwise these functions return a
+ null pointer.
+ .LP
+ On successful completion, the \fBread_scf_header()\fR function returns 0.


=====================================
debian/staden-io-lib-examples.lintian-overrides
=====================================
@@ -1,2 +1,2 @@
 # The scripts are examples and intentionally put there
-staden-io-lib-examples: script-in-usr-share-doc usr/share/doc/staden-io-lib/test/*
+staden-io-lib-examples: script-in-usr-share-doc [usr/share/doc/staden-io-lib/test/*]


=====================================
io_lib/cram_encode.c
=====================================
@@ -535,6 +535,7 @@ static int cram_encode_slice_read(cram_fd *fd,
     int32_t i32;
     int64_t i64;
     unsigned char uc;
+    int explicit_qual = 0;
 
     //fprintf(stderr, "Encode seq %d, %d/%d FN=%d, %s\n", rec, core->byte, core->bit, cr->nfeature, s->name_ds->str + cr->name);
 
@@ -609,11 +610,6 @@ static int cram_encode_slice_read(cram_fd *fd,
     /* Aux tags */
     r |= h->codecs[DS_TL]->encode(s, h->codecs[DS_TL], (char *)&cr->TL, 1);
 
-    // qual
-    r |= h->codecs[DS_QS]->encode(s, h->codecs[DS_QS],
-				  (char *)BLOCK_DATA(s->qual_blk) + cr->qual,
-				  cr->len);
-
     // features (diffs)
     if (!(cr->flags & BAM_FUNMAP)) {
 	int prev_pos = 0, j;
@@ -686,6 +682,7 @@ static int cram_encode_slice_read(cram_fd *fd,
 		uc  = f->B.qual;
 		r |= h->codecs[DS_QS]->encode(s, h->codecs[DS_QS],
 					      (char *)&uc, 1);
+		explicit_qual++;
 		break;
 
 	    case 'b':
@@ -700,6 +697,7 @@ static int cram_encode_slice_read(cram_fd *fd,
 		uc  = f->Q.qual;
 		r |= h->codecs[DS_QS]->encode(s, h->codecs[DS_QS],
 					      (char *)&uc, 1);
+		explicit_qual++;
 		break;
 
 	    case 'N':
@@ -736,6 +734,11 @@ static int cram_encode_slice_read(cram_fd *fd,
 	    r |= h->codecs[DS_BA]->encode(s, h->codecs[DS_BA], seq, cr->len);
     }
 
+    // qual
+    r |= h->codecs[DS_QS]->encode(s, h->codecs[DS_QS],
+				  (char *)BLOCK_DATA(s->qual_blk) + cr->qual
+				  + explicit_qual, cr->len);
+
     return r ? -1 : 0;
 }
 


=====================================
progs/scramble.c
=====================================
@@ -184,7 +184,7 @@ static int filter_tags(bam_seq_t *s, char *aux_filter, int keep) {
 
 static void usage(FILE *fp) {
     fprintf(fp, "  -=- sCRAMble -=-     version %s\n", IOLIB_VERSION);
-    fprintf(fp, "Author: James Bonfield, Wellcome Trust Sanger Institute. 2013-2022\n\n");
+    fprintf(fp, "Author: James Bonfield, Wellcome Trust Sanger Institute. 2013-2023\n\n");
 
     fprintf(fp, "Usage:    scramble [options] [input_file [output_file]]\n");
 
@@ -504,10 +504,6 @@ int main(int argc, char **argv) {
 	fprintf(stderr, "\nWARNING: this version of CRAM is not a recognised GA4GH standard.\n"
 		"Note this CRAM version is a technology demonstration only.\n"
 		"Future versions of Scramble may not be able to read these files.\n\n");
-    } else if (cram_default_version() > 300) {
-	fprintf(stderr, "\nWARNING: this version of CRAM has yet to be formally signed off.\n"
-		"CRAM 3.1 has multiple implementations that have been cross-validated, but\n"
-		"the specification document has not yet been accepted as an official standard.\n\n");
     }
 
     if (argc - optind > 2) {


=====================================
tests/data/xx#MD.sam
=====================================
@@ -1,7 +1,7 @@
 @SQ	SN:xx	LN:30
 @CO	All MD and NM should match the stored values
 a	0	xx	6	1	10M	*	0	0	AAAAATTTTT	*	co:Z:no fields
-a	0	xx	6	1	10M	*	0	0	AAAAGGTTTT	*
+a	0	xx	6	1	11M	*	0	0	AAAAGRTTTTT	ABCDEFGHIJK
 a	0	xx	6	1	10M	*	0	0	GAAAATTTTG	*
 i	0	xx	6	1	5M1I5M	*	0	0	AAAAAGTTTTT	*
 i	0	xx	6	1	5M3I5M	*	0	0	AAAAAGGGTTTTT	*
@@ -11,12 +11,12 @@ d	0	xx	6	1	5M10D5M	*	0	0	AAAAACCCCC	*
 d	0	xx	6	1	5M10N5M	*	0	0	AAAAACCCCC	*
 sid	0	xx	6	1	1S4M10D5I4M1S	*	0	0	AAAAAGGGGGCCCCC	*
 a	0	xx	6	1	10M	*	0	0	AAAAATTTTT	*	MD:Z:10	NM:i:0	co:Z:correct fields
-a	0	xx	6	1	10M	*	0	0	AAAAGGTTTT	*	MD:Z:4A0T4	NM:i:2
+a	0	xx	6	1	11M	*	0	0	AAAAGRTTTTT	ABCDEFGHIJK	MD:Z:4A0T4Y0	NM:i:3
 a	0	xx	6	1	10M	*	0	0	GAAAATTTTG	*	MD:Z:0A8T0	NM:i:2
 i	0	xx	6	1	5M1I5M	*	0	0	AAAAAGTTTTT	*	MD:Z:10	NM:i:1
 i	0	xx	6	1	5M3I5M	*	0	0	AAAAAGGGTTTTT	*	MD:Z:10	NM:i:3
 i	0	xx	6	1	10M2I	*	0	0	AAAAATTTTTCC	*	MD:Z:10	NM:i:2
 i	0	xx	6	1	10M2P2I	*	0	0	AAAAATTTTTCC	*	MD:Z:10	NM:i:2
-d	0	xx	6	1	5M10D5M	*	0	0	AAAAACCCCC	*	MD:Z:5^TTTTTTTTTT5	NM:i:10
+d	0	xx	6	1	5M10D5M	*	0	0	AAAAACCCCC	*	MD:Z:5^TTTTTYTTTT5	NM:i:10
 d	0	xx	6	1	5M10N5M	*	0	0	AAAAACCCCC	*	MD:Z:10	NM:i:0
-sid	0	xx	6	1	1S4M10D5I4M1S	*	0	0	AAAAAGGGGGCCCCC	*	MD:Z:4^ATTTTTTTTT0T3	NM:i:16
+sid	0	xx	6	1	1S4M10D5I4M1S	*	0	0	AAAAAGGGGGCCCCC	*	MD:Z:4^ATTTTTYTTT0T3	NM:i:16


=====================================
tests/data/xx#MD2.sam
=====================================
@@ -1,7 +1,8 @@
 @SQ	SN:xx	LN:30
 @CO	All MD and/or NM should differ to the stored values
 a	0	xx	6	1	10M	*	0	0	AAAAATTTTT	*	MD:Z:9	NM:i:0	co:Z:MD incorrect fields
-a	0	xx	6	1	10M	*	0	0	AAAAGGTTTT	*	MD:Z:4A0A4	NM:i:2
+a	0	xx	6	1	11M	*	0	0	AAAAGGTTTTT	*	MD:Z:4A0T4Y0	NM:i:2
+a	0	xx	6	1	11M	*	0	0	AAAAGGTTTTT	*	MD:Z:4A0T4N0	NM:i:3
 a	0	xx	6	1	10M	*	0	0	GAAAATTTTG	*	MD:Z:0G8T0	NM:i:2
 i	0	xx	6	1	5M1I5M	*	0	0	AAAAAGTTTTT	*	MD:Z:11	NM:i:1
 i	0	xx	6	1	5M3I5M	*	0	0	AAAAAGGGTTTTT	*	MD:Z:1A1	NM:i:3


=====================================
tests/data/xx#rg.sam
=====================================
@@ -1,5 +1,5 @@
 @HD	VN:1.4	SO:coordinate
- at SQ	SN:xx	LN:30	AS:?	SP:?	UR:?	M5:bbf4de6d8497a119dda6e074521643dc
+ at SQ	SN:xx	LN:30	AS:?	SP:?	UR:?	M5:1224b81d8664d77635e1620d7f2c1523
 @RG	ID:x1	SM:x1
 @RG	ID:x2	SM:x2	LB:x	PG:foo:bar	PI:1111
 @PG	ID:emacs	PN:emacs	VN:23.1.1


=====================================
tests/data/xx.fa
=====================================
@@ -1,5 +1,5 @@
 >xx
-AAAAAAAAAATTTTTTTTTTCCCCCCCCCC
+AAAAAAAAAATTTTTYTTTTCCCCCCCCCC
 >yy
 AAAAAAAAAATTTTTTTTTT
 



View it on GitLab: https://salsa.debian.org/med-team/staden-io-lib/-/compare/9581f379e25ae0fe38083034fb2f134f3a4219f5...75c3b71354eb0ac0db1b7654dc7b9cf082d21762

-- 
View it on GitLab: https://salsa.debian.org/med-team/staden-io-lib/-/compare/9581f379e25ae0fe38083034fb2f134f3a4219f5...75c3b71354eb0ac0db1b7654dc7b9cf082d21762
You're receiving this email because of your account on salsa.debian.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20240120/6ffe4b7b/attachment-0001.htm>


More information about the debian-med-commit mailing list