[med-svn] [htslib] annotated tag 1.4 created (now 8f87475)

Andreas Tille tille at debian.org
Tue Mar 21 18:04:45 UTC 2017


This is an automated email from the git hooks/post-receive script.

tille pushed a change to annotated tag 1.4
in repository htslib.

        at  8f87475   (tag)
   tagging  d2d9c76ade2df2b63b9cf79ae8decda1dfadc042 (commit)
  replaces  1.3.2
 tagged by  jenniferliddle
        on  Mon Mar 13 14:49:35 2017 +0000

- Log -----------------------------------------------------------------
Relase 1.4 (13 March 2017)

* Incompatible changes: several functions and data types have been changed
  in this release, and the shared library soversion has been bumped to 2.

  - bam_pileup1_t has an additional field (which holds user data)
  - bam1_core_t has been modified to allow for >64K CIGAR operations
    and (along with bam1_t) so that CIGAR entries are aligned in memory
  - hopen() has vararg arguments for setting URL scheme-dependent options
  - the various tbx_conf_* presets are now const
  - auxiliary fields in bam1_t are now always stored in little-endian byte
    order (previously this depended on if you read a bam, sam or cram file)
  - index metadata (accessible via hts_idx_get_meta()) is now always
    stored in little-endian byte order (previously this depended on if
    the index was in tbi or csi format)
  - bam_aux2i() now returns an int64_t value
  - fai_load() will no longer save local copies of remote fasta indexes
  - hts_idx_get_meta() now takes a uint32_t * for l_meta (was int32_t *)

* HTSlib now links against libbz2 and liblzma by default.  To remove these
  dependencies, run configure with options --disable-bz2 and --disable-lzma,
  but note that this may make some CRAM files produced elsewhere unreadable. 

* Added a thread pool interface and replaced the bgzf multi-threading
  code to use this pool.  BAM and CRAM decoding is now multi-threaded
  too, using the pool to automatically balance the number of threads
  between decode, encode and any data processing jobs.

* New errmod_cal(), probaln_glocal(), sam_cap_mapq(), and sam_prob_realn()
  functions, previously internal to SAMtools, have been added to HTSlib.

* Files can now be accessed via Google Cloud Storage using gs: URLs, when
  HTSlib is configured to use libcurl for network file access rather than
  the included basic knetfile networking.

* S3 file access now also supports the "host_base" setting in the
  $HOME/.s3cfg configuration file.

* Data URLs ("data:,text") now follow the standard RFC 2397 format and may
  be base64-encoded (when written as "data:;base64,text") or may include
  percent-encoded characters.  HTSlib's previous over-simplified "data:text"
  format is no longer supported -- you will need to add an initial comma.

* When plugins are enabled, S3 support is now provided by a separate
  hfile_s3 plugin rather than by hfile_libcurl itself as previously.
  When --enable-libcurl is used, by default both GCS and S3 support
  and plugins will also be built; they can be individually disabled
  via --disable-gcs and --disable-s3.

* The iRODS file access plugin has been moved to a separate repository.
  Configure no longer has a --with-irods option; instead build the plugin
  found at <https://github.com/samtools/htslib-plugins>.

* APIs to portably read and write (possibly unaligned) data in little-endian
  byte order have been added.

* New functions bam_auxB_len(), bam_auxB2i() and bam_auxB2f() have been
  added to make accessing array-type auxiliary data easier.  bam_aux2i()
  can now return the full range of values that can be stored in an integer
  tag (including unsigned 32 bit tags).  bam_aux2f() will return the value
  of integer tags (as a double) as well as floating-point ones.  All of
  the bam_aux2 and bam_auxB2 functions will set errno if the requested
  conversion is not valid.

* New functions fai_load3() and fai_build3() allow fasta indexes to be
  stored in a different location to the indexed fasta file.

* New functions bgzf_index_dump_hfile() and bgzf_index_load_hfile()
  allow bgzf index files (.gzi) to be written to / read from an existing
  hFILE handle.

* hts_idx_push() will report when trying to add a range to an index that
  is beyond the limits that the given index can handle.  This means trying
  to index chromosomes longer than 2^29 bases with a .bai or .tbi index
  will report an error instead of apparantly working but creating an invalid
  index entry.

* VCF formatting is now approximately 4x faster.  (Whether this is
  noticable depends on what was creating the VCF.)

* CRAM lossy_names mode now works with TLEN of 0 or TLEN within +/- 1
  of the computed value.  Note in these situations TLEN will be
  generated / fixed during CRAM decode.

* CRAM now supports bzip2 and lzma codecs.  Within htslib these are
  disabled by default, but can be enabled by specifying "use_bzip2" or
  "use_lzma" in an hts_opt_add() call or via the mode string of the
  hts_open_format() function.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iEYEABECAAYFAljGsX8ACgkQJ6l3bgmDfuxvxwCfWxvlZp9Z37iueklye7UV67XH
DUcAoI685j2xf2RYrSwC7CmstVmAou+p
=npIH
-----END PGP SIGNATURE-----

Anders Kaplan (3):
      Fixed a few non-portable constructs.
      Added a description of how the thread pool test program works.
      Added missing #include in test_view.c.

Andreas (Kusalananda) Kähäri (1):
      Include <sys/select.h>

Daniel Cooke (1):
      Suppress index date warning when hts_verbose == 0

James Bonfield (95):
      Minor fix to lzma error message.
      Added a safe_ltf8_decode function to go along with the itf8 variant.
      Fixed multi threaded partial decoding. (Eg quitting after N reads)
      Fixed a bug where we weren't setting fd->first_container on V2.x or
      Code changes to synchronise cram_encode.c (mostly) with io_lib.
      Merged in several cram_codecs changes from io_lib.
      Improved error checking, copied over from io_lib.
      Added a BASES_PER_SLICE hts option.
      CRAM encoding now puts auxiliary tags in their own blocks.
      Add prototype for cram_update_curr_slice to avoid warning in cram_io.c.
      Sped up rANS decoders by 6% (O0) to 10% (O1).
      Removed use of CRAM "CORE" block while encoding.
      Prompoted cram/thread_pool code to the top-level and use it within
      Tidied up the mass of #ifdefs.
      Further updates to multi-threading.
      Fixed dependencies & .mk defs so Samtools links too (threading change).
      Fixed uninitialised memory (queue shutdown).
      Fixed out by one error in bin calculation (CRAM -> BAM).
      Deleted the broken code in zfopen() when HAVE_POPEN is defined.
      Sam_index_build(2) now returns -4 for failure to save/create the index.
      Fixed CRAM MD/NM generation to follow the same logic as calmd.
      Tidied up pointless while/if duplication.
      Draft of the multi-threaded decoder.
      Committing as it's a working MT decoder now.
      Attempt to use shared pool.
      Added a lossy read-name option (CRAM_OPT_LOSSY_NAME).
      Made the CRAM_OPT_PREFIX option external via the "name_prefix" option.
      Expose cram_get_refs() function to return the opaque refs_t data type.
      Major revamp of thread pool and associated tests.  In particular the
      Bumped the sample size for the tests, and made all tests share a
      White space updates (tabs -> spaces).
      Sorry, updates to the previous commits too; forgetting to add this file!
      Updates to cope with the new thread pool API.
      Removed some debugging and changed the default queue sizes to be smaller.
      Improvements to avoid boom & bust scenarios.
      Culled the duplicate is_compressed assignment.
      Various memory leaks fixed in bgzf multi-threading.
      First apparent working bgzf_seek implementation.
      Added HTS_OPT_CACHE_SIZE option to specify the bgzf cache size.  This
      Made thread job serial numbers 64-bit.
      Improvements to multi-threading encode/decode/transcode.
      Changed the HTS_OPT_THREAD_POOL argument from a t_pool pointer to a
      bgzf_check_EOF now works when input is multi-threaded.
      Htslib now copes with zero length Z and H aux tags.
      Propagate read error in thread breader ack to fp->errcode error in main thread.
      Deleted the now defunct read_eof variable.
      Fixed bgzf_getc and bgzf_getline to work in multi-threaded mode.
      Added support for multi-threaded BAM indexing.
      Fixed CRAM_OPT_THREAD_POOL for cram.
      Merge branch 'develop' of https://github.com/samtools/htslib into threading_pool
      Bug fix to hts_get_bgzfp.
      Improvements for VCF/BCF multi-threading.
      Fixed file-descriptor leak in refs_load_fai().
      Make sure we destroy the thread_pool when created by ourselves.
      Merge branch 'develop' of https://github.com/samtools/htslib into threading_pool
      Bug fix to bgzf_read, which was breaking samtools index (and more).
      PTHREAD_MUTEX_RECURSIVE_NP vs PTHREAD_MUTEX_RECURSIVE.
      Code tidyup.
      test_view now also verified multi-threading.
      Removal of debugging output.
      Migrated the thread pool structures to thread_pool_internal.h.
      Renamed various thread pool structs/functions.
      Culled the DEBUG_TIME code in thread_pool.
      Further tidying up of queue vs process; mostly comments and docs.
      Speed up to probaln_glocal.
      Protect against sequences starting beyond reference end.
      Rebased PR#387 and minor code formatting fixes (trailing white space,
      Added bgzip to check/test dependency.
      Fixed a race condition in the multi-threaded cram encoder.
      Fixed a double free in multi-threaded CRAM and regions.
      Added callback + client data hooks to pileup iterators and pileup struct.
      Factored in the renormalisation to the f[] computation.
      Cosmetic: 0 to NULL
      Replaced BSD license with MIT license for consistency.
      Merge pull request #438 from daviesrob/cram_afl
      Fixed a rare renormalisation bug in the rANS codec.
      Added a kputd for %g specialisation.
      Merge Google Cloud Storage support (PR #446)
      Permit CRAM lossy_names mode to accept TLEN 0 or TLEN +/- 1.
      Fixed lzma memory limit.
      Document the --disable-lzma and --disable-bz2 configure options
      Fixed dead-lock case in seek + multi-threaded decode.
      Merge PR #395 (Add a kputd for %g specialisation).
      Adjusted prototype for kputd to be consistent with other kput functions.
      Fixed bgzf threading dead-lock when trying to reading beyond EOF.
      Fixes for dealing with raw gzip streams.
      Further thread pool fixes.
      Merge commit PR #459 (Fix undefined behaviour and improve endian-related behaviour)
      Fixed bgzf_gzip_compress when given uncompressable data.
      Fixed data corruption when switching to threads part way through a stream.
      Remove MacOS X dead-lock in bgzf threading.
      Fix to iterators when the query overlaps zero bins.
      Mention thread pool changes.
      Merged PR#463 (Configure BZ2/LZMA and make htslib.pc more accurate).
      News updates (from historical commits).

Joe Rayner (2):
      Added bgzf_block_write() and rebgzip option.
      Added bgzip test files.

John Marshall (90):
      Add {errmod,kprobaln,bam_md}.[ch] from samtools
      Retain only library parts of realn.c; canonicalise whitespace
      Add errmod declarations to htslib/hts.h and errmod.o to Makefile
      Rename kpa_glocal() to probaln_glocal() and add to htslib/hts.h
      Rename as sam_cap_mapq()/sam_prob_realn() and add to htslib/sam.h
      Remove probaln_par_def/probaln_par_alt constants
      Add errmod / probaln / prob_realn to HTSlib API (PR #343)
      Remove unused variable [minor]
      Fix len parameter type [minor]
      Fix potential infinite loop [minor]
      Replace bcf_hdr_fmt_text(), which can't handle huge headers
      Read BCF header's l_text as unsigned
      Merge fixes for huge BCF headers (PR #373)
      Add curl_kput(), which slurps an URL into a kstring
      knetfile.c: Only emit Range header if needed
      Merge hts_itr_query(HTS_IDX_NOCOOR) fixes (PR #376)
      Reorder Makefile dependencies [minor]
      Use internal plain char isdigit_c()/etc ctype functions
      Write CRAM ref cache via hFILE rather than stdio
      Add doxygen @file documentation
      Use hts_get_bgzfp() when format.compression==bgzf
      Not with tabs, James! [minor]
      Show coordinates in "unsorted positions" error message
      Merge CRAM lossy name compression (PR #326)
      Detect shared library and plugin types during configure
      stringify_argv(): suppress trailing space
      Build DLL and plugins on Cygwin
      Use hFILE rather than stdio when reading indices
      Document required Cygwin (and RPM-style) devel packages
      Add is_cram flag to distinguish dummy hts_itr_t objects
      Add print-config target
      Generate config.h.in with autoheader
      Add configure check for fdatasync()
      Use finer-grained $(INSTALL_LIB) and $(INSTALL_MAN) macros
      hts_itr_query(): discard chunks far beyond the query region
      Avoid linguist mis-classification [minor]
      Allocate BGZF::uncompressed_block/compressed_block together
      Allow plugins to select RTLD_LOCAL or RTLD_GLOBAL
      Use native Doxygen API documentation markup
      [faidx.h] Use native Doxygen API documentation markup
      Remove iRODS plugin, which has moved to samtools/htslib-plugins
      Embed version number directly in hfile_libcurl plugin
      Discard distant chunks based on binning index, not linear index
      [tabix man page] Note coordinate arguments are 1-based inclusive
      Treat regions [-1,n) as [0,n) when indexing
      Rewrite #ifdeffed-out use of now-removed variable [minor]
      Merge version number bump and NEWS file from master
      Bump SOVERSION to 2 and note ABI incompatibility in NEWS
      Merge (ABI-changing!) mpileup callbacks (PR #398)
      Use <inttypes.h> instead of old WIN32-specific code
      Avoid extraneous #includes
      Merge threading pool API (PR #397)
      Don't redefine thread_pool.h typedefs
      Activate auxf#values_java.cram test
      Ensure headers compile by themselves [minor]
      Add fixed/immobile hFILE buffers
      Implement base64-encoded data: URLs
      Add JSON format and very basic recognition
      Merge CRAM updates, sync with io_lib implementation (PR #361)
      Add hopen() varargs; use them for HTTP headers in hfile_libcurl.c
      Add JSON tokeniser / lexer
      Refactor incidental uses of kstream
      Remove htsFile's use of kstream
      Parse GA4GH Retrieval protocol and handle redirects
      Fix hseek() already-read buffer reuse bug
      Fix hFILE write-after-read bug
      Merge JSON-based GA4GH redirection file access protocol (PR #439)
      Also handle uncompressed (raw) BCF for vcf_sweep
      Propagate error return codes from hts_getline()
      Use hFILE to read htsFile::fn_aux FAI file
      Make htsfile work with (e.g. GA4GH) redirects
      Add "httphdr", "httphdr:l", and "va_list" hopen() options
      Constify extern tbx_conf_* preset variables
      Add support for Google Cloud Storage pseudo-URLs
      Add missing entries
      Change bam1_core_t::n_cigar from uint16_t to uint32_t
      Happy New Year
      Alter bam1_t data layout so that CIGAR data is 32-bit aligned
      Fix test/compare_sam.pl -Baux on 32-bit platforms
      Support custom S3 endpoint host_base setting (in .s3cfg)
      Split S3 parts of hfile_libcurl.c into separate hfile_s3.c
      Ensure max_off is -1 when end bin overflows
      Fix whitespace, shorten help string [minor]
      Move S3 support from hfile_libcurl.c to hfile_s3.c
      Merge separate hfile_s3.c code
      Don't FAILONERROR at high verbosity and other minor libcurl changes
      Add `htsfile -cv` raw view mode for unknown file formats
      Add bgzf_compression(); reuse check_header() in bgzf_is_bgzf()
      Add BZ2/LZMA to configure.ac and infrastructure to config.pc.in
      Add -rdynamic/-ldl to htslib.pc.in's static_* variables when needed

Martin O. Pollard (1):
      Create API to check EOF on all htsFile that support EOF block

Nathan T. Weeks (1):
      Define _XOPEN_SOURCE so that PTHREAD_MUTEX_RECURSIVE is defined

Olivier Cinquin (2):
      Provide more informative error message when unknown SAM tag type is encountered.
      Undefine macro after it has served its purpose (no functional change).

Petr Danecek (9):
      New bam_mplp_reset function to allow mplp in multiple regions
      Handle VCF lines with misssing `FORMAT=.`
      Bug fix: 0 is a valid return value of bcf_hrec_find_key
      More thorough INFO cleaning to prevent issues like https://github.com/samtools/bcftools/issues/428
      Turn off autodetection when -s,-b,-e,-0,-c,-S, or -p are given
      propagate vcf errors from synced reader
      BGZF skip empty blocks, do not give up reading prematurely
      Reworked synced VCF/BCF reading
      Prevent infinite loop on empty indexes

Rob Davies (39):
      Make hts_itr_query find no-coor reads when last reference is unused.
      Don't assume order of sequence_ids when finding HTS_IDX_NOCOOR location.
      Make kstring detect more errors and work better on 64 bit systems
      Add interfaces to hfile for delimited string input
      Fix error handling in cram_index_load.
      Make cram_decode_estimate_sizes handle missing codecs.
      Ensures rANS uncompressors don't read beyond end of input.
      Prevents out-of-bounds array access on ref_id
      Adds more CRAM decoder checks to prevent overrunning input buffers
      Fixes test for enough data when reading the preservation map.
      Prevents wrap-around bugs in allocations.
      Make hts_expand handle realloc failure a bit better.
      Prevent reads past the end of the VCF header.
      Make bcf_read1_core() return error if ks_resize fails, or on short read.
      Add function bcf_record_check() to validate bcf records
      Stop test_cmd from merging stderr with its output.
      Add hts_endian.h to convert little-endian bytes to/from native integers.
      Fix undefined behaviour and improve endian-related behaviour
      Report missing BGZF EOF blocks
      Merge "Further thread pool fixes" branch (PR #465)
      Allow fai index to be in a different location to the indexed file.
      Add bgzf_index_load_hfile and bgzf_index_dump_hfile
      Add bgzf unit tests
      Add more error checks when building indexes
      Add tabix functional tests
      Remove abort on corrupt aux data, pass errors up instead
      Make sam_format1() fail it it finds an invalid aux type
      Deal with bzip2 pkg-config module not being available everywhere
      Fix endianness, integer type and memory safety issues in index metadata
      Add more libraries to static_LIBS, where required
      Create a Makefile fragment with static linking flags
      Remove explicit -lz -lm link flags; add to LIBS instead
      Merge "Provide more informative error message for unknown tag type (PR#444)"
      Prevent segfault due to VCFs with very large IDX tag values
      Default to check for libcurl; don't fail on no -lcrypto for s3 check
      Add libbz2 and liblzma to default libraries in the Makefile
      Add sections on dependencies and making configure to the INSTALL file.
      Travis updates.
      Fix over-specified location of htslib.pc.tmp

Shane McCarthy (4):
      allele trimming bugfix
      kputd: set kstring len correctly for negative exponential values
      vcfutils: replace exit() with return -1 in bcf_remove_allele_set
      bcf_index_build3: return -4 on index write failure as per sam_index_build3

dlaehnemann (1):
      Extended bcf_get_format*() documentation to emphasize difference between

jenniferliddle (4):
      Added bam_aux_update_str()
      Fixes bug in bam_aux_update_str()
      Fix bugs in bam_aux_update_str()
      Release 1.4: summary

mcshane (1):
      allow bcf_index_build2 to index both bcf and vcf

-----------------------------------------------------------------------

No new revisions were added by this update.

-- 
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/debian-med/htslib.git



More information about the debian-med-commit mailing list