[med-svn] [htslib] annotated tag 1.0 created (now d39962b)
Charles Plessy
plessy at moszumanska.debian.org
Mon Aug 18 12:06:21 UTC 2014
This is an automated email from the git hooks/post-receive script.
plessy pushed a change to annotated tag 1.0
in repository htslib.
at d39962b (tag)
tagging f2af2ad9ee1969bcdeb475f614b79981093b3b47 (commit)
tagged by John Marshall
on Fri Aug 15 11:05:28 2014 +0100
- Log -----------------------------------------------------------------
First HTSlib release, supporting SAM/BAM/CRAM/VCF/BCF
Aaron Quinlan (1):
declare bcf_unpack in the header
Andre Masella (1):
Create a pkg-config description for easy linking
Heng Li (255):
Create trunk copy
initial source code. It is BUGGY!
bgzip is more like gzip in its command-line interface
* tabix-0.0.0-1 (r500) * apparently working
* tabix-0.0.0-2 (r501) * accelerate ti_readline()
* tabix-0.0.0-3 (r502) * support meta lines (not tested) * I am going to make the index file in the BGZF format
* tabix-0.0.0-4 (r503) * index files are bgzf compressed
* tabix-0.0.0-5 (r504) * fixed a critical bug in fetching data (a typo in fact) * support SAM (tested on ex1.sam) and VCF (not tested) * improve the command-line interface
documentation
Release tabix-0.1.0
Release tabix-0.1.1
* tabix-0.1.3 (r543) * fixed another off-by-one bug
removed a line of debugging code
* added the format specification * fixed a typo in bgzip
* tabix-0.1.3-2 (r555) * do not overwrite index file by default * a little code cleanup
* tabix-0.1.3-3 (r556) * fixed a small memory leak in knetfile * fixed a minor bug for remote downloading
Release tabix-0.1.4 (r559)
* Release tabix-0.1.5 (r560) * Improve seeking efficiency. Index file needs to be rebuilt.
If nothing bad happens, this will become 0.1.6
Release tabix-0.1.6
* improved C/Perl APIs * added test for Perl * added an tiny example
Release tabix-0.2.0
Release tabix-0.2.1
Release tabix-0.2.2
fixed a bug in C/Java when n_off == 0
patches by Peter Chines
Release tabix-0.2.3
BED support
update to the latest bgzf.*
fixed two bugs due to recent changes
update version number
Release tabix-0.2.4 (r949)
Release tabix-0.2.5 (r964)
updated BGZF
initial checkin
finish parsing ## lines
header parsing finished (not well tested)
towards parsing VCF
BCFv2 quick reference
code backup
a bit code restructuring
encode up to the INFO field
VCF=>BCF, up to QUAL
bcf->vcf, up to FILTER
bcf->vcf, up to INFO
change long to int32_t
towards parsing genotype fields (unfinished)
a little code cleanup
parse genotype fields; not tested yet
restructure the code a bit
to restructure decoding
restructure decoding
working on printing genotypes; not working yet
fixed one bug, more to fix
working on a toy example!!!
another bug fix
add a very simple CLI for testing
if PASS is not in the hdr, add it
use two memory buffers to fix a bug
clean bill by valgrind on a toy example
move the memory buffer to the header
bugfix: extra byte for string
read seq dict from an external file
finished text-in-text-out
code backup
bcf->vcf without inferred information
do not compute inferred info during vcf->bcf
separate shared and individual annotations
rename variables
added simple cmd interface
separated the main function; added Makefile
add external seq-dict to the header
added git ignore
updated the quick reference
bugfix: -l not working; corner case of Number=.
bugfix: segfault given Number=.
when -l is in use, use binary output
parse Flag type
bugfix: in case of no FORMAT
added CNL to the example
clarified quick reference
more consistent Flag; code clean up
do not keep '\0' in sample strings
cleanup
allow arrays longer than 128
bugfix: missing "," in ALT
use three header hash tables instead of one
rename variables
added line counting
routine to unpack FORMAT
collect site allele count
fixed a memory leak
supposedly faster VCF parsing
minor
can be compiled with g++
bugfix: str in GT not encoded properly
bugfix: incorrect float
move alloca() upwards
code backup; not working!
parse n_sample from record lines
back to work
encode GT
phasing info stick to the allele behind
updated BCFv2 spec
merge REF and ALT
added GT encoding
updated to the latest BGZF
bugfix: prematured output in bgzf-mt
g++ errors in bgzf-mt
prepare to add sam parsers
separate file I/O
use hts.{h,c} in vcf
replace vcf_verbose with hts_verbose
BAM I/O, SAM output; not tested
compiling err/warning from g++
create separate directory
rename api to htslib
added sam interface; not working
bugfix: wrong returned value from knet_seek()
code backup; NOT compiling
compiled, but certainly not working
working, up to QUAL
SAM parser works on a toy example
report line number
use hts_getline() to keep line number
bugfix: wrongly parse B arrays
added a SAM example
do not check CIGAR if CIGAR==*
bugfix: uninitialized numbers
bugfix: compute "bin" in bam
code backup
added comments
code backup
code backup
minor changes
code backup
separate indexing from hts.c
bam index interface
index load; bugfix: wrong offset of last chr
code backup
add region parser (not tested)
code backup
indexing apparently works for BAM
bugfix: one extra record from random access
fix C++ compiling errors and warnings
code cleanup
removed sam_hdr_t::has_SQ
added the pileup/mpileup interface
fixed g++ warnings
automatically figure out the index file name
forgot to add Makefile
expose bgzf.h
bamshuf: shuffle BAM
added bam2fq
System may have an <endian.h>; merge to hts.h
fixed gcc warnings
added simplified stdint.h, in case it is missing
bam2fq: output SE reads in the PE mode
output the number of reads
bugfix: bamshuf not working when collision
make BCF APIs similar to BAM APIs
added Description to PASS
detailed vcfview cmd prompt
bam2fq: change the option name
generalize the BAM index
unified index seems to work; optimization needed
merge a small bin to its parent
fixed a couple of warnings
changed magic
bugfix: parent merging not working
reduced the min chunk size
BCF2 index; compiled but not working
BCF index working on a toy example
allow longer arrays
allow missing binary index
wrong n_lvls
bugfix missing string not printed properly
avoid printing "\0" in the header
replace 0x7F800001 with bcf_missing_float
stop if not sorted
code backup; NOT working
a bit code cleanup
code cleanup
bugfix: hts index not working when n_seq is unknow
CLI for tabix
tabix apparently works, at least on toy examples
added an example for tabix
format of the new index
minor change to the index description
fixed a few g++ warnings
added command-line help
added a brief README
more flexible about the '\n' and '\0' in hdr
bugfix: wrong integer packing
change the meaning of l_shared (in line with GATK)
output a subset of samples
allow to drop all individual information
smart sample list
bugfix: missing values are not filled sometimes
new indexing algo/fmt; NOT working yet
BAM indexing works again on a toy example
bugfix: bin 0 is not sorted
a bit code cleanup
bugfix in the new index; containing dbg code
bugfix
using -O2
bugfix: update_loff() run twice
handle unmapped aln in SAM
halve HTS_MIN_MARKER_DIST; updated doc
declare bcf_fmt_sized_array() in the header
unpack bcf1_t; NOT WORKING yet
nothing
Merge pull request #2 from arq5x/master
generating vcf from bcf_dec_t; to remove old code
removed old vcf_format1() code
bugfix: a typo
a little bit code simplification
make bcf_unpack() more flexible
changed function names
added a little doc
migrate sam2break.d to htslib
abreak is working!
better help
vcf<->bcf is pretty mature now
compatible with tabix index
bugfix: vcfview -G failing due to recent changes
API to test if a record is a SNP
added BAM aux functions and bam2bed
bugfix: invalid memory access (fixed by petr)
vcf: add "PASS" before anything else
clarify the BCF2 document
clarify that BCF2 is BGZF'd and little endian
r198: changed the BCF2 magic to BCF\2\1
r199: support retrieving reads w/o coordinates
r200: keep read index in bam1_t
Merge branch 'master' of github.com:samtools/htslib
merge r199 changes; part 1
merge r199 changes; part 2
merge r199 changes; part 3
added bam2bed
r198: tag version
r199: reduce one conflict when merged with master
r200: the previous is not good enough
Exported some routines
Allow to skip errors
Append new header lines
added this missing file
Merge branch 'lite'
bam2fq: skip secondary alignments
Merge branch 'lite'
Merge pull request #9 from kmsquire/fix_secondary_check
compiling error
optionally output OQ instead of QUAL
Merge branch 'lite'
-C to output lines not overlapping BED
James Bonfield (72):
Added CRAM support to htslib and htscmd samview.
Merged with attractivechaos/klib:
Major overhaul of CRAM code (originally imported from Samtools/htlib
Merge branch 'develop' of https://github.com/samtools/htslib into develop
Further removal of various unusued portions of Staden io_lib.
Switch from modifying CFLAGS to CPPFLAGS so we can do "make CFLAGS=-g".
Bug fix to handle blank headers in SAM files.
Bug fix to stop sam view from converting 'H' into 'Z' types.
Use strtoul instead of strtol for XX:B:I tags so that integers >2.14
Better handling of unknown references.
Bug fix to khash-ification of sam_hdr_add_lines
A bit of a hack to set the header and references up properly for
Added some synthetic SAM test data and a test perl script to trial
Bug fix to khashification of process_one_read.
Added tests/test_view
Bug fix to stop sam view from converting 'H' into 'Z' types.
Use strtoul instead of strtol for XX:B:I tags so that integers >2.14 billion work.
(io_lib r3438/7)
(io_lib r3439)
(io_lib r3460)
(io_lib r3461 minus arithmetic coder & last_name experimental parts)
(io_lib r3467)
Silenced a "pointer targets in initialization differ in signedness" warning.
(io_lib r3470)
Fixed CRAM to code with the 0x800 supplementary flag.
Added copyright notices.
Fixed errors in copyright due to pasting the MRC one and forgetting to
Fixed a variety of multi-threading data races, detected through
Merge branch 'develop' of https://github.com/samtools/htslib into develop
Samtools. The test harness worked, but the samtools interface needed
Added prototype for cram_set_header()
Fixed a couple complaints from clang.
Fixed a bunch of code warnings produced by clang's static analyser.
A couple more clang reported errors, which are duplicates of previous
Merge branch 'develop' of https://github.com/samtools/htslib into develop
Robustness improvements following fuzz testing. (See io_lib r3500).
Added decoding checks for the most obvious of sam parsing failures.
Removed old comment from earlier code.
Merge branch 'develop' of https://github.com/samtools/htslib into develop
Further fixing of auxiliary tag decoding.
Added the ability to do samtools index on a CRAM file.
Cope with empty containers internal to a CRAM stream.
Added implementation of cram_external_encode(). It's not yet used
Bug fix to the zlib strategy tuning in multi-threaded mode. It could
Added refs_t->last_id (note distinct from ->last) for use when reading
Add mfsteal to mFILE, and use it in cram_populate_ref.
Added missing return code checks. Fixed 'x' mode of mfreopen.
Added some EOF block creation and checking.
Bug fix to cram_next_slice(). The new while loop checking for
Bug fix to EOF block writing when in multi-threaded mode (not yet
Added a check to EOF in Cram in hts_close(). Note that this isn't the
Merge branch 'develop' of https://github.com/samtools/htslib into develop
Updated the Cram EOF block detection to avoid falsely considering the
Merge branch 'develop' of https://github.com/samtools/htslib into develop
Fixed initialisation of CRAM indexing when the CRAM file contains
Initialise beg and end values in the hts iterator. Without these
Fixed the iterator to be 1-based inclusive coordinates for CRAM.
The default CRAM version is now 2.1. This is now the official release
Fixed EOF check. The usual code path had the correct check, but an
Bug fix to the SAM header padding. It could go negative and then fail
Reduced the number of realloc calls zlib_mem_inflate(), while
Fixed a bug when loading in fasta files consisting of all sequence on
Bug fix to CRAM indexing and index usage.
Fixed two small memory leaks.
Revert one of the previous memory leaks as it breaks multi-threading.
Better fix for container memory leak.
Changed the initial CRAM EOF value to be 1 (expected EOF) and then
Sanitised the setting of fd->empty_container into one place. This also
Removed bogus "Unable to find ref name"* errors when converting a SAM
sam_hdr_write() now calls cram_load_reference if
Set NM/MD creation to be on by default.
Fixed REF_PATH to handle URLs better.
Joel Thibault (1):
Add extern "C" around API to enable C++ linking
John Marshall (183):
Revert to kstring.h with a separate kstring.c
Fix <stdarg.h> include
Merge changes from samtools's kstring
Replace with samtools's ksort
Add klist.h, as used by samtools
Maintain the same API regardless of -D BGZF_MT
Use htslib's bgzf.h, faidx.h, and razf.h
Expose bam_hdr_init() in the public API header
Avoid conflicting with samtools's bam_index()
Merge branch 'for-samtools' into develop
Merge branch 'razf_sync' of https://github.com/mp15/htslib into develop
Add Travis CI control file
Merge branch 'master' into develop
Move format specifications to sam-spec repository
Test with both Clang and GCC
Remaining application code also moving repository
Lift library source to the top-level directory
New non-recursive build infrastructure
Build shared libraries, add "make install" target
Update khash.h from upstream sources
Update kstring.[ch] from upstream sources
Bug fix to handle blank headers in SAM files
Use strtoul() for other unsigned 'B' types too
Use config.h for configuration options
Replace HTS_VERSION with hts_version()
Actually $(NUMERIC_VERSION) is more generic
Fix union-based type-punning buglet
Use static inline, not just inline
Keep the makefile non-recursive
Add faidx (FASTA index) description man page
Alphabetise, fix whitespace [minor]
Add buffered low-level input/output streams
Warn if hFILE function return statuses are ignored
Refactoring [minor]
Add hseek() and htell() test cases
Flush or discard the buffer in hseek()
Maintain our own hFILE offset within the stream
Add hdestroy_buffer() for use in hopen_*() cleaning up
BGZF worker threads do no I/O themselves
Use hFILE underneath BGZF, and add bgzf_hopen()
Add hfile*.[ch] to $(HTSLIB_ALL)
Merge branch 'io' into develop
Fix vcf_sweep.h dependencies
Also run our tests
Fix typo [minor]
Lift mallocing & freeing to the generic hFILE code
Remove unneeded cast
On Windows, call knet_win32_init before using knet
Don't #include bgzf.h from other htslib headers
Change *_itr_next() to take htsFile*, not BGZF*
Replace "void *fp" by a union of the file pointers
Fix -Wc++-compat warning / C++ compilation error
Use hFILE instead of stdio for uncompressed output
Add bgzf_raw_read() and bgzf_raw_write()
Merge branch 'io' into develop
Properly ignore hclose() in bgzf_[d]open functions
Merge branch develop of github.com/jkbonfield/htslib into cram
Move cram/ directory to the top-level
Add makefile dependencies for cram code
Merge the CRAM test harness into current htslib
Build test_view with -pthread
Enable warnings only if -Wno-unused-result works
Merge branch 'cram' into develop
Remove -Wc++-compat for the sake of the CRAM code
Add bam_endpos(const bam1_t *)
Disentangle CRAM header loopiness [minor]
Add SAM and VCF man pages extracted from samtools.1
Add BAM_FSUPPLEMENTARY flag bit
Return bytes >= 0x80 from hgetc() correctly
Merge pull request #27 from jkbonfield/develop
Move cram_seek() from cram_index.c to cram_io.c
Use hFILE underneath cram_fd, and add cram_dopen()
hts_open(fname, "r") detects format by peeking at the input
Merge branch 'io' into develop
Tab, newline, etc are not control characters
Require lowercase [rw] mode letters in bgzf_open() etc
Limit input hFILE buffers to 32K [temporary hack version]
Fix tbx_itr_next() when given a textual (is_kstream) htsFile
test.pl: fail if any tests fail
hts_open(fname, "r") now works in both cases; no need for "rb"
Format detection: non-compressed binaries; is_compressed setting
bcf_sr_open_reader(): Auto-detect input file format
Handle erroneous modes with neither 'r' nor 'w'
Merge CRAM copyright notices and data race fixes from jkbonfield
Merge cram_set_header() from github.com/jkbonfield/htslib
Merge branch develop of github.com/jkbonfield/htslib
Add hts_set_fai_filename(); parse .fai in sam_hdr_read()
Remove now-unused fn_aux hts_open() parameter
Reinstate distinct bgzf_check_EOF() status for seek failure
Add hclose_abruptly() for stack unwinding after errors
Change hts_close() to return int (error indication)
Fix bgzf_write() for large (> 2GB) blocks
Merge cram_index_build() from github.com/jkbonfield/htslib
l_qname is unsigned, so is always >= 0
Ensure r is clearly always initialised [minor]
Use $(LDFLAGS) & $(LDLIBS) when linking shared objects
Use the exact type in malloc(N * sizeof(TYPE*)) etc
Fix pointer arithmetic [minor]
Add HTS_NORETURN and annotate fail() as such
Use the exact type in realloc(N * sizeof(TYPE*))
Merge tabix and samtools bgzip.c and tabix.c
Fix hseek() return value test
Make filename argument a *const* char pointer
Store readrec function pointer within hts_itr_t
Make search key arguments *const* char pointers
Write strcmp() comparisons comprehensibly
Add basic test exercising bcf_itr_queryi and bcf_itr_querys
hts_itr_querys() calls hts_itr_query() via a function parameter
Add sam_itr_*() dealing with BAM and CRAM iterators interchangeably
Also initialise tid in case user code looks at it
Merge pull request #46, CRAM EOF blocks
Merge pull request #60, CRAM EOF block improvements
Merge pull request #62, minor CRAM updates
Check cram_eof() before calling cram_close()
Keep the style of the surrounding code
Add HTS_IDX_NONE and document HTS_IDX_*
Merge #68, fix incorrect length given to MD5 calculation
Add 'a' (append) mode letter; propagate mode letters
Add "data:blahblah" in-memory hFILE backend (read-only)
Use hFILE rather than knet in test_and_fetch()
Merge #70, fetching of remote BAM/VCF indices
Rewrite __skip_tag() as skip_aux()
Add sam aux field tests
Merge sam_format1() and __skip_tag() fixes (PR #73)
Add ks_release() to kstring.h
Merge branch 'bcftools+calling' into develop
Make hfile.h part of the public API
Add MIT/Expat license boilerplate
Define __attribute__ macros in a new (semi-public) header
Propagate error codes for CRAM_OPT_RANGE
Construct HTS_IDX_REST and HTS_IDX_NONE iterators without an index
Merge bam_dup1() addition (PR #80)
Silence -Wstrict-prototypes warnings
Merge bgzip.c and tabix.c/.1 history from github.com/samtools/tabix
Fix bgzip and tabix dependencies and link commands
Exit statuses are non-negative
Merge branch 'bcftools+calling' into develop
Silence -Wstrict-prototypes warning
Check sam_open() and sam_close() return values
Return "" from bam_flag2str(0) rather than NULL
Merge CRAM indexing bug fixes (PR #85)
Add hts_set_threads() file I/O threading API
Fix bgzf_write() and bgzf_flush_try() error handling
Fix bam_write1() error handling
Merge branch 'bcftools+calling' into develop
Fix forgotten s/razip/bgzip/
Remove RAZF source code
Check CIGAR string has fewer than 2^16 operators
Fix "unrecognized CIGAR operator" diagnostic
Return 0/-1 rather than bool; parenthesise C macro
Merge API functions for index statistics (PR #83)
Check index file magic numbers and I/O
Fix endianness problems outwith CRAM code
Add sam_open_mode() API function
Remove val_unused -- use HTS_UNUSED instead
Suppress compression suffix search on default EBI MD5 service
Merge REF_PATH fixes and libcurl removal (PR #111)
Add license file
Add MIT/Expat license boilerplate
Fix copy/pasto
Files under cram/ are BSD-licensed
On open failure, set errno instead of printing message
Don't call freeaddrinfo() when getaddrinfo() fails
Rationalise include guard macro names
Ensure installation directories are world-executable
Add copyright notices and licensing boilerplate
Add klib copyright notices and licensing boilerplate
Omit control files and README.md from release tarballs
Add copyright notices and licensing boilerplate
Add copyright notices and licensing boilerplate
Update copyright notices to reflect historic changes
Add copyright notices and licensing boilerplate
Update copyright years to reflect historic changes
Add copyright notices and licensing boilerplate
Add copyright notices and licensing boilerplate
Add copyright notices and licensing boilerplate
Add copyright notices and licensing boilerplate
Add copyright notices and licensing boilerplate
Merge copyright notice and licensing boilerplate additions
Canonicalise whitespace -- USE -b/-w TO DIFF/BLAME ACROSS THIS COMMIT
Link shared library against libm and libpthread
Add basic INSTALL and README files
Release 1.0: first released HTSlib package
Joshua Randall (1):
Add a comment to deobfuscate bam_cigar_type
Joël Spaltenstein (1):
Fixed comment typo
Karthik Gururaj (1):
Fixed bug in bcf1_sync. The main cause of the bug is as follows:
Kevin Squire (1):
Bugfix: core is a member of bam1_t->core
Martin O. Pollard (4):
Reinstate bam_header_dup as bam_hdr_dup
Fix typo in comment
Add functions for idxstats
Improve hts_idx_get_stat to work for more file types
Martin Pollard (14):
Fix function prototypes for knetfile and bgzf to match system read and write.
Use off_t and lseek rather than the 64 alternatives as this is more portable.
Fix signed return going into unsigned variable
Make nucleotide tables const
Amend .gitignore to ignore produced binaries
Merge remote-tracking branch 'upstream/master'
Add .gitignore for LaTeX to doc directory
Fix handling of legacy tabix indexes with empty dictionary.
Fix crash on legacy files
Use sizeof() and NULL instead of numbers
Import header comments from samtools
Add const to char *'s to ensure they can be called with fixed values without triggering compiler warnings
Port fixes from samtools razf.c/h to htslib.
Migrate documentation from samtools bam.h to sam.h and add a few more items
Mauricio Carneiro (1):
Add duplication function for the bam1_t interface
On behalf of Bob Handsaker (1):
Change copyright notices now that MIT has approved open source distribution.
Petr Danecek (389):
Changed the mode for newly created files to 0666. This allows less strict permissions with umask properly set (e.g. 0002 vs. 0022).
Added the -l option for listing chromosomes
Fix in src/dst file detection and slight change of behaviour
The behaviour changed slightly to mimic gzip. Detect if std descriptors are connected to the terminal.
Complain about not-bgzipped files and check for noncontinuous chromosome blocks
Fix: Exit with an error rather than segfault when index is not present and region is queried
Added -h option to print header lines
Disable "unknown target name or minus interval" warning.
Prevent the common user mistake and check the timestamps of the vcf and index file
Fix: Complain only when VCF is newer, not newer or same mtime
New -r (reheader) option for efficient header replacement.
Querying remote files required -f option on some systems
Added option for printing header lines only
Guess the filetype from the file extension when -p not given
Assume GFF format when the file type cannot be guessed, as in the previous version
Even better, do this only for reading
Fix: allow custom formats
More informative error message about missing or out-of-date tabix index
Experimental implementation of vcfcheck (non-functional)
Fill sequence dictionary from tabix index if contigs not present in VCF header. Synced reader basically working. (No alleles detection yet.)
Functional implementation of synced reader including allele checking
Collect Ts/Tv by AF
Synced reader can collapse sites by variant type; new API call set_variant_types
Invalid read fix in hts_itr_query; Unsafe loop limits fix in vcf_parse1
Removed hacky work around, tabix seems to be fully working now
First plot-vcfcheck skeleton
Merge branch 'master' of github.com:pd3/htslib
Calculate AC,AN on fly if not in INFO; allow calling bcf_unpack() repeatedly without redoing the work; plot graphs using matplotlib; bugfixes
Calculate AC,AN on fly if not in INFO; allow calling bcf_unpack() repeatedly without redoing the work; plot graphs using matplotlib; bugfixes
Use matplotlib and pdflatex to create summary slides
More stats: indel distribution and substitution types
More stats: SNP and indel counts by AC
create_pdf: Switched from slides class to memoir
More stats: frameshifts ratio; bug fixes
Merged github.com:pd3/htslib
Documented the synced reader; More stats for vcfcheck: genotype concordance; Bugfixes
More vcfcheck stats: Ts/Tv and nHets by sample
Another vcfcheck stats: NRD by sample
Fix in per sample stats; Minor cosmetic improvements of plots
Sample list from file or command line; Handle missing GTs in calc_ac and init_iaf; Documented plotting
Changed NRD output
New option --split-by-ID for known vs novel stats
Fix in calc_ac for missing GTs; Number of MNPs and other events in the summary table
Always check INFO/AC,AN for by AF; TsTv by Qual stats
Depth distribution stats
Fixed typo, open with w instead of r
Fixed typo, check i-th ALT instead of 1st in set_variant_types
Synced reader and hence vcf*tools now work also with bcf2
Merge remote branch 'main/master'
Merge remote branch 'mp15/misc_improve'
Cumulative commit with many changes, including:
Bug fix: variant type not checked in the absence of --collapse option, this affected all vcf* commands
Changed order of underscore escaping and line breaking to prevent loss of escape characters
In calc_ac check added for different BCF_BT_INT* types; New trim_alleles method
Fix in variant type recognition: distinguish indels from complex events
Plotting Hets vs Homs instead of plain Hets; Sanity check for FS section header
Add missing contigs to the header permanently
Removed repeated call of bcf_unpack_fmt_core in vcf_format1
New API routine: remove_alleles; get_fmt_ptr
New API routine: remove_alleles; get_fmt_ptr
New API routine: remove_alleles; get_fmt_ptr
Do not attempt to read directories in bcf_sr_set_samples
Use uint64_t for DP stats. On behalf of Chris: plot-vcfcheck can now merge per-chromosome vcfcheck files.
Fixed a typo which rendered -n <cnt> ineffective.
New -C, --complement option for vcfisec. bcf_seqnames now returns chromosomes in correct order
Temporary changes
tmp save
GT merging now functional
Sanity check allow output from both vcfcheck and plot-vcfcheck -m
Allow missing values in lists in VCF output
Most features of vcfmerge functional now.
Merge branch 'master' of github.com:pd3/htslib
merge_format_field: fix in setting missing values
Fixed memset error in vcfmerge
Fix in merge_alleles (expand REF if necessary); Normalize local copy of ALT,REF
plot-vcfcheck: New --rasterize option for fast PDF rendering
plot-vcfcheck: fixex merging of per-sample stats
vcf_parse1: error recovery on missing FILTER/INFO/FORMAT tag definitions
vcfcheck: New stats and graph, allelic r2 by AF
plot-vcfcheck: fixed forgotten .pdf extension string
Merge remote branch 'remotes/mp15master/master' into develop
plot-vcfcheck: removed forgotten debugging comments; merge of allelic r2 column in GCsAF
Fix: do not drop first record with both -B and -h
plot-vcfcheck: -T option to set PDF title; minor improvements in various plots
htscmd: fall back to default name when named non-standardly
vcfcheck: singletons by sample and multiallelic SNPs stats
plot-vcfcheck: new -s switch; several new graphs and table values
vcf.c: strip newline from the last sample in BCFs
plot-vcfcheck: Added missing legend into SNP/indel counts by AF graphs
plot-vcfcheck: Fix in merging per-sample average depth - average, not sum
Merge remote branch 'remotes/pd3master/master' into develop
htslib/vcf.c: Set length of on-fly added chromosomes to max signed int; Removed deprecated header parsing code.
plot-vcfcheck: zoomed y-range for ts/tv by sample graph
htslib/vcf.c: free unused records, fixed a small memory leak
vcfcheck: New -t option
vcfcheck: --debug outputs to stdout rather than stderr and the output parsable
plot-vcfcheck: Prevent division by zero in FS merging
plot-vcfcheck: Escape all underscores summary.tex
Merge remote branch 'remotes/mp15master/master' into develop
Fix in treatment of haploid Number=G tags. New category for structured header lines such as ALT=<ID=..>. Temporary solution to missing values in vectors: TODO!
Merge branch 'develop' of github.com:samtools/htslib into develop
New tool: vcfquery. (TODO: sample columns)
calc_ac in vcfutils: return when AC present in header but absent in the line
New tools: vcfquery, vcffilter (SOM)
vcffilter: Allow fixed threshold filtering of annotations which are not used in SOM
vcffilter: added check if supplied annotations are unique
vcfquery: print header only when requested
vcfcheck hotfix: Handle properly sample stats, cases where AC,AN are present in the header but not in the lines
Merge branch 'develop' of github.com:samtools/htslib into develop
gtcheck: New tool, initial commit
htslib/vcf.[hc]: Keep ID separately from the alleles block. Renamed synced_bcf_reader
Added faidx into htslib
Merge branch 'develop' of git://github.com/samtools/htslib into develop
vcfquery: Read multiple VCFs from a file.
Merge branch 'develop' of github.com:samtools/htslib into develop
vcffilter: allow filtering of split VCFs with -r
vcfquery: forgotten open call with -l
gtcheck: better output file names with -p dir/ prefix.
gtcheck: Print header in the "-s and no -p" mode.
norm indels: backup commit
Merge remote branch 'remotes/origin/develop' into norm-indels
vcfnorm: New command for normalizing and left-aligning indels.
Merge branch 'develop' of github.com:samtools/htslib into develop
vcfquery fix: throw an error when no VCF given with -l.
vcfgtcheck fix: added the command into main()
Synced VCF/BCF reader and most of the vcf*tools now read from stdin. Speedup in
Merge branch 'master' of github.com:samtools/htslib
Updated test file for vcfcheck
vcfnorm fix: segfault on nasty case of indel; synced_reader fix: read from stdin
vcfnorm: Sanity check for reference sequence mismatches
vcfnorm: Bugfixes, more robust realignment, more sanity checks.
Merge branch 'master' of github.com:samtools/htslib
vcf_format1: move the string initialization outside so that the method can work in append mode (used by vcfquery)
synced_reader: Allow reading of unindexed .vcf and .vcf.gz in more situations
vcfquery: Do not crash when no arguments given
vcf.[hc]: convenience routine for converting BCF_INT types into C ints
vcfgtcheck: new functionality - multi-sample cross-check of genotypes
bcf_set_iarray: fix in incrementing typed pointer
gtcheck: added per-sample depth plot
vcfgtcheck: Slight change in semantics, -g now used only for the known genotypes.
bcf core lib: PASS filter must come first in order to conform to BCF specification
bcf core lib: Fixed a silly bug from the last commit. FILTER=PASS indeed must come first, but only after the ##fileformat string.
core vcf: added API for modifying/deleting/adding INFO fields, currently only VCF output supported; vcfmerge: AN,AC tags now updated correctly.
vcfcheck: Do not crash at non-variant sites. Restore checking of two VCFs.
bgzip now compiles
VCF/BCF core lib: Allow missing values in vectors. Note that this is a major change which, in contrast to BCFv2.1 specification, allows missing values in vectors. For integer types, the values 0x80, 0x8000, 0x80000000 are interpreted as missing values and 0x81, 0x8001, 0x80000001 as end-of-vector indicators. Similarly for floats, the value of 0x7F800001 is interpreted as a missing value and 0x7F800002 as an end-of-vector indicator. This trial BCF version (v2.2) is compatible with t [...]
gtcheck: Removed forgotten debugging line
A few bugfixes from the previous two commits. New -S option in gtcheck and -H made work also in the cross-check mode
vcfquery fix: Allow a mask file with the -v option
vcfcheck: complain if tabix index not available with -t option
vcfcheck: Do not run in streaming mode with the -t option
vcffilter: fixed a typo, output file names now named correcly based on filter type
bcf1_update_info fix: initialize bcf_info_t pointer, do not pass NULL
vcffilter: extended help message
vcffilter: allow arbitrary order in -f and -l filtering options (X>value is same as value<X)
vcffilter: plot annotation distributions
vcfview: Restored the subset functionality
vcfview: Restored the subset functionality (continued)
vcfmerge: do not crash with non-overlapping chromosomes
vcffilter: Plot also cropped annotation distributions to show the range of values actually used in filtering
vcfgtcheck: Fix in INFO/DP parsing, wrong ID was used before
vcfmerge: fix of copy-and-paste error in format field merging, use correct integer ranges
gtcheck: fixed incorrect output, instead of uncertainty, the opposite (confidence) was reported before
vcffilter: check to prevent division by zero in the python plot script
vcfisec: in the sites output mode print also file mask
vcfquery: Enable -p option again
vcffilter: Plot also bad sites in annotation distributions. Updated test files.
vcfmerge, vcfisec: Change in synced reader's API, allow unlimited number of input files. (Previously the limit was sizeof(int)*8)
Merge remote branch 'mp15/develop' into develop
vcffilter: backup commit, bad SOM now working
vcfcheck: Fixed a bug introduced by recent changed in synced reader
vcf core lib: Added sanity check to catch up common problem makers, such as PL declared as Number=. instead of Number=G; Support for output of more than 64 alternate alleles; API to access bcfinfo_t header line information.
vcfmerge: Fixed subtle bugs revealed when merging FMT fields of different lengths; Append version string and command to the output VCF header
vcffilter: First step towards indel filtering - new switch to select from good sites mask
vcffilter: skip the two additional score fields when applying SOM filters
vcfcheck: create additional ts/tv stats which counts 1st alternate allele only
vcfcheck: More readable output, replace "-" by "<STDIN>" when reading from standard input
vcfview: Fix in bcf_subset; force unpacking on VCF output
tbx: Allow indexing of empty files, do not segfault
vcfutils: bcf_gt_type should not be declared as static
vcf.[hc]: API to update FILTER column. BCF output not supported yet in this version
vcfmerge, vcfnorm: Append command and version string to VCF header
vcffilter &friends: Filtering of indels now supported. Changes in vcfcheck, plot-vcfcheck and vcfquery to reflect that. More changes and fixes to come, but this version seems stable and fully functional.
Updated tests
resolved test/check.chk conflict
vcfcheck: Changed output format of Ts/Tv
vcffilter,gtcheck: Added sanity checks; Extended help message; Changed the order of columns on gtcheck output
plot-vcfcheck: Produce the overlap by AF graphs
vcfsubset: New -i,-e options for general filter expressions. In this version, only QUAL and INFO columns are supported.
synced read and vcftools: Changed the -f, --apply-filters switch so that it accepts a list of allowed FILTERs. This is useful for example in cases when PASS has to be distinguished from "."
vcffilter: FILTER column of hard-filtered sites now lists failed filters; Unset the FILTER of skipped sites only with -u switch, not by default.
vcfcheck: Fixed a bug in initialization of frame-shift calculation; More detailed FS output
vcfnorm: Removed two assertions to allow realignment of MNPs
vcffilter: Change in neighbourhood function in SOM training, seems more robust with respect to random seed
vcf core lib, vcfsubset, vcfmerge: First step towards binary BCF output. Functional but probably buggy.
Resolved vcfmerge.c merge conflict
vcfsubset: Make --apply-filters consistent with other tools
Merge remote branch 'remotes/sm15-mod/develop' into develop
vcfcheck: re-enabled stats by QUAL
Merge branch 'develop' of github.com:samtools/htslib into develop
Deleted files which were moved to the new samtools/bcftools.git repository
New API for BGZF indexing w.r.t uncompressed data. Extended BGZF to read uncompressed files. Moved from RAZF to BGZF in FAI indexing.
Added BGZF indexing and reindexing capability to bgzip. Fixed -b seeking.
bgzf.c: Check return status of bgzf_read_block in bgzf_useek to catch gzipped (as opposed to bgzipped) files.
tview and bgzip: Check that the files are bgzipped, not gzipped.
[hts_open,bgzf,vcf] New flags for uncompressed input/output to distinguish between compressed/uncompressed BCF/VCF. Updates and fixes in BCF update functions. Removed old tests (now in samtools/bcftools repo) and added new.
Resolved merge conflict in Makefile, merged.
vcf.h: Reverted back to bcf_float_is_* macros as the inlined unions break vcfmerge code, found a solution with the original version.
vcf.h: New bcf_get_info_* and bcf_get_format_* calls
Merge branch 'develop' of github.com:samtools/htslib into develop
hts_open,bgzf_open: Changed semantics of the "u" flag. Originally "u" was equivalent to "0", which on output yielded uncompressed data in the zlib format. Newly "u" results in plain uncompressed output.
Merge branch 'develop' of github.com:samtools/htslib into develop
vcf: Make sure GT comes always first in bcf1_update_format; Do not loose last newline in vcf_hdr_read; Updated tests.
vcf: Changed semantics of bcf_get_[info|format]*() ndst parameter: interpret as number of elements in the array, not the size in bytes. Allow overlapping memory blocks in bcf1_update_alleles
synced_reader,vcf: Big rewrite of synced reader to allow more flexible handling of regions and targets in future.
This commit fixes issues in hts_open and bgzf with uncompressed BCF input, BGZF now detects uncompressed streams on the fly. Fixes in FILTER and FMT part of bcf1_sync.
kfunc: Moved the kt_fisher_exact test from the original bcftools into htslib
bgzf: Fixed typo in block_address calculation in bgzf_seek
synced_vcf_reader: new bcf_sr_get_line macro and fix in vcf/bcf index interator initialization
vcf: Minor bug fixes and API touches
kseq,hts: Enable seek() on kstreams and hts_useek(),hts_utell() on htsFiles. New module which allows sweeping BCF/VCF files both fwds and bwds
bgzip: Do not overwrite input files when -b or -s are given
vcf: prevent segfault when sample columns do not exist but FORMAT column does
Merge branch 'develop' of github.com:samtools/htslib into develop
Merge branch 'develop' of github.com:samtools/htslib into develop
synced_reader: Expanded docs - example of usage
vcf: Support for string values in FORMAT; handling missing BCF_BT_CHAR values in bcf_fmt_array()
VCF header parsing: Handle pathological cases where less-than and greater-than signs are used in place of double quotes
vcf: Non-critical vcf_parse1() errors must not go unnoticed by vcf_write1(). Fixed offending \0\n added by bcf_hdr_subset()
Fixes in targets/regions VCF synced reading, now regions work when streamed. Removed redundant get_fmt API calls
Minor docs comment added
Merge branch 'develop' of github.com:samtools/htslib into develop
bgzf: Added ad-hoc checks to all hfile calls to keep compiler happy
bcf_index_build: Check return status of helper calls
Merge branch 'develop' of github.com:samtools/htslib into develop
synced_reader: Added support for alleles in targets files to select best matching line out of possibly duplicate VCF records.
vcf: Added bcf_get_variant_type[s]() call to avoid the need for explicit call of bcf_set_variant_types(); The mpileup's X allele is not a SNP; Get rid of htfile warnings
Proper initialization in hts_readlines
bcf_index_build: Check if the BCF is compressed
Fixed a rather surprising bug in CSI index access via hts_itr_next, before it was not possible to obtain BCF record at chr:pos-pos. The fix may have unexpected consequences in other situations and file formats, beware!
vcf.c: Proper freeing of memory taken by dirty info/fmt tags when update routine called multiple times
New feature in bcf_synced_reader: negate the sample selection in bcf_sr_set_samples by exclamation mark
vcf.h: Add out of range check into bcf_idinfo_exists() as it always has to be checked elsewhere.
vcf.h: New API call: bcf_get_genotypes()
vcf.c: Call bcf_unpack() on user's behalf in all bcf1_get/update routines
synced_reader: Fixed an off by one error in bcf_sr_next_line() to properly match target alleles
vcf: Tidied up vcf.h, made API more consistent and extended documentation
vcf module: Made the naming more consistent to avoid confusion about which of the vcf/bcf_* function variant to use.
Merge branch 'bcftools+calling' into develop
Merge branch 'develop' of github.com:samtools/htslib into develop
hfile: Removed accidently commited local change in blksize
Get rid of %ld compilaiton warnings
vcf: New bcf_hdr_printf() call
Merge branch 'bcftools+calling' into develop
vcf: Print a comprehensible error message if unsupported version of BCF is encountered
The file type not necessary any more when opening synced reader
Merge branch 'develop' into bcftools+calling
Reverted a fix of "chr:pos-pos" index queries, the original version now works.
Compile cleanly without warnings with gcc 4.6.3
Fixed return status checks of fread calls
Check return status of IO calls to prevent gcc 4.6.3 warnings
Merge branch 'develop' into bcftools+calling
vcf: Support for regions in the form chr|chr:pos|chr:from-to|chr:from-
tbx: Sanity check if the file type meets the expectation, detect non-numeric fields
hts_idx: check return status of bgzf_read
bcf reader: Fix in chr:pos type of regions
synced reader: Support for BED regions/target files
synced reader: regions and targets now support VCF; Fix for match_alleles by reverting 0-terminating chr modification
synced reader: Give a hint when tabix was used with wrong column indexes for regions/targets
pileup: Detect non-recoverable errors and exit with proper status
vcf: new APIs bcf_alleles2gt and bcf_gt2alleles; bcf_synced_reader: recognise compressed BED files
Resolved merge conflicts
Merge branch 'bcftools+calling' into develop
Merge branch 'develop' into bcftools+calling
bcf_remove_alleles: Assume diploid genotypes if the ploidy cannot be determined, better to proceed than crash.
vcf: Update rlen on BCF output, this is important in BCF indexing, rlen is used as it is. Todo: also END should be checked with symbolic alleles
Merge branch 'bcftools+calling' into develop
Merge branch 'develop' into bcftools+calling
sam: New bam_str2flag and bam_flag2str functions for a more user-friendly setting of bitmasks. Removed the default pileup filters (except for BAM_FUNMAP) and moved the filtering to BAM-reading callbacks
Merge branch 'bcftools+calling' into develop
Merge branch 'develop' into bcftools+calling
Merge branch 'bcftools+calling' into develop
faidx: Detect compressed BGZF fasta files and create .gzi index without asking to allow proper functioning of samtools faidx fasta.fa.gz
synced_reader: Improved regions/targets to be faster with large number of sequences; More userfriendly regions_overlap() function
vcf: bcf_update_format now sets the number of samples in VCF record which previously was responsibility of the user; Fix in bcf_itr_querys macro, non-existent bcf_name2id renamed to bcf_hdr_name2id
vcf: Support for IDX BCF tag
vcf: Output . for missing INFO, FMT and samples
vcf.c: Fixed a renaming artifact, bcf_clear1 should be bcf_clear
vcf: qual now set to missing on bcf_clear; Avoid warnings about type-punned pointers
Output VCFv4.2 header
Merge branch 'develop' into bcftools+calling
Updated test
vcf: Make sure fileformat is added only once
vcf: New bcf_get_info_string() wrapper
vcf: New API bcf_get_format_string()
Allow overriding default columns indexes in bcf_sr_regions_init
vcf: new function bcf_remove_filter()
Merge branch 'develop' into bcftools+calling
Merge branch 'bcftools+calling' into develop
vcf: fix in bcf_get_genotypes, sanity check too strict for GT
vcf: Tip to workaround missing contig tags
hts indexing: Allow arbitrary chromosomal order.
vcf: New bcf_hdr_id2hrec() and bcf_hrec_format() functions
vcf: New vcf_write_line() function
synced_bcf_reader: New bcf_sr_seek() function
vcf: New bcf_translate() and bcf_hdr_combine() API
Initial version of bgzip and tabix with htslib.
idx: Support for "." queries to retrieve all records
vcf: New bcf_has_filter() API function
vcf: new experimental bcf_hdr_set_samples() for subsetting
Strict matching of strings in bam_str2flag
Merge branch 'bcftools+calling' into develop
bcf_hdr_set_samples() recognise lists vs file names
synced_reader: Better handling of target alleles
bcf_sr_set_samples to use the new --sample convention
vcf: New bcf_get_info_flag() API
vcf: Abort on broken VCFs with multiple/trailing spaces/tabs
vcf: Don't let unpack_*_core forget the to-be-freed pointer
hts_expand0: type ptr properly, it can be of different type
To avoid confusion, make bcf_get_*_int and bcf_update_*_int symetric
hts_idx_load: Print warning when index is older than data
Make synced reader aware of bcf_hdr_set_sample(), expanded docs
synced_reader: New bcf_sr_region_done macro
bcf_translate: Catch unchecked errors
More sensible Quals of matching bases in overlaps
Ignore operation-not-supported errors in fd_flush
vcf: Allow omitted trailing FORMAT fields
synced_reader: Init position correctly
synced_reader: Init position correctly
Merge branch 'develop' into bcftools+calling
Merge branch 'develop' into bcftools+calling
hts_idx: fixed indexing of NO_COOR blocks
This resolves #49
vcf: Propagate error in bcf_hdr_append
Fixed off-by-one in bcf_get_format_values
Propagate error from bcf_hdr_printf
hts_readlist modifed to explicitly indicate file name vs list
Support for =X cigars in mpileup overlaps.
faidx: Set error status if unknown seq is queried
bcf_hdr_subset: To allow sanity checks, mark missing samples
bcf_sr_seek to seek to start with seq=NULL
bcf_remove_alleles: Fix indexing range for Number=A alleles
Fixed BCF output after bcf1_sync_alleles, simplified
bcf_remove_alleles to work with BCF output
Added bgzip and tabix
.gitignore tabix and bgzip
bcf_subset: Reset n_fmt fields when all samples are dropped
synced_reader: Regions operate on two as well as three columns again
bcf1_sync: Update rlen for new records as well
bcf_write: sanity check the number of samples, resolves #55
vcf_parse: Detect missing INFO values in vectors
Merge branch 'develop' into bcftools+calling
For bcftools to work with CSI indexes up to 2^31-1
New bcf_hdr_[sg]et_version API; bcf_hdr_subset preserves VCF version
bgzf: Support for reading gzip-compressed files.
Minor comment change
bgzf: Support for FCOMMENT gzip header
bgzf: Support the remaining optional GZIP header fields
bcf_sr: New missed_reg_handler and bcf_sr_regions_flush() function
vcf: New bcf_dup() function
idx: Don't create broken .tbi or .csi when file unsorted (VCF/BCF/BAM)
synced_reader: Removed unused code
vcf: Change of bcf_hdr_add_sample() usage.
bcf_write: Error message instead of assert + propagate the error up
vcf_parse: Detect incorrect number of columns
vcf: Fix BCF syncing when only partially unpacked
vcf: Fixes in bcf_translate
bgzf: Do not end prematurely on non-critical Z_BUF_ERROR zlib error
vcf: Fix bcf_hdr_id2hrec, col_type parameter is needed as well
bcf_trim_als: Propagate error instead of assert
bcf_update_alleles: Reference length with INFO/END
Fix in index initialization
Improved bug fix of bcf1_sync() from the commit 2504243:
synced_reader: Consider non-variant sites as matching in comparisons
vcf: Detect broken structured header lines
synced_reader: Logical complements of target regions
bgzf: Translate windows line endings
bcf_hdr_combine: Check for tags of conflicting lengths
faidx: Exit with an error if run on gzip-ed files
vcf: Support for mpileup's symbolic <X> allele
Expose bcf_hdr_parse
vcf: Expose bcf_hdr_sync() and hrec_add_idx()
vcf: bcf_hdr_get_hrec() interface made more general
mpileup: Support for cigar operator "P"
faidx: new faidx_has_seq() call
faidx: Fixed copy-and-paste documentation error
Bug fix in bcf_trim_alleles()
mpileup: Support for cigar operation N
bgzf_seek: Assign corect block_address (on behalf of Chris Smowton)
Rob Davies (2):
Fixes incorrect MD5 calculation in cram_write_SAM_hdr
Fix sam_format1 failure on lines ending in integer/float tags and issue #63
Shane McCarthy (12):
do not round per-sample average depths to integer values
correct typos
deal with Number=A and Number=G INFO and FORMAT fields when trimming alleles
add new tool vcfsubset
add tests for vcfsubset
bcf_gt_type can now, optionally, give the index of the 2nd alt allele for GT_HET_AA genotypes
add support for Number=R fields when trimming alleles
Merge branch 'develop' of git://github.com/samtools/htslib into feature/number_R
add support for Number=R fields when trimming alleles
fix remote fetching of regions for bam and vcf
make sure we have found values for AC _and_ AN before using
hts_file_type: correct return of FT_VCF_GZ
jsimpson (1):
constify input parameter to faidx_fetch_seq
mp15 (1):
Merge pull request #1 from samtools/master
pd3 (7):
Merge pull request #7 from mp15/master
Merge pull request #8 from sm15/develop
Merge pull request #33 from sm15/develop
Merge pull request #79 from spalte/comment_typo
Merge pull request #82 from sm15/feature/bcf_calc_ac_bugfix
Merge pull request #108 from MauricioCarneiro/develop
Merge pull request #113 from mcshane/feature/hts_file_type_fix
-----------------------------------------------------------------------
No new revisions were added by this update.
--
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/debian-med/htslib.git
More information about the debian-med-commit
mailing list