[med-svn] [libzstd] 01/03: Imported Upstream version 0.8.0

Kevin Murray daube-guest at moszumanska.debian.org
Fri Aug 12 02:46:34 UTC 2016


This is an automated email from the git hooks/post-receive script.

daube-guest pushed a commit to branch master
in repository libzstd.

commit 80d57579ee83282d4d61bfd6fd42895d3ff4227c
Author: Kevin Murray <spam at kdmurray.id.au>
Date:   Fri Aug 12 12:09:06 2016 +1000

    Imported Upstream version 0.8.0
---
 .coverity.yml                                      |    5 +
 .gitignore                                         |    7 +
 Makefile                                           |   27 +-
 NEWS                                               |   37 +
 README.md                                          |   17 +-
 appveyor.yml                                       |    5 +-
 examples/.gitignore                                |   10 +
 examples/Makefile                                  |   59 +
 examples/README.md                                 |   18 +
 examples/dictionary_compression.c                  |  163 +
 examples/dictionary_decompression.c                |  136 +
 examples/simple_compression.c                      |  142 +
 examples/simple_decompression.c                    |  119 +
 images/Cspeed4.png                                 |  Bin 47294 -> 35361 bytes
 lib/.gitignore                                     |    2 +
 lib/Makefile                                       |   18 +-
 lib/README.md                                      |   72 +-
 lib/common/entropy_common.c                        |   67 +-
 lib/common/error_public.h                          |    6 +-
 lib/common/huf.h                                   |    2 +-
 lib/common/mem.h                                   |   26 +-
 lib/common/zbuff.h                                 |   29 +-
 lib/common/zstd_internal.h                         |   74 +-
 lib/compress/fse_compress.c                        |    2 +-
 lib/compress/huf_compress.c                        |   79 +-
 lib/compress/zbuff_compress.c                      |   59 +-
 lib/compress/zstd_compress.c                       | 1297 ++++---
 lib/compress/zstd_opt.h                            |  109 +-
 lib/decompress/zbuff_decompress.c                  |   51 +-
 lib/decompress/zstd_decompress.c                   |  674 ++--
 lib/dictBuilder/zdict.c                            |  286 +-
 lib/dictBuilder/zdict.h                            |   79 +-
 lib/legacy/zstd_legacy.h                           |   61 +-
 lib/legacy/zstd_v02.c                              |    2 +-
 lib/legacy/zstd_v04.c                              |   61 +-
 lib/legacy/zstd_v05.c                              |   28 +-
 lib/legacy/zstd_v06.c                              |    4 +-
 lib/legacy/zstd_v06.h                              |    2 +-
 lib/legacy/{zstd_v06.c => zstd_v07.c}              | 4053 +++++++++++---------
 lib/legacy/{zstd_v06.h => zstd_v07.h}              |  161 +-
 lib/{common => }/zstd.h                            |  274 +-
 programs/.gitignore                                |    2 +
 programs/Makefile                                  |   61 +-
 programs/bench.c                                   |   46 +-
 programs/datagen.c                                 |   30 +-
 programs/datagencli.c                              |    8 +-
 programs/dibio.c                                   |   70 +-
 programs/fileio.c                                  |  145 +-
 programs/fileio.h                                  |    1 -
 programs/fullbench.c                               |   30 +-
 programs/fuzzer.c                                  |  120 +-
 programs/legacy/fileio_legacy.c                    |   83 +
 programs/paramgrill.c                              |  360 +-
 programs/playTests.sh                              |   21 +-
 programs/util.h                                    |    7 +
 programs/zbufftest.c                               |   21 +-
 programs/zstd.1                                    |   30 +-
 programs/zstdcli.c                                 |  315 +-
 projects/README.md                                 |    3 +-
 projects/VS2008/fullbench/fullbench.vcproj         |   10 +-
 projects/VS2008/fuzzer/fuzzer.vcproj               |   10 +-
 projects/VS2008/zstd/zstd.vcproj                   |   18 +-
 projects/VS2008/zstdlib/zstdlib.vcproj             |   10 +-
 projects/VS2010/datagen/datagen.vcxproj.filters    |   26 -
 projects/VS2010/fullbench/fullbench.vcxproj        |   12 +-
 .../VS2010/fullbench/fullbench.vcxproj.filters     |   86 -
 projects/VS2010/fuzzer/fuzzer.vcxproj              |   12 +-
 projects/VS2010/fuzzer/fuzzer.vcxproj.filters      |   92 -
 projects/VS2010/zstd/zstd.vcxproj                  |   16 +-
 projects/VS2010/zstd/zstd.vcxproj.filters          |  158 -
 projects/VS2010/zstdlib/zstdlib.vcxproj            |   12 +-
 projects/VS2010/zstdlib/zstdlib.vcxproj.filters    |   95 -
 projects/cmake/lib/CMakeLists.txt                  |   14 +-
 projects/cmake/programs/CMakeLists.txt             |    2 +-
 tests/.gitignore                                   |    4 +
 tests/test-zstd-speed.py                           |  163 +-
 zlibWrapper/Makefile                               |    4 +-
 zstd.rb                                            |   18 +
 zstd_compression_format.md                         | 1149 ++++++
 79 files changed, 6826 insertions(+), 4761 deletions(-)

diff --git a/.coverity.yml b/.coverity.yml
new file mode 100644
index 0000000..907f096
--- /dev/null
+++ b/.coverity.yml
@@ -0,0 +1,5 @@
+configurationVersion: 1
+
+filters:
+    # third-party embedded
+    - filePath: lib/dictBuilder/divsufsort.c
diff --git a/.gitignore b/.gitignore
index 1816524..0c45815 100644
--- a/.gitignore
+++ b/.gitignore
@@ -37,3 +37,10 @@ _zstdbench/
 
 # CMake
 projects/cmake/
+
+# Test artefacts
+tmp*
+dictionary
+
+# tmp files
+*.swp
diff --git a/Makefile b/Makefile
index 4284528..d8e740b 100644
--- a/Makefile
+++ b/Makefile
@@ -41,15 +41,17 @@ else
 VOID = /dev/null
 endif
 
-.PHONY: default all zlibwrapper zstdprogram clean install uninstall travis-install test clangtest gpptest armtest usan asan uasan
+.PHONY: default all zlibwrapper zstd clean install uninstall travis-install test clangtest gpptest armtest usan asan uasan
 
-default: zstdprogram
+default: zstd
 
 all:
 	$(MAKE) -C $(ZSTDDIR) $@
 	$(MAKE) -C $(PRGDIR) $@
+	@rm -f lib/decompress/*.o
+	$(MAKE) -C $(PRGDIR) all32
 
-zstdprogram:
+zstd:
 	$(MAKE) -C $(PRGDIR)
 	cp $(PRGDIR)/zstd .
 
@@ -68,10 +70,10 @@ clean:
 	@echo Cleaning completed
 
 
-#------------------------------------------------------------------------
-#make install is validated only for Linux, OSX, kFreeBSD and Hurd targets
-#------------------------------------------------------------------------
-ifneq (,$(filter $(shell uname),Linux Darwin GNU/kFreeBSD GNU))
+#----------------------------------------------------------------------------------
+#make install is validated only for Linux, OSX, kFreeBSD, Hurd and some BSD targets
+#----------------------------------------------------------------------------------
+ifneq (,$(filter $(shell uname),Linux Darwin GNU/kFreeBSD GNU FreeBSD DragonFly))
 HOST_OS = POSIX
 install:
 	$(MAKE) -C $(ZSTDDIR) $@
@@ -85,7 +87,7 @@ travis-install:
 	$(MAKE) install PREFIX=~/install_test_dir
 
 gpptest: clean
-	$(MAKE) all CC=g++ CFLAGS="-O3 -Wall -Wextra -Wundef -Wshadow -Wcast-align -Werror"
+	$(MAKE) -C programs all CC=g++ CFLAGS="-O3 -Wall -Wextra -Wundef -Wshadow -Wcast-align -Werror"
 
 gcc5test: clean
 	gcc-5 -v
@@ -105,11 +107,11 @@ armtest: clean
 
 ppctest: clean
 	$(MAKE) -C $(PRGDIR) datagen   # use native, faster
-	$(MAKE) -C $(PRGDIR) test CC=powerpc-linux-gnu-gcc ZSTDRTTEST= MOREFLAGS="-Werror -static"
+	$(MAKE) -C $(PRGDIR) test CC=powerpc-linux-gnu-gcc ZSTDRTTEST= MOREFLAGS="-Werror -Wno-attributes -static"
 
 ppc64test: clean
 	$(MAKE) -C $(PRGDIR) datagen   # use native, faster
-	$(MAKE) -C $(PRGDIR) test CC=powerpc-linux-gnu-gcc ZSTDRTTEST= MOREFLAGS="-m64 -Werror -static"
+	$(MAKE) -C $(PRGDIR) test CC=powerpc-linux-gnu-gcc ZSTDRTTEST= MOREFLAGS="-m64 -static"
 
 usan: clean
 	$(MAKE) test CC=clang MOREFLAGS="-g -fsanitize=undefined"
@@ -168,6 +170,9 @@ bmix32test: clean
 
 bmi32test: clean
 	CFLAGS="-O3 -mbmi -m32 -Werror" $(MAKE) -C $(PRGDIR) test
+
+staticAnalyze: clean
+	CPPFLAGS=-g scan-build --status-bugs -v $(MAKE) all
 endif
 
 
@@ -187,7 +192,7 @@ gcc5install:
 
 gcc6install:
 	sudo add-apt-repository -y ppa:ubuntu-toolchain-r/test
-	sudo apt-get update -y -qq 
+	sudo apt-get update -y -qq
 	sudo apt-get install -y -qq gcc-6-multilib
 
 arminstall: clean
diff --git a/NEWS b/NEWS
index a980e80..56c46fe 100644
--- a/NEWS
+++ b/NEWS
@@ -1,3 +1,40 @@
+v0.8.0
+Improved : better speed on clang and gcc -O2, thanks to Eric Biggers
+New : Build on FreeBSD and DragonFly, thanks to JrMarino
+Changed : modified API : ZSTD_compressEnd()
+Fixed : legacy mode with ZSTD_HEAPMODE=0, by Christopher Bergqvist
+Fixed : premature end of frame when zero-sized raw block, reported by Eric Biggers
+Fixed : large dictionaries (> 384 KB), reported by Ilona Papava
+Fixed : checksum correctly checked in single-pass mode
+Fixed : combined --test amd --rm, reported by Andreas M. Nilsson
+Modified : minor compression level adaptations
+Updated : compression format specification to v0.2.0
+changed : zstd.h moved to /lib directory
+
+v0.7.4
+Added : homebrew for Mac, by Daniel Cade
+Added : more examples
+Fixed : segfault when using small dictionaries, reported by Felix Handte
+Modified : default compression level for CLI is now 3
+Updated : specification, to v0.1.1
+
+v0.7.3
+New : compression format specification
+New : `--` separator, stating that all following arguments are file names. Suggested by Chip Turner.
+New : `ZSTD_getDecompressedSize()`
+New : OpenBSD target, by Juan Francisco Cantero Hurtado
+New : `examples` directory
+fixed : dictBuilder using HC levels, reported by Bartosz Taudul
+fixed : legacy support from ZSTD_decompress_usingDDict(), reported by Felix Handte
+fixed : multi-blocks decoding with intermediate uncompressed blocks, reported by Greg Slazinski
+modified : removed "mem.h" and "error_public.h" dependencies from "zstd.h" (experimental section)
+modified : legacy functions no longer need magic number
+
+v0.7.2
+fixed : ZSTD_decompressBlock() using multiple consecutive blocks. Reported by Greg Slazinski.
+fixed : potential segfault on very large files (many gigabytes). Reported by Chip Turner.
+fixed : CLI displays system error message when destination file cannot be created (#231). Reported by Chip Turner.
+
 v0.7.1
 fixed : ZBUFF_compressEnd() called multiple times with too small `dst` buffer, reported by Christophe Chevalier
 fixed : dictBuilder fails if first sample is too small, reported by Руслан Ковалёв
diff --git a/README.md b/README.md
index 7b58e5e..f8353ec 100644
--- a/README.md
+++ b/README.md
@@ -1,13 +1,16 @@
- **Zstd**, short for Zstandard, is a fast lossless compression algorithm, targeting real-time compression scenarios at zlib-level and better compression ratios.
+ **Zstd**, short for Zstandard, is a fast lossless compression algorithm,
+ targeting real-time compression scenarios at zlib-level and better compression ratios.
 
-It is provided as a BSD-license package, hosted on Github.
+It is provided as an open-source BSD-licensed **C** library.
+For other programming languages,
+you can consult a list of known ports on [Zstandard homepage](http://www.zstd.net/#other-languages).
 
 |Branch      |Status   |
 |------------|---------|
 |master      | [![Build Status](https://travis-ci.org/Cyan4973/zstd.svg?branch=master)](https://travis-ci.org/Cyan4973/zstd) |
 |dev         | [![Build Status](https://travis-ci.org/Cyan4973/zstd.svg?branch=dev)](https://travis-ci.org/Cyan4973/zstd) |
 
-As a reference, several fast compression algorithms were tested and compared on a Core i7-3930K CPU @ 4.5GHz, using [lzbench], an open-source in-memory benchmark by @inikep compiled with gcc 5.2.1, with the [Silesia compression corpus].
+As a reference, several fast compression algorithms were tested and compared on a Core i7-3930K CPU @ 4.5GHz, using [lzbench], an open-source in-memory benchmark by @inikep compiled with gcc 5.4.0, with the [Silesia compression corpus].
 
 [lzbench]: https://github.com/inikep/lzbench
 [Silesia compression corpus]: http://sun.aei.polsl.pl/~sdeor/index.php?page=silesia
@@ -16,9 +19,9 @@ As a reference, several fast compression algorithms were tested and compared on
 |Name             | Ratio | C.speed | D.speed |
 |-----------------|-------|--------:|--------:|
 |                 |       |   MB/s  |  MB/s   |
-|**zstd 0.7.0 -1**|**2.877**|**325**| **930** |
+|**zstd 0.8.0 -1**|**2.877**|**330**| **930** |
 | [zlib] 1.2.8 -1 | 2.730 |    95   |   360   |
-| brotli -0       | 2.708 |   220   |   430   |
+| brotli 0.4 -0   | 2.708 |   320   |   375   |
 | QuickLZ 1.5     | 2.237 |   510   |   605   |
 | LZO 2.09        | 2.106 |   610   |   870   |
 | [LZ4] r131      | 2.101 |   620   |  3100   |
@@ -74,8 +77,8 @@ Hence, deploying one dictionary per type of data will provide the greater benefi
 
 ### Status
 
-Zstd compression format has reached "Final status". It means it is planned to become the official stable zstd format and be tagged `v1.0`. The reason it's not yet tagged `v1.0` is that it currently performs its "validation period", making sure the format holds all its promises and nothing was missed.
-Zstd library also offers legacy decoder support. Any data compressed by any version >= `v0.1` (hence including current one) remains decodable now and in the future.
+Zstd compression format has reached "Final status". It means it is planned to become the official stable zstd format tagged `v1.0`. The reason it's not yet tagged `v1.0` is that it currently performs its "validation period", making sure the format holds all its promises and nothing was missed.
+Zstd library also offers legacy decoder support. Any data compressed by any version >= `v0.1` is decodable now and in the future.
 The library has been validated using strong [fuzzer tests](https://en.wikipedia.org/wiki/Fuzz_testing), including both [internal tools](programs/fuzzer.c) and [external ones](http://lcamtuf.coredump.cx/afl). It's able to withstand hazard situations, including invalid inputs.
 As a consequence, Zstandard is considered safe for, and is currently used in, production environments.
 
diff --git a/appveyor.yml b/appveyor.yml
index 10da235..4f93812 100644
--- a/appveyor.yml
+++ b/appveyor.yml
@@ -27,7 +27,8 @@ install:
       SET "CLANG_PARAMS=-C programs zstd fullbench fuzzer zbufftest paramgrill datagen CC=clang MOREFLAGS="--target=x86_64-w64-mingw32 -Werror -Wconversion -Wno-sign-conversion"" &&
       SET "PATH_MINGW32=c:\MinGW\bin;c:\MinGW\usr\bin" &&
       SET "PATH_MINGW64=c:\msys64\mingw64\bin;c:\msys64\usr\bin" &&
-      COPY C:\MinGW\bin\mingw32-make.exe C:\MinGW\bin\make.exe
+      COPY C:\MinGW\bin\mingw32-make.exe C:\MinGW\bin\make.exe &&
+      COPY C:\MinGW\bin\gcc.exe C:\MinGW\bin\cc.exe
     ) else (
       IF [%PLATFORM%]==[x64] (SET ADDITIONALPARAM=/p:LibraryPath="C:\Program Files\Microsoft SDKs\Windows\v7.1\lib\x64;c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\lib\amd64;C:\Program Files (x86)\Microsoft Visual Studio 10.0\;C:\Program Files (x86)\Microsoft Visual Studio 10.0\lib\amd64;")
     )
@@ -50,6 +51,8 @@ build_script:
       ECHO *** &&
       ECHO *** Building %PLATFORM% &&
       ECHO *** &&
+      make -v &&
+      cc -v &&
       ECHO make %MAKE_PARAMS% &&
       make %MAKE_PARAMS% &&
       make clean
diff --git a/examples/.gitignore b/examples/.gitignore
new file mode 100644
index 0000000..9d241db
--- /dev/null
+++ b/examples/.gitignore
@@ -0,0 +1,10 @@
+#build
+simple_compression
+simple_decompression
+dictionary_compression
+dictionary_decompression
+
+#test artefact
+tmp*
+test*
+*.zst
diff --git a/examples/Makefile b/examples/Makefile
new file mode 100644
index 0000000..5e3f0e1
--- /dev/null
+++ b/examples/Makefile
@@ -0,0 +1,59 @@
+# ##########################################################################
+# ZSTD educational examples - Makefile
+# Copyright (C) Yann Collet 2016
+#
+# GPL v2 License
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License along
+# with this program; if not, write to the Free Software Foundation, Inc.,
+# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
+#
+# You can contact the author at :
+#  - zstd homepage : http://www.zstd.net/
+# ##########################################################################
+
+# This Makefile presumes libzstd is installed, using `sudo make install`
+
+LDFLAGS+= -lzstd
+
+.PHONY: default all clean test
+
+default: all
+
+all: simple_compression simple_decompression \
+	dictionary_compression dictionary_decompression
+
+simple_compression : simple_compression.c
+	$(CC) $(CPPFLAGS) $(CFLAGS) $^ $(LDFLAGS) -o $@
+
+simple_decompression : simple_decompression.c
+	$(CC) $(CPPFLAGS) $(CFLAGS) $^ $(LDFLAGS) -o $@
+
+dictionary_compression : dictionary_compression.c
+	$(CC) $(CPPFLAGS) $(CFLAGS) $^ $(LDFLAGS) -o $@
+
+dictionary_decompression : dictionary_decompression.c
+	$(CC) $(CPPFLAGS) $(CFLAGS) $^ $(LDFLAGS) -o $@
+
+clean:
+	@rm -f core *.o tmp* result* *.zst \
+        simple_compression simple_decompression \
+		dictionary_compression dictionary_decompression
+	@echo Cleaning completed
+
+test: all
+	cp README.md tmp
+	./simple_compression tmp
+	@echo starting simple_decompression
+	./simple_decompression tmp.zst
+	@echo tests completed
diff --git a/examples/README.md b/examples/README.md
new file mode 100644
index 0000000..2f46038
--- /dev/null
+++ b/examples/README.md
@@ -0,0 +1,18 @@
+Zstandard library : usage examples
+==================================
+
+- [Simple compression](simple_compression.c)
+  Compress a single file.
+  Introduces usage of : `ZSTD_compress()`
+
+- [Simple decompression](simple_decompression.c)
+  Decompress a single file compressed by zstd.
+  Introduces usage of : `ZSTD_decompress()`
+
+- [Dictionary compression](dictionary_compression.c)
+  Compress multiple files using the same dictionary.
+  Introduces usage of : `ZSTD_createCDict()` and `ZSTD_compress_usingCDict()`
+
+- [Dictionary decompression](dictionary_decompression.c)
+  Decompress multiple files using the same dictionary.
+  Introduces usage of : `ZSTD_createDDict()` and `ZSTD_decompress_usingDDict()`
diff --git a/examples/dictionary_compression.c b/examples/dictionary_compression.c
new file mode 100644
index 0000000..c4dc1b9
--- /dev/null
+++ b/examples/dictionary_compression.c
@@ -0,0 +1,163 @@
+/*
+  Dictionary compression
+  Educational program using zstd library
+  Copyright (C) Yann Collet 2016
+
+  GPL v2 License
+
+  This program is free software; you can redistribute it and/or modify
+  it under the terms of the GNU General Public License as published by
+  the Free Software Foundation; either version 2 of the License, or
+  (at your option) any later version.
+
+  This program is distributed in the hope that it will be useful,
+  but WITHOUT ANY WARRANTY; without even the implied warranty of
+  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+  GNU General Public License for more details.
+
+  You should have received a copy of the GNU General Public License along
+  with this program; if not, write to the Free Software Foundation, Inc.,
+  51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
+
+  You can contact the author at :
+  - zstd homepage : http://www.zstd.net/
+*/
+
+#include <stdlib.h>    // malloc, exit
+#include <stdio.h>     // printf
+#include <string.h>    // strerror
+#include <errno.h>     // errno
+#include <sys/stat.h>  // stat
+#include <zstd.h>      // presumes zstd library is installed
+
+
+static off_t fsize_X(const char *filename)
+{
+    struct stat st;
+    if (stat(filename, &st) == 0) return st.st_size;
+    /* error */
+    perror(filename);
+    exit(1);
+}
+
+static FILE* fopen_X(const char *filename, const char *instruction)
+{
+    FILE* const inFile = fopen(filename, instruction);
+    if (inFile) return inFile;
+    /* error */
+    perror(filename);
+    exit(2);
+}
+
+static void* malloc_X(size_t size)
+{
+    void* const buff = malloc(size);
+    if (buff) return buff;
+    /* error */
+    perror(NULL);
+    exit(3);
+}
+
+static void* loadFile_X(const char* fileName, size_t* size)
+{
+    off_t const buffSize = fsize_X(fileName);
+    FILE* const inFile = fopen_X(fileName, "rb");
+    void* const buffer = malloc_X(buffSize);
+    size_t const readSize = fread(buffer, 1, buffSize, inFile);
+    if (readSize != (size_t)buffSize) {
+        fprintf(stderr, "fread: %s : %s \n", fileName, strerror(errno));
+        exit(4);
+    }
+    fclose(inFile);
+    *size = buffSize;
+    return buffer;
+}
+
+static void saveFile_X(const char* fileName, const void* buff, size_t buffSize)
+{
+    FILE* const oFile = fopen_X(fileName, "wb");
+    size_t const wSize = fwrite(buff, 1, buffSize, oFile);
+    if (wSize != (size_t)buffSize) {
+        fprintf(stderr, "fwrite: %s : %s \n", fileName, strerror(errno));
+        exit(5);
+    }
+    if (fclose(oFile)) {
+        perror(fileName);
+        exit(6);
+    }
+}
+
+/* createDict() :
+   `dictFileName` is supposed to have been created using `zstd --train` */
+static const ZSTD_CDict* createDict(const char* dictFileName)
+{
+    size_t dictSize;
+    printf("loading dictionary %s \n", dictFileName);
+    void* const dictBuffer = loadFile_X(dictFileName, &dictSize);
+    const ZSTD_CDict* const ddict = ZSTD_createCDict(dictBuffer, dictSize, 3);
+    free(dictBuffer);
+    return ddict;
+}
+
+
+static void compress(const char* fname, const char* oname, const ZSTD_CDict* cdict)
+{
+    size_t fSize;
+    void* const fBuff = loadFile_X(fname, &fSize);
+    size_t const cBuffSize = ZSTD_compressBound(fSize);
+    void* const cBuff = malloc_X(cBuffSize);
+
+    ZSTD_CCtx* const cctx = ZSTD_createCCtx();
+    size_t const cSize = ZSTD_compress_usingCDict(cctx, cBuff, cBuffSize, fBuff, fSize, cdict);
+    if (ZSTD_isError(cSize)) {
+        fprintf(stderr, "error compressing %s : %s \n", fname, ZSTD_getErrorName(cSize));
+        exit(7);
+    }
+
+    saveFile_X(oname, cBuff, cSize);
+
+    /* success */
+    printf("%25s : %6u -> %7u - %s \n", fname, (unsigned)fSize, (unsigned)cSize, oname);
+
+    ZSTD_freeCCtx(cctx);
+    free(fBuff);
+    free(cBuff);
+}
+
+
+static char* createOutFilename(const char* filename)
+{
+    size_t const inL = strlen(filename);
+    size_t const outL = inL + 5;
+    void* outSpace = malloc_X(outL);
+    memset(outSpace, 0, outL);
+    strcat(outSpace, filename);
+    strcat(outSpace, ".zst");
+    return (char*)outSpace;
+}
+
+int main(int argc, const char** argv)
+{
+    const char* const exeName = argv[0];
+
+    if (argc<3) {
+        fprintf(stderr, "wrong arguments\n");
+        fprintf(stderr, "usage:\n");
+        fprintf(stderr, "%s [FILES] dictionary\n", exeName);
+        return 1;
+    }
+
+    /* load dictionary only once */
+    const char* const dictName = argv[argc-1];
+    const ZSTD_CDict* const dictPtr = createDict(dictName);
+
+    int u;
+    for (u=1; u<argc-1; u++) {
+        const char* inFilename = argv[u];
+        char* const outFilename = createOutFilename(inFilename);
+        compress(inFilename, outFilename, dictPtr);
+        free(outFilename);
+    }
+
+    printf("All %u files compressed. \n", argc-2);
+}
diff --git a/examples/dictionary_decompression.c b/examples/dictionary_decompression.c
new file mode 100644
index 0000000..8c5034b
--- /dev/null
+++ b/examples/dictionary_decompression.c
@@ -0,0 +1,136 @@
+/*
+  Dictionary decompression
+  Educational program using zstd library
+  Copyright (C) Yann Collet 2016
+
+  GPL v2 License
+
+  This program is free software; you can redistribute it and/or modify
+  it under the terms of the GNU General Public License as published by
+  the Free Software Foundation; either version 2 of the License, or
+  (at your option) any later version.
+
+  This program is distributed in the hope that it will be useful,
+  but WITHOUT ANY WARRANTY; without even the implied warranty of
+  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+  GNU General Public License for more details.
+
+  You should have received a copy of the GNU General Public License along
+  with this program; if not, write to the Free Software Foundation, Inc.,
+  51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
+
+  You can contact the author at :
+  - zstd homepage : http://www.zstd.net/
+*/
+
+#include <stdlib.h>    // malloc, exit
+#include <stdio.h>     // printf
+#include <string.h>    // strerror
+#include <errno.h>     // errno
+#include <sys/stat.h>  // stat
+#include <zstd.h>      // presumes zstd library is installed
+
+
+static off_t fsize_X(const char *filename)
+{
+    struct stat st;
+    if (stat(filename, &st) == 0) return st.st_size;
+    /* error */
+    printf("stat: %s : %s \n", filename, strerror(errno));
+    exit(1);
+}
+
+static FILE* fopen_X(const char *filename, const char *instruction)
+{
+    FILE* const inFile = fopen(filename, instruction);
+    if (inFile) return inFile;
+    /* error */
+    printf("fopen: %s : %s \n", filename, strerror(errno));
+    exit(2);
+}
+
+static void* malloc_X(size_t size)
+{
+    void* const buff = malloc(size);
+    if (buff) return buff;
+    /* error */
+    printf("malloc: %s \n", strerror(errno));
+    exit(3);
+}
+
+static void* loadFile_X(const char* fileName, size_t* size)
+{
+    off_t const buffSize = fsize_X(fileName);
+    FILE* const inFile = fopen_X(fileName, "rb");
+    void* const buffer = malloc_X(buffSize);
+    size_t const readSize = fread(buffer, 1, buffSize, inFile);
+    if (readSize != (size_t)buffSize) {
+        printf("fread: %s : %s \n", fileName, strerror(errno));
+        exit(4);
+    }
+    fclose(inFile);
+    *size = buffSize;
+    return buffer;
+}
+
+/* createDict() :
+   `dictFileName` is supposed to have been created using `zstd --train` */
+static const ZSTD_DDict* createDict(const char* dictFileName)
+{
+    size_t dictSize;
+    printf("loading dictionary %s \n", dictFileName);
+    void* const dictBuffer = loadFile_X(dictFileName, &dictSize);
+    const ZSTD_DDict* const ddict = ZSTD_createDDict(dictBuffer, dictSize);
+    free(dictBuffer);
+    return ddict;
+}
+
+
+static void decompress(const char* fname, const ZSTD_DDict* ddict)
+{
+    size_t cSize;
+    void* const cBuff = loadFile_X(fname, &cSize);
+    unsigned long long const rSize = ZSTD_getDecompressedSize(cBuff, cSize);
+    if (rSize==0) {
+        printf("%s : original size unknown \n", fname);
+        exit(5);
+    }
+    void* const rBuff = malloc_X(rSize);
+
+    ZSTD_DCtx* const dctx = ZSTD_createDCtx();
+    size_t const dSize = ZSTD_decompress_usingDDict(dctx, rBuff, rSize, cBuff, cSize, ddict);
+
+    if (dSize != rSize) {
+        printf("error decoding %s : %s \n", fname, ZSTD_getErrorName(dSize));
+        exit(7);
+    }
+
+    /* success */
+    printf("%25s : %6u -> %7u \n", fname, (unsigned)cSize, (unsigned)rSize);
+
+    ZSTD_freeDCtx(dctx);
+    free(rBuff);
+    free(cBuff);
+}
+
+
+int main(int argc, const char** argv)
+{
+    const char* const exeName = argv[0];
+
+    if (argc<3) {
+        printf("wrong arguments\n");
+        printf("usage:\n");
+        printf("%s [FILES] dictionary\n", exeName);
+        return 1;
+    }
+
+    /* load dictionary only once */
+    const char* const dictName = argv[argc-1];
+    const ZSTD_DDict* const dictPtr = createDict(dictName);
+
+    int u;
+    for (u=1; u<argc-1; u++) decompress(argv[u], dictPtr);
+
+    printf("All %u files correctly decoded (in memory) \n", argc-2);
+}
diff --git a/examples/simple_compression.c b/examples/simple_compression.c
new file mode 100644
index 0000000..adff81e
--- /dev/null
+++ b/examples/simple_compression.c
@@ -0,0 +1,142 @@
+/*
+  Simple compression
+  Educational program using zstd library
+  Copyright (C) Yann Collet 2016
+
+  GPL v2 License
+
+  This program is free software; you can redistribute it and/or modify
+  it under the terms of the GNU General Public License as published by
+  the Free Software Foundation; either version 2 of the License, or
+  (at your option) any later version.
+
+  This program is distributed in the hope that it will be useful,
+  but WITHOUT ANY WARRANTY; without even the implied warranty of
+  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+  GNU General Public License for more details.
+
+  You should have received a copy of the GNU General Public License along
+  with this program; if not, write to the Free Software Foundation, Inc.,
+  51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
+
+  You can contact the author at :
+  - zstd homepage : http://www.zstd.net/
+*/
+
+#include <stdlib.h>    // malloc, exit
+#include <stdio.h>     // fprintf, perror
+#include <string.h>    // strerror
+#include <errno.h>     // errno
+#include <sys/stat.h>  // stat
+#include <zstd.h>      // presumes zstd library is installed
+
+
+static off_t fsize_orDie(const char *filename)
+{
+    struct stat st;
+    if (stat(filename, &st) == 0) return st.st_size;
+    /* error */
+    perror(filename);
+    exit(1);
+}
+
+static FILE* fopen_orDie(const char *filename, const char *instruction)
+{
+    FILE* const inFile = fopen(filename, instruction);
+    if (inFile) return inFile;
+    /* error */
+    perror(filename);
+    exit(2);
+}
+
+static void* malloc_orDie(size_t size)
+{
+    void* const buff = malloc(size);
+    if (buff) return buff;
+    /* error */
+    perror(NULL);
+    exit(3);
+}
+
+static void* loadFile_orDie(const char* fileName, size_t* size)
+{
+    off_t const buffSize = fsize_orDie(fileName);
+    FILE* const inFile = fopen_orDie(fileName, "rb");
+    void* const buffer = malloc_orDie(buffSize);
+    size_t const readSize = fread(buffer, 1, buffSize, inFile);
+    if (readSize != (size_t)buffSize) {
+        fprintf(stderr, "fread: %s : %s \n", fileName, strerror(errno));
+        exit(4);
+    }
+    fclose(inFile);
+    *size = buffSize;
+    return buffer;
+}
+
+
+static void saveFile_orDie(const char* fileName, const void* buff, size_t buffSize)
+{
+    FILE* const oFile = fopen_orDie(fileName, "wb");
+    size_t const wSize = fwrite(buff, 1, buffSize, oFile);
+    if (wSize != (size_t)buffSize) {
+        fprintf(stderr, "fwrite: %s : %s \n", fileName, strerror(errno));
+        exit(5);
+    }
+    if (fclose(oFile)) {
+        perror(fileName);
+        exit(6);
+    }
+}
+
+
+static void compress_orDie(const char* fname, const char* oname)
+{
+    size_t fSize;
+    void* const fBuff = loadFile_orDie(fname, &fSize);
+    size_t const cBuffSize = ZSTD_compressBound(fSize);
+    void* const cBuff = malloc_orDie(cBuffSize);
+
+    size_t const cSize = ZSTD_compress(cBuff, cBuffSize, fBuff, fSize, 1);
+    if (ZSTD_isError(cSize)) {
+        fprintf(stderr, "error compressing %s : %s \n", fname, ZSTD_getErrorName(cSize));
+        exit(7);
+    }
+
+    saveFile_orDie(oname, cBuff, cSize);
+
+    /* success */
+    printf("%25s : %6u -> %7u - %s \n", fname, (unsigned)fSize, (unsigned)cSize, oname);
+
+    free(fBuff);
+    free(cBuff);
+}
+
+
+static const char* createOutFilename_orDie(const char* filename)
+{
+    size_t const inL = strlen(filename);
+    size_t const outL = inL + 5;
+    void* outSpace = malloc_orDie(outL);
+    memset(outSpace, 0, outL);
+    strcat(outSpace, filename);
+    strcat(outSpace, ".zst");
+    return (const char*)outSpace;
+}
+
+int main(int argc, const char** argv)
+{
+    const char* const exeName = argv[0];
+    const char* const inFilename = argv[1];
+
+    if (argc!=2) {
+        printf("wrong arguments\n");
+        printf("usage:\n");
+        printf("%s FILE\n", exeName);
+        return 1;
+    }
+
+    const char* const outFilename = createOutFilename_orDie(inFilename);
+    compress_orDie(inFilename, outFilename);
+
+    return 0;
+}
diff --git a/examples/simple_decompression.c b/examples/simple_decompression.c
new file mode 100644
index 0000000..b907afa
--- /dev/null
+++ b/examples/simple_decompression.c
@@ -0,0 +1,119 @@
+/*
+  Simple decompression
+  Educational program using zstd library
+  Copyright (C) Yann Collet 2016
+
+  GPL v2 License
+
+  This program is free software; you can redistribute it and/or modify
+  it under the terms of the GNU General Public License as published by
+  the Free Software Foundation; either version 2 of the License, or
+  (at your option) any later version.
+
+  This program is distributed in the hope that it will be useful,
+  but WITHOUT ANY WARRANTY; without even the implied warranty of
+  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+  GNU General Public License for more details.
+
+  You should have received a copy of the GNU General Public License along
+  with this program; if not, write to the Free Software Foundation, Inc.,
+  51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
+
+  You can contact the author at :
+  - zstd homepage : http://www.zstd.net/
+*/
+
+#include <stdlib.h>    // malloc, exit
+#include <stdio.h>     // printf
+#include <string.h>    // strerror
+#include <errno.h>     // errno
+#include <sys/stat.h>  // stat
+#include <zstd.h>      // presumes zstd library is installed
+
+
+static off_t fsize_X(const char *filename)
+{
+    struct stat st;
+    if (stat(filename, &st) == 0) return st.st_size;
+    /* error */
+    printf("stat: %s : %s \n", filename, strerror(errno));
+    exit(1);
+}
+
+static FILE* fopen_X(const char *filename, const char *instruction)
+{
+    FILE* const inFile = fopen(filename, instruction);
+    if (inFile) return inFile;
+    /* error */
+    printf("fopen: %s : %s \n", filename, strerror(errno));
+    exit(2);
+}
+
+static void* malloc_X(size_t size)
+{
+    void* const buff = malloc(size);
+    if (buff) return buff;
+    /* error */
+    printf("malloc: %s \n", strerror(errno));
+    exit(3);
+}
+
+static void* loadFile_X(const char* fileName, size_t* size)
+{
+    off_t const buffSize = fsize_X(fileName);
+    FILE* const inFile = fopen_X(fileName, "rb");
+    void* const buffer = malloc_X(buffSize);
+    size_t const readSize = fread(buffer, 1, buffSize, inFile);
+    if (readSize != (size_t)buffSize) {
+        printf("fread: %s : %s \n", fileName, strerror(errno));
+        exit(4);
+    }
+    fclose(inFile);
+    *size = buffSize;
+    return buffer;
+}
+
+
+static void decompress(const char* fname)
+{
+    size_t cSize;
+    void* const cBuff = loadFile_X(fname, &cSize);
+    unsigned long long const rSize = ZSTD_getDecompressedSize(cBuff, cSize);
+    if (rSize==0) {
+        printf("%s : original size unknown \n", fname);
+        exit(5);
+    }
+    void* const rBuff = malloc_X(rSize);
+
+    size_t const dSize = ZSTD_decompress(rBuff, rSize, cBuff, cSize);
+
+    if (dSize != rSize) {
+        printf("error decoding %s : %s \n", fname, ZSTD_getErrorName(dSize));
+        exit(7);
+    }
+
+    /* success */
+    printf("%25s : %6u -> %7u \n", fname, (unsigned)cSize, (unsigned)rSize);
+
+    free(rBuff);
+    free(cBuff);
+}
+
+
+int main(int argc, const char** argv)
+{
+    const char* const exeName = argv[0];
+
+    if (argc!=2) {
+        printf("wrong arguments\n");
+        printf("usage:\n");
+        printf("%s FILE\n", exeName);
+        return 1;
+    }
+
+    decompress(argv[1]);
+
+    printf("%s correctly decoded (in memory). \n", argv[1]);
+
+    return 0;
+}
diff --git a/images/Cspeed4.png b/images/Cspeed4.png
index d5219d7..f0ca0ff 100644
Binary files a/images/Cspeed4.png and b/images/Cspeed4.png differ
diff --git a/lib/.gitignore b/lib/.gitignore
new file mode 100644
index 0000000..b43a854
--- /dev/null
+++ b/lib/.gitignore
@@ -0,0 +1,2 @@
+# make install artefact
+libzstd.pc
diff --git a/lib/Makefile b/lib/Makefile
index 76731ab..1b4cb37 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -31,9 +31,9 @@
 # ################################################################
 
 # Version numbers
-LIBVER_MAJOR_SCRIPT:=`sed -n '/define ZSTD_VERSION_MAJOR/s/.*[[:blank:]]\([0-9][0-9]*\).*/\1/p' < ./common/zstd.h`
-LIBVER_MINOR_SCRIPT:=`sed -n '/define ZSTD_VERSION_MINOR/s/.*[[:blank:]]\([0-9][0-9]*\).*/\1/p' < ./common/zstd.h`
-LIBVER_PATCH_SCRIPT:=`sed -n '/define ZSTD_VERSION_RELEASE/s/.*[[:blank:]]\([0-9][0-9]*\).*/\1/p' < ./common/zstd.h`
+LIBVER_MAJOR_SCRIPT:=`sed -n '/define ZSTD_VERSION_MAJOR/s/.*[[:blank:]]\([0-9][0-9]*\).*/\1/p' < ./zstd.h`
+LIBVER_MINOR_SCRIPT:=`sed -n '/define ZSTD_VERSION_MINOR/s/.*[[:blank:]]\([0-9][0-9]*\).*/\1/p' < ./zstd.h`
+LIBVER_PATCH_SCRIPT:=`sed -n '/define ZSTD_VERSION_RELEASE/s/.*[[:blank:]]\([0-9][0-9]*\).*/\1/p' < ./zstd.h`
 LIBVER_SCRIPT:= $(LIBVER_MAJOR_SCRIPT).$(LIBVER_MINOR_SCRIPT).$(LIBVER_PATCH_SCRIPT)
 LIBVER_MAJOR := $(shell echo $(LIBVER_MAJOR_SCRIPT))
 LIBVER_MINOR := $(shell echo $(LIBVER_MINOR_SCRIPT))
@@ -46,9 +46,10 @@ PREFIX ?= /usr/local
 LIBDIR ?= $(PREFIX)/lib
 INCLUDEDIR=$(PREFIX)/include
 
-CPPFLAGS= -I./common -DXXH_NAMESPACE=ZSTD_
+CPPFLAGS= -I. -I./common -DXXH_NAMESPACE=ZSTD_
 CFLAGS ?= -O3
-CFLAGS += -Wall -Wextra -Wcast-qual -Wcast-align -Wshadow -Wstrict-aliasing=1 -Wswitch-enum -Wdeclaration-after-statement -Wstrict-prototypes -Wundef
+CFLAGS += -Wall -Wextra -Wcast-qual -Wcast-align -Wshadow -Wstrict-aliasing=1 \
+          -Wswitch-enum -Wdeclaration-after-statement -Wstrict-prototypes -Wundef
 FLAGS   = $(CPPFLAGS) $(CFLAGS) $(LDFLAGS) $(MOREFLAGS)
 
 
@@ -95,11 +96,12 @@ libzstd: $(ZSTD_FILES)
 
 clean:
 	@rm -f core *.o *.a *.gcda *.$(SHARED_EXT) *.$(SHARED_EXT).* libzstd.pc
+	@rm -f decompress/*.o
 	@echo Cleaning library completed
 
 #------------------------------------------------------------------------
-#make install is validated only for Linux, OSX, kFreeBSD and Hurd targets
-ifneq (,$(filter $(shell uname),Linux Darwin GNU/kFreeBSD GNU))
+#make install is validated only for Linux, OSX, kFreeBSD, Hurd and some BSD targets
+ifneq (,$(filter $(shell uname),Linux Darwin GNU/kFreeBSD GNU FreeBSD DragonFly))
 
 libzstd.pc:
 libzstd.pc: libzstd.pc.in
@@ -117,7 +119,7 @@ install: libzstd libzstd.pc
 	@cp -a libzstd.$(SHARED_EXT) $(DESTDIR)$(LIBDIR)
 	@cp -a libzstd.pc $(DESTDIR)$(LIBDIR)/pkgconfig/
 	@install -m 644 libzstd.a $(DESTDIR)$(LIBDIR)/libzstd.a
-	@install -m 644 common/zstd.h $(DESTDIR)$(INCLUDEDIR)/zstd.h
+	@install -m 644 zstd.h $(DESTDIR)$(INCLUDEDIR)/zstd.h
 	@install -m 644 common/zbuff.h $(DESTDIR)$(INCLUDEDIR)/zbuff.h
 	@install -m 644 dictBuilder/zdict.h $(DESTDIR)$(INCLUDEDIR)/zdict.h
 	@echo zstd static and shared library installed
diff --git a/lib/README.md b/lib/README.md
index 45e8e6f..a3087f0 100644
--- a/lib/README.md
+++ b/lib/README.md
@@ -1,63 +1,57 @@
 zstd - library files
 ================================
 
-The __lib__ directory contains several files, but depending on target use case, some of them may not be necessary.
+The __lib__ directory contains several directories.
+Depending on target use case, it's enough to include only files from relevant directories.
 
-#### Minimal library files
 
-To build the zstd library the following files are required:
+#### API
 
-- [common/bitstream.h](common/bitstream.h)
-- [common/error_private.h](common/error_private.h)
-- [common/error_public.h](common/error_public.h)
-- common/fse.h
-- common/fse_decompress.c
-- common/huf.h
-- [common/mem.h](common/mem.h)
-- [common/zstd.h]
-- common/zstd_internal.h
-- compress/fse_compress.c
-- compress/huf_compress.c
-- compress/zstd_compress.c
-- compress/zstd_opt.h
-- decompress/huf_decompress.c
-- decompress/zstd_decompress.c
+Zstandard's stable API is exposed within [zstd.h](zstd.h),
+at the root of `lib` directory.
 
-Stable API is exposed in [common/zstd.h].
-Advanced and experimental API can be enabled by defining `ZSTD_STATIC_LINKING_ONLY`.
-Never use them with a dynamic library, as their definition may change in future versions.
 
-[common/zstd.h]: common/zstd.h
+#### Advanced API
 
+Some additional API may be useful if you're looking into advanced features :
+- common/error_public.h : transforms `size_t` function results into an `enum`,
+                          for precise error handling.
+- ZSTD_STATIC_LINKING_ONLY : if you define this macro _before_ including `zstd.h`,
+                          it will give access to advanced and experimental API.
+                          These APIs shall ___never be used with dynamic library___ !
+                          They are not "stable", their definition may change in the future.
+                          Only static linking is allowed.
 
-#### Separate compressor and decompressor
 
-To build a separate zstd compressor all files from `common/` and `compressor/` directories are required.
-In a similar way to build a separate zstd decompressor all files from `common/` and `decompressor/` directories are needed.
+#### Modular build
 
+Directory `common/` is required in all circumstances.
+You can select to support compression only, by just adding files from the `compress/` directory,
+In a similar way, you can build a decompressor-only library with the `decompress/` directory.
 
-#### Buffered streaming
+Other optional functionalities provided are :
 
-This complementary API makes streaming integration easier.
-It is used by `zstd` command line utility, and [7zip plugin](http://mcmilk.de/projects/7-Zip-ZStd) :
+- `dictBuilder/`  : source files to create dictionaries.
+                    The API can be consulted in `dictBuilder/zdict.h`.
+                    This module also depends on `common/` and `compress/` .
 
-- common/zbuff.h
-- compress/zbuff_compress.c
-- decompress/zbuff_decompress.c
+- `legacy/` : source code to decompress previous versions of zstd, starting from `v0.1`.
+              This module also depends on `common/` and `decompress/` .
+              Note that it's required to compile the library with `ZSTD_LEGACY_SUPPORT = 1` .
+              The main API can be consulted in `legacy/zstd_legacy.h`.
+              Advanced API from each version can be found in its relevant header file.
+              For example, advanced API for version `v0.4` is in `zstd_v04.h` .
 
-#### Dictionary builder
 
-To create dictionaries from training sets :
+#### Streaming API
+
+Streaming is currently provided by `common/zbuff.h`.
 
-- dictBuilder/divsufsort.c
-- dictBuilder/divsufsort.h
-- dictBuilder/zdict.c
-- dictBuilder/zdict.h
 
 #### Miscellaneous
 
 The other files are not source code. There are :
 
  - LICENSE : contains the BSD license text
- - Makefile : script to compile or install zstd library (static or dynamic)
- - libzstd.pc.in : for pkg-config (make install)
+ - Makefile : script to compile or install zstd library (static and dynamic)
+ - libzstd.pc.in : for pkg-config (`make install`)
diff --git a/lib/common/entropy_common.c b/lib/common/entropy_common.c
index b42acb4..acd9669 100644
--- a/lib/common/entropy_common.c
+++ b/lib/common/entropy_common.c
@@ -38,10 +38,9 @@
 #include "mem.h"
 #include "error_private.h"       /* ERR_*, ERROR */
 #define FSE_STATIC_LINKING_ONLY  /* FSE_MIN_TABLELOG */
-#include "fse.h"   /* FSE_isError, FSE_getErrorName */
+#include "fse.h"
 #define HUF_STATIC_LINKING_ONLY  /* HUF_TABLELOG_ABSOLUTEMAX */
-#include "huf.h"   /* HUF_isError, HUF_getErrorName */
-
+#include "huf.h"
 
 
 /*-****************************************
@@ -63,7 +62,7 @@ const char* HUF_getErrorName(size_t code) { return ERR_getErrorName(code); }
 /*-**************************************************************
 *  FSE NCount encoding-decoding
 ****************************************************************/
-static short FSE_abs(short a) { return a<0 ? -a : a; }
+static short FSE_abs(short a) { return (short)(a<0 ? -a : a); }
 
 size_t FSE_readNCount (short* normalizedCounter, unsigned* maxSVPtr, unsigned* tableLogPtr,
                  const void* headerBuffer, size_t hbSize)
@@ -90,22 +89,22 @@ size_t FSE_readNCount (short* normalizedCounter, unsigned* maxSVPtr, unsigned* t
     threshold = 1<<nbBits;
     nbBits++;
 
-    while ((remaining>1) && (charnum<=*maxSVPtr)) {
+    while ((remaining>1) & (charnum<=*maxSVPtr)) {
         if (previous0) {
             unsigned n0 = charnum;
             while ((bitStream & 0xFFFF) == 0xFFFF) {
-                n0+=24;
+                n0 += 24;
                 if (ip < iend-5) {
-                    ip+=2;
+                    ip += 2;
                     bitStream = MEM_readLE32(ip) >> bitCount;
                 } else {
                     bitStream >>= 16;
-                    bitCount+=16;
+                    bitCount   += 16;
             }   }
             while ((bitStream & 3) == 3) {
-                n0+=3;
-                bitStream>>=2;
-                bitCount+=2;
+                n0 += 3;
+                bitStream >>= 2;
+                bitCount += 2;
             }
             n0 += bitStream & 3;
             bitCount += 2;
@@ -115,10 +114,9 @@ size_t FSE_readNCount (short* normalizedCounter, unsigned* maxSVPtr, unsigned* t
                 ip += bitCount>>3;
                 bitCount &= 7;
                 bitStream = MEM_readLE32(ip) >> bitCount;
-            }
-            else
+            } else {
                 bitStream >>= 2;
-        }
+        }   }
         {   short const max = (short)((2*threshold-1)-remaining);
             short count;
 
@@ -148,12 +146,12 @@ size_t FSE_readNCount (short* normalizedCounter, unsigned* maxSVPtr, unsigned* t
                 ip = iend - 4;
             }
             bitStream = MEM_readLE32(ip) >> (bitCount & 31);
-    }   }   /* while ((remaining>1) && (charnum<=*maxSVPtr)) */
-    if (remaining != 1) return ERROR(GENERIC);
+    }   }   /* while ((remaining>1) & (charnum<=*maxSVPtr)) */
+    if (remaining != 1) return ERROR(corruption_detected);
+    if (bitCount > 32) return ERROR(corruption_detected);
     *maxSVPtr = charnum-1;
 
     ip += (bitCount+7)>>3;
-    if ((size_t)(ip-istart) > hbSize) return ERROR(srcSize_wrong);
     return ip-istart;
 }
 
@@ -162,7 +160,7 @@ size_t FSE_readNCount (short* normalizedCounter, unsigned* maxSVPtr, unsigned* t
     Read compact Huffman tree, saved by HUF_writeCTable().
     `huffWeight` is destination buffer.
     @return : size read from `src` , or an error Code .
-    Note : Needed by HUF_readCTable() and HUF_readDTableXn() .
+    Note : Needed by HUF_readCTable() and HUF_readDTableX?() .
 */
 size_t HUF_readStats(BYTE* huffWeight, size_t hwSize, U32* rankStats,
                      U32* nbSymbolsPtr, U32* tableLogPtr,
@@ -173,26 +171,19 @@ size_t HUF_readStats(BYTE* huffWeight, size_t hwSize, U32* rankStats,
     size_t iSize = ip[0];
     size_t oSize;
 
-    //memset(huffWeight, 0, hwSize);   /* is not necessary, even though some analyzer complain ... */
-
-    if (iSize >= 128)  { /* special header */
-        if (iSize >= (242)) {  /* RLE */
-            static U32 l[14] = { 1, 2, 3, 4, 7, 8, 15, 16, 31, 32, 63, 64, 127, 128 };
-            oSize = l[iSize-242];
-            memset(huffWeight, 1, hwSize);
-            iSize = 0;
-        }
-        else {   /* Incompressible */
-            oSize = iSize - 127;
-            iSize = ((oSize+1)/2);
-            if (iSize+1 > srcSize) return ERROR(srcSize_wrong);
-            if (oSize >= hwSize) return ERROR(corruption_detected);
-            ip += 1;
-            {   U32 n;
-                for (n=0; n<oSize; n+=2) {
-                    huffWeight[n]   = ip[n/2] >> 4;
-                    huffWeight[n+1] = ip[n/2] & 15;
-    }   }   }   }
+    /* memset(huffWeight, 0, hwSize);   *//* is not necessary, even though some analyzer complain ... */
+
+    if (iSize >= 128) {  /* special header */
+        oSize = iSize - 127;
+        iSize = ((oSize+1)/2);
+        if (iSize+1 > srcSize) return ERROR(srcSize_wrong);
+        if (oSize >= hwSize) return ERROR(corruption_detected);
+        ip += 1;
+        {   U32 n;
+            for (n=0; n<oSize; n+=2) {
+                huffWeight[n]   = ip[n/2] >> 4;
+                huffWeight[n+1] = ip[n/2] & 15;
+    }   }   }
     else  {   /* header compressed with FSE (normal case) */
         if (iSize+1 > srcSize) return ERROR(srcSize_wrong);
         oSize = FSE_decompress(huffWeight, hwSize-1, ip+1, iSize);   /* max (hwSize-1) values decoded, as last one is implied */
diff --git a/lib/common/error_public.h b/lib/common/error_public.h
index e8cfcc9..29050b3 100644
--- a/lib/common/error_public.h
+++ b/lib/common/error_public.h
@@ -63,7 +63,11 @@ typedef enum {
   ZSTD_error_maxCode
 } ZSTD_ErrorCode;
 
-/* note : compare with size_t function results using ZSTD_getError() */
+/*! ZSTD_getErrorCode() :
+    convert a `size_t` function result into a `ZSTD_ErrorCode` enum type,
+    which can be used to compare directly with enum list published into "error_public.h" */
+ZSTD_ErrorCode ZSTD_getErrorCode(size_t functionResult);
+const char* ZSTD_getErrorString(ZSTD_ErrorCode code);
 
 
 #if defined (__cplusplus)
diff --git a/lib/common/huf.h b/lib/common/huf.h
index 3b837f1..29bab4b 100644
--- a/lib/common/huf.h
+++ b/lib/common/huf.h
@@ -100,7 +100,7 @@ size_t HUF_compress2 (void* dst, size_t dstSize, const void* src, size_t srcSize
 /* *** Constants *** */
 #define HUF_TABLELOG_ABSOLUTEMAX  16   /* absolute limit of HUF_MAX_TABLELOG. Beyond that value, code does not work */
 #define HUF_TABLELOG_MAX  12           /* max configured tableLog (for static allocation); can be modified up to HUF_ABSOLUTEMAX_TABLELOG */
-#define HUF_TABLELOG_DEFAULT  HUF_TABLELOG_MAX   /* tableLog by default, when not specified */
+#define HUF_TABLELOG_DEFAULT  11       /* tableLog by default, when not specified */
 #define HUF_SYMBOLVALUE_MAX 255
 #if (HUF_TABLELOG_MAX > HUF_TABLELOG_ABSOLUTEMAX)
 #  error "HUF_TABLELOG_MAX is too large !"
diff --git a/lib/common/mem.h b/lib/common/mem.h
index 9156bfd..fc7b103 100644
--- a/lib/common/mem.h
+++ b/lib/common/mem.h
@@ -44,19 +44,17 @@ extern "C" {
 ******************************************/
 #include <stddef.h>     /* size_t, ptrdiff_t */
 #include <string.h>     /* memcpy */
-#if defined(_MSC_VER)   /* Visual Studio */
-#   include <stdlib.h>  /* _byteswap_ulong */
-#endif
 
 
 /*-****************************************
 *  Compiler specifics
 ******************************************/
-#if defined(_MSC_VER)
-#   include <intrin.h>   /* _byteswap_ */
+#if defined(_MSC_VER)   /* Visual Studio */
+#   include <stdlib.h>  /* _byteswap_ulong */
+#   include <intrin.h>  /* _byteswap_* */
 #endif
 #if defined(__GNUC__)
-#  define MEM_STATIC static __attribute__((unused))
+#  define MEM_STATIC static __inline __attribute__((unused))
 #elif defined (__cplusplus) || (defined (__STDC_VERSION__) && (__STDC_VERSION__ >= 199901L) /* C99 */)
 #  define MEM_STATIC static inline
 #elif defined(_MSC_VER)
@@ -65,6 +63,10 @@ extern "C" {
 #  define MEM_STATIC static  /* this version may generate warnings for unused static functions; disable the relevant warning */
 #endif
 
+/* code only tested on 32 and 64 bits systems */
+#define MEM_STATIC_ASSERT(c)   { enum { XXH_static_assert = 1/(int)(!!(c)) }; }
+MEM_STATIC void MEM_check(void) { MEM_STATIC_ASSERT((sizeof(size_t)==4) || (sizeof(size_t)==8)); }
+
 
 /*-**************************************************************
 *  Basic Types
@@ -256,6 +258,17 @@ MEM_STATIC void MEM_writeLE16(void* memPtr, U16 val)
     }
 }
 
+MEM_STATIC U32 MEM_readLE24(const void* memPtr)
+{
+    return MEM_readLE16(memPtr) + (((const BYTE*)memPtr)[2] << 16);
+}
+
+MEM_STATIC void MEM_writeLE24(void* memPtr, U32 val)
+{
+    MEM_writeLE16(memPtr, (U16)val);
+    ((BYTE*)memPtr)[2] = (BYTE)(val>>16);
+}
+
 MEM_STATIC U32 MEM_readLE32(const void* memPtr)
 {
     if (MEM_isLittleEndian())
@@ -374,4 +387,3 @@ MEM_STATIC U32 MEM_readMINMATCH(const void* memPtr, U32 length)
 #endif
 
 #endif /* MEM_H_MODULE */
-
diff --git a/lib/common/zbuff.h b/lib/common/zbuff.h
index e449f6d..269dc22 100644
--- a/lib/common/zbuff.h
+++ b/lib/common/zbuff.h
@@ -44,10 +44,8 @@ extern "C" {
 /* ***************************************************************
 *  Compiler specifics
 *****************************************************************/
-/*!
-*  ZSTD_DLL_EXPORT :
-*  Enable exporting of functions when building a Windows DLL
-*/
+/* ZSTD_DLL_EXPORT :
+*  Enable exporting of functions when building a Windows DLL */
 #if defined(_WIN32) && defined(ZSTD_DLL_EXPORT) && (ZSTD_DLL_EXPORT==1)
 #  define ZSTDLIB_API __declspec(dllexport)
 #else
@@ -58,6 +56,12 @@ extern "C" {
 /* *************************************
 *  Streaming functions
 ***************************************/
+/* This is the easier "buffered" streaming API,
+*  using an internal buffer to lift all restrictions on user-provided buffers
+*  which can be any size, any place, for both input and output.
+*  ZBUFF and ZSTD are 100% interoperable,
+*  frames created by one can be decoded by the other one */
+
 typedef struct ZBUFF_CCtx_s ZBUFF_CCtx;
 ZSTDLIB_API ZBUFF_CCtx* ZBUFF_createCCtx(void);
 ZSTDLIB_API size_t      ZBUFF_freeCCtx(ZBUFF_CCtx* cctx);
@@ -103,8 +107,8 @@ ZSTDLIB_API size_t ZBUFF_compressEnd(ZBUFF_CCtx* cctx, void* dst, size_t* dstCap
 *  @return : nb of bytes still present into internal buffer (0 if it's empty)
 *            or an error code, which can be tested using ZBUFF_isError().
 *
-*  Hint : recommended buffer sizes (not compulsory) : ZBUFF_recommendedCInSize / ZBUFF_recommendedCOutSize
-*  input : ZBUFF_recommendedCInSize==128 KB block size is the internal unit, it improves latency to use this value (skipped buffering).
+*  Hint : _recommended buffer_ sizes (not compulsory) : ZBUFF_recommendedCInSize() / ZBUFF_recommendedCOutSize()
+*  input : ZBUFF_recommendedCInSize==128 KB block size is the internal unit, use this value to reduce intermediate stages (better latency)
 *  output : ZBUFF_recommendedCOutSize==ZSTD_compressBound(128 KB) + 3 + 3 : ensures it's always possible to write/flush/end a full block. Skip some buffering.
 *  By using both, it ensures that input will be entirely consumed, and output will always contain the result, reducing intermediate buffering.
 * **************************************************/
@@ -135,8 +139,9 @@ ZSTDLIB_API size_t ZBUFF_decompressContinue(ZBUFF_DCtx* dctx,
 *  The function will report how many bytes were read or written by modifying *srcSizePtr and *dstCapacityPtr.
 *  Note that it may not consume the entire input, in which case it's up to the caller to present remaining input again.
 *  The content of `dst` will be overwritten (up to *dstCapacityPtr) at each function call, so save its content if it matters, or change `dst`.
-*  @return : a hint to preferred nb of bytes to use as input for next function call (it's only a hint, to help latency),
-*            or 0 when a frame is completely decoded,
+*  @return : 0 when a frame is completely decoded and fully flushed,
+*            1 when there is still some data left within internal buffer to flush,
+*            >1 when more data is expected, with value being a suggested next input size (it's just a hint, which helps latency),
 *            or an error code, which can be tested using ZBUFF_isError().
 *
 *  Hint : recommended buffer sizes (not compulsory) : ZBUFF_recommendedDInSize() and ZBUFF_recommendedDOutSize()
@@ -170,11 +175,11 @@ ZSTDLIB_API size_t ZBUFF_recommendedDOutSize(void);
  * ==================================================================================== */
 
 /*--- Dependency ---*/
-#define ZSTD_STATIC_LINKING_ONLY   /* ZSTD_parameters */
+#define ZSTD_STATIC_LINKING_ONLY   /* ZSTD_parameters, ZSTD_customMem */
 #include "zstd.h"
 
 
-/*--- External memory ---*/
+/*--- Custom memory allocator ---*/
 /*! ZBUFF_createCCtx_advanced() :
  *  Create a ZBUFF compression context using external alloc and free functions */
 ZSTDLIB_API ZBUFF_CCtx* ZBUFF_createCCtx_advanced(ZSTD_customMem customMem);
@@ -184,10 +189,10 @@ ZSTDLIB_API ZBUFF_CCtx* ZBUFF_createCCtx_advanced(ZSTD_customMem customMem);
 ZSTDLIB_API ZBUFF_DCtx* ZBUFF_createDCtx_advanced(ZSTD_customMem customMem);
 
 
-/*--- Advanced Streaming function ---*/
+/*--- Advanced Streaming Initialization ---*/
 ZSTDLIB_API size_t ZBUFF_compressInit_advanced(ZBUFF_CCtx* zbc,
                                                const void* dict, size_t dictSize,
-                                               ZSTD_parameters params, U64 pledgedSrcSize);
+                                               ZSTD_parameters params, unsigned long long pledgedSrcSize);
 
 #endif /* ZBUFF_STATIC_LINKING_ONLY */
 
diff --git a/lib/common/zstd_internal.h b/lib/common/zstd_internal.h
index 0909955..0a1935a 100644
--- a/lib/common/zstd_internal.h
+++ b/lib/common/zstd_internal.h
@@ -51,9 +51,10 @@
 /*-*************************************
 *  Common constants
 ***************************************/
-#define ZSTD_OPT_DEBUG 0     // 3 = compression stats;  5 = check encoded sequences;  9 = full logs
-#include <stdio.h>
+#define ZSTD_OPT_DEBUG 0     /* 3 = compression stats;  5 = check encoded sequences;  9 = full logs */
 #if defined(ZSTD_OPT_DEBUG) && ZSTD_OPT_DEBUG>=9
+    #include <stdio.h>
+    #include <stdlib.h>
     #define ZSTD_LOG_PARSER(...) printf(__VA_ARGS__)
     #define ZSTD_LOG_ENCODE(...) printf(__VA_ARGS__)
     #define ZSTD_LOG_BLOCK(...) printf(__VA_ARGS__)
@@ -64,10 +65,10 @@
 #endif
 
 #define ZSTD_OPT_NUM    (1<<12)
-#define ZSTD_DICT_MAGIC  0xEC30A437   /* v0.7 */
+#define ZSTD_DICT_MAGIC  0xEC30A437   /* v0.7+ */
 
-#define ZSTD_REP_NUM    3
-#define ZSTD_REP_INIT   ZSTD_REP_NUM
+#define ZSTD_REP_NUM    3                 /* number of repcodes */
+#define ZSTD_REP_CHECK  (ZSTD_REP_NUM-0)  /* number of repcodes to check by the optimal parser */
 #define ZSTD_REP_MOVE   (ZSTD_REP_NUM-1)
 static const U32 repStartValue[ZSTD_REP_NUM] = { 1, 4, 8 };
 
@@ -88,13 +89,13 @@ static const size_t ZSTD_did_fieldSize[4] = { 0, 1, 2, 4 };
 
 #define ZSTD_BLOCKHEADERSIZE 3   /* C standard doesn't allow `static const` variable to be init using another `static const` variable */
 static const size_t ZSTD_blockHeaderSize = ZSTD_BLOCKHEADERSIZE;
-typedef enum { bt_compressed, bt_raw, bt_rle, bt_end } blockType_t;
+typedef enum { bt_raw, bt_rle, bt_compressed, bt_reserved } blockType_e;
 
 #define MIN_SEQUENCES_SIZE 1 /* nbSeq==0 */
 #define MIN_CBLOCK_SIZE (1 /*litCSize*/ + 1 /* RLE or RAW */ + MIN_SEQUENCES_SIZE /* nbSeq==0 */)   /* for a non-null block */
 
 #define HufLog 12
-typedef enum { lbt_huffman, lbt_repeat, lbt_raw, lbt_rle } litBlockType_t;
+typedef enum { set_basic, set_rle, set_compressed, set_repeat } symbolEncodingType_e;
 
 #define LONGNBSEQ 0x7F00
 
@@ -111,11 +112,6 @@ typedef enum { lbt_huffman, lbt_repeat, lbt_raw, lbt_rle } litBlockType_t;
 #define LLFSELog    9
 #define OffFSELog   8
 
-#define FSE_ENCODING_RAW     0
-#define FSE_ENCODING_RLE     1
-#define FSE_ENCODING_STATIC  2
-#define FSE_ENCODING_DYNAMIC 3
-
 static const U32 LL_bits[MaxLL+1] = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
                                       1, 1, 1, 1, 2, 2, 3, 3, 4, 6, 7, 8, 9,10,11,12,
                                      13,14,15,16 };
@@ -174,7 +170,7 @@ typedef struct {
     U32 off;
     U32 mlen;
     U32 litlen;
-    U32 rep[ZSTD_REP_INIT];
+    U32 rep[ZSTD_REP_NUM];
 } ZSTD_optimal_t;
 
 #if ZSTD_OPT_DEBUG == 3
@@ -187,19 +183,22 @@ typedef struct {
     MEM_STATIC void ZSTD_statsUpdatePrices(ZSTD_stats_t* stats, size_t litLength, const BYTE* literals, size_t offset, size_t matchLength) { (void)stats; (void)litLength; (void)literals; (void)offset; (void)matchLength; }
 #endif   /* #if ZSTD_OPT_DEBUG == 3 */
 
+
+typedef struct seqDef_s {
+    U32 offset;
+    U16 litLength;
+    U16 matchLength;
+} seqDef;
+
+
 typedef struct {
-    void* buffer;
-    U32*  offsetStart;
-    U32*  offset;
-    BYTE* offCodeStart;
+    seqDef* sequencesStart;
+    seqDef* sequences;
     BYTE* litStart;
     BYTE* lit;
-    U16*  litLengthStart;
-    U16*  litLength;
-    BYTE* llCodeStart;
-    U16*  matchLengthStart;
-    U16*  matchLength;
-    BYTE* mlCodeStart;
+    BYTE* llCode;
+    BYTE* mlCode;
+    BYTE* ofCode;
     U32   longLengthID;   /* 0 == no longLength; 1 == Lit.longLength; 2 == Match.longLength; */
     U32   longLengthPos;
     /* opt */
@@ -227,12 +226,37 @@ typedef struct {
 } seqStore_t;
 
 const seqStore_t* ZSTD_getSeqStore(const ZSTD_CCtx* ctx);
-void ZSTD_seqToCodes(const seqStore_t* seqStorePtr, size_t const nbSeq);
+void ZSTD_seqToCodes(const seqStore_t* seqStorePtr);
 int ZSTD_isSkipFrame(ZSTD_DCtx* dctx);
 
 /* custom memory allocation functions */
 void* ZSTD_defaultAllocFunction(void* opaque, size_t size);
 void ZSTD_defaultFreeFunction(void* opaque, void* address);
-static ZSTD_customMem const defaultCustomMem = { ZSTD_defaultAllocFunction, ZSTD_defaultFreeFunction, NULL };
+static const ZSTD_customMem defaultCustomMem = { ZSTD_defaultAllocFunction, ZSTD_defaultFreeFunction, NULL };
+
+/*======  common function  ======*/
+
+MEM_STATIC U32 ZSTD_highbit32(U32 val)
+{
+#   if defined(_MSC_VER)   /* Visual */
+    unsigned long r=0;
+    _BitScanReverse(&r, val);
+    return (unsigned)r;
+#   elif defined(__GNUC__) && (__GNUC__ >= 3)   /* GCC Intrinsic */
+    return 31 - __builtin_clz(val);
+#   else   /* Software version */
+    static const int DeBruijnClz[32] = { 0, 9, 1, 10, 13, 21, 2, 29, 11, 14, 16, 18, 22, 25, 3, 30, 8, 12, 20, 28, 15, 17, 24, 7, 19, 27, 23, 6, 26, 5, 4, 31 };
+    U32 v = val;
+    int r;
+    v |= v >> 1;
+    v |= v >> 2;
+    v |= v >> 4;
+    v |= v >> 8;
+    v |= v >> 16;
+    r = DeBruijnClz[(U32)(v * 0x07C4ACDDU) >> 27];
+    return r;
+#   endif
+}
+
 
 #endif   /* ZSTD_CCOMMON_H_MODULE */
diff --git a/lib/compress/fse_compress.c b/lib/compress/fse_compress.c
index 192d550..386b2c0 100644
--- a/lib/compress/fse_compress.c
+++ b/lib/compress/fse_compress.c
@@ -190,7 +190,7 @@ size_t FSE_NCountWriteBound(unsigned maxSymbolValue, unsigned tableLog)
     return maxSymbolValue ? maxHeaderSize : FSE_NCOUNTBOUND;  /* maxSymbolValue==0 ? use default */
 }
 
-static short FSE_abs(short a) { return a<0 ? -a : a; }
+static short FSE_abs(short a) { return (short)(a<0 ? -a : a); }
 
 static size_t FSE_writeNCount_generic (void* header, size_t headerBufferSize,
                                        const short* normalizedCounter, unsigned maxSymbolValue, unsigned tableLog,
diff --git a/lib/compress/huf_compress.c b/lib/compress/huf_compress.c
index 3533bb6..86a53c2 100644
--- a/lib/compress/huf_compress.c
+++ b/lib/compress/huf_compress.c
@@ -105,66 +105,37 @@ size_t HUF_writeCTable (void* dst, size_t maxDstSize,
                         const HUF_CElt* CTable, U32 maxSymbolValue, U32 huffLog)
 {
     BYTE bitsToWeight[HUF_TABLELOG_MAX + 1];
-    BYTE huffWeight[HUF_SYMBOLVALUE_MAX + 1];
-    U32 n;
+    BYTE huffWeight[HUF_SYMBOLVALUE_MAX];
     BYTE* op = (BYTE*)dst;
-    size_t size;
+    U32 n;
 
      /* check conditions */
-    if (maxSymbolValue > HUF_SYMBOLVALUE_MAX + 1)
-        return ERROR(GENERIC);
+    if (maxSymbolValue > HUF_SYMBOLVALUE_MAX) return ERROR(GENERIC);
 
     /* convert to weight */
     bitsToWeight[0] = 0;
-    for (n=1; n<=huffLog; n++)
+    for (n=1; n<huffLog+1; n++)
         bitsToWeight[n] = (BYTE)(huffLog + 1 - n);
     for (n=0; n<maxSymbolValue; n++)
         huffWeight[n] = bitsToWeight[CTable[n].nbBits];
 
-    size = FSE_compress(op+1, maxDstSize-1, huffWeight, maxSymbolValue);   /* don't need last symbol stat : implied */
-    if (HUF_isError(size)) return size;
-    if (size >= 128) return ERROR(GENERIC);   /* should never happen, since maxSymbolValue <= 255 */
-    if ((size <= 1) || (size >= maxSymbolValue/2)) {
-        if (size==1) {  /* RLE */
-            /* only possible case : series of 1 (because there are at least 2) */
-            /* can only be 2^n or (2^n-1), otherwise not an huffman tree */
-            BYTE code;
-            switch(maxSymbolValue)
-            {
-            case 1: code = 0; break;
-            case 2: code = 1; break;
-            case 3: code = 2; break;
-            case 4: code = 3; break;
-            case 7: code = 4; break;
-            case 8: code = 5; break;
-            case 15: code = 6; break;
-            case 16: code = 7; break;
-            case 31: code = 8; break;
-            case 32: code = 9; break;
-            case 63: code = 10; break;
-            case 64: code = 11; break;
-            case 127: code = 12; break;
-            case 128: code = 13; break;
-            default : return ERROR(corruption_detected);
-            }
-            op[0] = (BYTE)(255-13 + code);
-            return 1;
-        }
-         /* Not compressible */
-        if (maxSymbolValue > (241-128)) return ERROR(GENERIC);   /* not implemented (not possible with current format) */
-        if (((maxSymbolValue+1)/2) + 1 > maxDstSize) return ERROR(dstSize_tooSmall);   /* not enough space within dst buffer */
-        op[0] = (BYTE)(128 /*special case*/ + 0 /* Not Compressible */ + (maxSymbolValue-1));
-        huffWeight[maxSymbolValue] = 0;   /* to be sure it doesn't cause issue in final combination */
-        for (n=0; n<maxSymbolValue; n+=2)
-            op[(n/2)+1] = (BYTE)((huffWeight[n] << 4) + huffWeight[n+1]);
-        return ((maxSymbolValue+1)/2) + 1;
-    }
+    {   size_t const size = FSE_compress(op+1, maxDstSize-1, huffWeight, maxSymbolValue);
+        if (FSE_isError(size)) return size;
+        if ((size>1) & (size < maxSymbolValue/2)) {   /* FSE compressed */
+            op[0] = (BYTE)size;
+            return size+1;
+    }   }
 
-    /* normal header case */
-    op[0] = (BYTE)size;
-    return size+1;
-}
+    /* raw values */
+    if (maxSymbolValue > (256-128)) return ERROR(GENERIC);   /* should not happen */
+    if (((maxSymbolValue+1)/2) + 1 > maxDstSize) return ERROR(dstSize_tooSmall);   /* not enough space within dst buffer */
+    op[0] = (BYTE)(128 /*special case*/ + (maxSymbolValue-1));
+    huffWeight[maxSymbolValue] = 0;   /* to be sure it doesn't cause issue in final combination */
+    for (n=0; n<maxSymbolValue; n+=2)
+        op[(n/2)+1] = (BYTE)((huffWeight[n] << 4) + huffWeight[n+1]);
+    return ((maxSymbolValue+1)/2) + 1;
 
+}
 
 
 size_t HUF_readCTable (HUF_CElt* CTable, U32 maxSymbolValue, const void* src, size_t srcSize)
@@ -174,7 +145,7 @@ size_t HUF_readCTable (HUF_CElt* CTable, U32 maxSymbolValue, const void* src, si
     U32 tableLog = 0;
     size_t readSize;
     U32 nbSymbols = 0;
-    //memset(huffWeight, 0, sizeof(huffWeight));   /* is not necessary, even though some analyzer complain ... */
+    /*memset(huffWeight, 0, sizeof(huffWeight));*/   /* is not necessary, even though some analyzer complain ... */
 
     /* get symbol weights */
     readSize = HUF_readStats(huffWeight, HUF_SYMBOLVALUE_MAX+1, rankVal, &nbSymbols, &tableLog, src, srcSize);
@@ -193,10 +164,10 @@ size_t HUF_readCTable (HUF_CElt* CTable, U32 maxSymbolValue, const void* src, si
     }   }
 
     /* fill nbBits */
-    { U32 n; for (n=0; n<nbSymbols; n++) {
-        const U32 w = huffWeight[n];
-        CTable[n].nbBits = (BYTE)(tableLog + 1 - w);
-    }}
+    {   U32 n; for (n=0; n<nbSymbols; n++) {
+            const U32 w = huffWeight[n];
+            CTable[n].nbBits = (BYTE)(tableLog + 1 - w);
+    }   }
 
     /* fill val */
     {   U16 nbPerRank[HUF_TABLELOG_MAX+1] = {0};
@@ -239,7 +210,7 @@ static U32 HUF_setMaxHeight(nodeElt* huffNode, U32 lastNonNull, U32 maxNbBits)
 
         /* repay normalized cost */
         {   U32 const noSymbol = 0xF0F0F0F0;
-            U32 rankLast[HUF_TABLELOG_MAX+1];
+            U32 rankLast[HUF_TABLELOG_MAX+2];
             int pos;
 
             /* Get pos of last (smallest) symbol per rank */
diff --git a/lib/compress/zbuff_compress.c b/lib/compress/zbuff_compress.c
index 6ed5e52..5d92918 100644
--- a/lib/compress/zbuff_compress.c
+++ b/lib/compress/zbuff_compress.c
@@ -46,7 +46,7 @@
 static size_t const ZBUFF_endFrameSize = ZSTD_BLOCKHEADERSIZE;
 
 
-/*_**************************************************
+/*-***********************************************************
 *  Streaming compression
 *
 *  A ZBUFF_CCtx object is required to track streaming operation.
@@ -77,7 +77,7 @@ static size_t const ZBUFF_endFrameSize = ZSTD_BLOCKHEADERSIZE;
 *  Hint : recommended buffer sizes (not compulsory)
 *  input : ZSTD_BLOCKSIZE_MAX (128 KB), internal unit size, it improves latency to use this value.
 *  output : ZSTD_compressBound(ZSTD_BLOCKSIZE_MAX) + ZSTD_blockHeaderSize + ZBUFF_endFrameSize : ensures it's always possible to write/flush/end a full block at best speed.
-* **************************************************/
+* ***********************************************************/
 
 typedef enum { ZBUFFcs_init, ZBUFFcs_load, ZBUFFcs_flush, ZBUFFcs_final } ZBUFF_cStage;
 
@@ -95,6 +95,8 @@ struct ZBUFF_CCtx_s {
     size_t outBuffContentSize;
     size_t outBuffFlushedSize;
     ZBUFF_cStage stage;
+    U32    checksum;
+    U32    frameEnded;
     ZSTD_customMem customMem;
 };   /* typedef'd tp ZBUFF_CCtx within "zstd_buffered.h" */
 
@@ -133,11 +135,11 @@ size_t ZBUFF_freeCCtx(ZBUFF_CCtx* zbc)
 }
 
 
-/* *** Initialization *** */
+/* ======   Initialization   ====== */
 
 size_t ZBUFF_compressInit_advanced(ZBUFF_CCtx* zbc,
                                    const void* dict, size_t dictSize,
-                                   ZSTD_parameters params, U64 pledgedSrcSize)
+                                   ZSTD_parameters params, unsigned long long pledgedSrcSize)
 {
     /* allocate buffers */
     {   size_t const neededInBuffSize = (size_t)1 << params.cParams.windowLog;
@@ -147,7 +149,7 @@ size_t ZBUFF_compressInit_advanced(ZBUFF_CCtx* zbc,
             zbc->inBuff = (char*)zbc->customMem.customAlloc(zbc->customMem.opaque, neededInBuffSize);
             if (zbc->inBuff == NULL) return ERROR(memory_allocation);
         }
-        zbc->blockSize = MIN(ZSTD_BLOCKSIZE_MAX, neededInBuffSize);
+        zbc->blockSize = MIN(ZSTD_BLOCKSIZE_ABSOLUTEMAX, neededInBuffSize);
     }
     if (zbc->outBuffSize < ZSTD_compressBound(zbc->blockSize)+1) {
         zbc->outBuffSize = ZSTD_compressBound(zbc->blockSize)+1;
@@ -164,15 +166,15 @@ size_t ZBUFF_compressInit_advanced(ZBUFF_CCtx* zbc,
     zbc->inBuffTarget = zbc->blockSize;
     zbc->outBuffContentSize = zbc->outBuffFlushedSize = 0;
     zbc->stage = ZBUFFcs_load;
+    zbc->checksum = params.fParams.checksumFlag > 0;
+    zbc->frameEnded = 0;
     return 0;   /* ready to go */
 }
 
 
 size_t ZBUFF_compressInitDictionary(ZBUFF_CCtx* zbc, const void* dict, size_t dictSize, int compressionLevel)
 {
-    ZSTD_parameters params;
-    memset(&params, 0, sizeof(params));
-    params.cParams = ZSTD_getCParams(compressionLevel, 0, dictSize);
+    ZSTD_parameters const params = ZSTD_getParams(compressionLevel, 0, dictSize);
     return ZBUFF_compressInit_advanced(zbc, dict, dictSize, params, 0);
 }
 
@@ -191,14 +193,16 @@ MEM_STATIC size_t ZBUFF_limitCopy(void* dst, size_t dstCapacity, const void* src
 }
 
 
-/* *** Compression *** */
+/* ======   Compression   ====== */
+
+typedef enum { zbf_gather, zbf_flush, zbf_end } ZBUFF_flush_e;
 
 static size_t ZBUFF_compressContinue_generic(ZBUFF_CCtx* zbc,
                               void* dst, size_t* dstCapacityPtr,
                         const void* src, size_t* srcSizePtr,
-                              int flush)
+                              ZBUFF_flush_e const flush)
 {
-    U32 notDone = 1;
+    U32 someMoreWork = 1;
     const char* const istart = (const char*)src;
     const char* const iend = istart + *srcSizePtr;
     const char* ip = istart;
@@ -206,7 +210,7 @@ static size_t ZBUFF_compressContinue_generic(ZBUFF_CCtx* zbc,
     char* const oend = ostart + *dstCapacityPtr;
     char* op = ostart;
 
-    while (notDone) {
+    while (someMoreWork) {
         switch(zbc->stage)
         {
         case ZBUFFcs_init: return ERROR(init_missing);   /* call ZBUFF_compressInit() first ! */
@@ -218,7 +222,7 @@ static size_t ZBUFF_compressContinue_generic(ZBUFF_CCtx* zbc,
                 zbc->inBuffPos += loaded;
                 ip += loaded;
                 if ( (zbc->inBuffPos==zbc->inToCompress) || (!flush && (toLoad != loaded)) ) {
-                    notDone = 0; break;  /* not enough input to get a full block : stop there, wait for more */
+                    someMoreWork = 0; break;  /* not enough input to get a full block : stop there, wait for more */
             }   }
             /* compress current block (note : this stage cannot be stopped in the middle) */
             {   void* cDst;
@@ -229,8 +233,11 @@ static size_t ZBUFF_compressContinue_generic(ZBUFF_CCtx* zbc,
                     cDst = op;   /* compress directly into output buffer (avoid flush stage) */
                 else
                     cDst = zbc->outBuff, oSize = zbc->outBuffSize;
-                cSize = ZSTD_compressContinue(zbc->zc, cDst, oSize, zbc->inBuff + zbc->inToCompress, iSize);
+                cSize = (flush == zbf_end) ?
+                        ZSTD_compressEnd(zbc->zc, cDst, oSize, zbc->inBuff + zbc->inToCompress, iSize) :
+                        ZSTD_compressContinue(zbc->zc, cDst, oSize, zbc->inBuff + zbc->inToCompress, iSize);
                 if (ZSTD_isError(cSize)) return cSize;
+                if (flush == zbf_end) zbc->frameEnded = 1;
                 /* prepare next block */
                 zbc->inBuffTarget = zbc->inBuffPos + zbc->blockSize;
                 if (zbc->inBuffTarget > zbc->inBuffSize)
@@ -247,14 +254,14 @@ static size_t ZBUFF_compressContinue_generic(ZBUFF_CCtx* zbc,
                 size_t const flushed = ZBUFF_limitCopy(op, oend-op, zbc->outBuff + zbc->outBuffFlushedSize, toFlush);
                 op += flushed;
                 zbc->outBuffFlushedSize += flushed;
-                if (toFlush!=flushed) { notDone = 0; break; } /* dst too small to store flushed data : stop there */
+                if (toFlush!=flushed) { someMoreWork = 0; break; } /* dst too small to store flushed data : stop there */
                 zbc->outBuffContentSize = zbc->outBuffFlushedSize = 0;
                 zbc->stage = ZBUFFcs_load;
                 break;
             }
 
         case ZBUFFcs_final:
-            notDone = 0;   /* do nothing */
+            someMoreWork = 0;   /* do nothing */
             break;
 
         default:
@@ -264,6 +271,7 @@ static size_t ZBUFF_compressContinue_generic(ZBUFF_CCtx* zbc,
 
     *srcSizePtr = ip - istart;
     *dstCapacityPtr = op - ostart;
+    if (zbc->frameEnded) return 0;
     {   size_t hintInSize = zbc->inBuffTarget - zbc->inBuffPos;
         if (hintInSize==0) hintInSize = zbc->blockSize;
         return hintInSize;
@@ -274,17 +282,17 @@ size_t ZBUFF_compressContinue(ZBUFF_CCtx* zbc,
                               void* dst, size_t* dstCapacityPtr,
                         const void* src, size_t* srcSizePtr)
 {
-    return ZBUFF_compressContinue_generic(zbc, dst, dstCapacityPtr, src, srcSizePtr, 0);
+    return ZBUFF_compressContinue_generic(zbc, dst, dstCapacityPtr, src, srcSizePtr, zbf_gather);
 }
 
 
 
-/* *** Finalize *** */
+/* ======   Finalize   ====== */
 
 size_t ZBUFF_compressFlush(ZBUFF_CCtx* zbc, void* dst, size_t* dstCapacityPtr)
 {
     size_t srcSize = 0;
-    ZBUFF_compressContinue_generic(zbc, dst, dstCapacityPtr, &srcSize, &srcSize, 1);  /* use a valid src address instead of NULL */
+    ZBUFF_compressContinue_generic(zbc, dst, dstCapacityPtr, &srcSize, &srcSize, zbf_flush);  /* use a valid src address instead of NULL */
     return zbc->outBuffContentSize - zbc->outBuffFlushedSize;
 }
 
@@ -298,15 +306,18 @@ size_t ZBUFF_compressEnd(ZBUFF_CCtx* zbc, void* dst, size_t* dstCapacityPtr)
     if (zbc->stage != ZBUFFcs_final) {
         /* flush whatever remains */
         size_t outSize = *dstCapacityPtr;
-        size_t const remainingToFlush = ZBUFF_compressFlush(zbc, dst, &outSize);
+        size_t srcSize = 0;
+        size_t const notEnded = ZBUFF_compressContinue_generic(zbc, dst, &outSize, &srcSize, &srcSize, zbf_end);  /* use a valid address instead of NULL */
+        size_t const remainingToFlush = zbc->outBuffContentSize - zbc->outBuffFlushedSize;
         op += outSize;
         if (remainingToFlush) {
             *dstCapacityPtr = op-ostart;
-            return remainingToFlush + ZBUFF_endFrameSize;
+            return remainingToFlush + ZBUFF_endFrameSize + (zbc->checksum * 4);
         }
         /* create epilogue */
         zbc->stage = ZBUFFcs_final;
-        zbc->outBuffContentSize = ZSTD_compressEnd(zbc->zc, zbc->outBuff, zbc->outBuffSize); /* epilogue into outBuff */
+        zbc->outBuffContentSize = !notEnded ? 0 :
+            ZSTD_compressEnd(zbc->zc, zbc->outBuff, zbc->outBuffSize, NULL, 0);  /* write epilogue into outBuff */
     }
 
     /* flush epilogue */
@@ -325,5 +336,5 @@ size_t ZBUFF_compressEnd(ZBUFF_CCtx* zbc, void* dst, size_t* dstCapacityPtr)
 /* *************************************
 *  Tool functions
 ***************************************/
-size_t ZBUFF_recommendedCInSize(void)  { return ZSTD_BLOCKSIZE_MAX; }
-size_t ZBUFF_recommendedCOutSize(void) { return ZSTD_compressBound(ZSTD_BLOCKSIZE_MAX) + ZSTD_blockHeaderSize + ZBUFF_endFrameSize; }
+size_t ZBUFF_recommendedCInSize(void)  { return ZSTD_BLOCKSIZE_ABSOLUTEMAX; }
+size_t ZBUFF_recommendedCOutSize(void) { return ZSTD_compressBound(ZSTD_BLOCKSIZE_ABSOLUTEMAX) + ZSTD_blockHeaderSize + ZBUFF_endFrameSize; }
diff --git a/lib/compress/zstd_compress.c b/lib/compress/zstd_compress.c
index 42cf648..56c6360 100644
--- a/lib/compress/zstd_compress.c
+++ b/lib/compress/zstd_compress.c
@@ -66,6 +66,8 @@
 *  Constants
 ***************************************/
 static const U32 g_searchStrength = 8;   /* control skip over incompressible data */
+#define HASH_READ_SIZE 8
+typedef enum { ZSTDcs_created=0, ZSTDcs_init, ZSTDcs_ongoing, ZSTDcs_ending } ZSTD_compressionStage_e;
 
 
 /*-*************************************
@@ -73,37 +75,14 @@ static const U32 g_searchStrength = 8;   /* control skip over incompressible dat
 ***************************************/
 size_t ZSTD_compressBound(size_t srcSize) { return FSE_compressBound(srcSize) + 12; }
 
-static U32 ZSTD_highbit32(U32 val)
-{
-#   if defined(_MSC_VER)   /* Visual */
-    unsigned long r=0;
-    _BitScanReverse(&r, val);
-    return (unsigned)r;
-#   elif defined(__GNUC__) && (__GNUC__ >= 3)   /* GCC Intrinsic */
-    return 31 - __builtin_clz(val);
-#   else   /* Software version */
-    static const int DeBruijnClz[32] = { 0, 9, 1, 10, 13, 21, 2, 29, 11, 14, 16, 18, 22, 25, 3, 30, 8, 12, 20, 28, 15, 17, 24, 7, 19, 27, 23, 6, 26, 5, 4, 31 };
-    U32 v = val;
-    int r;
-    v |= v >> 1;
-    v |= v >> 2;
-    v |= v >> 4;
-    v |= v >> 8;
-    v |= v >> 16;
-    r = DeBruijnClz[(U32)(v * 0x07C4ACDDU) >> 27];
-    return r;
-#   endif
-}
 
 /*-*************************************
 *  Sequence storage
 ***************************************/
 static void ZSTD_resetSeqStore(seqStore_t* ssPtr)
 {
-    ssPtr->offset = ssPtr->offsetStart;
     ssPtr->lit = ssPtr->litStart;
-    ssPtr->litLength = ssPtr->litLengthStart;
-    ssPtr->matchLength = ssPtr->matchLengthStart;
+    ssPtr->sequences = ssPtr->sequencesStart;
     ssPtr->longLengthID = 0;
 }
 
@@ -122,7 +101,7 @@ struct ZSTD_CCtx_s
     U32   nextToUpdate3;    /* index from which to continue dictionary update */
     U32   hashLog3;         /* dispatch table : larger == faster, more memory */
     U32   loadedDictEnd;
-    U32   stage;            /* 0: created; 1: init,dictLoad; 2:started */
+    ZSTD_compressionStage_e stage;
     U32   rep[ZSTD_REP_NUM];
     U32   savedRep[ZSTD_REP_NUM];
     U32   dictID;
@@ -140,9 +119,9 @@ struct ZSTD_CCtx_s
     U32* chainTable;
     HUF_CElt* hufTable;
     U32 flagStaticTables;
-    FSE_CTable offcodeCTable   [FSE_CTABLE_SIZE_U32(OffFSELog, MaxOff)];
-    FSE_CTable matchlengthCTable [FSE_CTABLE_SIZE_U32(MLFSELog, MaxML)];
-    FSE_CTable litlengthCTable   [FSE_CTABLE_SIZE_U32(LLFSELog, MaxLL)];
+    FSE_CTable offcodeCTable  [FSE_CTABLE_SIZE_U32(OffFSELog, MaxOff)];
+    FSE_CTable matchlengthCTable[FSE_CTABLE_SIZE_U32(MLFSELog, MaxML)];
+    FSE_CTable litlengthCTable  [FSE_CTABLE_SIZE_U32(LLFSELog, MaxLL)];
 };
 
 ZSTD_CCtx* ZSTD_createCCtx(void)
@@ -152,7 +131,7 @@ ZSTD_CCtx* ZSTD_createCCtx(void)
 
 ZSTD_CCtx* ZSTD_createCCtx_advanced(ZSTD_customMem customMem)
 {
-    ZSTD_CCtx* ctx;
+    ZSTD_CCtx* cctx;
 
     if (!customMem.customAlloc && !customMem.customFree)
         customMem = defaultCustomMem;
@@ -160,11 +139,11 @@ ZSTD_CCtx* ZSTD_createCCtx_advanced(ZSTD_customMem customMem)
     if (!customMem.customAlloc || !customMem.customFree)
         return NULL;
 
-    ctx = (ZSTD_CCtx*) customMem.customAlloc(customMem.opaque, sizeof(ZSTD_CCtx));
-    if (!ctx) return NULL;
-    memset(ctx, 0, sizeof(ZSTD_CCtx));
-    memcpy(&ctx->customMem, &customMem, sizeof(ZSTD_customMem));
-    return ctx;
+    cctx = (ZSTD_CCtx*) customMem.customAlloc(customMem.opaque, sizeof(ZSTD_CCtx));
+    if (!cctx) return NULL;
+    memset(cctx, 0, sizeof(ZSTD_CCtx));
+    memcpy(&(cctx->customMem), &customMem, sizeof(ZSTD_customMem));
+    return cctx;
 }
 
 size_t ZSTD_freeCCtx(ZSTD_CCtx* cctx)
@@ -175,6 +154,11 @@ size_t ZSTD_freeCCtx(ZSTD_CCtx* cctx)
     return 0;   /* reserved as a potential error code in the future */
 }
 
+size_t ZSTD_sizeofCCtx(const ZSTD_CCtx* cctx)
+{
+    return sizeof(*cctx) + cctx->workSpaceSize;
+}
+
 const seqStore_t* ZSTD_getSeqStore(const ZSTD_CCtx* ctx)   /* hidden interface */
 {
     return &(ctx->seqStore);
@@ -221,7 +205,7 @@ size_t ZSTD_checkCParams_advanced(ZSTD_compressionParameters cParams, U64 srcSiz
     Both `srcSize` and `dictSize` are optional (use 0 if unknown),
     but if both are 0, no optimization can be done.
     Note : cPar is considered validated at this stage. Use ZSTD_checkParams() to ensure that. */
-ZSTD_compressionParameters ZSTD_adjustCParams(ZSTD_compressionParameters cPar, U64 srcSize, size_t dictSize)
+ZSTD_compressionParameters ZSTD_adjustCParams(ZSTD_compressionParameters cPar, unsigned long long srcSize, size_t dictSize)
 {
     if (srcSize+dictSize == 0) return cPar;   /* no size information available : no adjustment */
 
@@ -244,35 +228,43 @@ ZSTD_compressionParameters ZSTD_adjustCParams(ZSTD_compressionParameters cPar, U
 }
 
 
-size_t ZSTD_sizeofCCtx(ZSTD_compressionParameters cParams)   /* hidden interface, for paramagrill */
+size_t ZSTD_estimateCCtxSize(ZSTD_compressionParameters cParams)
 {
-    ZSTD_CCtx* const zc = ZSTD_createCCtx();
-    ZSTD_parameters params;
-    memset(&params, 0, sizeof(params));
-    params.cParams = cParams;
-    params.fParams.contentSizeFlag = 1;
-    ZSTD_compressBegin_advanced(zc, NULL, 0, params, 0);
-    { size_t const ccsize = sizeof(*zc) + zc->workSpaceSize;
-      ZSTD_freeCCtx(zc);
-      return ccsize; }
+    size_t const blockSize = MIN(ZSTD_BLOCKSIZE_ABSOLUTEMAX, (size_t)1 << cParams.windowLog);
+    U32    const divider = (cParams.searchLength==3) ? 3 : 4;
+    size_t const maxNbSeq = blockSize / divider;
+    size_t const tokenSpace = blockSize + 11*maxNbSeq;
+
+    size_t const chainSize = (cParams.strategy == ZSTD_fast) ? 0 : (1 << cParams.chainLog);
+    size_t const hSize = ((size_t)1) << cParams.hashLog;
+    U32    const hashLog3 = (cParams.searchLength>3) ? 0 : MIN(ZSTD_HASHLOG3_MAX, cParams.windowLog);
+    size_t const h3Size = ((size_t)1) << hashLog3;
+    size_t const tableSpace = (chainSize + hSize + h3Size) * sizeof(U32);
+
+    size_t const optSpace = ((MaxML+1) + (MaxLL+1) + (MaxOff+1) + (1<<Litbits))*sizeof(U32)
+                          + (ZSTD_OPT_NUM+1)*(sizeof(ZSTD_match_t) + sizeof(ZSTD_optimal_t));
+    size_t const neededSpace = tableSpace + (256*sizeof(U32)) /* huffTable */ + tokenSpace
+                             + ((cParams.strategy == ZSTD_btopt) ? optSpace : 0);
+
+    return sizeof(ZSTD_CCtx) + neededSpace;
 }
 
 /*! ZSTD_resetCCtx_advanced() :
     note : 'params' is expected to be validated */
 static size_t ZSTD_resetCCtx_advanced (ZSTD_CCtx* zc,
-                                       ZSTD_parameters params, U64 frameContentSize, U32 reset)
+                                       ZSTD_parameters params, U64 frameContentSize,
+                                       U32 reset)
 {   /* note : params considered validated here */
-    const size_t blockSize = MIN(ZSTD_BLOCKSIZE_MAX, (size_t)1 << params.cParams.windowLog);
-    const U32    divider = (params.cParams.searchLength==3) ? 3 : 4;
-    const size_t maxNbSeq = blockSize / divider;
-    const size_t tokenSpace = blockSize + 11*maxNbSeq;
-    const size_t chainSize = (params.cParams.strategy == ZSTD_fast) ? 0 : (1 << params.cParams.chainLog);
-    const size_t hSize = ((size_t)1) << params.cParams.hashLog;
-    const U32 hashLog3 = (params.cParams.searchLength>3) ? 0 :
-                        ( (!frameContentSize || frameContentSize >= 8192) ? ZSTD_HASHLOG3_MAX :
-                          ((frameContentSize >= 2048) ? ZSTD_HASHLOG3_MIN + 1 : ZSTD_HASHLOG3_MIN) );
-    const size_t h3Size = ((size_t)1) << hashLog3;
-    const size_t tableSpace = (chainSize + hSize + h3Size) * sizeof(U32);
+    size_t const blockSize = MIN(ZSTD_BLOCKSIZE_ABSOLUTEMAX, (size_t)1 << params.cParams.windowLog);
+    U32    const divider = (params.cParams.searchLength==3) ? 3 : 4;
+    size_t const maxNbSeq = blockSize / divider;
+    size_t const tokenSpace = blockSize + 11*maxNbSeq;
+    size_t const chainSize = (params.cParams.strategy == ZSTD_fast) ? 0 : (1 << params.cParams.chainLog);
+    size_t const hSize = ((size_t)1) << params.cParams.hashLog;
+    U32    const hashLog3 = (params.cParams.searchLength>3) ? 0 : MIN(ZSTD_HASHLOG3_MAX, params.cParams.windowLog);
+    size_t const h3Size = ((size_t)1) << hashLog3;
+    size_t const tableSpace = (chainSize + hSize + h3Size) * sizeof(U32);
+    void* ptr;
 
     /* Check if workSpace is large enough, alloc a new one if needed */
     {   size_t const optSpace = ((MaxML+1) + (MaxLL+1) + (MaxOff+1) + (1<<Litbits))*sizeof(U32)
@@ -292,10 +284,10 @@ static size_t ZSTD_resetCCtx_advanced (ZSTD_CCtx* zc,
     zc->hashTable = (U32*)(zc->workSpace);
     zc->chainTable = zc->hashTable + hSize;
     zc->hashTable3 = zc->chainTable + chainSize;
-    zc->seqStore.buffer = zc->hashTable3 + h3Size;
-    zc->hufTable = (HUF_CElt*)zc->seqStore.buffer;
+    ptr = zc->hashTable3 + h3Size;
+    zc->hufTable = (HUF_CElt*)ptr;
     zc->flagStaticTables = 0;
-    zc->seqStore.buffer = ((U32*)(zc->seqStore.buffer)) + 256;  /* note : HUF_CElt* is incomplete type, size is simulated using U32 */
+    ptr = ((U32*)ptr) + 256;  /* note : HUF_CElt* is incomplete type, size is simulated using U32 */
 
     zc->nextToUpdate = 1;
     zc->nextSrc = NULL;
@@ -309,27 +301,25 @@ static size_t ZSTD_resetCCtx_advanced (ZSTD_CCtx* zc,
     { int i; for (i=0; i<ZSTD_REP_NUM; i++) zc->rep[i] = repStartValue[i]; }
 
     if (params.cParams.strategy == ZSTD_btopt) {
-        zc->seqStore.litFreq = (U32*)(zc->seqStore.buffer);
+        zc->seqStore.litFreq = (U32*)ptr;
         zc->seqStore.litLengthFreq = zc->seqStore.litFreq + (1<<Litbits);
         zc->seqStore.matchLengthFreq = zc->seqStore.litLengthFreq + (MaxLL+1);
         zc->seqStore.offCodeFreq = zc->seqStore.matchLengthFreq + (MaxML+1);
-        zc->seqStore.buffer = zc->seqStore.offCodeFreq + (MaxOff+1);
-        zc->seqStore.matchTable = (ZSTD_match_t*)zc->seqStore.buffer;
-        zc->seqStore.buffer = zc->seqStore.matchTable + ZSTD_OPT_NUM+1;
-        zc->seqStore.priceTable = (ZSTD_optimal_t*)zc->seqStore.buffer;
-        zc->seqStore.buffer = zc->seqStore.priceTable + ZSTD_OPT_NUM+1;
+        ptr = zc->seqStore.offCodeFreq + (MaxOff+1);
+        zc->seqStore.matchTable = (ZSTD_match_t*)ptr;
+        ptr = zc->seqStore.matchTable + ZSTD_OPT_NUM+1;
+        zc->seqStore.priceTable = (ZSTD_optimal_t*)ptr;
+        ptr = zc->seqStore.priceTable + ZSTD_OPT_NUM+1;
         zc->seqStore.litLengthSum = 0;
     }
-    zc->seqStore.offsetStart = (U32*)(zc->seqStore.buffer);
-    zc->seqStore.buffer = zc->seqStore.offsetStart + maxNbSeq;
-    zc->seqStore.litLengthStart = (U16*)zc->seqStore.buffer;
-    zc->seqStore.matchLengthStart = zc->seqStore.litLengthStart + maxNbSeq;
-    zc->seqStore.llCodeStart = (BYTE*) (zc->seqStore.matchLengthStart + maxNbSeq);
-    zc->seqStore.mlCodeStart = zc->seqStore.llCodeStart + maxNbSeq;
-    zc->seqStore.offCodeStart = zc->seqStore.mlCodeStart + maxNbSeq;
-    zc->seqStore.litStart = zc->seqStore.offCodeStart + maxNbSeq;
-
-    zc->stage = 1;
+    zc->seqStore.sequencesStart = (seqDef*)ptr;
+    ptr = zc->seqStore.sequencesStart + maxNbSeq;
+    zc->seqStore.llCode = (BYTE*) ptr;
+    zc->seqStore.mlCode = zc->seqStore.llCode + maxNbSeq;
+    zc->seqStore.ofCode = zc->seqStore.mlCode + maxNbSeq;
+    zc->seqStore.litStart = zc->seqStore.ofCode + maxNbSeq;
+
+    zc->stage = ZSTDcs_init;
     zc->dictID = 0;
     zc->loadedDictEnd = 0;
 
@@ -339,21 +329,21 @@ static size_t ZSTD_resetCCtx_advanced (ZSTD_CCtx* zc,
 
 /*! ZSTD_copyCCtx() :
 *   Duplicate an existing context `srcCCtx` into another one `dstCCtx`.
-*   Only works during stage 1 (i.e. after creation, but before first call to ZSTD_compressContinue()).
+*   Only works during stage ZSTDcs_init (i.e. after creation, but before first call to ZSTD_compressContinue()).
 *   @return : 0, or an error code */
 size_t ZSTD_copyCCtx(ZSTD_CCtx* dstCCtx, const ZSTD_CCtx* srcCCtx)
 {
-    if (srcCCtx->stage!=1) return ERROR(stage_wrong);
+    if (srcCCtx->stage!=ZSTDcs_init) return ERROR(stage_wrong);
 
     memcpy(&dstCCtx->customMem, &srcCCtx->customMem, sizeof(ZSTD_customMem));
     ZSTD_resetCCtx_advanced(dstCCtx, srcCCtx->params, srcCCtx->frameContentSize, 0);
     dstCCtx->params.fParams.contentSizeFlag = 0;   /* content size different from the one set during srcCCtx init */
 
     /* copy tables */
-    {   const size_t chainSize = (srcCCtx->params.cParams.strategy == ZSTD_fast) ? 0 : (1 << srcCCtx->params.cParams.chainLog);
-        const size_t hSize = ((size_t)1) << srcCCtx->params.cParams.hashLog;
-        const size_t h3Size = (size_t)1 << srcCCtx->hashLog3;
-        const size_t tableSpace = (chainSize + hSize + h3Size) * sizeof(U32);
+    {   size_t const chainSize = (srcCCtx->params.cParams.strategy == ZSTD_fast) ? 0 : (1 << srcCCtx->params.cParams.chainLog);
+        size_t const hSize = ((size_t)1) << srcCCtx->params.cParams.hashLog;
+        size_t const h3Size = (size_t)1 << srcCCtx->hashLog3;
+        size_t const tableSpace = (chainSize + hSize + h3Size) * sizeof(U32);
         memcpy(dstCCtx->workSpace, srcCCtx->workSpace, tableSpace);
     }
 
@@ -396,13 +386,13 @@ static void ZSTD_reduceTable (U32* const table, U32 const size, U32 const reduce
 *   rescale all indexes to avoid future overflow (indexes are U32) */
 static void ZSTD_reduceIndex (ZSTD_CCtx* zc, const U32 reducerValue)
 {
-    { const U32 hSize = 1 << zc->params.cParams.hashLog;
+    { U32 const hSize = 1 << zc->params.cParams.hashLog;
       ZSTD_reduceTable(zc->hashTable, hSize, reducerValue); }
 
-    { const U32 chainSize = (zc->params.cParams.strategy == ZSTD_fast) ? 0 : (1 << zc->params.cParams.chainLog);
+    { U32 const chainSize = (zc->params.cParams.strategy == ZSTD_fast) ? 0 : (1 << zc->params.cParams.chainLog);
       ZSTD_reduceTable(zc->chainTable, chainSize, reducerValue); }
 
-    { const U32 h3Size = (zc->hashLog3) ? 1 << zc->hashLog3 : 0;
+    { U32 const h3Size = (zc->hashLog3) ? 1 << zc->hashLog3 : 0;
       ZSTD_reduceTable(zc->hashTable3, h3Size, reducerValue); }
 }
 
@@ -411,162 +401,13 @@ static void ZSTD_reduceIndex (ZSTD_CCtx* zc, const U32 reducerValue)
 *  Block entropic compression
 *********************************************************/
 
-/* Frame format description
-   Frame Header -  [ Block Header - Block ] - Frame End
-   1) Frame Header
-      - 4 bytes : Magic Number : ZSTD_MAGICNUMBER (defined within zstd_static.h)
-      - 1 byte  : Frame Header Descriptor
-      - 1-13 bytes : Optional fields
-   2) Block Header
-      - 3 bytes, starting with a 2-bits descriptor
-                 Uncompressed, Compressed, Frame End, unused
-   3) Block
-      See Block Format Description
-   4) Frame End
-      - 3 bytes, compatible with Block Header
-*/
-
-
-/* Frame descriptor
-
-    // old
-   1 byte - Alloc :
-   bit 0-3 : windowLog - ZSTD_WINDOWLOG_ABSOLUTEMIN   (see zstd_internal.h)
-   bit 4   : reserved for windowLog (must be zero)
-   bit 5   : reserved (must be zero)
-   bit 6-7 : Frame content size : unknown, 1 byte, 2 bytes, 8 bytes
-
-   1 byte - checker :
-   bit 0-1 : dictID (0, 1, 2 or 4 bytes)
-   bit 2-7 : reserved (must be zero)
-
-
-    // new
-   1 byte - FrameHeaderDescription :
-   bit 0-1 : dictID (0, 1, 2 or 4 bytes)
-   bit 2-4 : reserved (must be zero)
-   bit 5   : SkippedWindowLog (if 1, WindowLog byte is not present)
-   bit 6-7 : FrameContentFieldsize (0, 2, 4, or 8)
-             if (SkippedWindowLog && !FrameContentFieldsize) FrameContentFieldsize=1;
-
-   Optional : WindowLog (0 or 1 byte)
-   bit 0-2 : octal Fractional (1/8th)
-   bit 3-7 : Power of 2, with 0 = 1 KB (up to 2 TB)
-
-   Optional : dictID (0, 1, 2 or 4 bytes)
-   Automatic adaptation
-   0 : no dictID
-   1 : 1 - 255
-   2 : 256 - 65535
-   4 : all other values
-
-   Optional : content size (0, 1, 2, 4 or 8 bytes)
-   0 : unknown
-   1 : 0-255 bytes
-   2 : 256 - 65535+256
-   8 : up to 16 exa
-*/
-
-
-/* Block format description
-
-   Block = Literal Section - Sequences Section
-   Prerequisite : size of (compressed) block, maximum size of regenerated data
-
-   1) Literal Section
-
-   1.1) Header : 1-5 bytes
-        flags: 2 bits
-            00 compressed by Huff0
-            01 unused
-            10 is Raw (uncompressed)
-            11 is Rle
-            Note : using 01 => Huff0 with precomputed table ?
-            Note : delta map ? => compressed ?
-
-   1.1.1) Huff0-compressed literal block : 3-5 bytes
-            srcSize < 1 KB => 3 bytes (2-2-10-10) => single stream
-            srcSize < 1 KB => 3 bytes (2-2-10-10)
-            srcSize < 16KB => 4 bytes (2-2-14-14)
-            else           => 5 bytes (2-2-18-18)
-            big endian convention
-
-   1.1.2) Raw (uncompressed) literal block header : 1-3 bytes
-        size :  5 bits: (IS_RAW<<6) + (0<<4) + size
-               12 bits: (IS_RAW<<6) + (2<<4) + (size>>8)
-                        size&255
-               20 bits: (IS_RAW<<6) + (3<<4) + (size>>16)
-                        size>>8&255
-                        size&255
-
-   1.1.3) Rle (repeated single byte) literal block header : 1-3 bytes
-        size :  5 bits: (IS_RLE<<6) + (0<<4) + size
-               12 bits: (IS_RLE<<6) + (2<<4) + (size>>8)
-                        size&255
-               20 bits: (IS_RLE<<6) + (3<<4) + (size>>16)
-                        size>>8&255
-                        size&255
-
-   1.1.4) Huff0-compressed literal block, using precomputed CTables : 3-5 bytes
-            srcSize < 1 KB => 3 bytes (2-2-10-10) => single stream
-            srcSize < 1 KB => 3 bytes (2-2-10-10)
-            srcSize < 16KB => 4 bytes (2-2-14-14)
-            else           => 5 bytes (2-2-18-18)
-            big endian convention
-
-        1- CTable available (stored into workspace ?)
-        2- Small input (fast heuristic ? Full comparison ? depend on clevel ?)
-
-
-   1.2) Literal block content
-
-   1.2.1) Huff0 block, using sizes from header
-        See Huff0 format
-
-   1.2.2) Huff0 block, using prepared table
-
-   1.2.3) Raw content
-
-   1.2.4) single byte
-
-
-   2) Sequences section
-
-      - Nb Sequences : 2 bytes, little endian
-      - Control Token : 1 byte (see below)
-      - Dumps Length : 1 or 2 bytes (depending on control token)
-      - Dumps : as stated by dumps length
-      - Literal Lengths FSE table (as needed depending on encoding method)
-      - Offset Codes FSE table (as needed depending on encoding method)
-      - Match Lengths FSE table (as needed depending on encoding method)
-
-    2.1) Control Token
-      8 bits, divided as :
-      0-1 : dumpsLength
-      2-3 : MatchLength, FSE encoding method
-      4-5 : Offset Codes, FSE encoding method
-      6-7 : Literal Lengths, FSE encoding method
-
-      FSE encoding method :
-      FSE_ENCODING_RAW : uncompressed; no header
-      FSE_ENCODING_RLE : single repeated value; header 1 byte
-      FSE_ENCODING_STATIC : use prepared table; no header
-      FSE_ENCODING_DYNAMIC : read NCount
-*/
+/* See zstd_compression_format.md for detailed format description */
 
 size_t ZSTD_noCompressBlock (void* dst, size_t dstCapacity, const void* src, size_t srcSize)
 {
-    BYTE* const ostart = (BYTE* const)dst;
-
     if (srcSize + ZSTD_blockHeaderSize > dstCapacity) return ERROR(dstSize_tooSmall);
-    memcpy(ostart + ZSTD_blockHeaderSize, src, srcSize);
-
-    /* Build header */
-    ostart[0]  = (BYTE)(srcSize>>16);
-    ostart[1]  = (BYTE)(srcSize>>8);
-    ostart[2]  = (BYTE) srcSize;
-    ostart[0] += (BYTE)(bt_raw<<6);   /* is a raw (uncompressed) block */
-
+    memcpy((BYTE*)dst + ZSTD_blockHeaderSize, src, srcSize);
+    MEM_writeLE24(dst, (U32)(srcSize << 2) + (U32)bt_raw);
     return ZSTD_blockHeaderSize+srcSize;
 }
 
@@ -574,24 +415,21 @@ size_t ZSTD_noCompressBlock (void* dst, size_t dstCapacity, const void* src, siz
 static size_t ZSTD_noCompressLiterals (void* dst, size_t dstCapacity, const void* src, size_t srcSize)
 {
     BYTE* const ostart = (BYTE* const)dst;
-    U32 const flSize = 1 + (srcSize>31) + (srcSize>4095);
+    U32   const flSize = 1 + (srcSize>31) + (srcSize>4095);
 
     if (srcSize + flSize > dstCapacity) return ERROR(dstSize_tooSmall);
 
     switch(flSize)
     {
         case 1: /* 2 - 1 - 5 */
-            ostart[0] = (BYTE)((lbt_raw<<6) + (0<<5) + srcSize);
+            ostart[0] = (BYTE)((U32)set_basic + (srcSize<<3));
             break;
         case 2: /* 2 - 2 - 12 */
-            ostart[0] = (BYTE)((lbt_raw<<6) + (2<<4) + (srcSize >> 8));
-            ostart[1] = (BYTE)srcSize;
+            MEM_writeLE16(ostart, (U16)((U32)set_basic + (1<<2) + (srcSize<<4)));
             break;
         default:   /*note : should not be necessary : flSize is within {1,2,3} */
         case 3: /* 2 - 2 - 20 */
-            ostart[0] = (BYTE)((lbt_raw<<6) + (3<<4) + (srcSize >> 16));
-            ostart[1] = (BYTE)(srcSize>>8);
-            ostart[2] = (BYTE)srcSize;
+            MEM_writeLE32(ostart, (U32)((U32)set_basic + (3<<2) + (srcSize<<4)));
             break;
     }
 
@@ -602,24 +440,21 @@ static size_t ZSTD_noCompressLiterals (void* dst, size_t dstCapacity, const void
 static size_t ZSTD_compressRleLiteralsBlock (void* dst, size_t dstCapacity, const void* src, size_t srcSize)
 {
     BYTE* const ostart = (BYTE* const)dst;
-    U32 const flSize = 1 + (srcSize>31) + (srcSize>4095);
+    U32   const flSize = 1 + (srcSize>31) + (srcSize>4095);
 
-    (void)dstCapacity;  /* dstCapacity guaranteed to be >=4, hence large enough */
+    (void)dstCapacity;  /* dstCapacity already guaranteed to be >=4, hence large enough */
 
     switch(flSize)
     {
         case 1: /* 2 - 1 - 5 */
-            ostart[0] = (BYTE)((lbt_rle<<6) + (0<<5) + srcSize);
+            ostart[0] = (BYTE)((U32)set_rle + (srcSize<<3));
             break;
         case 2: /* 2 - 2 - 12 */
-            ostart[0] = (BYTE)((lbt_rle<<6) + (2<<4) + (srcSize >> 8));
-            ostart[1] = (BYTE)srcSize;
+            MEM_writeLE16(ostart, (U16)((U32)set_rle + (1<<2) + (srcSize<<4)));
             break;
         default:   /*note : should not be necessary : flSize is necessarily within {1,2,3} */
         case 3: /* 2 - 2 - 20 */
-            ostart[0] = (BYTE)((lbt_rle<<6) + (3<<4) + (srcSize >> 16));
-            ostart[1] = (BYTE)(srcSize>>8);
-            ostart[2] = (BYTE)srcSize;
+            MEM_writeLE32(ostart, (U32)((U32)set_rle + (3<<2) + (srcSize<<4)));
             break;
     }
 
@@ -636,9 +471,9 @@ static size_t ZSTD_compressLiterals (ZSTD_CCtx* zc,
 {
     size_t const minGain = ZSTD_minGain(srcSize);
     size_t const lhSize = 3 + (srcSize >= 1 KB) + (srcSize >= 16 KB);
-    BYTE* const ostart = (BYTE*)dst;
+    BYTE*  const ostart = (BYTE*)dst;
     U32 singleStream = srcSize < 256;
-    litBlockType_t hType = lbt_huffman;
+    symbolEncodingType_e hType = set_compressed;
     size_t cLitSize;
 
 
@@ -650,15 +485,15 @@ static size_t ZSTD_compressLiterals (ZSTD_CCtx* zc,
 
     if (dstCapacity < lhSize+1) return ERROR(dstSize_tooSmall);   /* not enough space for compression */
     if (zc->flagStaticTables && (lhSize==3)) {
-        hType = lbt_repeat;
+        hType = set_repeat;
         singleStream = 1;
         cLitSize = HUF_compress1X_usingCTable(ostart+lhSize, dstCapacity-lhSize, src, srcSize, zc->hufTable);
     } else {
-        cLitSize = singleStream ? HUF_compress1X(ostart+lhSize, dstCapacity-lhSize, src, srcSize, 255, 12)
-                                : HUF_compress2 (ostart+lhSize, dstCapacity-lhSize, src, srcSize, 255, 12);
+        cLitSize = singleStream ? HUF_compress1X(ostart+lhSize, dstCapacity-lhSize, src, srcSize, 255, 11)
+                                : HUF_compress2 (ostart+lhSize, dstCapacity-lhSize, src, srcSize, 255, 11);
     }
 
-    if ((cLitSize==0) || (cLitSize >= srcSize - minGain))
+    if ((cLitSize==0) | (cLitSize >= srcSize - minGain))
         return ZSTD_noCompressLiterals(dst, dstCapacity, src, srcSize);
     if (cLitSize==1)
         return ZSTD_compressRleLiteralsBlock(dst, dstCapacity, src, srcSize);
@@ -667,79 +502,66 @@ static size_t ZSTD_compressLiterals (ZSTD_CCtx* zc,
     switch(lhSize)
     {
     case 3: /* 2 - 2 - 10 - 10 */
-        ostart[0] = (BYTE)((srcSize>>6) + (singleStream << 4) + (hType<<6));
-        ostart[1] = (BYTE)((srcSize<<2) + (cLitSize>>8));
-        ostart[2] = (BYTE)(cLitSize);
-        break;
+        {   U32 const lhc = hType + ((!singleStream) << 2) + ((U32)srcSize<<4) + ((U32)cLitSize<<14);
+            MEM_writeLE24(ostart, lhc);
+            break;
+        }
     case 4: /* 2 - 2 - 14 - 14 */
-        ostart[0] = (BYTE)((srcSize>>10) + (2<<4) +  (hType<<6));
-        ostart[1] = (BYTE)(srcSize>> 2);
-        ostart[2] = (BYTE)((srcSize<<6) + (cLitSize>>8));
-        ostart[3] = (BYTE)(cLitSize);
-        break;
+        {   U32 const lhc = hType + (2 << 2) + ((U32)srcSize<<4) + ((U32)cLitSize<<18);
+            MEM_writeLE32(ostart, lhc);
+            break;
+        }
     default:   /* should not be necessary, lhSize is only {3,4,5} */
     case 5: /* 2 - 2 - 18 - 18 */
-        ostart[0] = (BYTE)((srcSize>>14) + (3<<4) +  (hType<<6));
-        ostart[1] = (BYTE)(srcSize>>6);
-        ostart[2] = (BYTE)((srcSize<<2) + (cLitSize>>16));
-        ostart[3] = (BYTE)(cLitSize>>8);
-        ostart[4] = (BYTE)(cLitSize);
-        break;
+        {   U32 const lhc = hType + (3 << 2) + ((U32)srcSize<<4) + ((U32)cLitSize<<22);
+            MEM_writeLE32(ostart, lhc);
+            ostart[4] = (BYTE)(cLitSize >> 10);
+            break;
+        }
     }
     return lhSize+cLitSize;
 }
 
-
-void ZSTD_seqToCodes(const seqStore_t* seqStorePtr, size_t const nbSeq)
-{
-    /* LL codes */
-    {   static const BYTE LL_Code[64] = {  0,  1,  2,  3,  4,  5,  6,  7,
-                                           8,  9, 10, 11, 12, 13, 14, 15,
-                                          16, 16, 17, 17, 18, 18, 19, 19,
-                                          20, 20, 20, 20, 21, 21, 21, 21,
-                                          22, 22, 22, 22, 22, 22, 22, 22,
-                                          23, 23, 23, 23, 23, 23, 23, 23,
-                                          24, 24, 24, 24, 24, 24, 24, 24,
-                                          24, 24, 24, 24, 24, 24, 24, 24 };
-        const BYTE LL_deltaCode = 19;
-        const U16* const llTable = seqStorePtr->litLengthStart;
-        BYTE* const llCodeTable = seqStorePtr->llCodeStart;
-        size_t u;
-        for (u=0; u<nbSeq; u++) {
-            U32 const  ll = llTable[u];
-            llCodeTable[u] = (ll>63) ? (BYTE)ZSTD_highbit32(ll) + LL_deltaCode : LL_Code[ll];
-        }
-        if (seqStorePtr->longLengthID==1)
-            llCodeTable[seqStorePtr->longLengthPos] = MaxLL;
-    }
-
-    /* Offset codes */
-    {   const U32* const offsetTable = seqStorePtr->offsetStart;
-        BYTE* const ofCodeTable = seqStorePtr->offCodeStart;
-        size_t u;
-        for (u=0; u<nbSeq; u++) ofCodeTable[u] = (BYTE)ZSTD_highbit32(offsetTable[u]);
-    }
-
-    /* ML codes */
-    {   static const BYTE ML_Code[128] = { 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15,
-                                          16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31,
-                                          32, 32, 33, 33, 34, 34, 35, 35, 36, 36, 36, 36, 37, 37, 37, 37,
-                                          38, 38, 38, 38, 38, 38, 38, 38, 39, 39, 39, 39, 39, 39, 39, 39,
-                                          40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40,
-                                          41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41,
-                                          42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42,
-                                          42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42 };
-        const BYTE ML_deltaCode = 36;
-        const U16* const mlTable = seqStorePtr->matchLengthStart;
-        BYTE* const mlCodeTable = seqStorePtr->mlCodeStart;
-        size_t u;
-        for (u=0; u<nbSeq; u++) {
-            U32 const ml = mlTable[u];
-            mlCodeTable[u] = (ml>127) ? (BYTE)ZSTD_highbit32(ml) + ML_deltaCode : ML_Code[ml];
-        }
-        if (seqStorePtr->longLengthID==2)
-            mlCodeTable[seqStorePtr->longLengthPos] = MaxML;
+static const BYTE LL_Code[64] = {  0,  1,  2,  3,  4,  5,  6,  7,
+                                   8,  9, 10, 11, 12, 13, 14, 15,
+                                  16, 16, 17, 17, 18, 18, 19, 19,
+                                  20, 20, 20, 20, 21, 21, 21, 21,
+                                  22, 22, 22, 22, 22, 22, 22, 22,
+                                  23, 23, 23, 23, 23, 23, 23, 23,
+                                  24, 24, 24, 24, 24, 24, 24, 24,
+                                  24, 24, 24, 24, 24, 24, 24, 24 };
+
+static const BYTE ML_Code[128] = { 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15,
+                                  16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31,
+                                  32, 32, 33, 33, 34, 34, 35, 35, 36, 36, 36, 36, 37, 37, 37, 37,
+                                  38, 38, 38, 38, 38, 38, 38, 38, 39, 39, 39, 39, 39, 39, 39, 39,
+                                  40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40,
+                                  41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41,
+                                  42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42,
+                                  42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42 };
+
+
+void ZSTD_seqToCodes(const seqStore_t* seqStorePtr)
+{
+    BYTE const LL_deltaCode = 19;
+    BYTE const ML_deltaCode = 36;
+    const seqDef* const sequences = seqStorePtr->sequencesStart;
+    BYTE* const llCodeTable = seqStorePtr->llCode;
+    BYTE* const ofCodeTable = seqStorePtr->ofCode;
+    BYTE* const mlCodeTable = seqStorePtr->mlCode;
+    U32 const nbSeq = (U32)(seqStorePtr->sequences - seqStorePtr->sequencesStart);
+    U32 u;
+    for (u=0; u<nbSeq; u++) {
+        U32 const llv = sequences[u].litLength;
+        U32 const mlv = sequences[u].matchLength;
+        llCodeTable[u] = (llv> 63) ? (BYTE)ZSTD_highbit32(llv) + LL_deltaCode : LL_Code[llv];
+        ofCodeTable[u] = (BYTE)ZSTD_highbit32(sequences[u].offset);
+        mlCodeTable[u] = (mlv>127) ? (BYTE)ZSTD_highbit32(mlv) + ML_deltaCode : ML_Code[mlv];
     }
+    if (seqStorePtr->longLengthID==1)
+        llCodeTable[seqStorePtr->longLengthPos] = MaxLL;
+    if (seqStorePtr->longLengthID==2)
+        mlCodeTable[seqStorePtr->longLengthPos] = MaxML;
 }
 
 
@@ -754,17 +576,14 @@ size_t ZSTD_compressSequences(ZSTD_CCtx* zc,
     FSE_CTable* CTable_OffsetBits = zc->offcodeCTable;
     FSE_CTable* CTable_MatchLength = zc->matchlengthCTable;
     U32 LLtype, Offtype, MLtype;   /* compressed, raw or rle */
-    U16*  const llTable = seqStorePtr->litLengthStart;
-    U16*  const mlTable = seqStorePtr->matchLengthStart;
-    const U32*  const offsetTable = seqStorePtr->offsetStart;
-    const U32*  const offsetTableEnd = seqStorePtr->offset;
-    BYTE* const ofCodeTable = seqStorePtr->offCodeStart;
-    BYTE* const llCodeTable = seqStorePtr->llCodeStart;
-    BYTE* const mlCodeTable = seqStorePtr->mlCodeStart;
+    const seqDef* const sequences = seqStorePtr->sequencesStart;
+    const BYTE* const ofCodeTable = seqStorePtr->ofCode;
+    const BYTE* const llCodeTable = seqStorePtr->llCode;
+    const BYTE* const mlCodeTable = seqStorePtr->mlCode;
     BYTE* const ostart = (BYTE*)dst;
     BYTE* const oend = ostart + dstCapacity;
     BYTE* op = ostart;
-    size_t const nbSeq = offsetTableEnd - offsetTable;
+    size_t const nbSeq = seqStorePtr->sequences - seqStorePtr->sequencesStart;
     BYTE* seqHead;
 
     /* Compress literals */
@@ -789,7 +608,7 @@ size_t ZSTD_compressSequences(ZSTD_CCtx* zc,
 #define MAX_SEQ_FOR_STATIC_FSE  1000
 
     /* convert length/distances into codes */
-    ZSTD_seqToCodes(seqStorePtr, nbSeq);
+    ZSTD_seqToCodes(seqStorePtr);
 
     /* CTable for Literal Lengths */
     {   U32 max = MaxLL;
@@ -797,12 +616,12 @@ size_t ZSTD_compressSequences(ZSTD_CCtx* zc,
         if ((mostFrequent == nbSeq) && (nbSeq > 2)) {
             *op++ = llCodeTable[0];
             FSE_buildCTable_rle(CTable_LitLength, (BYTE)max);
-            LLtype = FSE_ENCODING_RLE;
+            LLtype = set_rle;
         } else if ((zc->flagStaticTables) && (nbSeq < MAX_SEQ_FOR_STATIC_FSE)) {
-            LLtype = FSE_ENCODING_STATIC;
+            LLtype = set_repeat;
         } else if ((nbSeq < MIN_SEQ_FOR_DYNAMIC_FSE) || (mostFrequent < (nbSeq >> (LL_defaultNormLog-1)))) {
             FSE_buildCTable(CTable_LitLength, LL_defaultNorm, MaxLL, LL_defaultNormLog);
-            LLtype = FSE_ENCODING_RAW;
+            LLtype = set_basic;
         } else {
             size_t nbSeq_1 = nbSeq;
             const U32 tableLog = FSE_optimalTableLog(LLFSELog, nbSeq, max);
@@ -812,7 +631,7 @@ size_t ZSTD_compressSequences(ZSTD_CCtx* zc,
               if (FSE_isError(NCountSize)) return ERROR(GENERIC);
               op += NCountSize; }
             FSE_buildCTable(CTable_LitLength, norm, max, tableLog);
-            LLtype = FSE_ENCODING_DYNAMIC;
+            LLtype = set_compressed;
     }   }
 
     /* CTable for Offsets */
@@ -821,12 +640,12 @@ size_t ZSTD_compressSequences(ZSTD_CCtx* zc,
         if ((mostFrequent == nbSeq) && (nbSeq > 2)) {
             *op++ = ofCodeTable[0];
             FSE_buildCTable_rle(CTable_OffsetBits, (BYTE)max);
-            Offtype = FSE_ENCODING_RLE;
+            Offtype = set_rle;
         } else if ((zc->flagStaticTables) && (nbSeq < MAX_SEQ_FOR_STATIC_FSE)) {
-            Offtype = FSE_ENCODING_STATIC;
+            Offtype = set_repeat;
         } else if ((nbSeq < MIN_SEQ_FOR_DYNAMIC_FSE) || (mostFrequent < (nbSeq >> (OF_defaultNormLog-1)))) {
             FSE_buildCTable(CTable_OffsetBits, OF_defaultNorm, MaxOff, OF_defaultNormLog);
-            Offtype = FSE_ENCODING_RAW;
+            Offtype = set_basic;
         } else {
             size_t nbSeq_1 = nbSeq;
             const U32 tableLog = FSE_optimalTableLog(OffFSELog, nbSeq, max);
@@ -836,7 +655,7 @@ size_t ZSTD_compressSequences(ZSTD_CCtx* zc,
               if (FSE_isError(NCountSize)) return ERROR(GENERIC);
               op += NCountSize; }
             FSE_buildCTable(CTable_OffsetBits, norm, max, tableLog);
-            Offtype = FSE_ENCODING_DYNAMIC;
+            Offtype = set_compressed;
     }   }
 
     /* CTable for MatchLengths */
@@ -845,12 +664,12 @@ size_t ZSTD_compressSequences(ZSTD_CCtx* zc,
         if ((mostFrequent == nbSeq) && (nbSeq > 2)) {
             *op++ = *mlCodeTable;
             FSE_buildCTable_rle(CTable_MatchLength, (BYTE)max);
-            MLtype = FSE_ENCODING_RLE;
+            MLtype = set_rle;
         } else if ((zc->flagStaticTables) && (nbSeq < MAX_SEQ_FOR_STATIC_FSE)) {
-            MLtype = FSE_ENCODING_STATIC;
+            MLtype = set_repeat;
         } else if ((nbSeq < MIN_SEQ_FOR_DYNAMIC_FSE) || (mostFrequent < (nbSeq >> (ML_defaultNormLog-1)))) {
             FSE_buildCTable(CTable_MatchLength, ML_defaultNorm, MaxML, ML_defaultNormLog);
-            MLtype = FSE_ENCODING_RAW;
+            MLtype = set_basic;
         } else {
             size_t nbSeq_1 = nbSeq;
             const U32 tableLog = FSE_optimalTableLog(MLFSELog, nbSeq, max);
@@ -860,7 +679,7 @@ size_t ZSTD_compressSequences(ZSTD_CCtx* zc,
               if (FSE_isError(NCountSize)) return ERROR(GENERIC);
               op += NCountSize; }
             FSE_buildCTable(CTable_MatchLength, norm, max, tableLog);
-            MLtype = FSE_ENCODING_DYNAMIC;
+            MLtype = set_compressed;
     }   }
 
     *seqHead = (BYTE)((LLtype<<6) + (Offtype<<4) + (MLtype<<2));
@@ -879,21 +698,21 @@ size_t ZSTD_compressSequences(ZSTD_CCtx* zc,
         FSE_initCState2(&stateMatchLength, CTable_MatchLength, mlCodeTable[nbSeq-1]);
         FSE_initCState2(&stateOffsetBits,  CTable_OffsetBits,  ofCodeTable[nbSeq-1]);
         FSE_initCState2(&stateLitLength,   CTable_LitLength,   llCodeTable[nbSeq-1]);
-        BIT_addBits(&blockStream, llTable[nbSeq-1], LL_bits[llCodeTable[nbSeq-1]]);
+        BIT_addBits(&blockStream, sequences[nbSeq-1].litLength, LL_bits[llCodeTable[nbSeq-1]]);
         if (MEM_32bits()) BIT_flushBits(&blockStream);
-        BIT_addBits(&blockStream, mlTable[nbSeq-1], ML_bits[mlCodeTable[nbSeq-1]]);
+        BIT_addBits(&blockStream, sequences[nbSeq-1].matchLength, ML_bits[mlCodeTable[nbSeq-1]]);
         if (MEM_32bits()) BIT_flushBits(&blockStream);
-        BIT_addBits(&blockStream, offsetTable[nbSeq-1], ofCodeTable[nbSeq-1]);
+        BIT_addBits(&blockStream, sequences[nbSeq-1].offset, ofCodeTable[nbSeq-1]);
         BIT_flushBits(&blockStream);
 
         {   size_t n;
             for (n=nbSeq-2 ; n<nbSeq ; n--) {      /* intentional underflow */
-                const BYTE ofCode = ofCodeTable[n];
-                const BYTE mlCode = mlCodeTable[n];
-                const BYTE llCode = llCodeTable[n];
-                const U32  llBits = LL_bits[llCode];
-                const U32  mlBits = ML_bits[mlCode];
-                const U32  ofBits = ofCode;                                     /* 32b*/  /* 64b*/
+                BYTE const llCode = llCodeTable[n];
+                BYTE const ofCode = ofCodeTable[n];
+                BYTE const mlCode = mlCodeTable[n];
+                U32  const llBits = LL_bits[llCode];
+                U32  const ofBits = ofCode;                                     /* 32b*/  /* 64b*/
+                U32  const mlBits = ML_bits[mlCode];
                                                                                 /* (7)*/  /* (7)*/
                 FSE_encodeSymbol(&blockStream, &stateOffsetBits, ofCode);       /* 15 */  /* 15 */
                 FSE_encodeSymbol(&blockStream, &stateMatchLength, mlCode);      /* 24 */  /* 24 */
@@ -901,11 +720,11 @@ size_t ZSTD_compressSequences(ZSTD_CCtx* zc,
                 FSE_encodeSymbol(&blockStream, &stateLitLength, llCode);        /* 16 */  /* 33 */
                 if (MEM_32bits() || (ofBits+mlBits+llBits >= 64-7-(LLFSELog+MLFSELog+OffFSELog)))
                     BIT_flushBits(&blockStream);                                /* (7)*/
-                BIT_addBits(&blockStream, llTable[n], llBits);
+                BIT_addBits(&blockStream, sequences[n].litLength, llBits);
                 if (MEM_32bits() && ((llBits+mlBits)>24)) BIT_flushBits(&blockStream);
-                BIT_addBits(&blockStream, mlTable[n], mlBits);
+                BIT_addBits(&blockStream, sequences[n].matchLength, mlBits);
                 if (MEM_32bits()) BIT_flushBits(&blockStream);                  /* (7)*/
-                BIT_addBits(&blockStream, offsetTable[n], ofBits);              /* 31 */
+                BIT_addBits(&blockStream, sequences[n].offset, ofBits);         /* 31 */
                 BIT_flushBits(&blockStream);                                    /* (7)*/
         }   }
 
@@ -936,7 +755,7 @@ _check_compressibility:
     `offsetCode` : distance to match, or 0 == repCode.
     `matchCode` : matchLength - MINMATCH
 */
-MEM_STATIC void ZSTD_storeSeq(seqStore_t* seqStorePtr, size_t litLength, const void* literals, size_t offsetCode, size_t matchCode)
+MEM_STATIC void ZSTD_storeSeq(seqStore_t* seqStorePtr, size_t litLength, const void* literals, U32 offsetCode, size_t matchCode)
 {
 #if 0  /* for debug */
     static const BYTE* g_start = NULL;
@@ -953,15 +772,17 @@ MEM_STATIC void ZSTD_storeSeq(seqStore_t* seqStorePtr, size_t litLength, const v
     seqStorePtr->lit += litLength;
 
     /* literal Length */
-    if (litLength>0xFFFF) { seqStorePtr->longLengthID = 1; seqStorePtr->longLengthPos = (U32)(seqStorePtr->litLength - seqStorePtr->litLengthStart); }
-    *seqStorePtr->litLength++ = (U16)litLength;
+    if (litLength>0xFFFF) { seqStorePtr->longLengthID = 1; seqStorePtr->longLengthPos = (U32)(seqStorePtr->sequences - seqStorePtr->sequencesStart); }
+    seqStorePtr->sequences[0].litLength = (U16)litLength;
 
     /* match offset */
-    *(seqStorePtr->offset++) = (U32)offsetCode + 1;
+    seqStorePtr->sequences[0].offset = offsetCode + 1;
 
     /* match Length */
-    if (matchCode>0xFFFF) { seqStorePtr->longLengthID = 2; seqStorePtr->longLengthPos = (U32)(seqStorePtr->matchLength - seqStorePtr->matchLengthStart); }
-    *seqStorePtr->matchLength++ = (U16)matchCode;
+    if (matchCode>0xFFFF) { seqStorePtr->longLengthID = 2; seqStorePtr->longLengthPos = (U32)(seqStorePtr->sequences - seqStorePtr->sequencesStart); }
+    seqStorePtr->sequences[0].matchLength = (U16)matchCode;
+
+    seqStorePtr->sequences++;
 }
 
 
@@ -1051,10 +872,9 @@ static size_t ZSTD_count(const BYTE* pIn, const BYTE* pMatch, const BYTE* const
 static size_t ZSTD_count_2segments(const BYTE* ip, const BYTE* match, const BYTE* iEnd, const BYTE* mEnd, const BYTE* iStart)
 {
     const BYTE* const vEnd = MIN( ip + (mEnd - match), iEnd);
-    size_t matchLength = ZSTD_count(ip, match, vEnd);
-    if (match + matchLength == mEnd)
-        matchLength += ZSTD_count(ip+matchLength, iStart, iEnd);
-    return matchLength;
+    size_t const matchLength = ZSTD_count(ip, match, vEnd);
+    if (match + matchLength != mEnd) return matchLength;
+    return matchLength + ZSTD_count(ip+matchLength, iStart, iEnd);
 }
 
 
@@ -1063,7 +883,7 @@ static size_t ZSTD_count_2segments(const BYTE* ip, const BYTE* match, const BYTE
 ***************************************/
 static const U32 prime3bytes = 506832829U;
 static U32    ZSTD_hash3(U32 u, U32 h) { return ((u << (32-24)) * prime3bytes)  >> (32-h) ; }
-static size_t ZSTD_hash3Ptr(const void* ptr, U32 h) { return ZSTD_hash3(MEM_readLE32(ptr), h); }
+MEM_STATIC size_t ZSTD_hash3Ptr(const void* ptr, U32 h) { return ZSTD_hash3(MEM_readLE32(ptr), h); }   /* only in zstd_opt.h */
 
 static const U32 prime4bytes = 2654435761U;
 static U32    ZSTD_hash4(U32 u, U32 h) { return (u * prime4bytes) >> (32-h) ; }
@@ -1081,6 +901,10 @@ static const U64 prime7bytes = 58295818150454627ULL;
 static size_t ZSTD_hash7(U64 u, U32 h) { return (size_t)(((u  << (64-56)) * prime7bytes) >> (64-h)) ; }
 static size_t ZSTD_hash7Ptr(const void* p, U32 h) { return ZSTD_hash7(MEM_readLE64(p), h); }
 
+static const U64 prime8bytes = 0xCF1BBCDCB7A56463ULL;
+static size_t ZSTD_hash8(U64 u, U32 h) { return (size_t)(((u) * prime8bytes) >> (64-h)) ; }
+static size_t ZSTD_hash8Ptr(const void* p, U32 h) { return ZSTD_hash8(MEM_readLE64(p), h); }
+
 static size_t ZSTD_hashPtr(const void* p, U32 hBits, U32 mls)
 {
     switch(mls)
@@ -1090,6 +914,7 @@ static size_t ZSTD_hashPtr(const void* p, U32 hBits, U32 mls)
     case 5: return ZSTD_hash5Ptr(p, hBits);
     case 6: return ZSTD_hash6Ptr(p, hBits);
     case 7: return ZSTD_hash7Ptr(p, hBits);
+    case 8: return ZSTD_hash8Ptr(p, hBits);
     }
 }
 
@@ -1100,10 +925,10 @@ static size_t ZSTD_hashPtr(const void* p, U32 hBits, U32 mls)
 static void ZSTD_fillHashTable (ZSTD_CCtx* zc, const void* end, const U32 mls)
 {
     U32* const hashTable = zc->hashTable;
-    const U32 hBits = zc->params.cParams.hashLog;
+    U32  const hBits = zc->params.cParams.hashLog;
     const BYTE* const base = zc->base;
     const BYTE* ip = base + zc->nextToUpdate;
-    const BYTE* const iend = ((const BYTE*)end) - 8;
+    const BYTE* const iend = ((const BYTE*)end) - HASH_READ_SIZE;
     const size_t fastHashFillStep = 3;
 
     while(ip <= iend) {
@@ -1119,23 +944,24 @@ void ZSTD_compressBlock_fast_generic(ZSTD_CCtx* cctx,
                                  const U32 mls)
 {
     U32* const hashTable = cctx->hashTable;
-    const U32 hBits = cctx->params.cParams.hashLog;
+    U32  const hBits = cctx->params.cParams.hashLog;
     seqStore_t* seqStorePtr = &(cctx->seqStore);
     const BYTE* const base = cctx->base;
     const BYTE* const istart = (const BYTE*)src;
     const BYTE* ip = istart;
     const BYTE* anchor = istart;
-    const U32 lowestIndex = cctx->dictLimit;
+    const U32   lowestIndex = cctx->dictLimit;
     const BYTE* const lowest = base + lowestIndex;
     const BYTE* const iend = istart + srcSize;
-    const BYTE* const ilimit = iend - 8;
-    size_t offset_1=cctx->rep[0], offset_2=cctx->rep[1];
+    const BYTE* const ilimit = iend - HASH_READ_SIZE;
+    U32 offset_1=cctx->rep[0], offset_2=cctx->rep[1];
+    U32 offsetSaved = 0;
 
     /* init */
     ip += (ip==lowest);
     {   U32 const maxRep = (U32)(ip-lowest);
-        if (offset_1 > maxRep) offset_1 = 0;
-        if (offset_2 > maxRep) offset_2 = 0;
+        if (offset_2 > maxRep) offsetSaved = offset_2, offset_2 = 0;
+        if (offset_1 > maxRep) offsetSaved = offset_1, offset_1 = 0;
     }
 
     /* Main Search Loop */
@@ -1148,17 +974,17 @@ void ZSTD_compressBlock_fast_generic(ZSTD_CCtx* cctx,
         hashTable[h] = current;   /* update hash table */
 
         if ((offset_1 > 0) & (MEM_read32(ip+1-offset_1) == MEM_read32(ip+1))) { /* note : by construction, offset_1 <= current */
-            mLength = ZSTD_count(ip+1+EQUAL_READ32, ip+1+EQUAL_READ32-offset_1, iend) + EQUAL_READ32;
+            mLength = ZSTD_count(ip+1+4, ip+1+4-offset_1, iend) + 4;
             ip++;
             ZSTD_storeSeq(seqStorePtr, ip-anchor, anchor, 0, mLength-MINMATCH);
         } else {
-            size_t offset;
+            U32 offset;
             if ( (matchIndex <= lowestIndex) || (MEM_read32(match) != MEM_read32(ip)) ) {
                 ip += ((ip-anchor) >> g_searchStrength) + 1;
                 continue;
             }
-            mLength = ZSTD_count(ip+EQUAL_READ32, match+EQUAL_READ32, iend) + EQUAL_READ32;
-            offset = ip-match;
+            mLength = ZSTD_count(ip+4, match+4, iend) + 4;
+            offset = (U32)(ip-match);
             while (((ip>anchor) & (match>lowest)) && (ip[-1] == match[-1])) { ip--; match--; mLength++; } /* catch up */
             offset_2 = offset_1;
             offset_1 = offset;
@@ -1179,8 +1005,8 @@ void ZSTD_compressBlock_fast_generic(ZSTD_CCtx* cctx,
                  && ( (offset_2>0)
                  & (MEM_read32(ip) == MEM_read32(ip - offset_2)) )) {
                 /* store sequence */
-                size_t const rLength = ZSTD_count(ip+EQUAL_READ32, ip+EQUAL_READ32-offset_2, iend) + EQUAL_READ32;
-                { size_t const tmpOff = offset_2; offset_2 = offset_1; offset_1 = tmpOff; } /* swap offset_2 <=> offset_1 */
+                size_t const rLength = ZSTD_count(ip+4, ip+4-offset_2, iend) + 4;
+                { U32 const tmpOff = offset_2; offset_2 = offset_1; offset_1 = tmpOff; }  /* swap offset_2 <=> offset_1 */
                 hashTable[ZSTD_hashPtr(ip, hBits, mls)] = (U32)(ip-base);
                 ZSTD_storeSeq(seqStorePtr, 0, anchor, 0, rLength-MINMATCH);
                 ip += rLength;
@@ -1189,8 +1015,8 @@ void ZSTD_compressBlock_fast_generic(ZSTD_CCtx* cctx,
     }   }   }
 
     /* save reps for next block */
-    cctx->savedRep[0] = offset_1 ? (U32)offset_1 : (U32)(iend - base) + 1;
-    cctx->savedRep[1] = offset_2 ? (U32)offset_2 : (U32)(iend - base) + 1;
+    cctx->savedRep[0] = offset_1 ? offset_1 : offsetSaved;
+    cctx->savedRep[1] = offset_2 ? offset_2 : offsetSaved;
 
     /* Last Literals */
     {   size_t const lastLLSize = iend - anchor;
@@ -1317,7 +1143,7 @@ static void ZSTD_compressBlock_fast_extDict_generic(ZSTD_CCtx* ctx,
 static void ZSTD_compressBlock_fast_extDict(ZSTD_CCtx* ctx,
                          const void* src, size_t srcSize)
 {
-    const U32 mls = ctx->params.cParams.searchLength;
+    U32 const mls = ctx->params.cParams.searchLength;
     switch(mls)
     {
     default:
@@ -1334,6 +1160,283 @@ static void ZSTD_compressBlock_fast_extDict(ZSTD_CCtx* ctx,
 
 
 /*-*************************************
+*  Double Fast
+***************************************/
+static void ZSTD_fillDoubleHashTable (ZSTD_CCtx* cctx, const void* end, const U32 mls)
+{
+    U32* const hashLarge = cctx->hashTable;
+    U32  const hBitsL = cctx->params.cParams.hashLog;
+    U32* const hashSmall = cctx->chainTable;
+    U32  const hBitsS = cctx->params.cParams.chainLog;
+    const BYTE* const base = cctx->base;
+    const BYTE* ip = base + cctx->nextToUpdate;
+    const BYTE* const iend = ((const BYTE*)end) - HASH_READ_SIZE;
+    const size_t fastHashFillStep = 3;
+
+    while(ip <= iend) {
+        hashSmall[ZSTD_hashPtr(ip, hBitsS, mls)] = (U32)(ip - base);
+        hashLarge[ZSTD_hashPtr(ip, hBitsL, 8)] = (U32)(ip - base);
+        ip += fastHashFillStep;
+    }
+}
+
+
+FORCE_INLINE
+void ZSTD_compressBlock_doubleFast_generic(ZSTD_CCtx* cctx,
+                                 const void* src, size_t srcSize,
+                                 const U32 mls)
+{
+    U32* const hashLong = cctx->hashTable;
+    const U32 hBitsL = cctx->params.cParams.hashLog;
+    U32* const hashSmall = cctx->chainTable;
+    const U32 hBitsS = cctx->params.cParams.chainLog;
+    seqStore_t* seqStorePtr = &(cctx->seqStore);
+    const BYTE* const base = cctx->base;
+    const BYTE* const istart = (const BYTE*)src;
+    const BYTE* ip = istart;
+    const BYTE* anchor = istart;
+    const U32 lowestIndex = cctx->dictLimit;
+    const BYTE* const lowest = base + lowestIndex;
+    const BYTE* const iend = istart + srcSize;
+    const BYTE* const ilimit = iend - HASH_READ_SIZE;
+    U32 offset_1=cctx->rep[0], offset_2=cctx->rep[1];
+    U32 offsetSaved = 0;
+
+    /* init */
+    ip += (ip==lowest);
+    {   U32 const maxRep = (U32)(ip-lowest);
+        if (offset_2 > maxRep) offsetSaved = offset_2, offset_2 = 0;
+        if (offset_1 > maxRep) offsetSaved = offset_1, offset_1 = 0;
+    }
+
+    /* Main Search Loop */
+    while (ip < ilimit) {   /* < instead of <=, because repcode check at (ip+1) */
+        size_t mLength;
+        size_t const h2 = ZSTD_hashPtr(ip, hBitsL, 8);
+        size_t const h = ZSTD_hashPtr(ip, hBitsS, mls);
+        U32 const current = (U32)(ip-base);
+        U32 const matchIndexL = hashLong[h2];
+        U32 const matchIndexS = hashSmall[h];
+        const BYTE* matchLong = base + matchIndexL;
+        const BYTE* match = base + matchIndexS;
+        hashLong[h2] = hashSmall[h] = current;   /* update hash tables */
+
+        if ((offset_1 > 0) & (MEM_read32(ip+1-offset_1) == MEM_read32(ip+1))) { /* note : by construction, offset_1 <= current */
+            mLength = ZSTD_count(ip+1+4, ip+1+4-offset_1, iend) + 4;
+            ip++;
+            ZSTD_storeSeq(seqStorePtr, ip-anchor, anchor, 0, mLength-MINMATCH);
+        } else {
+            U32 offset;
+            if ( (matchIndexL > lowestIndex) && (MEM_read64(matchLong) == MEM_read64(ip)) ) {
+                mLength = ZSTD_count(ip+8, matchLong+8, iend) + 8;
+                offset = (U32)(ip-matchLong);
+                while (((ip>anchor) & (matchLong>lowest)) && (ip[-1] == matchLong[-1])) { ip--; matchLong--; mLength++; } /* catch up */
+            } else if ( (matchIndexS > lowestIndex) && (MEM_read32(match) == MEM_read32(ip)) ) {
+                mLength = ZSTD_count(ip+4, match+4, iend) + 4;
+                offset = (U32)(ip-match);
+                while (((ip>anchor) & (match>lowest)) && (ip[-1] == match[-1])) { ip--; match--; mLength++; } /* catch up */
+            } else {
+                ip += ((ip-anchor) >> g_searchStrength) + 1;
+                continue;
+            }
+
+            offset_2 = offset_1;
+            offset_1 = offset;
+
+            ZSTD_storeSeq(seqStorePtr, ip-anchor, anchor, offset + ZSTD_REP_MOVE, mLength-MINMATCH);
+        }
+
+        /* match found */
+        ip += mLength;
+        anchor = ip;
+
+        if (ip <= ilimit) {
+            /* Fill Table */
+            hashLong[ZSTD_hashPtr(base+current+2, hBitsL, 8)] =
+                hashSmall[ZSTD_hashPtr(base+current+2, hBitsS, mls)] = current+2;  /* here because current+2 could be > iend-8 */
+            hashLong[ZSTD_hashPtr(ip-2, hBitsL, 8)] =
+                hashSmall[ZSTD_hashPtr(ip-2, hBitsS, mls)] = (U32)(ip-2-base);
+
+            /* check immediate repcode */
+            while ( (ip <= ilimit)
+                 && ( (offset_2>0)
+                 & (MEM_read32(ip) == MEM_read32(ip - offset_2)) )) {
+                /* store sequence */
+                size_t const rLength = ZSTD_count(ip+4, ip+4-offset_2, iend) + 4;
+                { U32 const tmpOff = offset_2; offset_2 = offset_1; offset_1 = tmpOff; } /* swap offset_2 <=> offset_1 */
+                hashSmall[ZSTD_hashPtr(ip, hBitsS, mls)] = (U32)(ip-base);
+                hashLong[ZSTD_hashPtr(ip, hBitsL, 8)] = (U32)(ip-base);
+                ZSTD_storeSeq(seqStorePtr, 0, anchor, 0, rLength-MINMATCH);
+                ip += rLength;
+                anchor = ip;
+                continue;   /* faster when present ... (?) */
+    }   }   }
+
+    /* save reps for next block */
+    cctx->savedRep[0] = offset_1 ? offset_1 : offsetSaved;
+    cctx->savedRep[1] = offset_2 ? offset_2 : offsetSaved;
+
+    /* Last Literals */
+    {   size_t const lastLLSize = iend - anchor;
+        memcpy(seqStorePtr->lit, anchor, lastLLSize);
+        seqStorePtr->lit += lastLLSize;
+    }
+}
+
+
+static void ZSTD_compressBlock_doubleFast(ZSTD_CCtx* ctx, const void* src, size_t srcSize)
+{
+    const U32 mls = ctx->params.cParams.searchLength;
+    switch(mls)
+    {
+    default:
+    case 4 :
+        ZSTD_compressBlock_doubleFast_generic(ctx, src, srcSize, 4); return;
+    case 5 :
+        ZSTD_compressBlock_doubleFast_generic(ctx, src, srcSize, 5); return;
+    case 6 :
+        ZSTD_compressBlock_doubleFast_generic(ctx, src, srcSize, 6); return;
+    case 7 :
+        ZSTD_compressBlock_doubleFast_generic(ctx, src, srcSize, 7); return;
+    }
+}
+
+
+static void ZSTD_compressBlock_doubleFast_extDict_generic(ZSTD_CCtx* ctx,
+                                 const void* src, size_t srcSize,
+                                 const U32 mls)
+{
+    U32* const hashLong = ctx->hashTable;
+    U32  const hBitsL = ctx->params.cParams.hashLog;
+    U32* const hashSmall = ctx->chainTable;
+    U32  const hBitsS = ctx->params.cParams.chainLog;
+    seqStore_t* seqStorePtr = &(ctx->seqStore);
+    const BYTE* const base = ctx->base;
+    const BYTE* const dictBase = ctx->dictBase;
+    const BYTE* const istart = (const BYTE*)src;
+    const BYTE* ip = istart;
+    const BYTE* anchor = istart;
+    const U32   lowestIndex = ctx->lowLimit;
+    const BYTE* const dictStart = dictBase + lowestIndex;
+    const U32   dictLimit = ctx->dictLimit;
+    const BYTE* const lowPrefixPtr = base + dictLimit;
+    const BYTE* const dictEnd = dictBase + dictLimit;
+    const BYTE* const iend = istart + srcSize;
+    const BYTE* const ilimit = iend - 8;
+    U32 offset_1=ctx->rep[0], offset_2=ctx->rep[1];
+
+    /* Search Loop */
+    while (ip < ilimit) {  /* < instead of <=, because (ip+1) */
+        const size_t hSmall = ZSTD_hashPtr(ip, hBitsS, mls);
+        const U32 matchIndex = hashSmall[hSmall];
+        const BYTE* matchBase = matchIndex < dictLimit ? dictBase : base;
+        const BYTE* match = matchBase + matchIndex;
+
+        const size_t hLong = ZSTD_hashPtr(ip, hBitsL, 8);
+        const U32 matchLongIndex = hashLong[hLong];
+        const BYTE* matchLongBase = matchLongIndex < dictLimit ? dictBase : base;
+        const BYTE* matchLong = matchLongBase + matchLongIndex;
+
+        const U32 current = (U32)(ip-base);
+        const U32 repIndex = current + 1 - offset_1;   /* offset_1 expected <= current +1 */
+        const BYTE* repBase = repIndex < dictLimit ? dictBase : base;
+        const BYTE* repMatch = repBase + repIndex;
+        size_t mLength;
+        hashSmall[hSmall] = hashLong[hLong] = current;   /* update hash table */
+
+        if ( (((U32)((dictLimit-1) - repIndex) >= 3) /* intentional underflow */ & (repIndex > lowestIndex))
+           && (MEM_read32(repMatch) == MEM_read32(ip+1)) ) {
+            const BYTE* repMatchEnd = repIndex < dictLimit ? dictEnd : iend;
+            mLength = ZSTD_count_2segments(ip+1+4, repMatch+4, iend, repMatchEnd, lowPrefixPtr) + 4;
+            ip++;
+            ZSTD_storeSeq(seqStorePtr, ip-anchor, anchor, 0, mLength-MINMATCH);
+        } else {
+            if ((matchLongIndex > lowestIndex) && (MEM_read64(matchLong) == MEM_read64(ip))) {
+                const BYTE* matchEnd = matchLongIndex < dictLimit ? dictEnd : iend;
+                const BYTE* lowMatchPtr = matchLongIndex < dictLimit ? dictStart : lowPrefixPtr;
+                U32 offset;
+                mLength = ZSTD_count_2segments(ip+8, matchLong+8, iend, matchEnd, lowPrefixPtr) + 8;
+                offset = current - matchLongIndex;
+                while (((ip>anchor) & (matchLong>lowMatchPtr)) && (ip[-1] == matchLong[-1])) { ip--; matchLong--; mLength++; }   /* catch up */
+                offset_2 = offset_1;
+                offset_1 = offset;
+                ZSTD_storeSeq(seqStorePtr, ip-anchor, anchor, offset + ZSTD_REP_MOVE, mLength-MINMATCH);
+            } else if ((matchIndex > lowestIndex) && (MEM_read32(match) == MEM_read32(ip))) {
+                const BYTE* matchEnd = matchIndex < dictLimit ? dictEnd : iend;
+                const BYTE* lowMatchPtr = matchIndex < dictLimit ? dictStart : lowPrefixPtr;
+                U32 offset;
+                mLength = ZSTD_count_2segments(ip+4, match+4, iend, matchEnd, lowPrefixPtr) + 4;
+                while (((ip>anchor) & (match>lowMatchPtr)) && (ip[-1] == match[-1])) { ip--; match--; mLength++; }   /* catch up */
+                offset = current - matchIndex;
+                offset_2 = offset_1;
+                offset_1 = offset;
+                ZSTD_storeSeq(seqStorePtr, ip-anchor, anchor, offset + ZSTD_REP_MOVE, mLength-MINMATCH);
+            } else {
+                ip += ((ip-anchor) >> g_searchStrength) + 1;
+                continue;
+        }   }
+
+        /* found a match : store it */
+        ip += mLength;
+        anchor = ip;
+
+        if (ip <= ilimit) {
+            /* Fill Table */
+			hashSmall[ZSTD_hashPtr(base+current+2, hBitsS, mls)] = current+2;
+			hashLong[ZSTD_hashPtr(base+current+2, hBitsL, 8)] = current+2;
+            hashSmall[ZSTD_hashPtr(ip-2, hBitsS, mls)] = (U32)(ip-2-base);
+            hashLong[ZSTD_hashPtr(ip-2, hBitsL, 8)] = (U32)(ip-2-base);
+            /* check immediate repcode */
+            while (ip <= ilimit) {
+                U32 const current2 = (U32)(ip-base);
+                U32 const repIndex2 = current2 - offset_2;
+                const BYTE* repMatch2 = repIndex2 < dictLimit ? dictBase + repIndex2 : base + repIndex2;
+                if ( (((U32)((dictLimit-1) - repIndex2) >= 3) & (repIndex2 > lowestIndex))  /* intentional overflow */
+                   && (MEM_read32(repMatch2) == MEM_read32(ip)) ) {
+                    const BYTE* const repEnd2 = repIndex2 < dictLimit ? dictEnd : iend;
+                    size_t const repLength2 = ZSTD_count_2segments(ip+EQUAL_READ32, repMatch2+EQUAL_READ32, iend, repEnd2, lowPrefixPtr) + EQUAL_READ32;
+                    U32 tmpOffset = offset_2; offset_2 = offset_1; offset_1 = tmpOffset;   /* swap offset_2 <=> offset_1 */
+                    ZSTD_storeSeq(seqStorePtr, 0, anchor, 0, repLength2-MINMATCH);
+                    hashSmall[ZSTD_hashPtr(ip, hBitsS, mls)] = current2;
+                    hashLong[ZSTD_hashPtr(ip, hBitsL, 8)] = current2;
+                    ip += repLength2;
+                    anchor = ip;
+                    continue;
+                }
+                break;
+    }   }   }
+
+    /* save reps for next block */
+    ctx->savedRep[0] = offset_1; ctx->savedRep[1] = offset_2;
+
+    /* Last Literals */
+    {   size_t const lastLLSize = iend - anchor;
+        memcpy(seqStorePtr->lit, anchor, lastLLSize);
+        seqStorePtr->lit += lastLLSize;
+    }
+}
+
+
+static void ZSTD_compressBlock_doubleFast_extDict(ZSTD_CCtx* ctx,
+                         const void* src, size_t srcSize)
+{
+    U32 const mls = ctx->params.cParams.searchLength;
+    switch(mls)
+    {
+    default:
+    case 4 :
+        ZSTD_compressBlock_doubleFast_extDict_generic(ctx, src, srcSize, 4); return;
+    case 5 :
+        ZSTD_compressBlock_doubleFast_extDict_generic(ctx, src, srcSize, 5); return;
+    case 6 :
+        ZSTD_compressBlock_doubleFast_extDict_generic(ctx, src, srcSize, 6); return;
+    case 7 :
+        ZSTD_compressBlock_doubleFast_extDict_generic(ctx, src, srcSize, 7); return;
+    }
+}
+
+
+/*-*************************************
 *  Binary Tree search
 ***************************************/
 /** ZSTD_insertBt1() : add one or multiple positions to tree.
@@ -1342,13 +1445,13 @@ static void ZSTD_compressBlock_fast_extDict(ZSTD_CCtx* ctx,
 static U32 ZSTD_insertBt1(ZSTD_CCtx* zc, const BYTE* const ip, const U32 mls, const BYTE* const iend, U32 nbCompares,
                           U32 extDict)
 {
-    U32* const hashTable = zc->hashTable;
-    const U32 hashLog = zc->params.cParams.hashLog;
-    const size_t h  = ZSTD_hashPtr(ip, hashLog, mls);
-    U32* const bt   = zc->chainTable;
-    const U32 btLog = zc->params.cParams.chainLog - 1;
-    const U32 btMask= (1 << btLog) - 1;
-    U32 matchIndex  = hashTable[h];
+    U32*   const hashTable = zc->hashTable;
+    U32    const hashLog = zc->params.cParams.hashLog;
+    size_t const h  = ZSTD_hashPtr(ip, hashLog, mls);
+    U32*   const bt = zc->chainTable;
+    U32    const btLog  = zc->params.cParams.chainLog - 1;
+    U32    const btMask = (1 << btLog) - 1;
+    U32 matchIndex = hashTable[h];
     size_t commonLengthSmaller=0, commonLengthLarger=0;
     const BYTE* const base = zc->base;
     const BYTE* const dictBase = zc->dictBase;
@@ -1361,20 +1464,22 @@ static U32 ZSTD_insertBt1(ZSTD_CCtx* zc, const BYTE* const ip, const U32 mls, co
     U32* smallerPtr = bt + 2*(current&btMask);
     U32* largerPtr  = smallerPtr + 1;
     U32 dummy32;   /* to be nullified at the end */
-    const U32 windowLow = zc->lowLimit;
+    U32 const windowLow = zc->lowLimit;
     U32 matchEndIdx = current+8;
     size_t bestLength = 8;
+#ifdef ZSTD_C_PREDICT
     U32 predictedSmall = *(bt + 2*((current-1)&btMask) + 0);
     U32 predictedLarge = *(bt + 2*((current-1)&btMask) + 1);
     predictedSmall += (predictedSmall>0);
     predictedLarge += (predictedLarge>0);
+#endif /* ZSTD_C_PREDICT */
 
     hashTable[h] = current;   /* Update Hash Table */
 
     while (nbCompares-- && (matchIndex > windowLow)) {
         U32* nextPtr = bt + 2*(matchIndex & btMask);
         size_t matchLength = MIN(commonLengthSmaller, commonLengthLarger);   /* guaranteed minimum nb of common bytes */
-#if 0   /* note : can create issues when hlog small <= 11 */
+#ifdef ZSTD_C_PREDICT   /* note : can create issues when hlog small <= 11 */
         const U32* predictPtr = bt + 2*((matchIndex-1) & btMask);   /* written this way, as bt is a roll buffer */
         if (matchIndex == predictedSmall) {
             /* no need to check length, result known */
@@ -1444,12 +1549,12 @@ static size_t ZSTD_insertBtAndFindBestMatch (
                         U32 nbCompares, const U32 mls,
                         U32 extDict)
 {
-    U32* const hashTable = zc->hashTable;
-    const U32 hashLog = zc->params.cParams.hashLog;
-    const size_t h  = ZSTD_hashPtr(ip, hashLog, mls);
-    U32* const bt   = zc->chainTable;
-    const U32 btLog = zc->params.cParams.chainLog - 1;
-    const U32 btMask= (1 << btLog) - 1;
+    U32*   const hashTable = zc->hashTable;
+    U32    const hashLog = zc->params.cParams.hashLog;
+    size_t const h  = ZSTD_hashPtr(ip, hashLog, mls);
+    U32*   const bt = zc->chainTable;
+    U32    const btLog  = zc->params.cParams.chainLog - 1;
+    U32    const btMask = (1 << btLog) - 1;
     U32 matchIndex  = hashTable[h];
     size_t commonLengthSmaller=0, commonLengthLarger=0;
     const BYTE* const base = zc->base;
@@ -1595,13 +1700,11 @@ static size_t ZSTD_BtFindBestMatch_selectMLS_extDict (
 
 
 
-/* ***********************
+/* *********************************
 *  Hash Chain
-*************************/
-
+***********************************/
 #define NEXT_IN_CHAIN(d, mask)   chainTable[(d) & mask]
 
-
 /* Update chains up to ip (excluded)
    Assumption : always within prefix (ie. not within extDict) */
 FORCE_INLINE
@@ -1731,17 +1834,15 @@ void ZSTD_compressBlock_lazy_generic(ZSTD_CCtx* ctx,
                         size_t* offsetPtr,
                         U32 maxNbAttempts, U32 matchLengthSearch);
     searchMax_f const searchMax = searchMethod ? ZSTD_BtFindBestMatch_selectMLS : ZSTD_HcFindBestMatch_selectMLS;
-    U32 rep[ZSTD_REP_INIT];
+    U32 offset_1 = ctx->rep[0], offset_2 = ctx->rep[1], savedOffset=0;
 
     /* init */
     ip += (ip==base);
     ctx->nextToUpdate3 = ctx->nextToUpdate;
-    {   U32 i;
-        U32 const maxRep = (U32)(ip-base);
-        for (i=0; i<ZSTD_REP_INIT; i++) {
-            rep[i]=ctx->rep[i];
-            if (rep[i]>maxRep) rep[i]=0;
-    }   }
+    {   U32 const maxRep = (U32)(ip-base);
+        if (offset_2 > maxRep) savedOffset = offset_2, offset_2 = 0;
+        if (offset_1 > maxRep) savedOffset = offset_1, offset_1 = 0;
+    }
 
     /* Match Loop */
     while (ip < ilimit) {
@@ -1750,9 +1851,9 @@ void ZSTD_compressBlock_lazy_generic(ZSTD_CCtx* ctx,
         const BYTE* start=ip+1;
 
         /* check repCode */
-        if ((rep[0]>0) & (MEM_read32(ip+1) == MEM_read32(ip+1 - rep[0]))) {
+        if ((offset_1>0) & (MEM_read32(ip+1) == MEM_read32(ip+1 - offset_1))) {
             /* repcode : we take it */
-            matchLength = ZSTD_count(ip+1+EQUAL_READ32, ip+1+EQUAL_READ32-rep[0], iend) + EQUAL_READ32;
+            matchLength = ZSTD_count(ip+1+EQUAL_READ32, ip+1+EQUAL_READ32-offset_1, iend) + EQUAL_READ32;
             if (depth==0) goto _storeSequence;
         }
 
@@ -1772,8 +1873,8 @@ void ZSTD_compressBlock_lazy_generic(ZSTD_CCtx* ctx,
         if (depth>=1)
         while (ip<ilimit) {
             ip ++;
-            if ((offset) && ((rep[0]>0) & (MEM_read32(ip) == MEM_read32(ip - rep[0])))) {
-                size_t const mlRep = ZSTD_count(ip+EQUAL_READ32, ip+EQUAL_READ32-rep[0], iend) + EQUAL_READ32;
+            if ((offset) && ((offset_1>0) & (MEM_read32(ip) == MEM_read32(ip - offset_1)))) {
+                size_t const mlRep = ZSTD_count(ip+EQUAL_READ32, ip+EQUAL_READ32-offset_1, iend) + EQUAL_READ32;
                 int const gain2 = (int)(mlRep * 3);
                 int const gain1 = (int)(matchLength*3 - ZSTD_highbit32((U32)offset+1) + 1);
                 if ((mlRep >= EQUAL_READ32) && (gain2 > gain1))
@@ -1791,8 +1892,8 @@ void ZSTD_compressBlock_lazy_generic(ZSTD_CCtx* ctx,
             /* let's find an even better one */
             if ((depth==2) && (ip<ilimit)) {
                 ip ++;
-                if ((offset) && ((rep[0]>0) & (MEM_read32(ip) == MEM_read32(ip - rep[0])))) {
-                    size_t const ml2 = ZSTD_count(ip+EQUAL_READ32, ip+EQUAL_READ32-rep[0], iend) + EQUAL_READ32;
+                if ((offset) && ((offset_1>0) & (MEM_read32(ip) == MEM_read32(ip - offset_1)))) {
+                    size_t const ml2 = ZSTD_count(ip+EQUAL_READ32, ip+EQUAL_READ32-offset_1, iend) + EQUAL_READ32;
                     int const gain2 = (int)(ml2 * 4);
                     int const gain1 = (int)(matchLength*4 - ZSTD_highbit32((U32)offset+1) + 1);
                     if ((ml2 >= EQUAL_READ32) && (gain2 > gain1))
@@ -1813,23 +1914,23 @@ void ZSTD_compressBlock_lazy_generic(ZSTD_CCtx* ctx,
         if (offset) {
             while ((start>anchor) && (start>base+offset-ZSTD_REP_MOVE) && (start[-1] == start[-1-offset+ZSTD_REP_MOVE]))   /* only search for offset within prefix */
                 { start--; matchLength++; }
-            rep[1] = rep[0]; rep[0] = (U32)(offset - ZSTD_REP_MOVE);
+            offset_2 = offset_1; offset_1 = (U32)(offset - ZSTD_REP_MOVE);
         }
 
         /* store sequence */
 _storeSequence:
         {   size_t const litLength = start - anchor;
-            ZSTD_storeSeq(seqStorePtr, litLength, anchor, offset, matchLength-MINMATCH);
+            ZSTD_storeSeq(seqStorePtr, litLength, anchor, (U32)offset, matchLength-MINMATCH);
             anchor = ip = start + matchLength;
         }
 
         /* check immediate repcode */
         while ( (ip <= ilimit)
-             && ((rep[1]>0)
-             & (MEM_read32(ip) == MEM_read32(ip - rep[1])) )) {
+             && ((offset_2>0)
+             & (MEM_read32(ip) == MEM_read32(ip - offset_2)) )) {
             /* store sequence */
-            matchLength = ZSTD_count(ip+EQUAL_READ32, ip+EQUAL_READ32-rep[1], iend) + EQUAL_READ32;
-            offset = rep[1]; rep[1] = rep[0]; rep[0] = (U32)offset; /* swap repcodes */
+            matchLength = ZSTD_count(ip+EQUAL_READ32, ip+EQUAL_READ32-offset_2, iend) + EQUAL_READ32;
+            offset = offset_2; offset_2 = offset_1; offset_1 = (U32)offset; /* swap repcodes */
             ZSTD_storeSeq(seqStorePtr, 0, anchor, 0, matchLength-MINMATCH);
             ip += matchLength;
             anchor = ip;
@@ -1837,11 +1938,8 @@ _storeSequence:
     }   }
 
     /* Save reps for next block */
-    {   int i;
-        for (i=0; i<ZSTD_REP_NUM; i++) {
-            if (!rep[i]) rep[i] = (U32)(iend - ctx->base) + 1;   /* in case some zero are left */
-            ctx->savedRep[i] = rep[i];
-    }   }
+    ctx->savedRep[0] = offset_1 ? offset_1 : savedOffset;
+    ctx->savedRep[1] = offset_2 ? offset_2 : savedOffset;
 
     /* Last Literals */
     {   size_t const lastLLSize = iend - anchor;
@@ -1900,10 +1998,9 @@ void ZSTD_compressBlock_lazy_extDict_generic(ZSTD_CCtx* ctx,
                         U32 maxNbAttempts, U32 matchLengthSearch);
     searchMax_f searchMax = searchMethod ? ZSTD_BtFindBestMatch_selectMLS_extDict : ZSTD_HcFindBestMatch_extDict_selectMLS;
 
-    /* init */
-    U32 rep[ZSTD_REP_INIT];
-    { U32 i; for (i=0; i<ZSTD_REP_INIT; i++) rep[i]=ctx->rep[i]; }
+    U32 offset_1 = ctx->rep[0], offset_2 = ctx->rep[1];
 
+    /* init */
     ctx->nextToUpdate3 = ctx->nextToUpdate;
     ip += (ip == prefixStart);
 
@@ -1915,7 +2012,7 @@ void ZSTD_compressBlock_lazy_extDict_generic(ZSTD_CCtx* ctx,
         U32 current = (U32)(ip-base);
 
         /* check repCode */
-        {   const U32 repIndex = (U32)(current+1 - rep[0]);
+        {   const U32 repIndex = (U32)(current+1 - offset_1);
             const BYTE* const repBase = repIndex < dictLimit ? dictBase : base;
             const BYTE* const repMatch = repBase + repIndex;
             if (((U32)((dictLimit-1) - repIndex) >= 3) & (repIndex > lowestIndex))   /* intentional overflow */
@@ -1945,7 +2042,7 @@ void ZSTD_compressBlock_lazy_extDict_generic(ZSTD_CCtx* ctx,
             current++;
             /* check repCode */
             if (offset) {
-                const U32 repIndex = (U32)(current - rep[0]);
+                const U32 repIndex = (U32)(current - offset_1);
                 const BYTE* const repBase = repIndex < dictLimit ? dictBase : base;
                 const BYTE* const repMatch = repBase + repIndex;
                 if (((U32)((dictLimit-1) - repIndex) >= 3) & (repIndex > lowestIndex))  /* intentional overflow */
@@ -1975,7 +2072,7 @@ void ZSTD_compressBlock_lazy_extDict_generic(ZSTD_CCtx* ctx,
                 current++;
                 /* check repCode */
                 if (offset) {
-                    const U32 repIndex = (U32)(current - rep[0]);
+                    const U32 repIndex = (U32)(current - offset_1);
                     const BYTE* const repBase = repIndex < dictLimit ? dictBase : base;
                     const BYTE* const repMatch = repBase + repIndex;
                     if (((U32)((dictLimit-1) - repIndex) >= 3) & (repIndex > lowestIndex))  /* intentional overflow */
@@ -2007,19 +2104,19 @@ void ZSTD_compressBlock_lazy_extDict_generic(ZSTD_CCtx* ctx,
             const BYTE* match = (matchIndex < dictLimit) ? dictBase + matchIndex : base + matchIndex;
             const BYTE* const mStart = (matchIndex < dictLimit) ? dictStart : prefixStart;
             while ((start>anchor) && (match>mStart) && (start[-1] == match[-1])) { start--; match--; matchLength++; }  /* catch up */
-            rep[1] = rep[0]; rep[0] = (U32)(offset - ZSTD_REP_MOVE);
+            offset_2 = offset_1; offset_1 = (U32)(offset - ZSTD_REP_MOVE);
         }
 
         /* store sequence */
 _storeSequence:
         {   size_t const litLength = start - anchor;
-            ZSTD_storeSeq(seqStorePtr, litLength, anchor, offset, matchLength-MINMATCH);
+            ZSTD_storeSeq(seqStorePtr, litLength, anchor, (U32)offset, matchLength-MINMATCH);
             anchor = ip = start + matchLength;
         }
 
         /* check immediate repcode */
         while (ip <= ilimit) {
-            const U32 repIndex = (U32)((ip-base) - rep[1]);
+            const U32 repIndex = (U32)((ip-base) - offset_2);
             const BYTE* const repBase = repIndex < dictLimit ? dictBase : base;
             const BYTE* const repMatch = repBase + repIndex;
             if (((U32)((dictLimit-1) - repIndex) >= 3) & (repIndex > lowestIndex))  /* intentional overflow */
@@ -2027,7 +2124,7 @@ _storeSequence:
                 /* repcode detected we should take it */
                 const BYTE* const repEnd = repIndex < dictLimit ? dictEnd : iend;
                 matchLength = ZSTD_count_2segments(ip+EQUAL_READ32, repMatch+EQUAL_READ32, iend, repEnd, prefixStart) + EQUAL_READ32;
-                offset = rep[1]; rep[1] = rep[0]; rep[0] = (U32)offset;   /* swap offset history */
+                offset = offset_2; offset_2 = offset_1; offset_1 = (U32)offset;   /* swap offset history */
                 ZSTD_storeSeq(seqStorePtr, 0, anchor, 0, matchLength-MINMATCH);
                 ip += matchLength;
                 anchor = ip;
@@ -2037,7 +2134,7 @@ _storeSequence:
     }   }
 
     /* Save reps for next block */
-    ctx->savedRep[0] = rep[0]; ctx->savedRep[1] = rep[1]; ctx->savedRep[2] = rep[2];
+    ctx->savedRep[0] = offset_1; ctx->savedRep[1] = offset_2;
 
     /* Last Literals */
     {   size_t const lastLLSize = iend - anchor;
@@ -2068,18 +2165,27 @@ static void ZSTD_compressBlock_btlazy2_extDict(ZSTD_CCtx* ctx, const void* src,
 }
 
 
-
 /* The optimal parser */
 #include "zstd_opt.h"
 
 static void ZSTD_compressBlock_btopt(ZSTD_CCtx* ctx, const void* src, size_t srcSize)
 {
+#ifdef ZSTD_OPT_H_91842398743
     ZSTD_compressBlock_opt_generic(ctx, src, srcSize);
+#else
+    (void)ctx; (void)src; (void)srcSize;
+    return;
+#endif
 }
 
 static void ZSTD_compressBlock_btopt_extDict(ZSTD_CCtx* ctx, const void* src, size_t srcSize)
 {
+#ifdef ZSTD_OPT_H_91842398743
     ZSTD_compressBlock_opt_extDict_generic(ctx, src, srcSize);
+#else
+    (void)ctx; (void)src; (void)srcSize;
+    return;
+#endif
 }
 
 
@@ -2087,9 +2193,9 @@ typedef void (*ZSTD_blockCompressor) (ZSTD_CCtx* ctx, const void* src, size_t sr
 
 static ZSTD_blockCompressor ZSTD_selectBlockCompressor(ZSTD_strategy strat, int extDict)
 {
-    static const ZSTD_blockCompressor blockCompressor[2][6] = {
-        { ZSTD_compressBlock_fast, ZSTD_compressBlock_greedy, ZSTD_compressBlock_lazy, ZSTD_compressBlock_lazy2, ZSTD_compressBlock_btlazy2, ZSTD_compressBlock_btopt },
-        { ZSTD_compressBlock_fast_extDict, ZSTD_compressBlock_greedy_extDict, ZSTD_compressBlock_lazy_extDict,ZSTD_compressBlock_lazy2_extDict, ZSTD_compressBlock_btlazy2_extDict, ZSTD_compressBlock_btopt_extDict }
+    static const ZSTD_blockCompressor blockCompressor[2][7] = {
+        { ZSTD_compressBlock_fast, ZSTD_compressBlock_doubleFast, ZSTD_compressBlock_greedy, ZSTD_compressBlock_lazy, ZSTD_compressBlock_lazy2, ZSTD_compressBlock_btlazy2, ZSTD_compressBlock_btopt },
+        { ZSTD_compressBlock_fast_extDict, ZSTD_compressBlock_doubleFast_extDict, ZSTD_compressBlock_greedy_extDict, ZSTD_compressBlock_lazy_extDict,ZSTD_compressBlock_lazy2_extDict, ZSTD_compressBlock_btlazy2_extDict, ZSTD_compressBlock_btopt_extDict }
     };
 
     return blockCompressor[extDict][(U32)strat];
@@ -2106,25 +2212,32 @@ static size_t ZSTD_compressBlock_internal(ZSTD_CCtx* zc, void* dst, size_t dstCa
 }
 
 
-
-
+/*! ZSTD_compress_generic() :
+*   Compress a chunk of data into one or multiple blocks.
+*   All blocks will be terminated, all input will be consumed.
+*   Function will issue an error if there is not enough `dstCapacity` to hold the compressed content.
+*   Frame is supposed already started (header already produced)
+*   @return : compressed size, or an error code
+*/
 static size_t ZSTD_compress_generic (ZSTD_CCtx* cctx,
                                      void* dst, size_t dstCapacity,
-                               const void* src, size_t srcSize)
+                               const void* src, size_t srcSize,
+                                     U32 lastFrameChunk)
 {
     size_t blockSize = cctx->blockSize;
     size_t remaining = srcSize;
     const BYTE* ip = (const BYTE*)src;
     BYTE* const ostart = (BYTE*)dst;
     BYTE* op = ostart;
-    const U32 maxDist = 1 << cctx->params.cParams.windowLog;
+    U32 const maxDist = 1 << cctx->params.cParams.windowLog;
     ZSTD_stats_t* stats = &cctx->seqStore.stats;
-    ZSTD_statsInit(stats);
+    ZSTD_statsInit(stats);   /* debug only */
 
     if (cctx->params.fParams.checksumFlag)
         XXH64_update(&cctx->xxhState, src, srcSize);
 
     while (remaining) {
+        U32 const lastBlock = lastFrameChunk & (blockSize >= remaining);
         size_t cSize;
         ZSTD_statsResetFreqs(stats);   /* debug only */
 
@@ -2142,14 +2255,15 @@ static size_t ZSTD_compress_generic (ZSTD_CCtx* cctx,
         if (ZSTD_isError(cSize)) return cSize;
 
         if (cSize == 0) {  /* block is not compressible */
-            cSize = ZSTD_noCompressBlock(op, dstCapacity, ip, blockSize);
-            if (ZSTD_isError(cSize)) return cSize;
+            U32 const cBlockHeader24 = lastBlock + (((U32)bt_raw)<<1) + (U32)(blockSize << 3);
+            if (blockSize + ZSTD_blockHeaderSize > dstCapacity) return ERROR(dstSize_tooSmall);
+            MEM_writeLE32(op, cBlockHeader24);   /* no pb, 4th byte will be overwritten */
+            memcpy(op + ZSTD_blockHeaderSize, ip, blockSize);
+            cSize = ZSTD_blockHeaderSize+blockSize;
         } else {
-            op[0] = (BYTE)(cSize>>16);
-            op[1] = (BYTE)(cSize>>8);
-            op[2] = (BYTE)cSize;
-            op[0] += (BYTE)(bt_compressed << 6); /* is a compressed block */
-            cSize += 3;
+            U32 const cBlockHeader24 = lastBlock + (((U32)bt_compressed)<<1) + (U32)(cSize << 3);
+            MEM_writeLE24(op, cBlockHeader24);
+            cSize += ZSTD_blockHeaderSize;
         }
 
         remaining -= blockSize;
@@ -2158,7 +2272,8 @@ static size_t ZSTD_compress_generic (ZSTD_CCtx* cctx,
         op += cSize;
     }
 
-    ZSTD_statsPrint(stats, cctx->params.cParams.searchLength);
+    if (lastFrameChunk && (op>ostart)) cctx->stage = ZSTDcs_ending;
+    ZSTD_statsPrint(stats, cctx->params.cParams.searchLength);   /* debug only */
     return op-ostart;
 }
 
@@ -2166,34 +2281,34 @@ static size_t ZSTD_compress_generic (ZSTD_CCtx* cctx,
 static size_t ZSTD_writeFrameHeader(void* dst, size_t dstCapacity,
                                     ZSTD_parameters params, U64 pledgedSrcSize, U32 dictID)
 {   BYTE* const op = (BYTE*)dst;
-    U32 const dictIDSizeCode = (dictID>0) + (dictID>=256) + (dictID>=65536);   /* 0-3 */
-    U32 const checksumFlag = params.fParams.checksumFlag>0;
-    U32 const windowSize = 1U << params.cParams.windowLog;
-    U32 const directModeFlag = params.fParams.contentSizeFlag && (windowSize > (pledgedSrcSize-1));
-    BYTE const windowLogByte = (BYTE)((params.cParams.windowLog - ZSTD_WINDOWLOG_ABSOLUTEMIN) << 3);
-    U32 const fcsCode = params.fParams.contentSizeFlag ?
+    U32   const dictIDSizeCode = (dictID>0) + (dictID>=256) + (dictID>=65536);   /* 0-3 */
+    U32   const checksumFlag = params.fParams.checksumFlag>0;
+    U32   const windowSize = 1U << params.cParams.windowLog;
+    U32   const singleSegment = params.fParams.contentSizeFlag && (windowSize > (pledgedSrcSize-1));
+    BYTE  const windowLogByte = (BYTE)((params.cParams.windowLog - ZSTD_WINDOWLOG_ABSOLUTEMIN) << 3);
+    U32   const fcsCode = params.fParams.contentSizeFlag ?
                      (pledgedSrcSize>=256) + (pledgedSrcSize>=65536+256) + (pledgedSrcSize>=0xFFFFFFFFU) :   /* 0-3 */
                       0;
-    BYTE const frameHeaderDecriptionByte = (BYTE)(dictIDSizeCode + (checksumFlag<<2) + (directModeFlag<<5) + (fcsCode<<6) );
+    BYTE  const frameHeaderDecriptionByte = (BYTE)(dictIDSizeCode + (checksumFlag<<2) + (singleSegment<<5) + (fcsCode<<6) );
     size_t pos;
 
     if (dstCapacity < ZSTD_frameHeaderSize_max) return ERROR(dstSize_tooSmall);
 
     MEM_writeLE32(dst, ZSTD_MAGICNUMBER);
     op[4] = frameHeaderDecriptionByte; pos=5;
-    if (!directModeFlag) op[pos++] = windowLogByte;
+    if (!singleSegment) op[pos++] = windowLogByte;
     switch(dictIDSizeCode)
     {
         default:   /* impossible */
         case 0 : break;
         case 1 : op[pos] = (BYTE)(dictID); pos++; break;
-        case 2 : MEM_writeLE16(op+pos, (U16)(dictID)); pos+=2; break;
+        case 2 : MEM_writeLE16(op+pos, (U16)dictID); pos+=2; break;
         case 3 : MEM_writeLE32(op+pos, dictID); pos+=4; break;
     }
     switch(fcsCode)
     {
         default:   /* impossible */
-        case 0 : if (directModeFlag) op[pos++] = (BYTE)(pledgedSrcSize); break;
+        case 0 : if (singleSegment) op[pos++] = (BYTE)(pledgedSrcSize); break;
         case 1 : MEM_writeLE16(op+pos, (U16)(pledgedSrcSize-256)); pos+=2; break;
         case 2 : MEM_writeLE32(op+pos, (U32)(pledgedSrcSize)); pos+=4; break;
         case 3 : MEM_writeLE64(op+pos, (U64)(pledgedSrcSize)); pos+=8; break;
@@ -2205,30 +2320,31 @@ static size_t ZSTD_writeFrameHeader(void* dst, size_t dstCapacity,
 static size_t ZSTD_compressContinue_internal (ZSTD_CCtx* zc,
                               void* dst, size_t dstCapacity,
                         const void* src, size_t srcSize,
-                               U32 frame)
+                               U32 frame, U32 lastFrameChunk)
 {
     const BYTE* const ip = (const BYTE*) src;
     size_t fhSize = 0;
 
-    if (zc->stage==0) return ERROR(stage_wrong);
-    if (frame && (zc->stage==1)) {   /* copy saved header */
+    if (zc->stage==ZSTDcs_created) return ERROR(stage_wrong);   /* missing init (ZSTD_compressBegin) */
+
+    if (frame && (zc->stage==ZSTDcs_init)) {
         fhSize = ZSTD_writeFrameHeader(dst, dstCapacity, zc->params, zc->frameContentSize, zc->dictID);
         if (ZSTD_isError(fhSize)) return fhSize;
         dstCapacity -= fhSize;
         dst = (char*)dst + fhSize;
-        zc->stage = 2;
+        zc->stage = ZSTDcs_ongoing;
     }
 
     /* Check if blocks follow each other */
     if (src != zc->nextSrc) {
         /* not contiguous */
-        size_t const delta = zc->nextSrc - ip;
+        ptrdiff_t const delta = zc->nextSrc - ip;
         zc->lowLimit = zc->dictLimit;
         zc->dictLimit = (U32)(zc->nextSrc - zc->base);
         zc->dictBase = zc->base;
         zc->base -= delta;
         zc->nextToUpdate = zc->dictLimit;
-        if (zc->dictLimit - zc->lowLimit < 8) zc->lowLimit = zc->dictLimit;   /* too small extDict */
+        if (zc->dictLimit - zc->lowLimit < HASH_READ_SIZE) zc->lowLimit = zc->dictLimit;   /* too small extDict */
     }
 
     /* preemptive overflow correction */
@@ -2254,7 +2370,7 @@ static size_t ZSTD_compressContinue_internal (ZSTD_CCtx* zc,
 
     zc->nextSrc = ip + srcSize;
     {   size_t const cSize = frame ?
-                             ZSTD_compress_generic (zc, dst, dstCapacity, src, srcSize) :
+                             ZSTD_compress_generic (zc, dst, dstCapacity, src, srcSize, lastFrameChunk) :
                              ZSTD_compressBlock_internal (zc, dst, dstCapacity, src, srcSize);
         if (ZSTD_isError(cSize)) return cSize;
         return cSize + fhSize;
@@ -2262,19 +2378,25 @@ static size_t ZSTD_compressContinue_internal (ZSTD_CCtx* zc,
 }
 
 
-size_t ZSTD_compressContinue (ZSTD_CCtx* zc,
+size_t ZSTD_compressContinue (ZSTD_CCtx* cctx,
                               void* dst, size_t dstCapacity,
                         const void* src, size_t srcSize)
 {
-    return ZSTD_compressContinue_internal(zc, dst, dstCapacity, src, srcSize, 1);
+    return ZSTD_compressContinue_internal(cctx, dst, dstCapacity, src, srcSize, 1, 0);
 }
 
 
-size_t ZSTD_compressBlock(ZSTD_CCtx* zc, void* dst, size_t dstCapacity, const void* src, size_t srcSize)
+size_t ZSTD_getBlockSizeMax(ZSTD_CCtx* cctx)
 {
-    if (srcSize > ZSTD_BLOCKSIZE_MAX) return ERROR(srcSize_wrong);
-    ZSTD_LOG_BLOCK("%p: ZSTD_compressBlock searchLength=%d\n", zc->base, zc->params.cParams.searchLength);
-    return ZSTD_compressContinue_internal(zc, dst, dstCapacity, src, srcSize, 0);
+    return MIN (ZSTD_BLOCKSIZE_ABSOLUTEMAX, 1 << cctx->params.cParams.windowLog);
+}
+
+size_t ZSTD_compressBlock(ZSTD_CCtx* cctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize)
+{
+    size_t const blockSizeMax = ZSTD_getBlockSizeMax(cctx);
+    if (srcSize > blockSizeMax) return ERROR(srcSize_wrong);
+    ZSTD_LOG_BLOCK("%p: ZSTD_compressBlock searchLength=%d\n", cctx->base, cctx->params.cParams.searchLength);
+    return ZSTD_compressContinue_internal(cctx, dst, dstCapacity, src, srcSize, 0, 0);
 }
 
 
@@ -2292,7 +2414,7 @@ static size_t ZSTD_loadDictionaryContent(ZSTD_CCtx* zc, const void* src, size_t
     zc->loadedDictEnd = (U32)(iend - zc->base);
 
     zc->nextSrc = iend;
-    if (srcSize <= 8) return 0;
+    if (srcSize <= HASH_READ_SIZE) return 0;
 
     switch(zc->params.cParams.strategy)
     {
@@ -2300,15 +2422,19 @@ static size_t ZSTD_loadDictionaryContent(ZSTD_CCtx* zc, const void* src, size_t
         ZSTD_fillHashTable (zc, iend, zc->params.cParams.searchLength);
         break;
 
+    case ZSTD_dfast:
+        ZSTD_fillDoubleHashTable (zc, iend, zc->params.cParams.searchLength);
+        break;
+
     case ZSTD_greedy:
     case ZSTD_lazy:
     case ZSTD_lazy2:
-        ZSTD_insertAndFindFirstIndex (zc, iend-8, zc->params.cParams.searchLength);
+        ZSTD_insertAndFindFirstIndex (zc, iend-HASH_READ_SIZE, zc->params.cParams.searchLength);
         break;
 
     case ZSTD_btlazy2:
     case ZSTD_btopt:
-        ZSTD_updateTree(zc, iend-8, iend, 1 << zc->params.cParams.searchLog, zc->params.cParams.searchLength);
+        ZSTD_updateTree(zc, iend-HASH_READ_SIZE, iend, 1 << zc->params.cParams.searchLog, zc->params.cParams.searchLength);
         break;
 
     default:
@@ -2323,8 +2449,8 @@ static size_t ZSTD_loadDictionaryContent(ZSTD_CCtx* zc, const void* src, size_t
 /* Dictionary format :
      Magic == ZSTD_DICT_MAGIC (4 bytes)
      HUF_writeCTable(256)
-     FSE_writeNCount(ml)
      FSE_writeNCount(off)
+     FSE_writeNCount(ml)
      FSE_writeNCount(ll)
      RepOffsets
      Dictionary content
@@ -2414,7 +2540,7 @@ static size_t ZSTD_compressBegin_internal(ZSTD_CCtx* zc,
 *   @return : 0, or an error code */
 size_t ZSTD_compressBegin_advanced(ZSTD_CCtx* cctx,
                              const void* dict, size_t dictSize,
-                                   ZSTD_parameters params, U64 pledgedSrcSize)
+                                   ZSTD_parameters params, unsigned long long pledgedSrcSize)
 {
     /* compression parameters verification and optimization */
     { size_t const errorCode = ZSTD_checkCParams_advanced(params.cParams, pledgedSrcSize);
@@ -2426,9 +2552,7 @@ size_t ZSTD_compressBegin_advanced(ZSTD_CCtx* cctx,
 
 size_t ZSTD_compressBegin_usingDict(ZSTD_CCtx* cctx, const void* dict, size_t dictSize, int compressionLevel)
 {
-    ZSTD_parameters params;
-    memset(&params, 0, sizeof(params));
-    params.cParams = ZSTD_getCParams(compressionLevel, 0, dictSize);
+    ZSTD_parameters const params = ZSTD_getParams(compressionLevel, 0, dictSize);
     ZSTD_LOG_BLOCK("%p: ZSTD_compressBegin_usingDict compressionLevel=%d\n", cctx->base, compressionLevel);
     return ZSTD_compressBegin_internal(cctx, dict, dictSize, params, 0);
 }
@@ -2441,38 +2565,57 @@ size_t ZSTD_compressBegin(ZSTD_CCtx* zc, int compressionLevel)
 }
 
 
-/*! ZSTD_compressEnd() :
-*   Write frame epilogue.
+/*! ZSTD_writeEpilogue() :
+*   Ends a frame.
 *   @return : nb of bytes written into dst (or an error code) */
-size_t ZSTD_compressEnd(ZSTD_CCtx* cctx, void* dst, size_t dstCapacity)
+static size_t ZSTD_writeEpilogue(ZSTD_CCtx* cctx, void* dst, size_t dstCapacity)
 {
-    BYTE* op = (BYTE*)dst;
+    BYTE* const ostart = (BYTE*)dst;
+    BYTE* op = ostart;
     size_t fhSize = 0;
 
-    /* not even init ! */
-    if (cctx->stage==0) return ERROR(stage_wrong);
+    if (cctx->stage == ZSTDcs_created) return ERROR(stage_wrong);  /*< not even init ! */
 
     /* special case : empty frame */
-    if (cctx->stage==1) {
+    if (cctx->stage == ZSTDcs_init) {
         fhSize = ZSTD_writeFrameHeader(dst, dstCapacity, cctx->params, 0, 0);
         if (ZSTD_isError(fhSize)) return fhSize;
         dstCapacity -= fhSize;
         op += fhSize;
-        cctx->stage = 2;
+        cctx->stage = ZSTDcs_ongoing;
+    }
+
+    if (cctx->stage != ZSTDcs_ending) {
+        /* write one last empty block, make it the "last" block */
+        U32 const cBlockHeader24 = 1 /* last block */ + (((U32)bt_raw)<<1) + 0;
+        if (dstCapacity<4) return ERROR(dstSize_tooSmall);
+        MEM_writeLE32(op, cBlockHeader24);
+        op += ZSTD_blockHeaderSize;
+        dstCapacity -= ZSTD_blockHeaderSize;
     }
 
-    /* frame epilogue */
-    if (dstCapacity < 3) return ERROR(dstSize_tooSmall);
-    {   U32 const checksum = cctx->params.fParams.checksumFlag ?
-                             (U32)((XXH64_digest(&cctx->xxhState) >> 11) & ((1<<22)-1)) :
-                             0;
-        op[0] = (BYTE)((bt_end<<6) + (checksum>>16));
-        op[1] = (BYTE)(checksum>>8);
-        op[2] = (BYTE)checksum;
+    if (cctx->params.fParams.checksumFlag) {
+        U32 const checksum = (U32) XXH64_digest(&cctx->xxhState);
+        if (dstCapacity<4) return ERROR(dstSize_tooSmall);
+        MEM_writeLE32(op, checksum);
+        op += 4;
     }
 
-    cctx->stage = 0;  /* return to "created but not init" status */
-    return 3+fhSize;
+    cctx->stage = ZSTDcs_created;  /* return to "created but no init" status */
+    return op-ostart;
+}
+
+
+size_t ZSTD_compressEnd (ZSTD_CCtx* cctx,
+                         void* dst, size_t dstCapacity,
+                   const void* src, size_t srcSize)
+{
+    size_t endResult;
+    size_t const cSize = ZSTD_compressContinue_internal(cctx, dst, dstCapacity, src, srcSize, 1, 1);
+    if (ZSTD_isError(cSize)) return cSize;
+    endResult = ZSTD_writeEpilogue(cctx, (char*)dst + cSize, dstCapacity-cSize);
+    if (ZSTD_isError(endResult)) return endResult;
+    return cSize + endResult;
 }
 
 
@@ -2485,44 +2628,23 @@ static size_t ZSTD_compress_usingPreparedCCtx(ZSTD_CCtx* cctx, const ZSTD_CCtx*
                                        void* dst, size_t dstCapacity,
                                  const void* src, size_t srcSize)
 {
-    {   size_t const errorCode = ZSTD_copyCCtx(cctx, preparedCCtx);
-        if (ZSTD_isError(errorCode)) return errorCode;
-    }
-    {   size_t const cSize = ZSTD_compressContinue(cctx, dst, dstCapacity, src, srcSize);
-        if (ZSTD_isError(cSize)) return cSize;
+    size_t const errorCode = ZSTD_copyCCtx(cctx, preparedCCtx);
+    if (ZSTD_isError(errorCode)) return errorCode;
 
-        {   size_t const endSize = ZSTD_compressEnd(cctx, (char*)dst+cSize, dstCapacity-cSize);
-            if (ZSTD_isError(endSize)) return endSize;
-            return cSize + endSize;
-    }   }
+    return ZSTD_compressEnd(cctx, dst, dstCapacity, src, srcSize);
 }
 
 
-static size_t ZSTD_compress_internal (ZSTD_CCtx* ctx,
+static size_t ZSTD_compress_internal (ZSTD_CCtx* cctx,
                                void* dst, size_t dstCapacity,
                          const void* src, size_t srcSize,
                          const void* dict,size_t dictSize,
                                ZSTD_parameters params)
 {
-    BYTE* const ostart = (BYTE*)dst;
-    BYTE* op = ostart;
-
-    /* Init */
-    { size_t const errorCode = ZSTD_compressBegin_internal(ctx, dict, dictSize, params, srcSize);
-      if(ZSTD_isError(errorCode)) return errorCode; }
-
-    /* body (compression) */
-    { size_t const oSize = ZSTD_compressContinue (ctx, op,  dstCapacity, src, srcSize);
-      if(ZSTD_isError(oSize)) return oSize;
-      op += oSize;
-      dstCapacity -= oSize; }
+    size_t const errorCode = ZSTD_compressBegin_internal(cctx, dict, dictSize, params, srcSize);
+    if(ZSTD_isError(errorCode)) return errorCode;
 
-    /* Close frame */
-    { size_t const oSize = ZSTD_compressEnd(ctx, op, dstCapacity);
-      if(ZSTD_isError(oSize)) return oSize;
-      op += oSize; }
-
-    return (op - ostart);
+    return ZSTD_compressEnd(cctx, dst,  dstCapacity, src, srcSize);
 }
 
 size_t ZSTD_compress_advanced (ZSTD_CCtx* ctx,
@@ -2538,11 +2660,9 @@ size_t ZSTD_compress_advanced (ZSTD_CCtx* ctx,
 
 size_t ZSTD_compress_usingDict(ZSTD_CCtx* ctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize, const void* dict, size_t dictSize, int compressionLevel)
 {
-    ZSTD_parameters params;
-    memset(&params, 0, sizeof(params));
-    ZSTD_LOG_BLOCK("%p: ZSTD_compress_usingDict srcSize=%d dictSize=%d compressionLevel=%d\n", ctx->base, (int)srcSize, (int)dictSize, compressionLevel);
-    params.cParams =  ZSTD_getCParams(compressionLevel, srcSize, dictSize);
+    ZSTD_parameters params = ZSTD_getParams(compressionLevel, srcSize, dictSize);
     params.fParams.contentSizeFlag = 1;
+    ZSTD_LOG_BLOCK("%p: ZSTD_compress_usingDict srcSize=%d dictSize=%d compressionLevel=%d\n", ctx->base, (int)srcSize, (int)dictSize, compressionLevel);
     return ZSTD_compress_internal(ctx, dst, dstCapacity, src, srcSize, dict, dictSize, params);
 }
 
@@ -2577,7 +2697,7 @@ ZSTD_CDict* ZSTD_createCDict_advanced(const void* dict, size_t dictSize, ZSTD_pa
     if (!customMem.customAlloc && !customMem.customFree)
         customMem = defaultCustomMem;
 
-    if (!customMem.customAlloc || !customMem.customFree)
+    if (!customMem.customAlloc || !customMem.customFree)  /* can't have 1/2 custom alloc/free as NULL */
         return NULL;
 
     {   ZSTD_CDict* const cdict = (ZSTD_CDict*) customMem.customAlloc(customMem.opaque, sizeof(*cdict));
@@ -2643,29 +2763,29 @@ ZSTDLIB_API size_t ZSTD_compress_usingCDict(ZSTD_CCtx* cctx,
 
 #define ZSTD_DEFAULT_CLEVEL 1
 #define ZSTD_MAX_CLEVEL     22
-unsigned ZSTD_maxCLevel(void) { return ZSTD_MAX_CLEVEL; }
+int ZSTD_maxCLevel(void) { return ZSTD_MAX_CLEVEL; }
 
 static const ZSTD_compressionParameters ZSTD_defaultCParameters[4][ZSTD_MAX_CLEVEL+1] = {
 {   /* "default" */
     /* W,  C,  H,  S,  L, TL, strat */
-    {  0,  0,  0,  0,  0,  0, ZSTD_fast    },  /* level  0 - never used */
-    { 19, 13, 14,  1,  7,  4, ZSTD_fast    },  /* level  1 */
-    { 19, 15, 16,  1,  6,  4, ZSTD_fast    },  /* level  2 */
-    { 20, 18, 20,  1,  6,  4, ZSTD_fast    },  /* level  3 */
-    { 20, 13, 17,  2,  5,  4, ZSTD_greedy  },  /* level  4.*/
-    { 20, 15, 18,  3,  5,  4, ZSTD_greedy  },  /* level  5 */
-    { 21, 16, 19,  2,  5,  4, ZSTD_lazy    },  /* level  6 */
-    { 21, 17, 20,  3,  5,  4, ZSTD_lazy    },  /* level  7 */
-    { 21, 18, 20,  3,  5,  4, ZSTD_lazy2   },  /* level  8.*/
-    { 21, 20, 20,  3,  5,  4, ZSTD_lazy2   },  /* level  9 */
-    { 21, 19, 21,  4,  5,  4, ZSTD_lazy2   },  /* level 10 */
-    { 22, 20, 22,  4,  5,  4, ZSTD_lazy2   },  /* level 11 */
-    { 22, 20, 22,  5,  5,  4, ZSTD_lazy2   },  /* level 12 */
-    { 22, 21, 22,  5,  5,  4, ZSTD_lazy2   },  /* level 13 */
-    { 22, 21, 22,  6,  5,  4, ZSTD_lazy2   },  /* level 14 */
-    { 22, 21, 21,  5,  5,  4, ZSTD_btlazy2 },  /* level 15 */
-    { 23, 22, 22,  5,  5,  4, ZSTD_btlazy2 },  /* level 16 */
-    { 23, 23, 22,  5,  5,  4, ZSTD_btlazy2 },  /* level 17.*/
+    { 18, 12, 12,  1,  7, 16, ZSTD_fast    },  /* level  0 - not used */
+    { 19, 13, 14,  1,  7, 16, ZSTD_fast    },  /* level  1 */
+    { 19, 15, 16,  1,  6, 16, ZSTD_fast    },  /* level  2 */
+    { 20, 16, 18,  1,  5, 16, ZSTD_dfast   },  /* level  3 */
+    { 20, 13, 17,  2,  5, 16, ZSTD_greedy  },  /* level  4.*/
+    { 20, 15, 18,  3,  5, 16, ZSTD_greedy  },  /* level  5 */
+    { 21, 16, 19,  2,  5, 16, ZSTD_lazy    },  /* level  6 */
+    { 21, 17, 20,  3,  5, 16, ZSTD_lazy    },  /* level  7 */
+    { 21, 18, 20,  3,  5, 16, ZSTD_lazy2   },  /* level  8.*/
+    { 21, 20, 20,  3,  5, 16, ZSTD_lazy2   },  /* level  9 */
+    { 21, 19, 21,  4,  5, 16, ZSTD_lazy2   },  /* level 10 */
+    { 22, 20, 22,  4,  5, 16, ZSTD_lazy2   },  /* level 11 */
+    { 22, 20, 22,  5,  5, 16, ZSTD_lazy2   },  /* level 12 */
+    { 22, 21, 22,  5,  5, 16, ZSTD_lazy2   },  /* level 13 */
+    { 22, 21, 22,  6,  5, 16, ZSTD_lazy2   },  /* level 14 */
+    { 22, 21, 21,  5,  5, 16, ZSTD_btlazy2 },  /* level 15 */
+    { 23, 22, 22,  5,  5, 16, ZSTD_btlazy2 },  /* level 16 */
+    { 23, 23, 22,  5,  5, 16, ZSTD_btlazy2 },  /* level 17.*/
     { 23, 23, 22,  6,  5, 24, ZSTD_btopt   },  /* level 18.*/
     { 23, 23, 22,  6,  3, 48, ZSTD_btopt   },  /* level 19.*/
     { 25, 26, 23,  7,  3, 64, ZSTD_btopt   },  /* level 20.*/
@@ -2674,7 +2794,7 @@ static const ZSTD_compressionParameters ZSTD_defaultCParameters[4][ZSTD_MAX_CLEV
 },
 {   /* for srcSize <= 256 KB */
     /* W,  C,  H,  S,  L,  T, strat */
-    {  0,  0,  0,  0,  0,  0, ZSTD_fast    },  /* level  0 */
+    { 18, 12, 12,  1,  7,  4, ZSTD_fast    },  /* level  0 - not used */
     { 18, 13, 14,  1,  6,  4, ZSTD_fast    },  /* level  1 */
     { 18, 15, 17,  1,  5,  4, ZSTD_fast    },  /* level  2 */
     { 18, 13, 15,  1,  5,  4, ZSTD_greedy  },  /* level  3.*/
@@ -2700,20 +2820,20 @@ static const ZSTD_compressionParameters ZSTD_defaultCParameters[4][ZSTD_MAX_CLEV
 },
 {   /* for srcSize <= 128 KB */
     /* W,  C,  H,  S,  L,  T, strat */
-    {  0,  0,  0,  0,  0,  0, ZSTD_fast    },  /* level  0 - never used */
-    { 17, 12, 13,  1,  6,  4, ZSTD_fast    },  /* level  1 */
-    { 17, 13, 16,  1,  5,  4, ZSTD_fast    },  /* level  2 */
-    { 17, 13, 14,  2,  5,  4, ZSTD_greedy  },  /* level  3 */
-    { 17, 13, 15,  3,  4,  4, ZSTD_greedy  },  /* level  4 */
-    { 17, 15, 17,  4,  4,  4, ZSTD_greedy  },  /* level  5 */
-    { 17, 16, 17,  3,  4,  4, ZSTD_lazy    },  /* level  6 */
-    { 17, 15, 17,  4,  4,  4, ZSTD_lazy2   },  /* level  7 */
-    { 17, 17, 17,  4,  4,  4, ZSTD_lazy2   },  /* level  8 */
-    { 17, 17, 17,  5,  4,  4, ZSTD_lazy2   },  /* level  9 */
-    { 17, 17, 17,  6,  4,  4, ZSTD_lazy2   },  /* level 10 */
-    { 17, 17, 17,  7,  4,  4, ZSTD_lazy2   },  /* level 11 */
-    { 17, 17, 17,  8,  4,  4, ZSTD_lazy2   },  /* level 12 */
-    { 17, 18, 17,  6,  4,  4, ZSTD_btlazy2 },  /* level 13.*/
+    { 17, 12, 12,  1,  7,  8, ZSTD_fast    },  /* level  0 - not used */
+    { 17, 12, 13,  1,  6,  8, ZSTD_fast    },  /* level  1 */
+    { 17, 13, 16,  1,  5,  8, ZSTD_fast    },  /* level  2 */
+    { 17, 16, 16,  2,  5,  8, ZSTD_dfast   },  /* level  3 */
+    { 17, 13, 15,  3,  4,  8, ZSTD_greedy  },  /* level  4 */
+    { 17, 15, 17,  4,  4,  8, ZSTD_greedy  },  /* level  5 */
+    { 17, 16, 17,  3,  4,  8, ZSTD_lazy    },  /* level  6 */
+    { 17, 15, 17,  4,  4,  8, ZSTD_lazy2   },  /* level  7 */
+    { 17, 17, 17,  4,  4,  8, ZSTD_lazy2   },  /* level  8 */
+    { 17, 17, 17,  5,  4,  8, ZSTD_lazy2   },  /* level  9 */
+    { 17, 17, 17,  6,  4,  8, ZSTD_lazy2   },  /* level 10 */
+    { 17, 17, 17,  7,  4,  8, ZSTD_lazy2   },  /* level 11 */
+    { 17, 17, 17,  8,  4,  8, ZSTD_lazy2   },  /* level 12 */
+    { 17, 18, 17,  6,  4,  8, ZSTD_btlazy2 },  /* level 13.*/
     { 17, 17, 17,  7,  3,  8, ZSTD_btopt   },  /* level 14.*/
     { 17, 17, 17,  7,  3, 16, ZSTD_btopt   },  /* level 15.*/
     { 17, 18, 17,  7,  3, 32, ZSTD_btopt   },  /* level 16.*/
@@ -2722,20 +2842,20 @@ static const ZSTD_compressionParameters ZSTD_defaultCParameters[4][ZSTD_MAX_CLEV
     { 17, 18, 17,  8,  3,256, ZSTD_btopt   },  /* level 19.*/
     { 17, 18, 17,  9,  3,256, ZSTD_btopt   },  /* level 20.*/
     { 17, 18, 17, 10,  3,256, ZSTD_btopt   },  /* level 21.*/
-    { 17, 18, 17, 11,  3,256, ZSTD_btopt   },  /* level 22.*/
+    { 17, 18, 17, 11,  3,512, ZSTD_btopt   },  /* level 22.*/
 },
 {   /* for srcSize <= 16 KB */
     /* W,  C,  H,  S,  L,  T, strat */
-    {  0,  0,  0,  0,  0,  0, ZSTD_fast    },  /* level  0 -- never used */
-    { 14, 14, 14,  1,  4,  4, ZSTD_fast    },  /* level  1 */
-    { 14, 14, 15,  1,  4,  4, ZSTD_fast    },  /* level  2 */
-    { 14, 14, 14,  4,  4,  4, ZSTD_greedy  },  /* level  3.*/
-    { 14, 14, 14,  3,  4,  4, ZSTD_lazy    },  /* level  4.*/
-    { 14, 14, 14,  4,  4,  4, ZSTD_lazy2   },  /* level  5 */
-    { 14, 14, 14,  5,  4,  4, ZSTD_lazy2   },  /* level  6 */
-    { 14, 14, 14,  6,  4,  4, ZSTD_lazy2   },  /* level  7.*/
-    { 14, 14, 14,  7,  4,  4, ZSTD_lazy2   },  /* level  8.*/
-    { 14, 15, 14,  6,  4,  4, ZSTD_btlazy2 },  /* level  9.*/
+    { 14, 12, 12,  1,  7,  6, ZSTD_fast    },  /* level  0 - not used */
+    { 14, 14, 14,  1,  6,  6, ZSTD_fast    },  /* level  1 */
+    { 14, 14, 14,  1,  4,  6, ZSTD_fast    },  /* level  2 */
+    { 14, 14, 14,  1,  4,  6, ZSTD_dfast   },  /* level  3.*/
+    { 14, 14, 14,  4,  4,  6, ZSTD_greedy  },  /* level  4.*/
+    { 14, 14, 14,  3,  4,  6, ZSTD_lazy    },  /* level  5.*/
+    { 14, 14, 14,  4,  4,  6, ZSTD_lazy2   },  /* level  6 */
+    { 14, 14, 14,  5,  4,  6, ZSTD_lazy2   },  /* level  7 */
+    { 14, 14, 14,  6,  4,  6, ZSTD_lazy2   },  /* level  8.*/
+    { 14, 15, 14,  6,  4,  6, ZSTD_btlazy2 },  /* level  9.*/
     { 14, 15, 14,  3,  3,  6, ZSTD_btopt   },  /* level 10.*/
     { 14, 15, 14,  6,  3,  8, ZSTD_btopt   },  /* level 11.*/
     { 14, 15, 14,  6,  3, 16, ZSTD_btopt   },  /* level 12.*/
@@ -2755,7 +2875,7 @@ static const ZSTD_compressionParameters ZSTD_defaultCParameters[4][ZSTD_MAX_CLEV
 /*! ZSTD_getCParams() :
 *   @return ZSTD_compressionParameters structure for a selected compression level, `srcSize` and `dictSize`.
 *   Size values are optional, provide 0 if not known or unused */
-ZSTD_compressionParameters ZSTD_getCParams(int compressionLevel, U64 srcSize, size_t dictSize)
+ZSTD_compressionParameters ZSTD_getCParams(int compressionLevel, unsigned long long srcSize, size_t dictSize)
 {
     ZSTD_compressionParameters cp;
     size_t const addedSize = srcSize ? 0 : 500;
@@ -2772,3 +2892,14 @@ ZSTD_compressionParameters ZSTD_getCParams(int compressionLevel, U64 srcSize, si
     cp = ZSTD_adjustCParams(cp, srcSize, dictSize);
     return cp;
 }
+
+/*! ZSTD_getParams() :
+*   same as ZSTD_getCParams(), but @return a `ZSTD_parameters` object (instead of `ZSTD_compressionParameters`).
+*   All fields of `ZSTD_frameParameters` are set to default (0) */
+ZSTD_parameters ZSTD_getParams(int compressionLevel, unsigned long long srcSize, size_t dictSize) {
+    ZSTD_parameters params;
+    ZSTD_compressionParameters const cParams = ZSTD_getCParams(compressionLevel, srcSize, dictSize);
+    memset(&params, 0, sizeof(params));
+    params.cParams = cParams;
+    return params;
+}
diff --git a/lib/compress/zstd_opt.h b/lib/compress/zstd_opt.h
index 97b1623..3a1e9e1 100644
--- a/lib/compress/zstd_opt.h
+++ b/lib/compress/zstd_opt.h
@@ -34,6 +34,10 @@
 /* Note : this file is intended to be included within zstd_compress.c */
 
 
+#ifndef ZSTD_OPT_H_91842398743
+#define ZSTD_OPT_H_91842398743
+
+
 #define ZSTD_FREQ_DIV   5
 
 /*-*************************************
@@ -110,7 +114,7 @@ FORCE_INLINE U32 ZSTD_getLiteralPrice(seqStore_t* ssPtr, U32 litLength, const BY
 
     /* literals */
     if (ssPtr->cachedLiterals == literals) {
-        U32 additional = litLength - ssPtr->cachedLitLength;
+        U32 const additional = litLength - ssPtr->cachedLitLength;
         const BYTE* literals2 = ssPtr->cachedLiterals + ssPtr->cachedLitLength;
         price = ssPtr->cachedPrice + additional * ssPtr->log2litSum;
         for (u=0; u < additional; u++)
@@ -130,15 +134,7 @@ FORCE_INLINE U32 ZSTD_getLiteralPrice(seqStore_t* ssPtr, U32 litLength, const BY
     }
 
     /* literal Length */
-    {   static const BYTE LL_Code[64] = {  0,  1,  2,  3,  4,  5,  6,  7,
-                                           8,  9, 10, 11, 12, 13, 14, 15,
-                                          16, 16, 17, 17, 18, 18, 19, 19,
-                                          20, 20, 20, 20, 21, 21, 21, 21,
-                                          22, 22, 22, 22, 22, 22, 22, 22,
-                                          23, 23, 23, 23, 23, 23, 23, 23,
-                                          24, 24, 24, 24, 24, 24, 24, 24,
-                                          24, 24, 24, 24, 24, 24, 24, 24 };
-        const BYTE LL_deltaCode = 19;
+    {   const BYTE LL_deltaCode = 19;
         const BYTE llCode = (litLength>63) ? (BYTE)ZSTD_highbit32(litLength) + LL_deltaCode : LL_Code[litLength];
         price += LL_bits[llCode] + ssPtr->log2litLengthSum - ZSTD_highbit32(ssPtr->litLengthFreq[llCode]+1);
     }
@@ -150,19 +146,11 @@ FORCE_INLINE U32 ZSTD_getLiteralPrice(seqStore_t* ssPtr, U32 litLength, const BY
 FORCE_INLINE U32 ZSTD_getPrice(seqStore_t* seqStorePtr, U32 litLength, const BYTE* literals, U32 offset, U32 matchLength)
 {
     /* offset */
-    BYTE offCode = (BYTE)ZSTD_highbit32(offset+1);
+    BYTE const offCode = (BYTE)ZSTD_highbit32(offset+1);
     U32 price = offCode + seqStorePtr->log2offCodeSum - ZSTD_highbit32(seqStorePtr->offCodeFreq[offCode]+1);
 
     /* match Length */
-    {   static const BYTE ML_Code[128] = { 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15,
-                                          16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31,
-                                          32, 32, 33, 33, 34, 34, 35, 35, 36, 36, 36, 36, 37, 37, 37, 37,
-                                          38, 38, 38, 38, 38, 38, 38, 38, 39, 39, 39, 39, 39, 39, 39, 39,
-                                          40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40,
-                                          41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41,
-                                          42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42,
-                                          42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42 };
-        const BYTE ML_deltaCode = 36;
+    {   const BYTE ML_deltaCode = 36;
         const BYTE mlCode = (matchLength>127) ? (BYTE)ZSTD_highbit32(matchLength) + ML_deltaCode : ML_Code[matchLength];
         price += ML_bits[mlCode] + seqStorePtr->log2matchLengthSum - ZSTD_highbit32(seqStorePtr->matchLengthFreq[mlCode]+1);
     }
@@ -181,36 +169,20 @@ MEM_STATIC void ZSTD_updatePrice(seqStore_t* seqStorePtr, U32 litLength, const B
         seqStorePtr->litFreq[literals[u]]++;
 
     /* literal Length */
-    {   static const BYTE LL_Code[64] = {  0,  1,  2,  3,  4,  5,  6,  7,
-                                           8,  9, 10, 11, 12, 13, 14, 15,
-                                          16, 16, 17, 17, 18, 18, 19, 19,
-                                          20, 20, 20, 20, 21, 21, 21, 21,
-                                          22, 22, 22, 22, 22, 22, 22, 22,
-                                          23, 23, 23, 23, 23, 23, 23, 23,
-                                          24, 24, 24, 24, 24, 24, 24, 24,
-                                          24, 24, 24, 24, 24, 24, 24, 24 };
-        const BYTE LL_deltaCode = 19;
+    {   const BYTE LL_deltaCode = 19;
         const BYTE llCode = (litLength>63) ? (BYTE)ZSTD_highbit32(litLength) + LL_deltaCode : LL_Code[litLength];
         seqStorePtr->litLengthFreq[llCode]++;
         seqStorePtr->litLengthSum++;
     }
 
     /* match offset */
-	{   BYTE offCode = (BYTE)ZSTD_highbit32(offset+1);
+	{   BYTE const offCode = (BYTE)ZSTD_highbit32(offset+1);
 		seqStorePtr->offCodeSum++;
 		seqStorePtr->offCodeFreq[offCode]++;
 	}
 
     /* match Length */
-    {   static const BYTE ML_Code[128] = { 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15,
-                                          16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31,
-                                          32, 32, 33, 33, 34, 34, 35, 35, 36, 36, 36, 36, 37, 37, 37, 37,
-                                          38, 38, 38, 38, 38, 38, 38, 38, 39, 39, 39, 39, 39, 39, 39, 39,
-                                          40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40,
-                                          41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41,
-                                          42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42,
-                                          42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42 };
-        const BYTE ML_deltaCode = 36;
+    {   const BYTE ML_deltaCode = 36;
         const BYTE mlCode = (matchLength>127) ? (BYTE)ZSTD_highbit32(matchLength) + ML_deltaCode : ML_Code[matchLength];
         seqStorePtr->matchLengthFreq[mlCode]++;
         seqStorePtr->matchLengthSum++;
@@ -232,7 +204,6 @@ MEM_STATIC void ZSTD_updatePrice(seqStore_t* seqStorePtr, U32 litLength, const B
 
 
 
-
 /* Update hashTable3 up to ip (excluded)
    Assumption : always within prefix (ie. not within extDict) */
 FORCE_INLINE
@@ -461,13 +432,14 @@ void ZSTD_compressBlock_opt_generic(ZSTD_CCtx* ctx,
     ZSTD_optimal_t* opt = seqStorePtr->priceTable;
     ZSTD_match_t* matches = seqStorePtr->matchTable;
     const BYTE* inr;
-    U32 offset, rep[ZSTD_REP_INIT];
+    U32 offset, rep[ZSTD_REP_NUM];
 
     /* init */
     ctx->nextToUpdate3 = ctx->nextToUpdate;
     ZSTD_rescaleFreqs(seqStorePtr);
     ip += (ip==prefixStart);
-    { U32 i; for (i=0; i<ZSTD_REP_INIT; i++) rep[i]=ctx->rep[i]; }
+    { U32 i; for (i=0; i<ZSTD_REP_NUM; i++) rep[i]=ctx->rep[i]; }
+    inr = ip;
 
     ZSTD_LOG_BLOCK("%d: COMPBLOCK_OPT_GENERIC srcSz=%d maxSrch=%d mls=%d sufLen=%d\n", (int)(ip-base), (int)srcSize, maxSearches, mls, sufficient_len);
 
@@ -481,7 +453,7 @@ void ZSTD_compressBlock_opt_generic(ZSTD_CCtx* ctx,
 
         /* check repCode */
         {   U32 i;
-            for (i=0; i<ZSTD_REP_NUM; i++) {
+            for (i=(ip == anchor); i<ZSTD_REP_CHECK; i++) {
                 if ((rep[i]<(U32)(ip-prefixStart))
                     && (MEM_readMINMATCH(ip, minMatch) == MEM_readMINMATCH(ip - rep[i], minMatch))) {
                     mlen = (U32)ZSTD_count(ip+minMatch, ip+minMatch-rep[i], iend) + minMatch;
@@ -490,7 +462,7 @@ void ZSTD_compressBlock_opt_generic(ZSTD_CCtx* ctx,
                         best_mlen = mlen; best_off = i; cur = 0; last_pos = 1;
                         goto _storeSequence;
                     }
-                    best_off = (i<=1 && ip == anchor) ? 1-i : i;
+                    best_off = i - (ip == anchor);
                     do {
                         price = ZSTD_getPrice(seqStorePtr, litlen, anchor, best_off, mlen - MINMATCH);
                         if (mlen > last_pos || price < opt[mlen].price)
@@ -528,7 +500,7 @@ void ZSTD_compressBlock_opt_generic(ZSTD_CCtx* ctx,
         if (last_pos < minMatch) { ip++; continue; }
 
         /* initialize opt[0] */
-        { U32 i ; for (i=0; i<ZSTD_REP_INIT; i++) opt[0].rep[i] = rep[i]; }
+        { U32 i ; for (i=0; i<ZSTD_REP_NUM; i++) opt[0].rep[i] = rep[i]; }
         opt[0].mlen = 1;
         opt[0].litlen = litlen;
 
@@ -572,19 +544,21 @@ void ZSTD_compressBlock_opt_generic(ZSTD_CCtx* ctx,
 
            best_mlen = minMatch;
            {   U32 i;
-               for (i=0; i<ZSTD_REP_NUM; i++) {
+               for (i=(opt[cur].mlen != 1); i<ZSTD_REP_CHECK; i++) {  /* check rep */
                    if ((opt[cur].rep[i]<(U32)(inr-prefixStart))
-                       && (MEM_readMINMATCH(inr, minMatch) == MEM_readMINMATCH(inr - opt[cur].rep[i], minMatch))) {  /* check rep */
+                       && (MEM_readMINMATCH(inr, minMatch) == MEM_readMINMATCH(inr - opt[cur].rep[i], minMatch))) {
                        mlen = (U32)ZSTD_count(inr+minMatch, inr+minMatch - opt[cur].rep[i], iend) + minMatch;
                        ZSTD_LOG_PARSER("%d: Found REP %d/%d mlen=%d off=%d rep=%d opt[%d].off=%d\n", (int)(inr-base), i, ZSTD_REP_NUM, mlen, i, opt[cur].rep[i], cur, opt[cur].off);
 
                        if (mlen > sufficient_len || cur + mlen >= ZSTD_OPT_NUM) {
-                            ZSTD_LOG_PARSER("%d: REP sufficient_len=%d best_mlen=%d best_off=%d last_pos=%d\n", (int)(inr-base), sufficient_len, best_mlen, best_off, last_pos);
                             best_mlen = mlen; best_off = i; last_pos = cur + 1;
+                            ZSTD_LOG_PARSER("%d: REP sufficient_len=%d best_mlen=%d best_off=%d last_pos=%d\n", (int)(inr-base), sufficient_len, best_mlen, best_off, last_pos);
                             goto _storeSequence;
                        }
 
-                       best_off = (i<=1 && opt[cur].mlen != 1) ? 1-i : i;
+                       //best_off = ((i<=1) & (opt[cur].mlen != 1)) ? 1-i : i;
+                       best_off = i - (opt[cur].mlen != 1);
+
                        if (opt[cur].mlen == 1) {
                             litlen = opt[cur].litlen;
                             if (cur > litlen) {
@@ -689,7 +663,8 @@ _storeSequence:   /* cur, last_pos, best_mlen, best_off have to be set */
                     rep[1] = rep[0];
                     rep[0] = best_off;
                 }
-                if (litLength == 0 && offset<=1) offset = 1-offset;
+                if ((litLength == 0) & (offset==0)) offset = rep[1];  /* protection, but should never happen */
+                if ((litLength == 0) & (offset<=2)) offset--;
             }
 
             ZSTD_LOG_ENCODE("%d/%d: ENCODE literals=%d mlen=%d off=%d rep[0]=%d rep[1]=%d\n", (int)(ip-base), (int)(iend-base), (int)(litLength), (int)mlen, (int)(offset), (int)rep[0], (int)rep[1]);
@@ -752,12 +727,13 @@ void ZSTD_compressBlock_opt_extDict_generic(ZSTD_CCtx* ctx,
     const BYTE* inr;
 
     /* init */
-    U32 offset, rep[ZSTD_REP_INIT];
-    { U32 i; for (i=0; i<ZSTD_REP_INIT; i++) rep[i]=ctx->rep[i]; }
+    U32 offset, rep[ZSTD_REP_NUM];
+    { U32 i; for (i=0; i<ZSTD_REP_NUM; i++) rep[i]=ctx->rep[i]; }
 
     ctx->nextToUpdate3 = ctx->nextToUpdate;
     ZSTD_rescaleFreqs(seqStorePtr);
     ip += (ip==prefixStart);
+    inr = ip;
 
     ZSTD_LOG_BLOCK("%d: COMPBLOCK_OPT_EXTDICT srcSz=%d maxSrch=%d mls=%d sufLen=%d\n", (int)(ip-base), (int)srcSize, maxSearches, mls, sufficient_len);
 
@@ -773,11 +749,12 @@ void ZSTD_compressBlock_opt_extDict_generic(ZSTD_CCtx* ctx,
 
         /* check repCode */
         {   U32 i;
-            for (i=0; i<ZSTD_REP_NUM; i++) {
+            for (i = (ip==anchor); i<ZSTD_REP_CHECK; i++) {
                 const U32 repIndex = (U32)(current - rep[i]);
                 const BYTE* const repBase = repIndex < dictLimit ? dictBase : base;
                 const BYTE* const repMatch = repBase + repIndex;
-                if ( (((U32)((dictLimit-1) - repIndex) >= 3) & (repIndex>lowestIndex))  /* intentional overflow */
+                if ( (rep[i] <= current)
+                   && (((U32)((dictLimit-1) - repIndex) >= 3) & (repIndex>lowestIndex))  /* intentional overflow */
                    && (MEM_readMINMATCH(ip, minMatch) == MEM_readMINMATCH(repMatch, minMatch)) ) {
                     /* repcode detected we should take it */
                     const BYTE* const repEnd = repIndex < dictLimit ? dictEnd : iend;
@@ -789,7 +766,7 @@ void ZSTD_compressBlock_opt_extDict_generic(ZSTD_CCtx* ctx,
                         goto _storeSequence;
                     }
 
-                    best_off = (i<=1 && ip == anchor) ? 1-i : i;
+                    best_off = i - (ip==anchor);
                     litlen = opt[0].litlen;
                     do {
                         price = ZSTD_getPrice(seqStorePtr, litlen, anchor, best_off, mlen - MINMATCH);
@@ -804,7 +781,7 @@ void ZSTD_compressBlock_opt_extDict_generic(ZSTD_CCtx* ctx,
         ZSTD_LOG_PARSER("%d: match_num=%d last_pos=%d\n", (int)(ip-base), match_num, last_pos);
         if (!last_pos && !match_num) { ip++; continue; }
 
-        { U32 i; for (i=0; i<ZSTD_REP_INIT; i++) opt[0].rep[i] = rep[i]; }
+        { U32 i; for (i=0; i<ZSTD_REP_NUM; i++) opt[0].rep[i] = rep[i]; }
         opt[0].mlen = 1;
 
         if (match_num && (matches[match_num-1].len > sufficient_len || matches[match_num-1].len >= ZSTD_OPT_NUM)) {
@@ -875,11 +852,12 @@ void ZSTD_compressBlock_opt_extDict_generic(ZSTD_CCtx* ctx,
             best_mlen = 0;
 
             {   U32 i;
-                for (i=0; i<ZSTD_REP_NUM; i++) {
+                for (i = (opt[cur].mlen != 1); i<ZSTD_REP_CHECK; i++) {
                     const U32 repIndex = (U32)(current+cur - opt[cur].rep[i]);
                     const BYTE* const repBase = repIndex < dictLimit ? dictBase : base;
                     const BYTE* const repMatch = repBase + repIndex;
-                    if ( (((U32)((dictLimit-1) - repIndex) >= 3) & (repIndex>lowestIndex))  /* intentional overflow */
+                    if ( (opt[cur].rep[i] <= current+cur)
+                      && (((U32)((dictLimit-1) - repIndex) >= 3) & (repIndex>lowestIndex))  /* intentional overflow */
                       && (MEM_readMINMATCH(inr, minMatch) == MEM_readMINMATCH(repMatch, minMatch)) ) {
                         /* repcode detected */
                         const BYTE* const repEnd = repIndex < dictLimit ? dictEnd : iend;
@@ -887,12 +865,12 @@ void ZSTD_compressBlock_opt_extDict_generic(ZSTD_CCtx* ctx,
                         ZSTD_LOG_PARSER("%d: Found REP %d/%d mlen=%d off=%d rep=%d opt[%d].off=%d\n", (int)(inr-base), i, ZSTD_REP_NUM, mlen, i, opt[cur].rep[i], cur, opt[cur].off);
 
                         if (mlen > sufficient_len || cur + mlen >= ZSTD_OPT_NUM) {
-                            ZSTD_LOG_PARSER("%d: REP sufficient_len=%d best_mlen=%d best_off=%d last_pos=%d\n", (int)(inr-base), sufficient_len, best_mlen, best_off, last_pos);
                             best_mlen = mlen; best_off = i; last_pos = cur + 1;
+                            ZSTD_LOG_PARSER("%d: REP sufficient_len=%d best_mlen=%d best_off=%d last_pos=%d\n", (int)(inr-base), sufficient_len, best_mlen, best_off, last_pos);
                             goto _storeSequence;
                         }
 
-                        best_off = (i<=1 && opt[cur].mlen != 1) ? 1-i : i;
+                        best_off = i - (opt[cur].mlen != 1);
                         if (opt[cur].mlen == 1) {
                             litlen = opt[cur].litlen;
                             if (cur > litlen) {
@@ -998,8 +976,9 @@ _storeSequence:   /* cur, last_pos, best_mlen, best_off have to be set */
                     if (offset != 1) rep[2] = rep[1];
                     rep[1] = rep[0];
                     rep[0] = best_off;
-                 }
-                 if (litLength == 0 && offset<=1) offset = 1-offset;
+                }
+                if ((litLength==0) & (offset==0)) offset = rep[1];  /* protection, but should never happen */
+                if ((litLength==0) & (offset<=2)) offset --;
             }
 
             ZSTD_LOG_ENCODE("%d/%d: ENCODE literals=%d mlen=%d off=%d rep[0]=%d rep[1]=%d\n", (int)(ip-base), (int)(iend-base), (int)(litLength), (int)mlen, (int)(offset), (int)rep[0], (int)rep[1]);
@@ -1013,7 +992,7 @@ _storeSequence:   /* cur, last_pos, best_mlen, best_off have to be set */
                     ml2 = ZSTD_count_2segments(ip, match, iend, dictEnd, prefixStart);
                     ZSTD_LOG_PARSER("%d: ZSTD_count_2segments=%d offset=%d dictBase=%p dictEnd=%p prefixStart=%p ip=%p match=%p\n", (int)current, (int)ml2, (int)best_off, dictBase, dictEnd, prefixStart, ip, match);
                 }
-                else ml2 = (U32)ZSTD_count(ip, ip-offset, iend);
+                else ml2 = (U32)ZSTD_count(ip, ip-best_off, iend);
             }
             else ml2 = (U32)ZSTD_count(ip, ip-rep[0], iend);
             if ((offset >= 8) && (ml2 < mlen || ml2 < minMatch)) {
@@ -1030,7 +1009,7 @@ _storeSequence:   /* cur, last_pos, best_mlen, best_off have to be set */
     }    }   /* for (cur=0; cur < last_pos; ) */
 
     /* Save reps for next block */
-    ctx->savedRep[0] = rep[0]; ctx->savedRep[1] = rep[1]; ctx->savedRep[2] = rep[2];
+    { int i; for (i=0; i<ZSTD_REP_NUM; i++) ctx->savedRep[i] = rep[i]; }
 
     /* Last Literals */
     {   size_t lastLLSize = iend - anchor;
@@ -1039,3 +1018,5 @@ _storeSequence:   /* cur, last_pos, best_mlen, best_off have to be set */
         seqStorePtr->lit += lastLLSize;
     }
 }
+
+#endif  /* ZSTD_OPT_H_91842398743 */
diff --git a/lib/decompress/zbuff_decompress.c b/lib/decompress/zbuff_decompress.c
index b6e1806..908120f 100644
--- a/lib/decompress/zbuff_decompress.c
+++ b/lib/decompress/zbuff_decompress.c
@@ -158,9 +158,9 @@ size_t ZBUFF_decompressContinue(ZBUFF_DCtx* zbd,
     char* const ostart = (char*)dst;
     char* const oend = ostart + *dstCapacityPtr;
     char* op = ostart;
-    U32 notDone = 1;
+    U32 someMoreWork = 1;
 
-    while (notDone) {
+    while (someMoreWork) {
         switch(zbd->stage)
         {
         case ZBUFFds_init :
@@ -168,12 +168,12 @@ size_t ZBUFF_decompressContinue(ZBUFF_DCtx* zbd,
 
         case ZBUFFds_loadHeader :
             {   size_t const hSize = ZSTD_getFrameParams(&(zbd->fParams), zbd->headerBuffer, zbd->lhSize);
-                if (hSize != 0) {
+                if (ZSTD_isError(hSize)) return hSize;
+                if (hSize != 0) {   /* need more input */
                     size_t const toLoad = hSize - zbd->lhSize;   /* if hSize!=0, hSize > zbd->lhSize */
-                    if (ZSTD_isError(hSize)) return hSize;
                     if (toLoad > (size_t)(iend-ip)) {   /* not enough input to load full header */
                         memcpy(zbd->headerBuffer + zbd->lhSize, ip, iend-ip);
-                        zbd->lhSize += iend-ip; ip = iend; notDone = 0;
+                        zbd->lhSize += iend-ip;
                         *dstCapacityPtr = 0;
                         return (hSize - zbd->lhSize) + ZSTD_blockHeaderSize;   /* remaining header bytes + next block header */
                     }
@@ -184,7 +184,7 @@ size_t ZBUFF_decompressContinue(ZBUFF_DCtx* zbd,
             /* Consume header */
             {   size_t const h1Size = ZSTD_nextSrcSizeToDecompress(zbd->zd);  /* == ZSTD_frameHeaderSize_min */
                 size_t const h1Result = ZSTD_decompressContinue(zbd->zd, NULL, 0, zbd->headerBuffer, h1Size);
-                if (ZSTD_isError(h1Result)) return h1Result;
+                if (ZSTD_isError(h1Result)) return h1Result;   /* should not happen : already checked */
                 if (h1Size < zbd->lhSize) {   /* long header */
                     size_t const h2Size = ZSTD_nextSrcSizeToDecompress(zbd->zd);
                     size_t const h2Result = ZSTD_decompressContinue(zbd->zd, NULL, 0, zbd->headerBuffer+h1Size, h2Size);
@@ -194,7 +194,8 @@ size_t ZBUFF_decompressContinue(ZBUFF_DCtx* zbd,
             zbd->fParams.windowSize = MAX(zbd->fParams.windowSize, 1U << ZSTD_WINDOWLOG_ABSOLUTEMIN);
 
             /* Frame header instruct buffer sizes */
-            {   size_t const blockSize = MIN(zbd->fParams.windowSize, ZSTD_BLOCKSIZE_MAX);
+            {   size_t const blockSize = MIN(zbd->fParams.windowSize, ZSTD_BLOCKSIZE_ABSOLUTEMAX);
+                size_t const neededOutSize = zbd->fParams.windowSize + blockSize;
                 zbd->blockSize = blockSize;
                 if (zbd->inBuffSize < blockSize) {
                     zbd->customMem.customFree(zbd->customMem.opaque, zbd->inBuff);
@@ -202,20 +203,20 @@ size_t ZBUFF_decompressContinue(ZBUFF_DCtx* zbd,
                     zbd->inBuff = (char*)zbd->customMem.customAlloc(zbd->customMem.opaque, blockSize);
                     if (zbd->inBuff == NULL) return ERROR(memory_allocation);
                 }
-                {   size_t const neededOutSize = zbd->fParams.windowSize + blockSize;
-                    if (zbd->outBuffSize < neededOutSize) {
-                        zbd->customMem.customFree(zbd->customMem.opaque, zbd->outBuff);
-                        zbd->outBuffSize = neededOutSize;
-                        zbd->outBuff = (char*)zbd->customMem.customAlloc(zbd->customMem.opaque, neededOutSize);
-                        if (zbd->outBuff == NULL) return ERROR(memory_allocation);
-            }   }   }
+                if (zbd->outBuffSize < neededOutSize) {
+                    zbd->customMem.customFree(zbd->customMem.opaque, zbd->outBuff);
+                    zbd->outBuffSize = neededOutSize;
+                    zbd->outBuff = (char*)zbd->customMem.customAlloc(zbd->customMem.opaque, neededOutSize);
+                    if (zbd->outBuff == NULL) return ERROR(memory_allocation);
+            }   }
             zbd->stage = ZBUFFds_read;
+            /* pass-through */
 
         case ZBUFFds_read:
             {   size_t const neededInSize = ZSTD_nextSrcSizeToDecompress(zbd->zd);
                 if (neededInSize==0) {  /* end of frame */
                     zbd->stage = ZBUFFds_init;
-                    notDone = 0;
+                    someMoreWork = 0;
                     break;
                 }
                 if ((size_t)(iend-ip) >= neededInSize) {  /* decode directly from src */
@@ -230,8 +231,9 @@ size_t ZBUFF_decompressContinue(ZBUFF_DCtx* zbd,
                     zbd->stage = ZBUFFds_flush;
                     break;
                 }
-                if (ip==iend) { notDone = 0; break; }   /* no more input */
+                if (ip==iend) { someMoreWork = 0; break; }   /* no more input */
                 zbd->stage = ZBUFFds_load;
+                /* pass-through */
             }
 
         case ZBUFFds_load:
@@ -242,7 +244,7 @@ size_t ZBUFF_decompressContinue(ZBUFF_DCtx* zbd,
                 loadedSize = ZBUFF_limitCopy(zbd->inBuff + zbd->inPos, toLoad, ip, iend-ip);
                 ip += loadedSize;
                 zbd->inPos += loadedSize;
-                if (loadedSize < toLoad) { notDone = 0; break; }   /* not enough input, wait for more */
+                if (loadedSize < toLoad) { someMoreWork = 0; break; }   /* not enough input, wait for more */
 
                 /* decode loaded input */
                 {  const int isSkipFrame = ZSTD_isSkipFrame(zbd->zd);
@@ -254,7 +256,7 @@ size_t ZBUFF_decompressContinue(ZBUFF_DCtx* zbd,
                     if (!decodedSize && !isSkipFrame) { zbd->stage = ZBUFFds_read; break; }   /* this was just a header */
                     zbd->outEnd = zbd->outStart +  decodedSize;
                     zbd->stage = ZBUFFds_flush;
-                    // break; /* ZBUFFds_flush follows */
+                    /* pass-through */
             }   }
 
         case ZBUFFds_flush:
@@ -262,14 +264,14 @@ size_t ZBUFF_decompressContinue(ZBUFF_DCtx* zbd,
                 size_t const flushedSize = ZBUFF_limitCopy(op, oend-op, zbd->outBuff + zbd->outStart, toFlushSize);
                 op += flushedSize;
                 zbd->outStart += flushedSize;
-                if (flushedSize == toFlushSize) {
+                if (flushedSize == toFlushSize) {  /* flush completed */
                     zbd->stage = ZBUFFds_read;
                     if (zbd->outStart + zbd->blockSize > zbd->outBuffSize)
                         zbd->outStart = zbd->outEnd = 0;
                     break;
                 }
                 /* cannot flush everything */
-                notDone = 0;
+                someMoreWork = 0;
                 break;
             }
         default: return ERROR(GENERIC);   /* impossible */
@@ -279,16 +281,17 @@ size_t ZBUFF_decompressContinue(ZBUFF_DCtx* zbd,
     *srcSizePtr = ip-istart;
     *dstCapacityPtr = op-ostart;
     {   size_t nextSrcSizeHint = ZSTD_nextSrcSizeToDecompress(zbd->zd);
-//        if (nextSrcSizeHint > ZSTD_blockHeaderSize) nextSrcSizeHint+= ZSTD_blockHeaderSize;   /* get following block header too */
+        if (!nextSrcSizeHint) return (zbd->outEnd != zbd->outStart);   /* return 0 only if fully flushed too */
+        nextSrcSizeHint += ZSTD_blockHeaderSize * (ZSTD_nextInputType(zbd->zd) == ZSTDnit_block);
+        if (zbd->inPos > nextSrcSizeHint) return ERROR(GENERIC);   /* should never happen */
         nextSrcSizeHint -= zbd->inPos;   /* already loaded*/
         return nextSrcSizeHint;
     }
 }
 
 
-
 /* *************************************
 *  Tool functions
 ***************************************/
-size_t ZBUFF_recommendedDInSize(void)  { return ZSTD_BLOCKSIZE_MAX + ZSTD_blockHeaderSize /* block header size*/ ; }
-size_t ZBUFF_recommendedDOutSize(void) { return ZSTD_BLOCKSIZE_MAX; }
+size_t ZBUFF_recommendedDInSize(void)  { return ZSTD_BLOCKSIZE_ABSOLUTEMAX + ZSTD_blockHeaderSize /* block header size*/ ; }
+size_t ZBUFF_recommendedDOutSize(void) { return ZSTD_BLOCKSIZE_ABSOLUTEMAX; }
diff --git a/lib/decompress/zstd_decompress.c b/lib/decompress/zstd_decompress.c
index 37aa403..958d636 100644
--- a/lib/decompress/zstd_decompress.c
+++ b/lib/decompress/zstd_decompress.c
@@ -105,6 +105,7 @@ static void ZSTD_copy4(void* dst, const void* src) { memcpy(dst, src, 4); }
 ***************************************************************/
 typedef enum { ZSTDds_getFrameHeaderSize, ZSTDds_decodeFrameHeader,
                ZSTDds_decodeBlockHeader, ZSTDds_decompressBlock,
+               ZSTDds_decompressLastBlock, ZSTDds_checkChecksum,
                ZSTDds_decodeSkippableHeader, ZSTDds_skipFrame } ZSTD_dStage;
 
 struct ZSTD_DCtx_s
@@ -118,9 +119,9 @@ struct ZSTD_DCtx_s
     const void* vBase;
     const void* dictEnd;
     size_t expected;
-    U32 rep[3];
+    U32 rep[ZSTD_REP_NUM];
     ZSTD_frameParams fParams;
-    blockType_t bType;   /* used in ZSTD_decompressContinue(), to transfer blockType between header decoding and block decoding stages */
+    blockType_e bType;   /* used in ZSTD_decompressContinue(), to transfer blockType between header decoding and block decoding stages */
     ZSTD_dStage stage;
     U32 litEntropy;
     U32 fseEntropy;
@@ -131,11 +132,14 @@ struct ZSTD_DCtx_s
     ZSTD_customMem customMem;
     size_t litBufSize;
     size_t litSize;
-    BYTE litBuffer[ZSTD_BLOCKSIZE_MAX + WILDCOPY_OVERLENGTH];
+    size_t rleSize;
+    BYTE litBuffer[ZSTD_BLOCKSIZE_ABSOLUTEMAX + WILDCOPY_OVERLENGTH];
     BYTE headerBuffer[ZSTD_FRAMEHEADERSIZE_MAX];
 };  /* typedef'd to ZSTD_DCtx within "zstd_static.h" */
 
-size_t ZSTD_sizeofDCtx (void) { return sizeof(ZSTD_DCtx); }   /* non published interface */
+size_t ZSTD_sizeofDCtx (const ZSTD_DCtx* dctx) { return sizeof(*dctx); }
+
+size_t ZSTD_estimateDCtxSize(void) { return sizeof(ZSTD_DCtx); }
 
 size_t ZSTD_decompressBegin(ZSTD_DCtx* dctx)
 {
@@ -184,7 +188,7 @@ size_t ZSTD_freeDCtx(ZSTD_DCtx* dctx)
 void ZSTD_copyDCtx(ZSTD_DCtx* dstDCtx, const ZSTD_DCtx* srcDCtx)
 {
     memcpy(dstDCtx, srcDCtx,
-           sizeof(ZSTD_DCtx) - (ZSTD_BLOCKSIZE_MAX+WILDCOPY_OVERLENGTH + ZSTD_frameHeaderSize_max));  /* no need to copy workspace */
+           sizeof(ZSTD_DCtx) - (ZSTD_BLOCKSIZE_ABSOLUTEMAX+WILDCOPY_OVERLENGTH + ZSTD_frameHeaderSize_max));  /* no need to copy workspace */
 }
 
 
@@ -192,129 +196,7 @@ void ZSTD_copyDCtx(ZSTD_DCtx* dstDCtx, const ZSTD_DCtx* srcDCtx)
 *   Decompression section
 ***************************************************************/
 
-/* Frame format description
-   Frame Header -  [ Block Header - Block ] - Frame End
-   1) Frame Header
-      - 4 bytes - Magic Number : ZSTD_MAGICNUMBER (defined within zstd_static.h)
-      - 1 byte  - Frame Descriptor
-   2) Block Header
-      - 3 bytes, starting with a 2-bits descriptor
-                 Uncompressed, Compressed, Frame End, unused
-   3) Block
-      See Block Format Description
-   4) Frame End
-      - 3 bytes, compatible with Block Header
-*/
-
-
-/* Frame descriptor
-
-    // old
-   1 byte - Alloc :
-   bit 0-3 : windowLog - ZSTD_WINDOWLOG_ABSOLUTEMIN   (see zstd_internal.h)
-   bit 4   : reserved for windowLog (must be zero)
-   bit 5   : reserved (must be zero)
-   bit 6-7 : Frame content size : unknown, 1 byte, 2 bytes, 8 bytes
-
-   1 byte - checker :
-   bit 0-1 : dictID (0, 1, 2 or 4 bytes)
-   bit 2-7 : reserved (must be zero)
-
-    // new
-   1 byte - FrameHeaderDescription :
-   bit 0-1 : dictID (0, 1, 2 or 4 bytes)
-   bit 2   : checksumFlag
-   bit 3   : reserved (must be zero)
-   bit 4   : reserved (unused, can be any value)
-   bit 5   : Single Segment (if 1, WindowLog byte is not present)
-   bit 6-7 : FrameContentFieldSize (0, 2, 4, or 8)
-             if (SkippedWindowLog && !FrameContentFieldsize) FrameContentFieldsize=1;
-
-   Optional : WindowLog (0 or 1 byte)
-   bit 0-2 : octal Fractional (1/8th)
-   bit 3-7 : Power of 2, with 0 = 1 KB (up to 2 TB)
-
-   Optional : dictID (0, 1, 2 or 4 bytes)
-   Automatic adaptation
-   0 : no dictID
-   1 : 1 - 255
-   2 : 256 - 65535
-   4 : all other values
-
-   Optional : content size (0, 1, 2, 4 or 8 bytes)
-   0 : unknown          (fcfs==0 and swl==0)
-   1 : 0-255 bytes      (fcfs==0 and swl==1)
-   2 : 256 - 65535+256  (fcfs==1)
-   4 : 0 - 4GB-1        (fcfs==2)
-   8 : 0 - 16EB-1       (fcfs==3)
-*/
-
-
-/* Compressed Block, format description
-
-   Block = Literal Section - Sequences Section
-   Prerequisite : size of (compressed) block, maximum size of regenerated data
-
-   1) Literal Section
-
-   1.1) Header : 1-5 bytes
-        flags: 2 bits
-            00 compressed by Huff0
-            01 unused
-            10 is Raw (uncompressed)
-            11 is Rle
-            Note : using 01 => Huff0 with precomputed table ?
-            Note : delta map ? => compressed ?
-
-   1.1.1) Huff0-compressed literal block : 3-5 bytes
-            srcSize < 1 KB => 3 bytes (2-2-10-10) => single stream
-            srcSize < 1 KB => 3 bytes (2-2-10-10)
-            srcSize < 16KB => 4 bytes (2-2-14-14)
-            else           => 5 bytes (2-2-18-18)
-            big endian convention
-
-   1.1.2) Raw (uncompressed) literal block header : 1-3 bytes
-        size :  5 bits: (IS_RAW<<6) + (0<<4) + size
-               12 bits: (IS_RAW<<6) + (2<<4) + (size>>8)
-                        size&255
-               20 bits: (IS_RAW<<6) + (3<<4) + (size>>16)
-                        size>>8&255
-                        size&255
-
-   1.1.3) Rle (repeated single byte) literal block header : 1-3 bytes
-        size :  5 bits: (IS_RLE<<6) + (0<<4) + size
-               12 bits: (IS_RLE<<6) + (2<<4) + (size>>8)
-                        size&255
-               20 bits: (IS_RLE<<6) + (3<<4) + (size>>16)
-                        size>>8&255
-                        size&255
-
-   1.1.4) Huff0-compressed literal block, using precomputed CTables : 3-5 bytes
-            srcSize < 1 KB => 3 bytes (2-2-10-10) => single stream
-            srcSize < 1 KB => 3 bytes (2-2-10-10)
-            srcSize < 16KB => 4 bytes (2-2-14-14)
-            else           => 5 bytes (2-2-18-18)
-            big endian convention
-
-        1- CTable available (stored into workspace ?)
-        2- Small input (fast heuristic ? Full comparison ? depend on clevel ?)
-
-
-   1.2) Literal block content
-
-   1.2.1) Huff0 block, using sizes from header
-        See Huff0 format
-
-   1.2.2) Huff0 block, using prepared table
-
-   1.2.3) Raw content
-
-   1.2.4) single byte
-
-
-   2) Sequences section
-      TO DO
-*/
+/* See compression format details in : zstd_compression_format.md */
 
 /** ZSTD_frameHeaderSize() :
 *   srcSize must be >= ZSTD_frameHeaderSize_min.
@@ -324,10 +206,10 @@ static size_t ZSTD_frameHeaderSize(const void* src, size_t srcSize)
     if (srcSize < ZSTD_frameHeaderSize_min) return ERROR(srcSize_wrong);
     {   BYTE const fhd = ((const BYTE*)src)[4];
         U32 const dictID= fhd & 3;
-        U32 const directMode = (fhd >> 5) & 1;
+        U32 const singleSegment = (fhd >> 5) & 1;
         U32 const fcsId = fhd >> 6;
-        return ZSTD_frameHeaderSize_min + !directMode + ZSTD_did_fieldSize[dictID] + ZSTD_fcs_fieldSize[fcsId]
-                + (directMode && !ZSTD_fcs_fieldSize[fcsId]);
+        return ZSTD_frameHeaderSize_min + !singleSegment + ZSTD_did_fieldSize[dictID] + ZSTD_fcs_fieldSize[fcsId]
+                + (singleSegment && !ZSTD_fcs_fieldSize[fcsId]);
     }
 }
 
@@ -361,14 +243,14 @@ size_t ZSTD_getFrameParams(ZSTD_frameParams* fparamsPtr, const void* src, size_t
         size_t pos = 5;
         U32 const dictIDSizeCode = fhdByte&3;
         U32 const checksumFlag = (fhdByte>>2)&1;
-        U32 const directMode = (fhdByte>>5)&1;
+        U32 const singleSegment = (fhdByte>>5)&1;
         U32 const fcsID = fhdByte>>6;
         U32 const windowSizeMax = 1U << ZSTD_WINDOWLOG_MAX;
         U32 windowSize = 0;
         U32 dictID = 0;
         U64 frameContentSize = 0;
         if ((fhdByte & 0x08) != 0) return ERROR(frameParameter_unsupported);   /* reserved bits, which must be zero */
-        if (!directMode) {
+        if (!singleSegment) {
             BYTE const wlByte = ip[pos++];
             U32 const windowLog = (wlByte >> 3) + ZSTD_WINDOWLOG_ABSOLUTEMIN;
             if (windowLog > ZSTD_WINDOWLOG_MAX) return ERROR(frameParameter_unsupported);
@@ -387,7 +269,7 @@ size_t ZSTD_getFrameParams(ZSTD_frameParams* fparamsPtr, const void* src, size_t
         switch(fcsID)
         {
             default:   /* impossible */
-            case 0 : if (directMode) frameContentSize = ip[pos]; break;
+            case 0 : if (singleSegment) frameContentSize = ip[pos]; break;
             case 1 : frameContentSize = MEM_readLE16(ip+pos)+256; break;
             case 2 : frameContentSize = MEM_readLE32(ip+pos); break;
             case 3 : frameContentSize = MEM_readLE64(ip+pos); break;
@@ -403,6 +285,26 @@ size_t ZSTD_getFrameParams(ZSTD_frameParams* fparamsPtr, const void* src, size_t
 }
 
 
+/** ZSTD_getDecompressedSize() :
+*   compatible with legacy mode
+*   @return : decompressed size if known, 0 otherwise
+              note : 0 can mean any of the following :
+                   - decompressed size is not present within frame header
+                   - frame header unknown / not supported
+                   - frame header not complete (`srcSize` too small) */
+unsigned long long ZSTD_getDecompressedSize(const void* src, size_t srcSize)
+{
+#if defined(ZSTD_LEGACY_SUPPORT) && (ZSTD_LEGACY_SUPPORT==1)
+    if (ZSTD_isLegacy(src, srcSize)) return ZSTD_getDecompressedSize_legacy(src, srcSize);
+#endif
+    {   ZSTD_frameParams fparams;
+        size_t const frResult = ZSTD_getFrameParams(&fparams, src, srcSize);
+        if (frResult!=0) return 0;
+        return fparams.frameContentSize;
+    }
+}
+
+
 /** ZSTD_decodeFrameHeader() :
 *   `srcSize` must be the size provided by ZSTD_frameHeaderSize().
 *   @return : 0 if success, or an error code, which can be tested using ZSTD_isError() */
@@ -417,7 +319,8 @@ static size_t ZSTD_decodeFrameHeader(ZSTD_DCtx* dctx, const void* src, size_t sr
 
 typedef struct
 {
-    blockType_t blockType;
+    blockType_e blockType;
+    U32 lastBlock;
     U32 origSize;
 } blockProperties_t;
 
@@ -425,18 +328,16 @@ typedef struct
 *   Provides the size of compressed block from block header `src` */
 size_t ZSTD_getcBlockSize(const void* src, size_t srcSize, blockProperties_t* bpPtr)
 {
-    const BYTE* const in = (const BYTE* const)src;
-    U32 cSize;
-
     if (srcSize < ZSTD_blockHeaderSize) return ERROR(srcSize_wrong);
-
-    bpPtr->blockType = (blockType_t)((*in) >> 6);
-    cSize = in[2] + (in[1]<<8) + ((in[0] & 7)<<16);
-    bpPtr->origSize = (bpPtr->blockType == bt_rle) ? cSize : 0;
-
-    if (bpPtr->blockType == bt_end) return 0;
-    if (bpPtr->blockType == bt_rle) return 1;
-    return cSize;
+    {   U32 const cBlockHeader = MEM_readLE24(src);
+        U32 const cSize = cBlockHeader >> 3;
+        bpPtr->lastBlock = cBlockHeader & 1;
+        bpPtr->blockType = (blockType_e)((cBlockHeader >> 1) & 3);
+        bpPtr->origSize = cSize;   /* only useful for RLE */
+        if (bpPtr->blockType == bt_rle) return 1;
+        if (bpPtr->blockType == bt_reserved) return ERROR(corruption_detected);
+        return cSize;
+    }
 }
 
 
@@ -448,138 +349,143 @@ static size_t ZSTD_copyRawBlock(void* dst, size_t dstCapacity, const void* src,
 }
 
 
+static size_t ZSTD_setRleBlock(void* dst, size_t dstCapacity, const void* src, size_t srcSize, size_t regenSize)
+{
+    if (srcSize != 1) return ERROR(srcSize_wrong);
+    if (regenSize > dstCapacity) return ERROR(dstSize_tooSmall);
+    memset(dst, *(const BYTE*)src, regenSize);
+    return regenSize;
+}
+
 /*! ZSTD_decodeLiteralsBlock() :
     @return : nb of bytes read from src (< srcSize ) */
 size_t ZSTD_decodeLiteralsBlock(ZSTD_DCtx* dctx,
                           const void* src, size_t srcSize)   /* note : srcSize < BLOCKSIZE */
 {
-    const BYTE* const istart = (const BYTE*) src;
-    litBlockType_t lbt;
-
     if (srcSize < MIN_CBLOCK_SIZE) return ERROR(corruption_detected);
-    lbt = (litBlockType_t)(istart[0]>> 6);
 
-    switch(lbt)
-    {
-    case lbt_huffman:
-        {   size_t litSize, litCSize, singleStream=0;
-            U32 lhSize = ((istart[0]) >> 4) & 3;
-            if (srcSize < 5) return ERROR(corruption_detected);   /* srcSize >= MIN_CBLOCK_SIZE == 3; here we need up to 5 for lhSize, + cSize (+nbSeq) */
-            switch(lhSize)
-            {
-            case 0: case 1: default:   /* note : default is impossible, since lhSize into [0..3] */
-                /* 2 - 2 - 10 - 10 */
-                lhSize=3;
-                singleStream = istart[0] & 16;
-                litSize  = ((istart[0] & 15) << 6) + (istart[1] >> 2);
-                litCSize = ((istart[1] &  3) << 8) + istart[2];
-                break;
-            case 2:
-                /* 2 - 2 - 14 - 14 */
-                lhSize=4;
-                litSize  = ((istart[0] & 15) << 10) + (istart[1] << 2) + (istart[2] >> 6);
-                litCSize = ((istart[2] & 63) <<  8) + istart[3];
-                break;
-            case 3:
-                /* 2 - 2 - 18 - 18 */
-                lhSize=5;
-                litSize  = ((istart[0] & 15) << 14) + (istart[1] << 6) + (istart[2] >> 2);
-                litCSize = ((istart[2] &  3) << 16) + (istart[3] << 8) + istart[4];
-                break;
-            }
-            if (litSize > ZSTD_BLOCKSIZE_MAX) return ERROR(corruption_detected);
-            if (litCSize + lhSize > srcSize) return ERROR(corruption_detected);
+    {   const BYTE* const istart = (const BYTE*) src;
+        symbolEncodingType_e const litEncType = (symbolEncodingType_e)(istart[0] & 3);
 
-            if (HUF_isError(singleStream ?
-                            HUF_decompress1X2_DCtx(dctx->hufTable, dctx->litBuffer, litSize, istart+lhSize, litCSize) :
-                            HUF_decompress4X_hufOnly (dctx->hufTable, dctx->litBuffer, litSize, istart+lhSize, litCSize) ))
-                return ERROR(corruption_detected);
+        switch(litEncType)
+        {
+        case set_repeat:
+            if (dctx->litEntropy==0) return ERROR(dictionary_corrupted);
+            /* fall-through */
+        case set_compressed:
+            if (srcSize < 5) return ERROR(corruption_detected);   /* srcSize >= MIN_CBLOCK_SIZE == 3; here we need up to 5 for case 3 */
+            {   size_t lhSize, litSize, litCSize;
+                U32 singleStream=0;
+                U32 const lhlCode = (istart[0] >> 2) & 3;
+                U32 const lhc = MEM_readLE32(istart);
+                switch(lhlCode)
+                {
+                case 0: case 1: default:   /* note : default is impossible, since lhlCode into [0..3] */
+                    /* 2 - 2 - 10 - 10 */
+                    {   singleStream = !lhlCode;
+                        lhSize = 3;
+                        litSize  = (lhc >> 4) & 0x3FF;
+                        litCSize = (lhc >> 14) & 0x3FF;
+                        break;
+                    }
+                case 2:
+                    /* 2 - 2 - 14 - 14 */
+                    {   lhSize = 4;
+                        litSize  = (lhc >> 4) & 0x3FFF;
+                        litCSize = lhc >> 18;
+                        break;
+                    }
+                case 3:
+                    /* 2 - 2 - 18 - 18 */
+                    {   lhSize = 5;
+                        litSize  = (lhc >> 4) & 0x3FFFF;
+                        litCSize = (lhc >> 22) + (istart[4] << 10);
+                        break;
+                    }
+                }
+                if (litSize > ZSTD_BLOCKSIZE_ABSOLUTEMAX) return ERROR(corruption_detected);
+                if (litCSize + lhSize > srcSize) return ERROR(corruption_detected);
+
+                if (HUF_isError((litEncType==set_repeat) ?
+                                    ( singleStream ?
+                                        HUF_decompress1X_usingDTable(dctx->litBuffer, litSize, istart+lhSize, litCSize, dctx->hufTable) :
+                                        HUF_decompress4X_usingDTable(dctx->litBuffer, litSize, istart+lhSize, litCSize, dctx->hufTable) ) :
+                                    ( singleStream ?
+                                        HUF_decompress1X2_DCtx(dctx->hufTable, dctx->litBuffer, litSize, istart+lhSize, litCSize) :
+                                        HUF_decompress4X_hufOnly (dctx->hufTable, dctx->litBuffer, litSize, istart+lhSize, litCSize)) ))
+                    return ERROR(corruption_detected);
 
-            dctx->litPtr = dctx->litBuffer;
-            dctx->litBufSize = ZSTD_BLOCKSIZE_MAX+8;
-            dctx->litSize = litSize;
-            dctx->litEntropy = 1;
-            return litCSize + lhSize;
-        }
-    case lbt_repeat:
-        {   size_t litSize, litCSize;
-            U32 lhSize = ((istart[0]) >> 4) & 3;
-            if (lhSize != 1)  /* only case supported for now : small litSize, single stream */
-                return ERROR(corruption_detected);
-            if (dctx->litEntropy==0)
-                return ERROR(dictionary_corrupted);
+                dctx->litPtr = dctx->litBuffer;
+                dctx->litBufSize = ZSTD_BLOCKSIZE_ABSOLUTEMAX+WILDCOPY_OVERLENGTH;
+                dctx->litSize = litSize;
+                dctx->litEntropy = 1;
+                return litCSize + lhSize;
+            }
 
-            /* 2 - 2 - 10 - 10 */
-            lhSize=3;
-            litSize  = ((istart[0] & 15) << 6) + (istart[1] >> 2);
-            litCSize = ((istart[1] &  3) << 8) + istart[2];
-            if (litCSize + lhSize > srcSize) return ERROR(corruption_detected);
+        case set_basic:
+            {   size_t litSize, lhSize;
+                U32 const lhlCode = ((istart[0]) >> 2) & 3;
+                switch(lhlCode)
+                {
+                case 0: case 2: default:   /* note : default is impossible, since lhlCode into [0..3] */
+                    lhSize = 1;
+                    litSize = istart[0] >> 3;
+                    break;
+                case 1:
+                    lhSize = 2;
+                    litSize = MEM_readLE16(istart) >> 4;
+                    break;
+                case 3:
+                    lhSize = 3;
+                    litSize = MEM_readLE24(istart) >> 4;
+                    break;
+                }
 
-            {   size_t const errorCode = HUF_decompress1X4_usingDTable(dctx->litBuffer, litSize, istart+lhSize, litCSize, dctx->hufTable);
-                if (HUF_isError(errorCode)) return ERROR(corruption_detected);
-            }
-            dctx->litPtr = dctx->litBuffer;
-            dctx->litBufSize = ZSTD_BLOCKSIZE_MAX+WILDCOPY_OVERLENGTH;
-            dctx->litSize = litSize;
-            return litCSize + lhSize;
-        }
-    case lbt_raw:
-        {   size_t litSize;
-            U32 lhSize = ((istart[0]) >> 4) & 3;
-            switch(lhSize)
-            {
-            case 0: case 1: default:   /* note : default is impossible, since lhSize into [0..3] */
-                lhSize=1;
-                litSize = istart[0] & 31;
-                break;
-            case 2:
-                litSize = ((istart[0] & 15) << 8) + istart[1];
-                break;
-            case 3:
-                litSize = ((istart[0] & 15) << 16) + (istart[1] << 8) + istart[2];
-                break;
+                if (lhSize+litSize+WILDCOPY_OVERLENGTH > srcSize) {  /* risk reading beyond src buffer with wildcopy */
+                    if (litSize+lhSize > srcSize) return ERROR(corruption_detected);
+                    memcpy(dctx->litBuffer, istart+lhSize, litSize);
+                    dctx->litPtr = dctx->litBuffer;
+                    dctx->litBufSize = ZSTD_BLOCKSIZE_ABSOLUTEMAX+8;
+                    dctx->litSize = litSize;
+                    return lhSize+litSize;
+                }
+                /* direct reference into compressed stream */
+                dctx->litPtr = istart+lhSize;
+                dctx->litBufSize = srcSize-lhSize;
+                dctx->litSize = litSize;
+                return lhSize+litSize;
             }
 
-            if (lhSize+litSize+WILDCOPY_OVERLENGTH > srcSize) {  /* risk reading beyond src buffer with wildcopy */
-                if (litSize+lhSize > srcSize) return ERROR(corruption_detected);
-                memcpy(dctx->litBuffer, istart+lhSize, litSize);
+        case set_rle:
+            {   U32 const lhlCode = ((istart[0]) >> 2) & 3;
+                size_t litSize, lhSize;
+                switch(lhlCode)
+                {
+                case 0: case 2: default:   /* note : default is impossible, since lhlCode into [0..3] */
+                    lhSize = 1;
+                    litSize = istart[0] >> 3;
+                    break;
+                case 1:
+                    lhSize = 2;
+                    litSize = MEM_readLE16(istart) >> 4;
+                    break;
+                case 3:
+                    lhSize = 3;
+                    litSize = MEM_readLE24(istart) >> 4;
+                    if (srcSize<4) return ERROR(corruption_detected);   /* srcSize >= MIN_CBLOCK_SIZE == 3; here we need lhSize+1 = 4 */
+                    break;
+                }
+                if (litSize > ZSTD_BLOCKSIZE_ABSOLUTEMAX) return ERROR(corruption_detected);
+                memset(dctx->litBuffer, istart[lhSize], litSize);
                 dctx->litPtr = dctx->litBuffer;
-                dctx->litBufSize = ZSTD_BLOCKSIZE_MAX+8;
+                dctx->litBufSize = ZSTD_BLOCKSIZE_ABSOLUTEMAX+WILDCOPY_OVERLENGTH;
                 dctx->litSize = litSize;
-                return lhSize+litSize;
-            }
-            /* direct reference into compressed stream */
-            dctx->litPtr = istart+lhSize;
-            dctx->litBufSize = srcSize-lhSize;
-            dctx->litSize = litSize;
-            return lhSize+litSize;
-        }
-    case lbt_rle:
-        {   size_t litSize;
-            U32 lhSize = ((istart[0]) >> 4) & 3;
-            switch(lhSize)
-            {
-            case 0: case 1: default:   /* note : default is impossible, since lhSize into [0..3] */
-                lhSize = 1;
-                litSize = istart[0] & 31;
-                break;
-            case 2:
-                litSize = ((istart[0] & 15) << 8) + istart[1];
-                break;
-            case 3:
-                litSize = ((istart[0] & 15) << 16) + (istart[1] << 8) + istart[2];
-                if (srcSize<4) return ERROR(corruption_detected);   /* srcSize >= MIN_CBLOCK_SIZE == 3; here we need lhSize+1 = 4 */
-                break;
+                return lhSize+1;
             }
-            if (litSize > ZSTD_BLOCKSIZE_MAX) return ERROR(corruption_detected);
-            memset(dctx->litBuffer, istart[lhSize], litSize);
-            dctx->litPtr = dctx->litBuffer;
-            dctx->litBufSize = ZSTD_BLOCKSIZE_MAX+WILDCOPY_OVERLENGTH;
-            dctx->litSize = litSize;
-            return lhSize+1;
+        default:
+            return ERROR(corruption_detected);   /* impossible */
         }
-    default:
-        return ERROR(corruption_detected);   /* impossible */
+
     }
 }
 
@@ -588,25 +494,25 @@ size_t ZSTD_decodeLiteralsBlock(ZSTD_DCtx* dctx,
     @return : nb bytes read from src,
               or an error code if it fails, testable with ZSTD_isError()
 */
-FORCE_INLINE size_t ZSTD_buildSeqTable(FSE_DTable* DTable, U32 type, U32 max, U32 maxLog,
+FORCE_INLINE size_t ZSTD_buildSeqTable(FSE_DTable* DTable, symbolEncodingType_e type, U32 max, U32 maxLog,
                                  const void* src, size_t srcSize,
                                  const S16* defaultNorm, U32 defaultLog, U32 flagRepeatTable)
 {
     switch(type)
     {
-    case FSE_ENCODING_RLE :
+    case set_rle :
         if (!srcSize) return ERROR(srcSize_wrong);
         if ( (*(const BYTE*)src) > max) return ERROR(corruption_detected);
         FSE_buildDTable_rle(DTable, *(const BYTE*)src);   /* if *src > max, data is corrupted */
         return 1;
-    case FSE_ENCODING_RAW :
+    case set_basic :
         FSE_buildDTable(DTable, defaultNorm, max, defaultLog);
         return 0;
-    case FSE_ENCODING_STATIC:
+    case set_repeat:
         if (!flagRepeatTable) return ERROR(corruption_detected);
         return 0;
     default :   /* impossible */
-    case FSE_ENCODING_DYNAMIC :
+    case set_compressed :
         {   U32 tableLog;
             S16 norm[MaxSeq+1];
             size_t const headerSize = FSE_readNCount(norm, &max, &tableLog, src, srcSize);
@@ -642,26 +548,24 @@ size_t ZSTD_decodeSeqHeaders(int* nbSeqPtr,
     }
 
     /* FSE table descriptors */
-    {   U32 const LLtype  = *ip >> 6;
-        U32 const Offtype = (*ip >> 4) & 3;
-        U32 const MLtype  = (*ip >> 2) & 3;
+    if (ip+4 > iend) return ERROR(srcSize_wrong); /* minimum possible size */
+    {   symbolEncodingType_e const LLtype = (symbolEncodingType_e)(*ip >> 6);
+        symbolEncodingType_e const OFtype = (symbolEncodingType_e)((*ip >> 4) & 3);
+        symbolEncodingType_e const MLtype = (symbolEncodingType_e)((*ip >> 2) & 3);
         ip++;
 
-        /* check */
-        if (ip > iend-3) return ERROR(srcSize_wrong); /* min : all 3 are "raw", hence no header, but at least xxLog bits per type */
-
         /* Build DTables */
-        {   size_t const bhSize = ZSTD_buildSeqTable(DTableLL, LLtype, MaxLL, LLFSELog, ip, iend-ip, LL_defaultNorm, LL_defaultNormLog, flagRepeatTable);
-            if (ZSTD_isError(bhSize)) return ERROR(corruption_detected);
-            ip += bhSize;
+        {   size_t const llhSize = ZSTD_buildSeqTable(DTableLL, LLtype, MaxLL, LLFSELog, ip, iend-ip, LL_defaultNorm, LL_defaultNormLog, flagRepeatTable);
+            if (ZSTD_isError(llhSize)) return ERROR(corruption_detected);
+            ip += llhSize;
         }
-        {   size_t const bhSize = ZSTD_buildSeqTable(DTableOffb, Offtype, MaxOff, OffFSELog, ip, iend-ip, OF_defaultNorm, OF_defaultNormLog, flagRepeatTable);
-            if (ZSTD_isError(bhSize)) return ERROR(corruption_detected);
-            ip += bhSize;
+        {   size_t const ofhSize = ZSTD_buildSeqTable(DTableOffb, OFtype, MaxOff, OffFSELog, ip, iend-ip, OF_defaultNorm, OF_defaultNormLog, flagRepeatTable);
+            if (ZSTD_isError(ofhSize)) return ERROR(corruption_detected);
+            ip += ofhSize;
         }
-        {   size_t const bhSize = ZSTD_buildSeqTable(DTableML, MLtype, MaxML, MLFSELog, ip, iend-ip, ML_defaultNorm, ML_defaultNormLog, flagRepeatTable);
-            if (ZSTD_isError(bhSize)) return ERROR(corruption_detected);
-            ip += bhSize;
+        {   size_t const mlhSize = ZSTD_buildSeqTable(DTableML, MLtype, MaxML, MLFSELog, ip, iend-ip, ML_defaultNorm, ML_defaultNormLog, flagRepeatTable);
+            if (ZSTD_isError(mlhSize)) return ERROR(corruption_detected);
+            ip += mlhSize;
     }   }
 
     return ip-istart;
@@ -679,7 +583,7 @@ typedef struct {
     FSE_DState_t stateLL;
     FSE_DState_t stateOffb;
     FSE_DState_t stateML;
-    size_t prevOffset[ZSTD_REP_INIT];
+    size_t prevOffset[ZSTD_REP_NUM];
 } seqState_t;
 
 
@@ -702,42 +606,37 @@ static seq_t ZSTD_decodeSequence(seqState_t* seqState)
                             0x2000, 0x4000, 0x8000, 0x10000 };
 
     static const U32 ML_base[MaxML+1] = {
-                             0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10,   11,    12,    13,    14,    15,
-                            16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26,   27,    28,    29,    30,    31,
-                            32, 34, 36, 38, 40, 44, 48, 56, 64, 80, 96, 0x80, 0x100, 0x200, 0x400, 0x800,
-                            0x1000, 0x2000, 0x4000, 0x8000, 0x10000 };
+                             3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13,   14,    15,    16,    17,    18,
+                            19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29,   30,    31,    32,    33,    34,
+                            35, 37, 39, 41, 43, 47, 51, 59, 67, 83, 99, 0x83, 0x103, 0x203, 0x403, 0x803,
+                            0x1003, 0x2003, 0x4003, 0x8003, 0x10003 };
 
     static const U32 OF_base[MaxOff+1] = {
-                 0,        1,       3,       7,     0xF,     0x1F,     0x3F,     0x7F,
-                 0xFF,   0x1FF,   0x3FF,   0x7FF,   0xFFF,   0x1FFF,   0x3FFF,   0x7FFF,
-                 0xFFFF, 0x1FFFF, 0x3FFFF, 0x7FFFF, 0xFFFFF, 0x1FFFFF, 0x3FFFFF, 0x7FFFFF,
-                 0xFFFFFF, 0x1FFFFFF, 0x3FFFFFF, /*fake*/ 1, 1 };
+                 0,        1,       1,       5,     0xD,     0x1D,     0x3D,     0x7D,
+                 0xFD,   0x1FD,   0x3FD,   0x7FD,   0xFFD,   0x1FFD,   0x3FFD,   0x7FFD,
+                 0xFFFD, 0x1FFFD, 0x3FFFD, 0x7FFFD, 0xFFFFD, 0x1FFFFD, 0x3FFFFD, 0x7FFFFD,
+                 0xFFFFFD, 0x1FFFFFD, 0x3FFFFFD, 0x7FFFFFD, 0xFFFFFFD };
 
     /* sequence */
     {   size_t offset;
         if (!ofCode)
             offset = 0;
         else {
-            offset = OF_base[ofCode] + BIT_readBits(&(seqState->DStream), ofBits);   /* <=  26 bits */
+            offset = OF_base[ofCode] + BIT_readBits(&(seqState->DStream), ofBits);   /* <=  (ZSTD_WINDOWLOG_MAX-1) bits */
             if (MEM_32bits()) BIT_reloadDStream(&(seqState->DStream));
         }
 
-        if (offset < ZSTD_REP_NUM) {
-            if (llCode == 0 && offset <= 1) offset = 1-offset;
-
-            if (offset != 0) {
-                size_t temp = seqState->prevOffset[offset];
-                if (offset != 1) {
-                    seqState->prevOffset[2] = seqState->prevOffset[1];
-                }
+        if (ofCode <= 1) {
+            offset += (llCode==0);
+            if (offset) {
+                size_t const temp = (offset==3) ? seqState->prevOffset[0] - 1 : seqState->prevOffset[offset];
+                if (offset != 1) seqState->prevOffset[2] = seqState->prevOffset[1];
                 seqState->prevOffset[1] = seqState->prevOffset[0];
                 seqState->prevOffset[0] = offset = temp;
-
             } else {
                 offset = seqState->prevOffset[0];
             }
         } else {
-            offset -= ZSTD_REP_MOVE;
             seqState->prevOffset[2] = seqState->prevOffset[1];
             seqState->prevOffset[1] = seqState->prevOffset[0];
             seqState->prevOffset[0] = offset;
@@ -745,11 +644,11 @@ static seq_t ZSTD_decodeSequence(seqState_t* seqState)
         seq.offset = offset;
     }
 
-    seq.matchLength = ML_base[mlCode] + MINMATCH + ((mlCode>31) ? BIT_readBits(&(seqState->DStream), mlBits) : 0);   /* <=  16 bits */
+    seq.matchLength = ML_base[mlCode] + ((mlCode>31) ? BIT_readBits(&(seqState->DStream), mlBits) : 0);   /* <=  16 bits */
     if (MEM_32bits() && (mlBits+llBits>24)) BIT_reloadDStream(&(seqState->DStream));
 
     seq.litLength = LL_base[llCode] + ((llCode>15) ? BIT_readBits(&(seqState->DStream), llBits) : 0);   /* <=  16 bits */
-    if (MEM_32bits() |
+    if (MEM_32bits() ||
        (totalBits > 64 - 7 - (LLFSELog+MLFSELog+OffFSELog)) ) BIT_reloadDStream(&(seqState->DStream));
 
     /* ANS state update */
@@ -771,7 +670,7 @@ size_t ZSTD_execSequence(BYTE* op,
     BYTE* const oLitEnd = op + sequence.litLength;
     size_t const sequenceLength = sequence.litLength + sequence.matchLength;
     BYTE* const oMatchEnd = op + sequenceLength;   /* risk : address space overflow (32-bits) */
-    BYTE* const oend_w = oend-WILDCOPY_OVERLENGTH;
+    BYTE* const oend_w = oend - WILDCOPY_OVERLENGTH;
     const BYTE* const iLitEnd = *litPtr + sequence.litLength;
     const BYTE* match = oLitEnd - sequence.offset;
 
@@ -864,7 +763,7 @@ static size_t ZSTD_decompressSequences(
     if (nbSeq) {
         seqState_t seqState;
         dctx->fseEntropy = 1;
-        { U32 i; for (i=0; i<ZSTD_REP_INIT; i++) seqState.prevOffset[i] = dctx->rep[i]; }
+        { U32 i; for (i=0; i<ZSTD_REP_NUM; i++) seqState.prevOffset[i] = dctx->rep[i]; }
         { size_t const errorCode = BIT_initDStream(&(seqState.DStream), ip, iend-ip);
           if (ERR_isError(errorCode)) return ERROR(corruption_detected); }
         FSE_initDState(&(seqState.stateLL), &(seqState.DStream), DTableLL);
@@ -882,12 +781,11 @@ static size_t ZSTD_decompressSequences(
         /* check if reached exact end */
         if (nbSeq) return ERROR(corruption_detected);
         /* save reps for next block */
-        { U32 i; for (i=0; i<ZSTD_REP_INIT; i++) dctx->rep[i] = (U32)(seqState.prevOffset[i]); }
+        { U32 i; for (i=0; i<ZSTD_REP_NUM; i++) dctx->rep[i] = (U32)(seqState.prevOffset[i]); }
     }
 
     /* last literal segment */
     {   size_t const lastLLSize = litEnd - litPtr;
-        //if (litPtr > litEnd) return ERROR(corruption_detected);   /* too many literals already used */
         if (lastLLSize > (size_t)(oend-op)) return ERROR(dstSize_tooSmall);
         memcpy(op, litPtr, lastLLSize);
         op += lastLLSize;
@@ -914,7 +812,7 @@ static size_t ZSTD_decompressBlock_internal(ZSTD_DCtx* dctx,
 {   /* blockType == blockCompressed */
     const BYTE* ip = (const BYTE*)src;
 
-    if (srcSize >= ZSTD_BLOCKSIZE_MAX) return ERROR(srcSize_wrong);
+    if (srcSize >= ZSTD_BLOCKSIZE_ABSOLUTEMAX) return ERROR(srcSize_wrong);
 
     /* Decode literals sub-block */
     {   size_t const litCSize = ZSTD_decodeLiteralsBlock(dctx, src, srcSize);
@@ -930,12 +828,25 @@ size_t ZSTD_decompressBlock(ZSTD_DCtx* dctx,
                             void* dst, size_t dstCapacity,
                       const void* src, size_t srcSize)
 {
+    size_t dSize;
     ZSTD_checkContinuity(dctx, dst);
-    return ZSTD_decompressBlock_internal(dctx, dst, dstCapacity, src, srcSize);
+    dSize = ZSTD_decompressBlock_internal(dctx, dst, dstCapacity, src, srcSize);
+    dctx->previousDstEnd = (char*)dst + dSize;
+    return dSize;
 }
 
 
-size_t ZSTD_generateNxByte(void* dst, size_t dstCapacity, BYTE byte, size_t length)
+/** ZSTD_insertBlock() :
+    insert `src` block into `dctx` history. Useful to track uncompressed blocks. */
+ZSTDLIB_API size_t ZSTD_insertBlock(ZSTD_DCtx* dctx, const void* blockStart, size_t blockSize)
+{
+    ZSTD_checkContinuity(dctx, blockStart);
+    dctx->previousDstEnd = (const char*)blockStart + blockSize;
+    return blockSize;
+}
+
+
+size_t ZSTD_generateNxBytes(void* dst, size_t dstCapacity, BYTE byte, size_t length)
 {
     if (length > dstCapacity) return ERROR(dstSize_tooSmall);
     memset(dst, byte, length);
@@ -950,7 +861,6 @@ static size_t ZSTD_decompressFrame(ZSTD_DCtx* dctx,
                                  const void* src, size_t srcSize)
 {
     const BYTE* ip = (const BYTE*)src;
-    const BYTE* const iend = ip + srcSize;
     BYTE* const ostart = (BYTE* const)dst;
     BYTE* const oend = ostart + dstCapacity;
     BYTE* op = ostart;
@@ -961,9 +871,11 @@ static size_t ZSTD_decompressFrame(ZSTD_DCtx* dctx,
 
     /* Frame Header */
     {   size_t const frameHeaderSize = ZSTD_frameHeaderSize(src, ZSTD_frameHeaderSize_min);
+        size_t result;
         if (ZSTD_isError(frameHeaderSize)) return frameHeaderSize;
         if (srcSize < frameHeaderSize+ZSTD_blockHeaderSize) return ERROR(srcSize_wrong);
-        if (ZSTD_decodeFrameHeader(dctx, src, frameHeaderSize)) return ERROR(corruption_detected);
+        result = ZSTD_decodeFrameHeader(dctx, src, frameHeaderSize);
+        if (ZSTD_isError(result)) return result;
         ip += frameHeaderSize; remainingSize -= frameHeaderSize;
     }
 
@@ -971,7 +883,7 @@ static size_t ZSTD_decompressFrame(ZSTD_DCtx* dctx,
     while (1) {
         size_t decodedSize;
         blockProperties_t blockProperties;
-        size_t const cBlockSize = ZSTD_getcBlockSize(ip, iend-ip, &blockProperties);
+        size_t const cBlockSize = ZSTD_getcBlockSize(ip, remainingSize, &blockProperties);
         if (ZSTD_isError(cBlockSize)) return cBlockSize;
 
         ip += ZSTD_blockHeaderSize;
@@ -987,25 +899,31 @@ static size_t ZSTD_decompressFrame(ZSTD_DCtx* dctx,
             decodedSize = ZSTD_copyRawBlock(op, oend-op, ip, cBlockSize);
             break;
         case bt_rle :
-            decodedSize = ZSTD_generateNxByte(op, oend-op, *ip, blockProperties.origSize);
-            break;
-        case bt_end :
-            /* end of frame */
-            if (remainingSize) return ERROR(srcSize_wrong);
-            decodedSize = 0;
+            decodedSize = ZSTD_generateNxBytes(op, oend-op, *ip, blockProperties.origSize);
             break;
+        case bt_reserved :
         default:
-            return ERROR(GENERIC);   /* impossible */
+            return ERROR(corruption_detected);
         }
-        if (cBlockSize == 0) break;   /* bt_end */
 
         if (ZSTD_isError(decodedSize)) return decodedSize;
         if (dctx->fParams.checksumFlag) XXH64_update(&dctx->xxhState, op, decodedSize);
         op += decodedSize;
         ip += cBlockSize;
         remainingSize -= cBlockSize;
+        if (blockProperties.lastBlock) break;
     }
 
+    if (dctx->fParams.checksumFlag) {   /* Frame content checksum verification */
+        U32 const checkCalc = (U32)XXH64_digest(&dctx->xxhState);
+        U32 checkRead;
+        if (remainingSize<4) return ERROR(checksum_wrong);
+        checkRead = MEM_readLE32(ip);
+        if (checkRead != checkCalc) return ERROR(checksum_wrong);
+        remainingSize -= 4;
+    }
+
+    if (remainingSize) return ERROR(srcSize_wrong);
     return op-ostart;
 }
 
@@ -1031,10 +949,7 @@ size_t ZSTD_decompress_usingDict(ZSTD_DCtx* dctx,
                                  const void* dict, size_t dictSize)
 {
 #if defined(ZSTD_LEGACY_SUPPORT) && (ZSTD_LEGACY_SUPPORT==1)
-    {   U32 const magicNumber = MEM_readLE32(src);
-        if (ZSTD_isLegacy(magicNumber))
-            return ZSTD_decompressLegacy(dst, dstCapacity, src, srcSize, dict, dictSize, magicNumber);
-    }
+    if (ZSTD_isLegacy(src, srcSize)) return ZSTD_decompressLegacy(dst, dstCapacity, src, srcSize, dict, dictSize);
 #endif
     ZSTD_decompressBegin_usingDict(dctx, dict, dictSize);
     ZSTD_checkContinuity(dctx, dst);
@@ -1064,19 +979,34 @@ size_t ZSTD_decompress(void* dst, size_t dstCapacity, const void* src, size_t sr
 }
 
 
-/*_******************************
-*  Streaming Decompression API
-********************************/
-size_t ZSTD_nextSrcSizeToDecompress(ZSTD_DCtx* dctx)
-{
-    return dctx->expected;
-}
+/*-**********************************
+*   Streaming Decompression API
+************************************/
+size_t ZSTD_nextSrcSizeToDecompress(ZSTD_DCtx* dctx) { return dctx->expected; }
 
-int ZSTD_isSkipFrame(ZSTD_DCtx* dctx)
-{
-    return dctx->stage == ZSTDds_skipFrame;
+ZSTD_nextInputType_e ZSTD_nextInputType(ZSTD_DCtx* dctx) {
+    switch(dctx->stage)
+    {
+    default:   /* should not happen */
+    case ZSTDds_getFrameHeaderSize:
+    case ZSTDds_decodeFrameHeader:
+        return ZSTDnit_frameHeader;
+    case ZSTDds_decodeBlockHeader:
+        return ZSTDnit_blockHeader;
+    case ZSTDds_decompressBlock:
+        return ZSTDnit_block;
+    case ZSTDds_decompressLastBlock:
+        return ZSTDnit_lastBlock;
+    case ZSTDds_checkChecksum:
+        return ZSTDnit_checksum;
+    case ZSTDds_decodeSkippableHeader:
+    case ZSTDds_skipFrame:
+        return ZSTDnit_skippableFrame;
+    }
 }
 
+int ZSTD_isSkipFrame(ZSTD_DCtx* dctx) { return dctx->stage == ZSTDds_skipFrame; }   /* for zbuff */
+
 /** ZSTD_decompressContinue() :
 *   @return : nb of bytes generated into `dst` (necessarily <= `dstCapacity)
 *             or an error code, which can be tested using ZSTD_isError() */
@@ -1119,23 +1049,29 @@ size_t ZSTD_decompressContinue(ZSTD_DCtx* dctx, void* dst, size_t dstCapacity, c
         {   blockProperties_t bp;
             size_t const cBlockSize = ZSTD_getcBlockSize(src, ZSTD_blockHeaderSize, &bp);
             if (ZSTD_isError(cBlockSize)) return cBlockSize;
-            if (bp.blockType == bt_end) {
+            dctx->expected = cBlockSize;
+            dctx->bType = bp.blockType;
+            dctx->rleSize = bp.origSize;
+            if (cBlockSize) {
+                dctx->stage = bp.lastBlock ? ZSTDds_decompressLastBlock : ZSTDds_decompressBlock;
+                return 0;
+            }
+            /* empty block */
+            if (bp.lastBlock) {
                 if (dctx->fParams.checksumFlag) {
-                    U64 const h64 = XXH64_digest(&dctx->xxhState);
-                    U32 const h32 = (U32)(h64>>11) & ((1<<22)-1);
-                    const BYTE* const ip = (const BYTE*)src;
-                    U32 const check32 = ip[2] + (ip[1] << 8) + ((ip[0] & 0x3F) << 16);
-                    if (check32 != h32) return ERROR(checksum_wrong);
+                    dctx->expected = 4;
+                    dctx->stage = ZSTDds_checkChecksum;
+                } else {
+                    dctx->expected = 0; /* end of frame */
+                    dctx->stage = ZSTDds_getFrameHeaderSize;
                 }
-                dctx->expected = 0;
-                dctx->stage = ZSTDds_getFrameHeaderSize;
             } else {
-                dctx->expected = cBlockSize;
-                dctx->bType = bp.blockType;
-                dctx->stage = ZSTDds_decompressBlock;
+                dctx->expected = 3;  /* go directly to next header */
+                dctx->stage = ZSTDds_decodeBlockHeader;
             }
             return 0;
         }
+    case ZSTDds_decompressLastBlock:
     case ZSTDds_decompressBlock:
         {   size_t rSize;
             switch(dctx->bType)
@@ -1147,21 +1083,38 @@ size_t ZSTD_decompressContinue(ZSTD_DCtx* dctx, void* dst, size_t dstCapacity, c
                 rSize = ZSTD_copyRawBlock(dst, dstCapacity, src, srcSize);
                 break;
             case bt_rle :
-                return ERROR(GENERIC);   /* not yet handled */
-                break;
-            case bt_end :   /* should never happen (filtered at phase 1) */
-                rSize = 0;
+                rSize = ZSTD_setRleBlock(dst, dstCapacity, src, srcSize, dctx->rleSize);
                 break;
+            case bt_reserved :   /* should never happen */
             default:
-                return ERROR(GENERIC);   /* impossible */
+                return ERROR(corruption_detected);
             }
-            dctx->stage = ZSTDds_decodeBlockHeader;
-            dctx->expected = ZSTD_blockHeaderSize;
-            dctx->previousDstEnd = (char*)dst + rSize;
             if (ZSTD_isError(rSize)) return rSize;
             if (dctx->fParams.checksumFlag) XXH64_update(&dctx->xxhState, dst, rSize);
+
+            if (dctx->stage == ZSTDds_decompressLastBlock) {   /* end of frame */
+                if (dctx->fParams.checksumFlag) {  /* another round for frame checksum */
+                    dctx->expected = 4;
+                    dctx->stage = ZSTDds_checkChecksum;
+                } else {
+                    dctx->expected = 0;   /* ends here */
+                    dctx->stage = ZSTDds_getFrameHeaderSize;
+                }
+            } else {
+                dctx->stage = ZSTDds_decodeBlockHeader;
+                dctx->expected = ZSTD_blockHeaderSize;
+                dctx->previousDstEnd = (char*)dst + rSize;
+            }
             return rSize;
         }
+    case ZSTDds_checkChecksum:
+        {   U32 const h32 = (U32)XXH64_digest(&dctx->xxhState);
+            U32 const check32 = MEM_readLE32(src);   /* srcSize == 4, guaranteed by dctx->expected */
+            if (check32 != h32) return ERROR(checksum_wrong);
+            dctx->expected = 0;
+            dctx->stage = ZSTDds_getFrameHeaderSize;
+            return 0;
+        }
     case ZSTDds_decodeSkippableHeader:
         {   memcpy(dctx->headerBuffer + ZSTD_frameHeaderSize_min, src, dctx->expected);
             dctx->expected = MEM_readLE32(dctx->headerBuffer + 4);
@@ -1273,8 +1226,8 @@ size_t ZSTD_decompressBegin_usingDict(ZSTD_DCtx* dctx, const void* dict, size_t
 
 
 struct ZSTD_DDict_s {
-    void* dictContent;
-    size_t dictContentSize;
+    void* dict;
+    size_t dictSize;
     ZSTD_DCtx* refContext;
 };  /* typedef'd tp ZSTD_CDict within zstd.h */
 
@@ -1306,8 +1259,8 @@ ZSTD_DDict* ZSTD_createDDict_advanced(const void* dict, size_t dictSize, ZSTD_cu
                 return NULL;
         }   }
 
-        ddict->dictContent = dictContent;
-        ddict->dictContentSize = dictSize;
+        ddict->dict = dictContent;
+        ddict->dictSize = dictSize;
         ddict->refContext = dctx;
         return ddict;
     }
@@ -1327,7 +1280,7 @@ size_t ZSTD_freeDDict(ZSTD_DDict* ddict)
     ZSTD_freeFunction const cFree = ddict->refContext->customMem.customFree;
     void* const opaque = ddict->refContext->customMem.opaque;
     ZSTD_freeDCtx(ddict->refContext);
-    cFree(opaque, ddict->dictContent);
+    cFree(opaque, ddict->dict);
     cFree(opaque, ddict);
     return 0;
 }
@@ -1340,6 +1293,9 @@ ZSTDLIB_API size_t ZSTD_decompress_usingDDict(ZSTD_DCtx* dctx,
                                      const void* src, size_t srcSize,
                                      const ZSTD_DDict* ddict)
 {
+#if defined(ZSTD_LEGACY_SUPPORT) && (ZSTD_LEGACY_SUPPORT==1)
+    if (ZSTD_isLegacy(src, srcSize)) return ZSTD_decompressLegacy(dst, dstCapacity, src, srcSize, ddict->dict, ddict->dictSize);
+#endif
     return ZSTD_decompress_usingPreparedDCtx(dctx, ddict->refContext,
                                            dst, dstCapacity,
                                            src, srcSize);
diff --git a/lib/dictBuilder/zdict.c b/lib/dictBuilder/zdict.c
index 0315094..6c2277b 100644
--- a/lib/dictBuilder/zdict.c
+++ b/lib/dictBuilder/zdict.c
@@ -32,13 +32,14 @@
 */
 
 /*-**************************************
-*  Compiler Options
+*  Tuning parameters
 ****************************************/
-/* Disable some Visual warning messages */
-#ifdef _MSC_VER
-#  pragma warning(disable : 4127)                /* disable: C4127: conditional expression is constant */
-#endif
+#define ZDICT_MAX_SAMPLES_SIZE (2000U << 20)
+
 
+/*-**************************************
+*  Compiler Options
+****************************************/
 /* Unix Large Files support (>4GB) */
 #define _FILE_OFFSET_BITS 64
 #if (defined(__sun__) && (!defined(__LP64__)))   /* Sun Solaris 32-bits requires specific definitions */
@@ -58,13 +59,15 @@
 
 #include "mem.h"           /* read */
 #include "error_private.h"
-#include "fse.h"
+#include "fse.h"           /* FSE_normalizeCount, FSE_writeNCount */
 #define HUF_STATIC_LINKING_ONLY
 #include "huf.h"
 #include "zstd_internal.h" /* includes zstd.h */
 #include "xxhash.h"
 #include "divsufsort.h"
-#define ZDICT_STATIC_LINKING_ONLY
+#ifndef ZDICT_STATIC_LINKING_ONLY
+#  define ZDICT_STATIC_LINKING_ONLY
+#endif
 #include "zdict.h"
 
 
@@ -82,7 +85,7 @@
 #define PRIME2   2246822519U
 
 #define MINRATIO 4
-static const U32 g_compressionLevel_default = 5;
+static const int g_compressionLevel_default = 5;
 static const U32 g_selectivity_default = 9;
 static const size_t g_provision_entropySize = 200;
 static const size_t g_min_fast_dictContent = 192;
@@ -91,17 +94,19 @@ static const size_t g_min_fast_dictContent = 192;
 /*-*************************************
 *  Console display
 ***************************************/
-#define DISPLAY(...)         fprintf(stderr, __VA_ARGS__)
+#define DISPLAY(...)         { fprintf(stderr, __VA_ARGS__); fflush( stderr ); }
 #define DISPLAYLEVEL(l, ...) if (g_displayLevel>=l) { DISPLAY(__VA_ARGS__); }
 static unsigned g_displayLevel = 0;   /* 0 : no display;   1: errors;   2: default;  4: full information */
 
 #define DISPLAYUPDATE(l, ...) if (g_displayLevel>=l) { \
-            if (ZDICT_GetMilliSpan(g_time) > refreshRate)  \
+            if (ZDICT_clockSpan(g_time) > refreshRate)  \
             { g_time = clock(); DISPLAY(__VA_ARGS__); \
             if (g_displayLevel>=4) fflush(stdout); } }
-static const unsigned refreshRate = 300;
+static const clock_t refreshRate = CLOCKS_PER_SEC * 3 / 10;
 static clock_t g_time = 0;
 
+static clock_t ZDICT_clockSpan(clock_t nPrevious) { return clock() - nPrevious; }
+
 static void ZDICT_printHex(U32 dlevel, const void* ptr, size_t length)
 {
     const BYTE* const b = (const BYTE*)ptr;
@@ -117,13 +122,6 @@ static void ZDICT_printHex(U32 dlevel, const void* ptr, size_t length)
 /*-********************************************************
 *  Helper functions
 **********************************************************/
-static unsigned ZDICT_GetMilliSpan(clock_t nPrevious)
-{
-    clock_t nCurrent = clock();
-    unsigned nSpan = (unsigned)(((nCurrent - nPrevious) * 1000) / CLOCKS_PER_SEC);
-    return nSpan;
-}
-
 unsigned ZDICT_isError(size_t errorCode) { return ERR_isError(errorCode); }
 
 const char* ZDICT_getErrorName(size_t errorCode) { return ERR_getErrorName(errorCode); }
@@ -286,7 +284,7 @@ static dictItem ZDICT_analyzePos(
         U32 refinedEnd = end;
 
         DISPLAYLEVEL(4, "\n");
-        DISPLAYLEVEL(4, "found %3u matches of length >= %u at pos %7u  ", (U32)(end-start), MINMATCHLENGTH, (U32)pos);
+        DISPLAYLEVEL(4, "found %3u matches of length >= %i at pos %7u  ", (U32)(end-start), MINMATCHLENGTH, (U32)pos);
         DISPLAYLEVEL(4, "\n");
 
         for (searchLength = MINMATCHLENGTH ; ; searchLength++) {
@@ -489,17 +487,15 @@ static U32 ZDICT_dictSize(const dictItem* dictList)
 
 
 static size_t ZDICT_trainBuffer(dictItem* dictList, U32 dictListSize,
-                            const void* const buffer, const size_t bufferSize,   /* buffer must end with noisy guard band */
+                            const void* const buffer, size_t bufferSize,   /* buffer must end with noisy guard band */
                             const size_t* fileSizes, unsigned nbFiles,
-                            U32 shiftRatio, unsigned maxDictSize)
+                            U32 minRatio)
 {
     int* const suffix0 = (int*)malloc((bufferSize+2)*sizeof(*suffix0));
     int* const suffix = suffix0+1;
     U32* reverseSuffix = (U32*)malloc((bufferSize)*sizeof(*reverseSuffix));
     BYTE* doneMarks = (BYTE*)malloc((bufferSize+16)*sizeof(*doneMarks));   /* +16 for overflow security */
     U32* filePos = (U32*)malloc(nbFiles * sizeof(*filePos));
-    U32 minRatio = nbFiles >> shiftRatio;
-    int divSuftSortResult;
     size_t result = 0;
 
     /* init */
@@ -511,15 +507,19 @@ static size_t ZDICT_trainBuffer(dictItem* dictList, U32 dictListSize,
     if (minRatio < MINRATIO) minRatio = MINRATIO;
     memset(doneMarks, 0, bufferSize+16);
 
+    /* limit sample set size (divsufsort limitation)*/
+    if (bufferSize > ZDICT_MAX_SAMPLES_SIZE) DISPLAYLEVEL(3, "sample set too large : reduced to %u MB ...\n", (U32)(ZDICT_MAX_SAMPLES_SIZE>>20));
+    while (bufferSize > ZDICT_MAX_SAMPLES_SIZE) bufferSize -= fileSizes[--nbFiles];
+
     /* sort */
     DISPLAYLEVEL(2, "sorting %u files of total size %u MB ...\n", nbFiles, (U32)(bufferSize>>20));
-    divSuftSortResult = divsufsort((const unsigned char*)buffer, suffix, (int)bufferSize, 0);
-    if (divSuftSortResult != 0) { result = ERROR(GENERIC); goto _cleanup; }
+    {   int const divSuftSortResult = divsufsort((const unsigned char*)buffer, suffix, (int)bufferSize, 0);
+        if (divSuftSortResult != 0) { result = ERROR(GENERIC); goto _cleanup; }
+    }
     suffix[bufferSize] = (int)bufferSize;   /* leads into noise */
     suffix0[0] = (int)bufferSize;           /* leads into noise */
-    {
-        /* build reverse suffix sort */
-        size_t pos;
+    /* build reverse suffix sort */
+    {   size_t pos;
         for (pos=0; pos < bufferSize; pos++)
             reverseSuffix[suffix[pos]] = (U32)pos;
         /* build file pos */
@@ -541,16 +541,6 @@ static size_t ZDICT_trainBuffer(dictItem* dictList, U32 dictListSize,
             DISPLAYUPDATE(2, "\r%4.2f %% \r", (double)cursor / bufferSize * 100);
     }   }
 
-    /* limit dictionary size */
-    {   U32 const max = dictList->pos;   /* convention : nb of useful elts within dictList */
-        U32 currentSize = 0;
-        U32 n; for (n=1; n<max; n++) {
-            currentSize += dictList[n].length;
-            if (currentSize > maxDictSize) break;
-        }
-        dictList->pos = n;
-    }
-
 _cleanup:
     free(suffix0);
     free(reverseSuffix);
@@ -575,20 +565,23 @@ typedef struct
 {
     ZSTD_CCtx* ref;
     ZSTD_CCtx* zc;
-    void* workPlace;   /* must be ZSTD_BLOCKSIZE_MAX allocated */
+    void* workPlace;   /* must be ZSTD_BLOCKSIZE_ABSOLUTEMAX allocated */
 } EStats_ress_t;
 
 #define MAXREPOFFSET 1024
 
-static void ZDICT_countEStats(EStats_ress_t esr,
+static void ZDICT_countEStats(EStats_ress_t esr, ZSTD_parameters params,
                             U32* countLit, U32* offsetcodeCount, U32* matchlengthCount, U32* litlengthCount, U32* repOffsets,
                             const void* src, size_t srcSize)
 {
+    size_t const blockSizeMax = MIN (ZSTD_BLOCKSIZE_ABSOLUTEMAX, 1 << params.cParams.windowLog);
     size_t cSize;
 
-    if (srcSize > ZSTD_BLOCKSIZE_MAX) srcSize = ZSTD_BLOCKSIZE_MAX;   /* protection vs large samples */
-    ZSTD_copyCCtx(esr.zc, esr.ref);
-    cSize = ZSTD_compressBlock(esr.zc, esr.workPlace, ZSTD_BLOCKSIZE_MAX, src, srcSize);
+    if (srcSize > blockSizeMax) srcSize = blockSizeMax;   /* protection vs large samples */
+	{	size_t const errorCode = ZSTD_copyCCtx(esr.zc, esr.ref);
+		if (ZSTD_isError(errorCode)) { DISPLAYLEVEL(1, "warning : ZSTD_copyCCtx failed \n"); return; }
+	}
+    cSize = ZSTD_compressBlock(esr.zc, esr.workPlace, ZSTD_BLOCKSIZE_ABSOLUTEMAX, src, srcSize);
     if (ZSTD_isError(cSize)) { DISPLAYLEVEL(1, "warning : could not compress sample size %u \n", (U32)srcSize); return; }
 
     if (cSize) {  /* if == 0; block is not compressible */
@@ -601,28 +594,28 @@ static void ZDICT_countEStats(EStats_ress_t esr,
         }
 
         /* seqStats */
-        {   size_t const nbSeq = (size_t)(seqStorePtr->offset - seqStorePtr->offsetStart);
-            ZSTD_seqToCodes(seqStorePtr, nbSeq);
+        {   U32 const nbSeq = (U32)(seqStorePtr->sequences - seqStorePtr->sequencesStart);
+            ZSTD_seqToCodes(seqStorePtr);
 
-            {   const BYTE* codePtr = seqStorePtr->offCodeStart;
-                size_t u;
+            {   const BYTE* codePtr = seqStorePtr->ofCode;
+                U32 u;
                 for (u=0; u<nbSeq; u++) offsetcodeCount[codePtr[u]]++;
             }
 
-            {   const BYTE* codePtr = seqStorePtr->mlCodeStart;
-                size_t u;
+            {   const BYTE* codePtr = seqStorePtr->mlCode;
+                U32 u;
                 for (u=0; u<nbSeq; u++) matchlengthCount[codePtr[u]]++;
             }
 
-            {   const BYTE* codePtr = seqStorePtr->llCodeStart;
-                size_t u;
+            {   const BYTE* codePtr = seqStorePtr->llCode;
+                U32 u;
                 for (u=0; u<nbSeq; u++) litlengthCount[codePtr[u]]++;
         }   }
 
         /* rep offsets */
-        {   const U32* const offsetPtr = seqStorePtr->offsetStart;
-            U32 offset1 = offsetPtr[0] - 3;
-            U32 offset2 = offsetPtr[1] - 3;
+        {   const seqDef* const seq = seqStorePtr->sequences;
+            U32 offset1 = seq[0].offset - 3;
+            U32 offset2 = seq[1].offset - 3;
             if (offset1 >= MAXREPOFFSET) offset1 = 0;
             if (offset2 >= MAXREPOFFSET) offset2 = 0;
             repOffsets[offset1] += 3;
@@ -667,7 +660,7 @@ static void ZDICT_insertSortCount(offsetCount_t table[ZSTD_REP_NUM+1], U32 val,
 }
 
 
-#define OFFCODE_MAX 18  /* only applicable to first block */
+#define OFFCODE_MAX 30  /* only applicable to first block */
 static size_t ZDICT_analyzeEntropy(void*  dstBuffer, size_t maxDstSize,
                                  unsigned compressionLevel,
                            const void*  srcBuffer, const size_t* fileSizes, unsigned nbFiles,
@@ -677,6 +670,7 @@ static size_t ZDICT_analyzeEntropy(void*  dstBuffer, size_t maxDstSize,
     HUF_CREATE_STATIC_CTABLE(hufTable, 255);
     U32 offcodeCount[OFFCODE_MAX+1];
     short offcodeNCount[OFFCODE_MAX+1];
+    U32 offcodeMax = ZSTD_highbit32((U32)(dictBufferSize + 128 KB));
     U32 matchLengthCount[MaxML+1];
     short matchLengthNCount[MaxML+1];
     U32 litLengthCount[MaxLL+1];
@@ -685,7 +679,7 @@ static size_t ZDICT_analyzeEntropy(void*  dstBuffer, size_t maxDstSize,
     offsetCount_t bestRepOffset[ZSTD_REP_NUM+1];
     EStats_ress_t esr;
     ZSTD_parameters params;
-    U32 u, huffLog = 12, Offlog = OffFSELog, mlLog = MLFSELog, llLog = LLFSELog, total;
+    U32 u, huffLog = 11, Offlog = OffFSELog, mlLog = MLFSELog, llLog = LLFSELog, total;
     size_t pos = 0, errorCode;
     size_t eSize = 0;
     size_t const totalSrcSize = ZDICT_totalSampleSize(fileSizes, nbFiles);
@@ -693,29 +687,33 @@ static size_t ZDICT_analyzeEntropy(void*  dstBuffer, size_t maxDstSize,
     BYTE* dstPtr = (BYTE*)dstBuffer;
 
     /* init */
+    if (offcodeMax>OFFCODE_MAX) { eSize = ERROR(dictionary_wrong); goto _cleanup; }   /* too large dictionary */
     for (u=0; u<256; u++) countLit[u]=1;   /* any character must be described */
-    for (u=0; u<=OFFCODE_MAX; u++) offcodeCount[u]=1;
+    for (u=0; u<=offcodeMax; u++) offcodeCount[u]=1;
     for (u=0; u<=MaxML; u++) matchLengthCount[u]=1;
     for (u=0; u<=MaxLL; u++) litLengthCount[u]=1;
     repOffset[1] = repOffset[4] = repOffset[8] = 1;
     memset(bestRepOffset, 0, sizeof(bestRepOffset));
     esr.ref = ZSTD_createCCtx();
     esr.zc = ZSTD_createCCtx();
-    esr.workPlace = malloc(ZSTD_BLOCKSIZE_MAX);
+    esr.workPlace = malloc(ZSTD_BLOCKSIZE_ABSOLUTEMAX);
     if (!esr.ref || !esr.zc || !esr.workPlace) {
             eSize = ERROR(memory_allocation);
             DISPLAYLEVEL(1, "Not enough memory");
             goto _cleanup;
     }
     if (compressionLevel==0) compressionLevel=g_compressionLevel_default;
-    params.cParams = ZSTD_getCParams(compressionLevel, averageSampleSize, dictBufferSize);
-    params.cParams.strategy = ZSTD_greedy;
-    params.fParams.contentSizeFlag = 0;
-    ZSTD_compressBegin_advanced(esr.ref, dictBuffer, dictBufferSize, params, 0);
+    params = ZSTD_getParams(compressionLevel, averageSampleSize, dictBufferSize);
+	{	size_t const beginResult = ZSTD_compressBegin_advanced(esr.ref, dictBuffer, dictBufferSize, params, 0);
+		if (ZSTD_isError(beginResult)) {
+			eSize = ERROR(GENERIC);
+			DISPLAYLEVEL(1, "error : ZSTD_compressBegin_advanced failed ");
+			goto _cleanup;
+	}	}
 
     /* collect stats on all files */
     for (u=0; u<nbFiles; u++) {
-        ZDICT_countEStats(esr,
+        ZDICT_countEStats(esr, params,
                         countLit, offcodeCount, matchLengthCount, litLengthCount, repOffset,
            (const char*)srcBuffer + pos, fileSizes[u]);
         pos += fileSizes[u];
@@ -737,8 +735,8 @@ static size_t ZDICT_analyzeEntropy(void*  dstBuffer, size_t maxDstSize,
     }
     /* note : the result of this phase should be used to better appreciate the impact on statistics */
 
-    total=0; for (u=0; u<=OFFCODE_MAX; u++) total+=offcodeCount[u];
-    errorCode = FSE_normalizeCount(offcodeNCount, Offlog, offcodeCount, total, OFFCODE_MAX);
+    total=0; for (u=0; u<=offcodeMax; u++) total+=offcodeCount[u];
+    errorCode = FSE_normalizeCount(offcodeNCount, Offlog, offcodeCount, total, offcodeMax);
     if (FSE_isError(errorCode)) {
         eSize = ERROR(GENERIC);
         DISPLAYLEVEL(1, "FSE_normalizeCount error with offcodeCount");
@@ -838,56 +836,18 @@ _cleanup:
 }
 
 
-#define DIB_FASTSEGMENTSIZE 64
-/*! ZDICT_fastSampling()  (based on an idea proposed by Giuseppe Ottaviano) :
-    Fill `dictBuffer` with stripes of size DIB_FASTSEGMENTSIZE from `samplesBuffer`,
-    up to `dictSize`.
-    Filling starts from the end of `dictBuffer`, down to maximum possible.
-    if `dictSize` is not a multiply of DIB_FASTSEGMENTSIZE, some bytes at beginning of `dictBuffer` won't be used.
-    @return : amount of data written into `dictBuffer`,
-              or an error code
-*/
-static size_t ZDICT_fastSampling(void* dictBuffer, size_t dictSize,
-                         const void* samplesBuffer, size_t samplesSize)
-{
-    char* dstPtr = (char*)dictBuffer + dictSize;
-    const char* srcPtr = (const char*)samplesBuffer;
-    size_t const nbSegments = dictSize / DIB_FASTSEGMENTSIZE;
-    size_t segNb, interSize;
-
-    if (nbSegments <= 2) return ERROR(srcSize_wrong);
-    if (samplesSize < dictSize) return ERROR(srcSize_wrong);
-
-    /* first and last segments are part of dictionary, in case they contain interesting header/footer */
-    dstPtr -= DIB_FASTSEGMENTSIZE;
-    memcpy(dstPtr, srcPtr, DIB_FASTSEGMENTSIZE);
-    dstPtr -= DIB_FASTSEGMENTSIZE;
-    memcpy(dstPtr, srcPtr+samplesSize-DIB_FASTSEGMENTSIZE, DIB_FASTSEGMENTSIZE);
-
-    /* regularly copy a segment */
-    interSize = (samplesSize - nbSegments*DIB_FASTSEGMENTSIZE) / (nbSegments-1);
-    srcPtr += DIB_FASTSEGMENTSIZE;
-    for (segNb=2; segNb < nbSegments; segNb++) {
-        srcPtr += interSize;
-        dstPtr -= DIB_FASTSEGMENTSIZE;
-        memcpy(dstPtr, srcPtr, DIB_FASTSEGMENTSIZE);
-        srcPtr += DIB_FASTSEGMENTSIZE;
-    }
-
-    return nbSegments * DIB_FASTSEGMENTSIZE;
-}
-
 size_t ZDICT_addEntropyTablesFromBuffer_advanced(void* dictBuffer, size_t dictContentSize, size_t dictBufferCapacity,
                                                  const void* samplesBuffer, const size_t* samplesSizes, unsigned nbSamples,
                                                  ZDICT_params_t params)
 {
     size_t hSize;
-    unsigned const compressionLevel = (params.compressionLevel == 0) ? g_compressionLevel_default : params.compressionLevel;
+    int const compressionLevel = (params.compressionLevel <= 0) ? g_compressionLevel_default : params.compressionLevel;
 
     /* dictionary header */
     MEM_writeLE32(dictBuffer, ZSTD_DICT_MAGIC);
     {   U64 const randomID = XXH64((char*)dictBuffer + dictBufferCapacity - dictContentSize, dictContentSize, 0);
-        U32 const dictID = params.dictID ? params.dictID : (U32)(randomID>>11);
+        U32 const compliantID = (randomID % ((1U<<31)-32768)) + 32768;
+        U32 const dictID = params.dictID ? params.dictID : compliantID;
         MEM_writeLE32((char*)dictBuffer+4, dictID);
     }
     hSize = 8;
@@ -905,60 +865,88 @@ size_t ZDICT_addEntropyTablesFromBuffer_advanced(void* dictBuffer, size_t dictCo
     return MIN(dictBufferCapacity, hSize+dictContentSize);
 }
 
-#define DIB_MINSAMPLESSIZE (DIB_FASTSEGMENTSIZE*3)
+
+#define DIB_MINSAMPLESSIZE 512
 /*! ZDICT_trainFromBuffer_unsafe() :
-*   `samplesBuffer` must be followed by noisy guard band.
-*   @return : size of dictionary.
+*   Warning : `samplesBuffer` must be followed by noisy guard band.
+*   @return : size of dictionary, or an error code which can be tested with ZDICT_isError()
 */
 size_t ZDICT_trainFromBuffer_unsafe(
                             void* dictBuffer, size_t maxDictSize,
                             const void* samplesBuffer, const size_t* samplesSizes, unsigned nbSamples,
                             ZDICT_params_t params)
 {
-    U32 const dictListSize = MAX( MAX(DICTLISTSIZE, nbSamples), (U32)(maxDictSize/16));
+    U32 const dictListSize = MAX(MAX(DICTLISTSIZE, nbSamples), (U32)(maxDictSize/16));
     dictItem* const dictList = (dictItem*)malloc(dictListSize * sizeof(*dictList));
-    unsigned selectivity = params.selectivityLevel;
+    unsigned const selectivity = params.selectivityLevel == 0 ? g_selectivity_default : params.selectivityLevel;
+    unsigned const minRep = (selectivity > 30) ? MINRATIO : nbSamples >> selectivity;
     size_t const targetDictSize = maxDictSize;
-    size_t sBuffSize;
+    size_t const samplesBuffSize = ZDICT_totalSampleSize(samplesSizes, nbSamples);
     size_t dictSize = 0;
 
     /* checks */
-    if (maxDictSize <= g_provision_entropySize + g_min_fast_dictContent) return ERROR(dstSize_tooSmall);
     if (!dictList) return ERROR(memory_allocation);
+    if (maxDictSize <= g_provision_entropySize + g_min_fast_dictContent) { free(dictList); return ERROR(dstSize_tooSmall); }
+    if (samplesBuffSize < DIB_MINSAMPLESSIZE) { free(dictList); return 0; }   /* not enough source to create dictionary */
 
     /* init */
-    { unsigned u; for (u=0, sBuffSize=0; u<nbSamples; u++) sBuffSize += samplesSizes[u]; }
-    if (sBuffSize < DIB_MINSAMPLESSIZE) return 0;   /* not enough source to create dictionary */
     ZDICT_initDictItem(dictList);
     g_displayLevel = params.notificationLevel;
-    if (selectivity==0) selectivity = g_selectivity_default;
 
     /* build dictionary */
-    if (selectivity>1) {  /* selectivity == 1 => fast mode */
-        ZDICT_trainBuffer(dictList, dictListSize,
-                        samplesBuffer, sBuffSize,
-                        samplesSizes, nbSamples,
-                        selectivity, (U32)targetDictSize);
-
-        /* display best matches */
-        if (g_displayLevel>= 3) {
-            U32 const nb = 25;
-            U32 const dictContentSize = ZDICT_dictSize(dictList);
-            U32 u;
-            DISPLAYLEVEL(3, "\n %u segments found, of total size %u \n", dictList[0].pos, dictContentSize);
-            DISPLAYLEVEL(3, "list %u best segments \n", nb);
-            for (u=1; u<=nb; u++) {
-                U32 p = dictList[u].pos;
-                U32 l = dictList[u].length;
-                U32 d = MIN(40, l);
-                DISPLAYLEVEL(3, "%3u:%3u bytes at pos %8u, savings %7u bytes |",
-                             u, l, p, dictList[u].savings);
-                ZDICT_printHex(3, (const char*)samplesBuffer+p, d);
-                DISPLAYLEVEL(3, "| \n");
-    }   }   }
+    ZDICT_trainBuffer(dictList, dictListSize,
+                    samplesBuffer, samplesBuffSize,
+                    samplesSizes, nbSamples,
+                    minRep);
+
+    /* display best matches */
+    if (g_displayLevel>= 3) {
+        U32 const nb = 25;
+        U32 const dictContentSize = ZDICT_dictSize(dictList);
+        U32 u;
+        DISPLAYLEVEL(3, "\n %u segments found, of total size %u \n", dictList[0].pos, dictContentSize);
+        DISPLAYLEVEL(3, "list %u best segments \n", nb);
+        for (u=1; u<=nb; u++) {
+            U32 pos = dictList[u].pos;
+            U32 length = dictList[u].length;
+            U32 printedLength = MIN(40, length);
+            DISPLAYLEVEL(3, "%3u:%3u bytes at pos %8u, savings %7u bytes |",
+                         u, length, pos, dictList[u].savings);
+            ZDICT_printHex(3, (const char*)samplesBuffer+pos, printedLength);
+            DISPLAYLEVEL(3, "| \n");
+    }   }
+
 
     /* create dictionary */
     {   U32 dictContentSize = ZDICT_dictSize(dictList);
+        if (dictContentSize < targetDictSize/2) {
+            DISPLAYLEVEL(2, "!  warning : created dictionary significantly smaller than requested (%u < %u) \n", dictContentSize, (U32)maxDictSize);
+            if (minRep > MINRATIO) {
+                DISPLAYLEVEL(2, "!  consider increasing selectivity to produce larger dictionary (-s%u) \n", selectivity+1);
+                DISPLAYLEVEL(2, "!  note : larger dictionaries are not necessarily better, test its efficiency on samples \n");
+            }
+            if (samplesBuffSize < 10 * targetDictSize)
+                DISPLAYLEVEL(2, "!  consider increasing the number of samples (total size : %u MB)\n", (U32)(samplesBuffSize>>20));
+        }
+
+        if ((dictContentSize > targetDictSize*2) && (nbSamples > 2*MINRATIO) && (selectivity>1)) {
+            U32 proposedSelectivity = selectivity-1;
+            while ((nbSamples >> proposedSelectivity) <= MINRATIO) { proposedSelectivity--; }
+            DISPLAYLEVEL(2, "!  note : calculated dictionary significantly larger than requested (%u > %u) \n", dictContentSize, (U32)maxDictSize);
+            DISPLAYLEVEL(2, "!  you may consider decreasing selectivity to produce denser dictionary (-s%u) \n", proposedSelectivity);
+            DISPLAYLEVEL(2, "!  but test its efficiency on samples \n");
+        }
+
+        /* limit dictionary size */
+        {   U32 const max = dictList->pos;   /* convention : nb of useful elts within dictList */
+            U32 currentSize = 0;
+            U32 n; for (n=1; n<max; n++) {
+                currentSize += dictList[n].length;
+                if (currentSize > targetDictSize) { currentSize -= dictList[n].length; break; }
+            }
+            dictList->pos = n;
+            dictContentSize = currentSize;
+        }
 
         /* build dict content */
         {   U32 u;
@@ -966,18 +954,10 @@ size_t ZDICT_trainFromBuffer_unsafe(
             for (u=1; u<dictList->pos; u++) {
                 U32 l = dictList[u].length;
                 ptr -= l;
-                if (ptr<(BYTE*)dictBuffer) return ERROR(GENERIC);   /* should not happen */
+                if (ptr<(BYTE*)dictBuffer) { free(dictList); return ERROR(GENERIC); }   /* should not happen */
                 memcpy(ptr, (const char*)samplesBuffer+dictList[u].pos, l);
         }   }
 
-        /* fast mode dict content */
-        if (selectivity==1) {  /* note could also be used to complete a dictionary, but not necessarily better */
-            DISPLAYLEVEL(3, "\r%70s\r", "");   /* clean display line */
-            DISPLAYLEVEL(3, "Adding %u KB with fast sampling \n", (U32)(targetDictSize>>10));
-            dictContentSize = (U32)ZDICT_fastSampling(dictBuffer, targetDictSize,
-                                                      samplesBuffer, sBuffSize);
-        }
-
         dictSize = ZDICT_addEntropyTablesFromBuffer_advanced(dictBuffer, dictContentSize, maxDictSize,
                                                              samplesBuffer, samplesSizes, nbSamples,
                                                              params);
@@ -995,23 +975,23 @@ size_t ZDICT_trainFromBuffer_advanced(void* dictBuffer, size_t dictBufferCapacit
                                       const void* samplesBuffer, const size_t* samplesSizes, unsigned nbSamples,
                                       ZDICT_params_t params)
 {
+    size_t result;
     void* newBuff;
-    size_t sBuffSize;
+    size_t const sBuffSize = ZDICT_totalSampleSize(samplesSizes, nbSamples);
+    if (sBuffSize < DIB_MINSAMPLESSIZE) return 0;   /* not enough content => no dictionary */
 
-    { unsigned u; for (u=0, sBuffSize=0; u<nbSamples; u++) sBuffSize += samplesSizes[u]; }
-    if (sBuffSize==0) return 0;   /* empty content => no dictionary */
     newBuff = malloc(sBuffSize + NOISELENGTH);
     if (!newBuff) return ERROR(memory_allocation);
 
     memcpy(newBuff, samplesBuffer, sBuffSize);
     ZDICT_fillNoise((char*)newBuff + sBuffSize, NOISELENGTH);   /* guard band, for end of buffer condition */
 
-    { size_t const result = ZDICT_trainFromBuffer_unsafe(
+    result = ZDICT_trainFromBuffer_unsafe(
                                         dictBuffer, dictBufferCapacity,
                                         newBuff, samplesSizes, nbSamples,
                                         params);
-      free(newBuff);
-      return result; }
+    free(newBuff);
+    return result;
 }
 
 
diff --git a/lib/dictBuilder/zdict.h b/lib/dictBuilder/zdict.h
index 39acdf8..d61b592 100644
--- a/lib/dictBuilder/zdict.h
+++ b/lib/dictBuilder/zdict.h
@@ -38,43 +38,28 @@
 extern "C" {
 #endif
 
-/*-*************************************
-*  Public functions
-***************************************/
 /*! ZDICT_trainFromBuffer() :
-    Train a dictionary from a memory buffer `samplesBuffer`,
-    where `nbSamples` samples have been stored concatenated.
-    Each sample size is provided into an orderly table `samplesSizes`.
-    Resulting dictionary will be saved into `dictBuffer`.
+    Train a dictionary from an array of samples.
+    Samples must be stored concatenated in a single flat buffer `samplesBuffer`,
+    supplied with an array of sizes `samplesSizes`, providing the size of each sample, in order.
+    The resulting dictionary will be saved into `dictBuffer`.
     @return : size of dictionary stored into `dictBuffer` (<= `dictBufferCapacity`)
-              or an error code, which can be tested by ZDICT_isError().
+              or an error code, which can be tested with ZDICT_isError().
+    Tips : In general, a reasonable dictionary has a size of ~ 100 KB.
+           It's obviously possible to target smaller or larger ones, just by specifying different `dictBufferCapacity`.
+           In general, it's recommended to provide a few thousands samples, but this can vary a lot.
+           It's recommended that total size of all samples be about ~x100 times the target size of dictionary.
 */
 size_t ZDICT_trainFromBuffer(void* dictBuffer, size_t dictBufferCapacity,
-                             const void* samplesBuffer, const size_t* samplesSizes, unsigned nbSamples);
-
-/*! ZDICT_addEntropyTablesFromBuffer() :
-
-    Given a content-only dictionary (built for example from common strings in
-    the input), add entropy tables computed from the memory buffer
-    `samplesBuffer`, where `nbSamples` samples have been stored concatenated.
-    Each sample size is provided into an orderly table `samplesSizes`.
-
-    The input dictionary is the last `dictContentSize` bytes of `dictBuffer`. The
-    resulting dictionary with added entropy tables will written back to
-    `dictBuffer`.
-    @return : size of dictionary stored into `dictBuffer` (<= `dictBufferCapacity`).
-*/
-size_t ZDICT_addEntropyTablesFromBuffer(void* dictBuffer, size_t dictContentSize, size_t dictBufferCapacity,
-                                        const void* samplesBuffer, const size_t* samplesSizes, unsigned nbSamples);
+                       const void* samplesBuffer, const size_t* samplesSizes, unsigned nbSamples);
 
 
-/*-*************************************
-*  Helper functions
-***************************************/
+/*======   Helper functions   ======*/
 unsigned ZDICT_isError(size_t errorCode);
 const char* ZDICT_getErrorName(size_t errorCode);
 
 
+
 #ifdef ZDICT_STATIC_LINKING_ONLY
 
 /* ====================================================================================
@@ -84,32 +69,44 @@ const char* ZDICT_getErrorName(size_t errorCode);
  * Use them only in association with static linking.
  * ==================================================================================== */
 
-
-/*-*************************************
-*  Public type
-***************************************/
 typedef struct {
-    unsigned selectivityLevel;   /* 0 means default; larger => bigger selection => larger dictionary */
-    unsigned compressionLevel;   /* 0 means default; target a specific zstd compression level */
+    unsigned selectivityLevel;   /* 0 means default; larger => select more => larger dictionary */
+    int      compressionLevel;   /* 0 means default; target a specific zstd compression level */
     unsigned notificationLevel;  /* Write to stderr; 0 = none (default); 1 = errors; 2 = progression; 3 = details; 4 = debug; */
     unsigned dictID;             /* 0 means auto mode (32-bits random value); other : force dictID value */
     unsigned reserved[2];        /* space for future parameters */
 } ZDICT_params_t;
 
 
-/*-*************************************
-*  Public functions
-***************************************/
 /*! ZDICT_trainFromBuffer_advanced() :
     Same as ZDICT_trainFromBuffer() with control over more parameters.
     `parameters` is optional and can be provided with values set to 0 to mean "default".
-    @return : size of dictionary stored into `dictBuffer` (<= `dictBufferSize`)
+    @return : size of dictionary stored into `dictBuffer` (<= `dictBufferSize`),
               or an error code, which can be tested by ZDICT_isError().
-    note : ZDICT_trainFromBuffer_advanced() will send notifications into stderr if instructed to, using ZDICT_setNotificationLevel()
+    note : ZDICT_trainFromBuffer_advanced() will send notifications into stderr if instructed to, using notificationLevel>0.
 */
 size_t ZDICT_trainFromBuffer_advanced(void* dictBuffer, size_t dictBufferCapacity,
-                             const void* samplesBuffer, const size_t* samplesSizes, unsigned nbSamples,
-                             ZDICT_params_t parameters);
+                                const void* samplesBuffer, const size_t* samplesSizes, unsigned nbSamples,
+                                ZDICT_params_t parameters);
+
+
+/*! ZDICT_addEntropyTablesFromBuffer() :
+
+    Given a content-only dictionary (built using any 3rd party algorithm),
+    add entropy tables computed from an array of samples.
+    Samples must be stored concatenated in a flat buffer `samplesBuffer`,
+    supplied with an array of sizes `samplesSizes`, providing the size of each sample in order.
+
+    The input dictionary content must be stored *at the end* of `dictBuffer`.
+    Its size is `dictContentSize`.
+    The resulting dictionary with added entropy tables will be *written back to `dictBuffer`*,
+    starting from its beginning.
+    @return : size of dictionary stored into `dictBuffer` (<= `dictBufferCapacity`).
+*/
+size_t ZDICT_addEntropyTablesFromBuffer(void* dictBuffer, size_t dictContentSize, size_t dictBufferCapacity,
+                                        const void* samplesBuffer, const size_t* samplesSizes, unsigned nbSamples);
+
+
 
 #endif   /* ZDICT_STATIC_LINKING_ONLY */
 
@@ -117,4 +114,4 @@ size_t ZDICT_trainFromBuffer_advanced(void* dictBuffer, size_t dictBufferCapacit
 }
 #endif
 
-#endif
+#endif   /* DICTBUILDER_H_001 */
diff --git a/lib/legacy/zstd_legacy.h b/lib/legacy/zstd_legacy.h
index 22921be..6c2a101 100644
--- a/lib/legacy/zstd_legacy.h
+++ b/lib/legacy/zstd_legacy.h
@@ -48,14 +48,18 @@ extern "C" {
 #include "zstd_v04.h"
 #include "zstd_v05.h"
 #include "zstd_v06.h"
+#include "zstd_v07.h"
 
 
 /** ZSTD_isLegacy() :
     @return : > 0 if supported by legacy decoder. 0 otherwise.
               return value is the version.
 */
-MEM_STATIC unsigned ZSTD_isLegacy (U32 magicNumberLE)
+MEM_STATIC unsigned ZSTD_isLegacy(const void* src, size_t srcSize)
 {
+    U32 magicNumberLE;
+    if (srcSize<4) return 0;
+    magicNumberLE = MEM_readLE32(src);
     switch(magicNumberLE)
     {
         case ZSTDv01_magicNumberLE:return 1;
@@ -64,28 +68,57 @@ MEM_STATIC unsigned ZSTD_isLegacy (U32 magicNumberLE)
         case ZSTDv04_magicNumber : return 4;
         case ZSTDv05_MAGICNUMBER : return 5;
         case ZSTDv06_MAGICNUMBER : return 6;
+        case ZSTDv07_MAGICNUMBER : return 7;
         default : return 0;
     }
 }
 
 
+MEM_STATIC unsigned long long ZSTD_getDecompressedSize_legacy(const void* src, size_t srcSize)
+{
+    if (srcSize < 4) return 0;
+
+    {   U32 const version = ZSTD_isLegacy(src, srcSize);
+        if (version < 5) return 0;  /* no decompressed size in frame header, or not a legacy format */
+        if (version==5) {
+            ZSTDv05_parameters fParams;
+            size_t const frResult = ZSTDv05_getFrameParams(&fParams, src, srcSize);
+            if (frResult != 0) return 0;
+            return fParams.srcSize;
+        }
+        if (version==6) {
+            ZSTDv06_frameParams fParams;
+            size_t const frResult = ZSTDv06_getFrameParams(&fParams, src, srcSize);
+            if (frResult != 0) return 0;
+            return fParams.frameContentSize;
+        }
+        if (version==7) {
+            ZSTDv07_frameParams fParams;
+            size_t const frResult = ZSTDv07_getFrameParams(&fParams, src, srcSize);
+            if (frResult != 0) return 0;
+            return fParams.frameContentSize;
+        }
+        return 0;   /* should not be possible */
+    }
+}
+
 MEM_STATIC size_t ZSTD_decompressLegacy(
                      void* dst, size_t dstCapacity,
                const void* src, size_t compressedSize,
-               const void* dict,size_t dictSize,
-                     U32 magicNumberLE)
+               const void* dict,size_t dictSize)
 {
-    switch(magicNumberLE)
+    U32 const version = ZSTD_isLegacy(src, compressedSize);
+    switch(version)
     {
-        case ZSTDv01_magicNumberLE :
+        case 1 :
             return ZSTDv01_decompress(dst, dstCapacity, src, compressedSize);
-        case ZSTDv02_magicNumber :
+        case 2 :
             return ZSTDv02_decompress(dst, dstCapacity, src, compressedSize);
-        case ZSTDv03_magicNumber :
+        case 3 :
             return ZSTDv03_decompress(dst, dstCapacity, src, compressedSize);
-        case ZSTDv04_magicNumber :
+        case 4 :
             return ZSTDv04_decompress(dst, dstCapacity, src, compressedSize);
-        case ZSTDv05_MAGICNUMBER :
+        case 5 :
             {   size_t result;
                 ZSTDv05_DCtx* const zd = ZSTDv05_createDCtx();
                 if (zd==NULL) return ERROR(memory_allocation);
@@ -93,7 +126,7 @@ MEM_STATIC size_t ZSTD_decompressLegacy(
                 ZSTDv05_freeDCtx(zd);
                 return result;
             }
-        case ZSTDv06_MAGICNUMBER :
+        case 6 :
             {   size_t result;
                 ZSTDv06_DCtx* const zd = ZSTDv06_createDCtx();
                 if (zd==NULL) return ERROR(memory_allocation);
@@ -101,6 +134,14 @@ MEM_STATIC size_t ZSTD_decompressLegacy(
                 ZSTDv06_freeDCtx(zd);
                 return result;
             }
+        case 7 :
+            {   size_t result;
+                ZSTDv07_DCtx* const zd = ZSTDv07_createDCtx();
+                if (zd==NULL) return ERROR(memory_allocation);
+                result = ZSTDv07_decompress_usingDict(zd, dst, dstCapacity, src, compressedSize, dict, dictSize);
+                ZSTDv07_freeDCtx(zd);
+                return result;
+            }
         default :
             return ERROR(prefix_unknown);
     }
diff --git a/lib/legacy/zstd_v02.c b/lib/legacy/zstd_v02.c
index 8950111..2d4cfa5 100644
--- a/lib/legacy/zstd_v02.c
+++ b/lib/legacy/zstd_v02.c
@@ -1350,7 +1350,7 @@ static unsigned FSE_isError(size_t code) { return ERR_isError(code); }
 ****************************************************************/
 static short FSE_abs(short a)
 {
-    return a<0 ? -a : a;
+    return (short)(a<0 ? -a : a);
 }
 
 static size_t FSE_readNCount (short* normalizedCounter, unsigned* maxSVPtr, unsigned* tableLogPtr,
diff --git a/lib/legacy/zstd_v04.c b/lib/legacy/zstd_v04.c
index 3546904..66a47e7 100644
--- a/lib/legacy/zstd_v04.c
+++ b/lib/legacy/zstd_v04.c
@@ -3620,36 +3620,26 @@ static size_t ZSTD_decompressContinue(ZSTD_DCtx* ctx, void* dst, size_t maxDstSi
     switch (ctx->stage)
     {
     case ZSTDds_getFrameHeaderSize :
-        {
-            /* get frame header size */
-            if (srcSize != ZSTD_frameHeaderSize_min) return ERROR(srcSize_wrong);   /* impossible */
-            ctx->headerSize = ZSTD_decodeFrameHeader_Part1(ctx, src, ZSTD_frameHeaderSize_min);
-            if (ZSTD_isError(ctx->headerSize)) return ctx->headerSize;
-            memcpy(ctx->headerBuffer, src, ZSTD_frameHeaderSize_min);
-            if (ctx->headerSize > ZSTD_frameHeaderSize_min)
-            {
-                ctx->expected = ctx->headerSize - ZSTD_frameHeaderSize_min;
-                ctx->stage = ZSTDds_decodeFrameHeader;
-                return 0;
-            }
-            ctx->expected = 0;   /* not necessary to copy more */
-        }
+        /* get frame header size */
+        if (srcSize != ZSTD_frameHeaderSize_min) return ERROR(srcSize_wrong);   /* impossible */
+        ctx->headerSize = ZSTD_decodeFrameHeader_Part1(ctx, src, ZSTD_frameHeaderSize_min);
+        if (ZSTD_isError(ctx->headerSize)) return ctx->headerSize;
+        memcpy(ctx->headerBuffer, src, ZSTD_frameHeaderSize_min);
+        if (ctx->headerSize > ZSTD_frameHeaderSize_min) return ERROR(GENERIC);   /* impossible */
+        ctx->expected = 0;   /* not necessary to copy more */
+        /* fallthrough */
     case ZSTDds_decodeFrameHeader:
-        {
-            /* get frame header */
-            size_t result;
-            memcpy(ctx->headerBuffer + ZSTD_frameHeaderSize_min, src, ctx->expected);
-            result = ZSTD_decodeFrameHeader_Part2(ctx, ctx->headerBuffer, ctx->headerSize);
+        /* get frame header */
+        {   size_t const result = ZSTD_decodeFrameHeader_Part2(ctx, ctx->headerBuffer, ctx->headerSize);
             if (ZSTD_isError(result)) return result;
             ctx->expected = ZSTD_blockHeaderSize;
             ctx->stage = ZSTDds_decodeBlockHeader;
             return 0;
         }
     case ZSTDds_decodeBlockHeader:
-        {
-            /* Decode block header */
-            blockProperties_t bp;
-            size_t blockSize = ZSTD_getcBlockSize(src, ZSTD_blockHeaderSize, &bp);
+        /* Decode block header */
+        {   blockProperties_t bp;
+            size_t const blockSize = ZSTD_getcBlockSize(src, ZSTD_blockHeaderSize, &bp);
             if (ZSTD_isError(blockSize)) return blockSize;
             if (bp.blockType == bt_end)
             {
@@ -3864,11 +3854,9 @@ static size_t ZBUFF_decompressContinue(ZBUFF_DCtx* zbc, void* dst, size_t* maxDs
 
         case ZBUFFds_readHeader :
             /* read header from src */
-            {
-                size_t headerSize = ZSTD_getFrameParams(&(zbc->params), src, *srcSizePtr);
+            {   size_t const headerSize = ZSTD_getFrameParams(&(zbc->params), src, *srcSizePtr);
                 if (ZSTD_isError(headerSize)) return headerSize;
-                if (headerSize)
-                {
+                if (headerSize) {
                     /* not enough input to decode header : tell how many bytes would be necessary */
                     memcpy(zbc->headerBuffer+zbc->hPos, src, *srcSizePtr);
                     zbc->hPos += *srcSizePtr;
@@ -3882,8 +3870,7 @@ static size_t ZBUFF_decompressContinue(ZBUFF_DCtx* zbc, void* dst, size_t* maxDs
 
         case ZBUFFds_loadHeader:
             /* complete header from src */
-            {
-                size_t headerSize = ZBUFF_limitCopy(
+            {   size_t headerSize = ZBUFF_limitCopy(
                     zbc->headerBuffer + zbc->hPos, ZSTD_frameHeaderSize_max - zbc->hPos,
                     src, *srcSizePtr);
                 zbc->hPos += headerSize;
@@ -3895,12 +3882,12 @@ static size_t ZBUFF_decompressContinue(ZBUFF_DCtx* zbc, void* dst, size_t* maxDs
                     *maxDstSizePtr = 0;
                     return headerSize - zbc->hPos;
             }   }
+            /* intentional fallthrough */
 
         case ZBUFFds_decodeHeader:
                 /* apply header to create / resize buffers */
-                {
-                    size_t neededOutSize = (size_t)1 << zbc->params.windowLog;
-                    size_t neededInSize = BLOCKSIZE;   /* a block is never > BLOCKSIZE */
+                {   size_t const neededOutSize = (size_t)1 << zbc->params.windowLog;
+                    size_t const neededInSize = BLOCKSIZE;   /* a block is never > BLOCKSIZE */
                     if (zbc->inBuffSize < neededInSize) {
                         free(zbc->inBuff);
                         zbc->inBuffSize = neededInSize;
@@ -4037,7 +4024,7 @@ size_t ZSTDv04_decompress(void* dst, size_t maxDstSize, const void* src, size_t
     return regenSize;
 #else
     ZSTD_DCtx dctx;
-    return ZSTD_decompressDCtx(&dctx, dst, maxDstSize, src, srcSize);
+    return ZSTDv04_decompressDCtx(&dctx, dst, maxDstSize, src, srcSize);
 #endif
 }
 
@@ -4067,3 +4054,11 @@ size_t ZBUFFv04_decompressContinue(ZBUFFv04_DCtx* dctx, void* dst, size_t* maxDs
 {
     return ZBUFF_decompressContinue(dctx, dst, maxDstSizePtr, src, srcSizePtr);
 }
+
+ZSTD_DCtx* ZSTDv04_createDCtx(void) { return ZSTD_createDCtx(); }
+size_t ZSTDv04_freeDCtx(ZSTD_DCtx* dctx) { return ZSTD_freeDCtx(dctx); }
+
+size_t ZSTDv04_getFrameParams(ZSTD_parameters* params, const void* src, size_t srcSize)
+{
+    return ZSTD_getFrameParams(params, src, srcSize);
+}
diff --git a/lib/legacy/zstd_v05.c b/lib/legacy/zstd_v05.c
index 9c57d18..f3c720f 100644
--- a/lib/legacy/zstd_v05.c
+++ b/lib/legacy/zstd_v05.c
@@ -3872,25 +3872,17 @@ size_t ZSTDv05_decompressContinue(ZSTDv05_DCtx* dctx, void* dst, size_t maxDstSi
     switch (dctx->stage)
     {
     case ZSTDv05ds_getFrameHeaderSize :
-        {
-            /* get frame header size */
-            if (srcSize != ZSTDv05_frameHeaderSize_min) return ERROR(srcSize_wrong);   /* impossible */
-            dctx->headerSize = ZSTDv05_decodeFrameHeader_Part1(dctx, src, ZSTDv05_frameHeaderSize_min);
-            if (ZSTDv05_isError(dctx->headerSize)) return dctx->headerSize;
-            memcpy(dctx->headerBuffer, src, ZSTDv05_frameHeaderSize_min);
-            if (dctx->headerSize > ZSTDv05_frameHeaderSize_min) {
-                dctx->expected = dctx->headerSize - ZSTDv05_frameHeaderSize_min;
-                dctx->stage = ZSTDv05ds_decodeFrameHeader;
-                return 0;
-            }
-            dctx->expected = 0;   /* not necessary to copy more */
-        }
+        /* get frame header size */
+        if (srcSize != ZSTDv05_frameHeaderSize_min) return ERROR(srcSize_wrong);   /* impossible */
+        dctx->headerSize = ZSTDv05_decodeFrameHeader_Part1(dctx, src, ZSTDv05_frameHeaderSize_min);
+        if (ZSTDv05_isError(dctx->headerSize)) return dctx->headerSize;
+        memcpy(dctx->headerBuffer, src, ZSTDv05_frameHeaderSize_min);
+        if (dctx->headerSize > ZSTDv05_frameHeaderSize_min) return ERROR(GENERIC); /* should never happen */
+        dctx->expected = 0;   /* not necessary to copy more */
+        /* fallthrough */
     case ZSTDv05ds_decodeFrameHeader:
-        {
-            /* get frame header */
-            size_t result;
-            memcpy(dctx->headerBuffer + ZSTDv05_frameHeaderSize_min, src, dctx->expected);
-            result = ZSTDv05_decodeFrameHeader_Part2(dctx, dctx->headerBuffer, dctx->headerSize);
+        /* get frame header */
+        {   size_t const result = ZSTDv05_decodeFrameHeader_Part2(dctx, dctx->headerBuffer, dctx->headerSize);
             if (ZSTDv05_isError(result)) return result;
             dctx->expected = ZSTDv05_blockHeaderSize;
             dctx->stage = ZSTDv05ds_decodeBlockHeader;
diff --git a/lib/legacy/zstd_v06.c b/lib/legacy/zstd_v06.c
index 2640c86..ce6967e 100644
--- a/lib/legacy/zstd_v06.c
+++ b/lib/legacy/zstd_v06.c
@@ -36,7 +36,7 @@
 #include "zstd_v06.h"
 #include <stddef.h>    /* size_t, ptrdiff_t */
 #include <string.h>    /* memcpy */
-#include <stdlib.h>    /* malloc, free, qsort */ 
+#include <stdlib.h>    /* malloc, free, qsort */
 
 
 
@@ -535,8 +535,6 @@ ZSTDLIB_API size_t ZSTDv06_decompress_usingPreparedDCtx(
 
 
 
-struct ZSTDv06_frameParams_s { U64 frameContentSize; U32 windowLog; };
-
 #define ZSTDv06_FRAMEHEADERSIZE_MAX 13    /* for static allocation */
 static const size_t ZSTDv06_frameHeaderSize_min = 5;
 static const size_t ZSTDv06_frameHeaderSize_max = ZSTDv06_FRAMEHEADERSIZE_MAX;
diff --git a/lib/legacy/zstd_v06.h b/lib/legacy/zstd_v06.h
index 55619be..177f148 100644
--- a/lib/legacy/zstd_v06.h
+++ b/lib/legacy/zstd_v06.h
@@ -107,7 +107,7 @@ ZSTDLIB_API size_t ZSTDv06_decompress_usingDict(ZSTDv06_DCtx* dctx,
 /*-************************
 *  Advanced Streaming API
 ***************************/
-
+struct ZSTDv06_frameParams_s { unsigned long long frameContentSize; unsigned windowLog; };
 typedef struct ZSTDv06_frameParams_s ZSTDv06_frameParams;
 
 ZSTDLIB_API size_t ZSTDv06_getFrameParams(ZSTDv06_frameParams* fparamsPtr, const void* src, size_t srcSize);   /**< doesn't consume input */
diff --git a/lib/legacy/zstd_v06.c b/lib/legacy/zstd_v07.c
similarity index 57%
copy from lib/legacy/zstd_v06.c
copy to lib/legacy/zstd_v07.c
index 2640c86..d95fd43 100644
--- a/lib/legacy/zstd_v06.c
+++ b/lib/legacy/zstd_v07.c
@@ -1,6 +1,6 @@
 /* ******************************************************************
-   zstd_v06.c
-   Decompression module for ZSTD v0.6 legacy format
+   zstd_v07.c
+   Decompression module for ZSTD v0.7 legacy format
    Copyright (C) 2016, Yann Collet.
 
    BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php)
@@ -33,11 +33,169 @@
 ****************************************************************** */
 
 /*- Dependencies -*/
-#include "zstd_v06.h"
-#include <stddef.h>    /* size_t, ptrdiff_t */
-#include <string.h>    /* memcpy */
-#include <stdlib.h>    /* malloc, free, qsort */ 
+#include <stddef.h>     /* size_t, ptrdiff_t */
+#include <string.h>     /* memcpy */
+#include <stdlib.h>     /* malloc, free, qsort */
 
+#define XXH_STATIC_LINKING_ONLY   /* XXH64_state_t */
+#include "xxhash.h"      /* XXH64_* */
+#include "zstd_v07.h"
+
+#define FSEv07_STATIC_LINKING_ONLY  /* FSEv07_MIN_TABLELOG */
+#define HUFv07_STATIC_LINKING_ONLY  /* HUFv07_TABLELOG_ABSOLUTEMAX */
+#define ZSTDv07_STATIC_LINKING_ONLY
+
+
+#ifdef ZSTDv07_STATIC_LINKING_ONLY
+
+/* ====================================================================================
+ * The definitions in this section are considered experimental.
+ * They should never be used with a dynamic library, as they may change in the future.
+ * They are provided for advanced usages.
+ * Use them only in association with static linking.
+ * ==================================================================================== */
+
+/*--- Constants ---*/
+#define ZSTDv07_MAGIC_SKIPPABLE_START  0x184D2A50U
+
+#define ZSTDv07_WINDOWLOG_MAX_32  25
+#define ZSTDv07_WINDOWLOG_MAX_64  27
+#define ZSTDv07_WINDOWLOG_MAX    ((U32)(MEM_32bits() ? ZSTDv07_WINDOWLOG_MAX_32 : ZSTDv07_WINDOWLOG_MAX_64))
+#define ZSTDv07_WINDOWLOG_MIN     18
+#define ZSTDv07_CHAINLOG_MAX     (ZSTDv07_WINDOWLOG_MAX+1)
+#define ZSTDv07_CHAINLOG_MIN       4
+#define ZSTDv07_HASHLOG_MAX       ZSTDv07_WINDOWLOG_MAX
+#define ZSTDv07_HASHLOG_MIN       12
+#define ZSTDv07_HASHLOG3_MAX      17
+#define ZSTDv07_SEARCHLOG_MAX    (ZSTDv07_WINDOWLOG_MAX-1)
+#define ZSTDv07_SEARCHLOG_MIN      1
+#define ZSTDv07_SEARCHLENGTH_MAX   7
+#define ZSTDv07_SEARCHLENGTH_MIN   3
+#define ZSTDv07_TARGETLENGTH_MIN   4
+#define ZSTDv07_TARGETLENGTH_MAX 999
+
+#define ZSTDv07_FRAMEHEADERSIZE_MAX 18    /* for static allocation */
+static const size_t ZSTDv07_frameHeaderSize_min = 5;
+static const size_t ZSTDv07_frameHeaderSize_max = ZSTDv07_FRAMEHEADERSIZE_MAX;
+static const size_t ZSTDv07_skippableHeaderSize = 8;  /* magic number + skippable frame length */
+
+
+/* custom memory allocation functions */
+typedef void* (*ZSTDv07_allocFunction) (void* opaque, size_t size);
+typedef void  (*ZSTDv07_freeFunction) (void* opaque, void* address);
+typedef struct { ZSTDv07_allocFunction customAlloc; ZSTDv07_freeFunction customFree; void* opaque; } ZSTDv07_customMem;
+
+
+/*--- Advanced Decompression functions ---*/
+
+/*! ZSTDv07_estimateDCtxSize() :
+ *  Gives the potential amount of memory allocated to create a ZSTDv07_DCtx */
+ZSTDLIB_API size_t ZSTDv07_estimateDCtxSize(void);
+
+/*! ZSTDv07_createDCtx_advanced() :
+ *  Create a ZSTD decompression context using external alloc and free functions */
+ZSTDLIB_API ZSTDv07_DCtx* ZSTDv07_createDCtx_advanced(ZSTDv07_customMem customMem);
+
+/*! ZSTDv07_sizeofDCtx() :
+ *  Gives the amount of memory used by a given ZSTDv07_DCtx */
+ZSTDLIB_API size_t ZSTDv07_sizeofDCtx(const ZSTDv07_DCtx* dctx);
+
+
+/* ******************************************************************
+*  Buffer-less streaming functions (synchronous mode)
+********************************************************************/
+
+ZSTDLIB_API size_t ZSTDv07_decompressBegin(ZSTDv07_DCtx* dctx);
+ZSTDLIB_API size_t ZSTDv07_decompressBegin_usingDict(ZSTDv07_DCtx* dctx, const void* dict, size_t dictSize);
+ZSTDLIB_API void   ZSTDv07_copyDCtx(ZSTDv07_DCtx* dctx, const ZSTDv07_DCtx* preparedDCtx);
+
+ZSTDLIB_API size_t ZSTDv07_nextSrcSizeToDecompress(ZSTDv07_DCtx* dctx);
+ZSTDLIB_API size_t ZSTDv07_decompressContinue(ZSTDv07_DCtx* dctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize);
+
+/*
+  Buffer-less streaming decompression (synchronous mode)
+
+  A ZSTDv07_DCtx object is required to track streaming operations.
+  Use ZSTDv07_createDCtx() / ZSTDv07_freeDCtx() to manage it.
+  A ZSTDv07_DCtx object can be re-used multiple times.
+
+  First optional operation is to retrieve frame parameters, using ZSTDv07_getFrameParams(), which doesn't consume the input.
+  It can provide the minimum size of rolling buffer required to properly decompress data (`windowSize`),
+  and optionally the final size of uncompressed content.
+  (Note : content size is an optional info that may not be present. 0 means : content size unknown)
+  Frame parameters are extracted from the beginning of compressed frame.
+  The amount of data to read is variable, from ZSTDv07_frameHeaderSize_min to ZSTDv07_frameHeaderSize_max (so if `srcSize` >= ZSTDv07_frameHeaderSize_max, it will always work)
+  If `srcSize` is too small for operation to succeed, function will return the minimum size it requires to produce a result.
+  Result : 0 when successful, it means the ZSTDv07_frameParams structure has been filled.
+          >0 : means there is not enough data into `src`. Provides the expected size to successfully decode header.
+           errorCode, which can be tested using ZSTDv07_isError()
+
+  Start decompression, with ZSTDv07_decompressBegin() or ZSTDv07_decompressBegin_usingDict().
+  Alternatively, you can copy a prepared context, using ZSTDv07_copyDCtx().
+
+  Then use ZSTDv07_nextSrcSizeToDecompress() and ZSTDv07_decompressContinue() alternatively.
+  ZSTDv07_nextSrcSizeToDecompress() tells how much bytes to provide as 'srcSize' to ZSTDv07_decompressContinue().
+  ZSTDv07_decompressContinue() requires this exact amount of bytes, or it will fail.
+
+  @result of ZSTDv07_decompressContinue() is the number of bytes regenerated within 'dst' (necessarily <= dstCapacity).
+  It can be zero, which is not an error; it just means ZSTDv07_decompressContinue() has decoded some header.
+
+  ZSTDv07_decompressContinue() needs previous data blocks during decompression, up to `windowSize`.
+  They should preferably be located contiguously, prior to current block.
+  Alternatively, a round buffer of sufficient size is also possible. Sufficient size is determined by frame parameters.
+  ZSTDv07_decompressContinue() is very sensitive to contiguity,
+  if 2 blocks don't follow each other, make sure that either the compressor breaks contiguity at the same place,
+    or that previous contiguous segment is large enough to properly handle maximum back-reference.
+
+  A frame is fully decoded when ZSTDv07_nextSrcSizeToDecompress() returns zero.
+  Context can then be reset to start a new decompression.
+
+
+  == Special case : skippable frames ==
+
+  Skippable frames allow the integration of user-defined data into a flow of concatenated frames.
+  Skippable frames will be ignored (skipped) by a decompressor. The format of skippable frame is following:
+  a) Skippable frame ID - 4 Bytes, Little endian format, any value from 0x184D2A50 to 0x184D2A5F
+  b) Frame Size - 4 Bytes, Little endian format, unsigned 32-bits
+  c) Frame Content - any content (User Data) of length equal to Frame Size
+  For skippable frames ZSTDv07_decompressContinue() always returns 0.
+  For skippable frames ZSTDv07_getFrameParams() returns fparamsPtr->windowLog==0 what means that a frame is skippable.
+  It also returns Frame Size as fparamsPtr->frameContentSize.
+*/
+
+
+/* **************************************
+*  Block functions
+****************************************/
+/*! Block functions produce and decode raw zstd blocks, without frame metadata.
+    Frame metadata cost is typically ~18 bytes, which can be non-negligible for very small blocks (< 100 bytes).
+    User will have to take in charge required information to regenerate data, such as compressed and content sizes.
+
+    A few rules to respect :
+    - Compressing and decompressing require a context structure
+      + Use ZSTDv07_createCCtx() and ZSTDv07_createDCtx()
+    - It is necessary to init context before starting
+      + compression : ZSTDv07_compressBegin()
+      + decompression : ZSTDv07_decompressBegin()
+      + variants _usingDict() are also allowed
+      + copyCCtx() and copyDCtx() work too
+    - Block size is limited, it must be <= ZSTDv07_getBlockSizeMax()
+      + If you need to compress more, cut data into multiple blocks
+      + Consider using the regular ZSTDv07_compress() instead, as frame metadata costs become negligible when source size is large.
+    - When a block is considered not compressible enough, ZSTDv07_compressBlock() result will be zero.
+      In which case, nothing is produced into `dst`.
+      + User must test for such outcome and deal directly with uncompressed data
+      + ZSTDv07_decompressBlock() doesn't accept uncompressed data as input !!!
+      + In case of multiple successive blocks, decoder must be informed of uncompressed block existence to follow proper history.
+        Use ZSTDv07_insertBlock() in such a case.
+*/
+
+#define ZSTDv07_BLOCKSIZE_ABSOLUTEMAX (128 * 1024)   /* define, for static allocation */
+ZSTDLIB_API size_t ZSTDv07_decompressBlock(ZSTDv07_DCtx* dctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize);
+ZSTDLIB_API size_t ZSTDv07_insertBlock(ZSTDv07_DCtx* dctx, const void* blockStart, size_t blockSize);  /**< insert block into `dctx` history. Useful for uncompressed blocks */
+
+
+#endif   /* ZSTDv07_STATIC_LINKING_ONLY */
 
 
 /* ******************************************************************
@@ -81,10 +239,13 @@
 extern "C" {
 #endif
 
-
 /*-****************************************
 *  Compiler specifics
 ******************************************/
+#if defined(_MSC_VER)   /* Visual Studio */
+#   include <stdlib.h>  /* _byteswap_ulong */
+#   include <intrin.h>  /* _byteswap_* */
+#endif
 #if defined(__GNUC__)
 #  define MEM_STATIC static __attribute__((unused))
 #elif defined (__cplusplus) || (defined (__STDC_VERSION__) && (__STDC_VERSION__ >= 199901L) /* C99 */)
@@ -95,6 +256,10 @@ extern "C" {
 #  define MEM_STATIC static  /* this version may generate warnings for unused static functions; disable the relevant warning */
 #endif
 
+/* code only tested on 32 and 64 bits systems */
+#define MEM_STATIC_ASSERT(c)   { enum { XXH_static_assert = 1/(int)(!!(c)) }; }
+MEM_STATIC void MEM_check(void) { MEM_STATIC_ASSERT((sizeof(size_t)==4) || (sizeof(size_t)==8)); }
+
 
 /*-**************************************************************
 *  Basic Types
@@ -449,175 +614,39 @@ extern "C" {
 *  error codes list
 ******************************************/
 typedef enum {
-  ZSTDv06_error_no_error,
-  ZSTDv06_error_GENERIC,
-  ZSTDv06_error_prefix_unknown,
-  ZSTDv06_error_frameParameter_unsupported,
-  ZSTDv06_error_frameParameter_unsupportedBy32bits,
-  ZSTDv06_error_compressionParameter_unsupported,
-  ZSTDv06_error_init_missing,
-  ZSTDv06_error_memory_allocation,
-  ZSTDv06_error_stage_wrong,
-  ZSTDv06_error_dstSize_tooSmall,
-  ZSTDv06_error_srcSize_wrong,
-  ZSTDv06_error_corruption_detected,
-  ZSTDv06_error_tableLog_tooLarge,
-  ZSTDv06_error_maxSymbolValue_tooLarge,
-  ZSTDv06_error_maxSymbolValue_tooSmall,
-  ZSTDv06_error_dictionary_corrupted,
-  ZSTDv06_error_maxCode
-} ZSTDv06_ErrorCode;
-
-/* note : compare with size_t function results using ZSTDv06_getError() */
-
-
-#if defined (__cplusplus)
-}
-#endif
-
-#endif /* ERROR_PUBLIC_H_MODULE */
-/*
-    zstd - standard compression library
-    Header File for static linking only
-    Copyright (C) 2014-2016, Yann Collet.
-
-    BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php)
-
-    Redistribution and use in source and binary forms, with or without
-    modification, are permitted provided that the following conditions are
-    met:
-    * Redistributions of source code must retain the above copyright
-    notice, this list of conditions and the following disclaimer.
-    * Redistributions in binary form must reproduce the above
-    copyright notice, this list of conditions and the following disclaimer
-    in the documentation and/or other materials provided with the
-    distribution.
-    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
-    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
-    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
-    A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
-    OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
-    SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
-    LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
-    DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
-    THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
-    (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
-    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
-
-    You can contact the author at :
-    - zstd homepage : http://www.zstd.net
-*/
-#ifndef ZSTDv06_STATIC_H
-#define ZSTDv06_STATIC_H
-
-/* The prototypes defined within this file are considered experimental.
- * They should not be used in the context DLL as they may change in the future.
- * Prefer static linking if you need them, to control breaking version changes issues.
- */
-
-#if defined (__cplusplus)
-extern "C" {
-#endif
-
-
-
-/*- Advanced Decompression functions -*/
-
-/*! ZSTDv06_decompress_usingPreparedDCtx() :
-*   Same as ZSTDv06_decompress_usingDict, but using a reference context `preparedDCtx`, where dictionary has been loaded.
-*   It avoids reloading the dictionary each time.
-*   `preparedDCtx` must have been properly initialized using ZSTDv06_decompressBegin_usingDict().
-*   Requires 2 contexts : 1 for reference (preparedDCtx), which will not be modified, and 1 to run the decompression operation (dctx) */
-ZSTDLIB_API size_t ZSTDv06_decompress_usingPreparedDCtx(
-                                           ZSTDv06_DCtx* dctx, const ZSTDv06_DCtx* preparedDCtx,
-                                           void* dst, size_t dstCapacity,
-                                     const void* src, size_t srcSize);
-
-
-
-struct ZSTDv06_frameParams_s { U64 frameContentSize; U32 windowLog; };
-
-#define ZSTDv06_FRAMEHEADERSIZE_MAX 13    /* for static allocation */
-static const size_t ZSTDv06_frameHeaderSize_min = 5;
-static const size_t ZSTDv06_frameHeaderSize_max = ZSTDv06_FRAMEHEADERSIZE_MAX;
-
-ZSTDLIB_API size_t ZSTDv06_decompressBegin(ZSTDv06_DCtx* dctx);
-
-/*
-  Streaming decompression, direct mode (bufferless)
-
-  A ZSTDv06_DCtx object is required to track streaming operations.
-  Use ZSTDv06_createDCtx() / ZSTDv06_freeDCtx() to manage it.
-  A ZSTDv06_DCtx object can be re-used multiple times.
-
-  First optional operation is to retrieve frame parameters, using ZSTDv06_getFrameParams(), which doesn't consume the input.
-  It can provide the minimum size of rolling buffer required to properly decompress data,
-  and optionally the final size of uncompressed content.
-  (Note : content size is an optional info that may not be present. 0 means : content size unknown)
-  Frame parameters are extracted from the beginning of compressed frame.
-  The amount of data to read is variable, from ZSTDv06_frameHeaderSize_min to ZSTDv06_frameHeaderSize_max (so if `srcSize` >= ZSTDv06_frameHeaderSize_max, it will always work)
-  If `srcSize` is too small for operation to succeed, function will return the minimum size it requires to produce a result.
-  Result : 0 when successful, it means the ZSTDv06_frameParams structure has been filled.
-          >0 : means there is not enough data into `src`. Provides the expected size to successfully decode header.
-           errorCode, which can be tested using ZSTDv06_isError()
-
-  Start decompression, with ZSTDv06_decompressBegin() or ZSTDv06_decompressBegin_usingDict().
-  Alternatively, you can copy a prepared context, using ZSTDv06_copyDCtx().
-
-  Then use ZSTDv06_nextSrcSizeToDecompress() and ZSTDv06_decompressContinue() alternatively.
-  ZSTDv06_nextSrcSizeToDecompress() tells how much bytes to provide as 'srcSize' to ZSTDv06_decompressContinue().
-  ZSTDv06_decompressContinue() requires this exact amount of bytes, or it will fail.
-  ZSTDv06_decompressContinue() needs previous data blocks during decompression, up to (1 << windowlog).
-  They should preferably be located contiguously, prior to current block. Alternatively, a round buffer is also possible.
-
-  @result of ZSTDv06_decompressContinue() is the number of bytes regenerated within 'dst' (necessarily <= dstCapacity)
-  It can be zero, which is not an error; it just means ZSTDv06_decompressContinue() has decoded some header.
-
-  A frame is fully decoded when ZSTDv06_nextSrcSizeToDecompress() returns zero.
-  Context can then be reset to start a new decompression.
-*/
-
-
-/* **************************************
-*  Block functions
-****************************************/
-/*! Block functions produce and decode raw zstd blocks, without frame metadata.
-    User will have to take in charge required information to regenerate data, such as compressed and content sizes.
-
-    A few rules to respect :
-    - Uncompressed block size must be <= ZSTDv06_BLOCKSIZE_MAX (128 KB)
-    - Compressing or decompressing requires a context structure
-      + Use ZSTDv06_createCCtx() and ZSTDv06_createDCtx()
-    - It is necessary to init context before starting
-      + compression : ZSTDv06_compressBegin()
-      + decompression : ZSTDv06_decompressBegin()
-      + variants _usingDict() are also allowed
-      + copyCCtx() and copyDCtx() work too
-    - When a block is considered not compressible enough, ZSTDv06_compressBlock() result will be zero.
-      In which case, nothing is produced into `dst`.
-      + User must test for such outcome and deal directly with uncompressed data
-      + ZSTDv06_decompressBlock() doesn't accept uncompressed data as input !!
-*/
-
-#define ZSTDv06_BLOCKSIZE_MAX (128 * 1024)   /* define, for static allocation */
-ZSTDLIB_API size_t ZSTDv06_decompressBlock(ZSTDv06_DCtx* dctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize);
-
-
-/*-*************************************
-*  Error management
-***************************************/
-/*! ZSTDv06_getErrorCode() :
-    convert a `size_t` function result into a `ZSTDv06_ErrorCode` enum type,
+  ZSTDv07_error_no_error,
+  ZSTDv07_error_GENERIC,
+  ZSTDv07_error_prefix_unknown,
+  ZSTDv07_error_frameParameter_unsupported,
+  ZSTDv07_error_frameParameter_unsupportedBy32bits,
+  ZSTDv07_error_compressionParameter_unsupported,
+  ZSTDv07_error_init_missing,
+  ZSTDv07_error_memory_allocation,
+  ZSTDv07_error_stage_wrong,
+  ZSTDv07_error_dstSize_tooSmall,
+  ZSTDv07_error_srcSize_wrong,
+  ZSTDv07_error_corruption_detected,
+  ZSTDv07_error_checksum_wrong,
+  ZSTDv07_error_tableLog_tooLarge,
+  ZSTDv07_error_maxSymbolValue_tooLarge,
+  ZSTDv07_error_maxSymbolValue_tooSmall,
+  ZSTDv07_error_dictionary_corrupted,
+  ZSTDv07_error_dictionary_wrong,
+  ZSTDv07_error_maxCode
+} ZSTDv07_ErrorCode;
+
+/*! ZSTDv07_getErrorCode() :
+    convert a `size_t` function result into a `ZSTDv07_ErrorCode` enum type,
     which can be used to compare directly with enum list published into "error_public.h" */
-ZSTDLIB_API ZSTDv06_ErrorCode ZSTDv06_getErrorCode(size_t functionResult);
-ZSTDLIB_API const char* ZSTDv06_getErrorString(ZSTDv06_ErrorCode code);
+ZSTDv07_ErrorCode ZSTDv07_getErrorCode(size_t functionResult);
+const char* ZSTDv07_getErrorString(ZSTDv07_ErrorCode code);
 
 
 #if defined (__cplusplus)
 }
 #endif
 
-#endif  /* ZSTDv06_STATIC_H */
+#endif /* ERROR_PUBLIC_H_MODULE */
 /* ******************************************************************
    Error codes and messages
    Copyright (C) 2013-2016, Yann Collet
@@ -660,6 +689,7 @@ extern "C" {
 #endif
 
 
+
 /* ****************************************
 *  Compiler-specific
 ******************************************/
@@ -677,8 +707,8 @@ extern "C" {
 /*-****************************************
 *  Customization (error_public.h)
 ******************************************/
-typedef ZSTDv06_ErrorCode ERR_enum;
-#define PREFIX(name) ZSTDv06_error_##name
+typedef ZSTDv07_ErrorCode ERR_enum;
+#define PREFIX(name) ZSTDv07_error_##name
 
 
 /*-****************************************
@@ -715,10 +745,12 @@ ERR_STATIC const char* ERR_getErrorString(ERR_enum code)
     case PREFIX(dstSize_tooSmall): return "Destination buffer is too small";
     case PREFIX(srcSize_wrong): return "Src size incorrect";
     case PREFIX(corruption_detected): return "Corrupted block detected";
+    case PREFIX(checksum_wrong): return "Restored data doesn't match checksum";
     case PREFIX(tableLog_tooLarge): return "tableLog requires too much memory : unsupported";
     case PREFIX(maxSymbolValue_tooLarge): return "Unsupported max Symbol Value : too large";
     case PREFIX(maxSymbolValue_tooSmall): return "Specified maxSymbolValue is too small";
     case PREFIX(dictionary_corrupted): return "Dictionary is corrupted";
+    case PREFIX(dictionary_wrong): return "Dictionary mismatch";
     case PREFIX(maxCode):
     default: return notErrorCode;
     }
@@ -734,496 +766,100 @@ ERR_STATIC const char* ERR_getErrorName(size_t code)
 #endif
 
 #endif /* ERROR_H_MODULE */
-/*
-    zstd_internal - common functions to include
-    Header File for include
-    Copyright (C) 2014-2016, Yann Collet.
-
-    BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php)
+/* ******************************************************************
+   bitstream
+   Part of FSE library
+   header file (to include)
+   Copyright (C) 2013-2016, Yann Collet.
 
-    Redistribution and use in source and binary forms, with or without
-    modification, are permitted provided that the following conditions are
-    met:
-    * Redistributions of source code must retain the above copyright
-    notice, this list of conditions and the following disclaimer.
-    * Redistributions in binary form must reproduce the above
-    copyright notice, this list of conditions and the following disclaimer
-    in the documentation and/or other materials provided with the
-    distribution.
-    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
-    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
-    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
-    A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
-    OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
-    SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
-    LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
-    DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
-    THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
-    (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
-    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+   BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php)
 
-    You can contact the author at :
-    - zstd homepage : https://www.zstd.net
-*/
-#ifndef ZSTDv06_CCOMMON_H_MODULE
-#define ZSTDv06_CCOMMON_H_MODULE
+   Redistribution and use in source and binary forms, with or without
+   modification, are permitted provided that the following conditions are
+   met:
 
+       * Redistributions of source code must retain the above copyright
+   notice, this list of conditions and the following disclaimer.
+       * Redistributions in binary form must reproduce the above
+   copyright notice, this list of conditions and the following disclaimer
+   in the documentation and/or other materials provided with the
+   distribution.
 
-/*-*************************************
-*  Common macros
-***************************************/
-#define MIN(a,b) ((a)<(b) ? (a) : (b))
-#define MAX(a,b) ((a)>(b) ? (a) : (b))
+   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 
+   You can contact the author at :
+   - Source repository : https://github.com/Cyan4973/FiniteStateEntropy
+****************************************************************** */
+#ifndef BITSTREAM_H_MODULE
+#define BITSTREAM_H_MODULE
 
-/*-*************************************
-*  Common constants
-***************************************/
-#define ZSTDv06_OPT_DEBUG 0     // 3 = compression stats;  5 = check encoded sequences;  9 = full logs
-#include <stdio.h>
-#if defined(ZSTDv06_OPT_DEBUG) && ZSTDv06_OPT_DEBUG>=9
-    #define ZSTDv06_LOG_PARSER(...) printf(__VA_ARGS__)
-    #define ZSTDv06_LOG_ENCODE(...) printf(__VA_ARGS__)
-    #define ZSTDv06_LOG_BLOCK(...) printf(__VA_ARGS__)
-#else
-    #define ZSTDv06_LOG_PARSER(...)
-    #define ZSTDv06_LOG_ENCODE(...)
-    #define ZSTDv06_LOG_BLOCK(...)
+#if defined (__cplusplus)
+extern "C" {
 #endif
 
-#define ZSTDv06_OPT_NUM    (1<<12)
-#define ZSTDv06_DICT_MAGIC  0xEC30A436
 
-#define ZSTDv06_REP_NUM    3
-#define ZSTDv06_REP_INIT   ZSTDv06_REP_NUM
-#define ZSTDv06_REP_MOVE   (ZSTDv06_REP_NUM-1)
+/*
+*  This API consists of small unitary functions, which must be inlined for best performance.
+*  Since link-time-optimization is not available for all compilers,
+*  these functions are defined into a .h to be included.
+*/
 
-#define KB *(1 <<10)
-#define MB *(1 <<20)
-#define GB *(1U<<30)
 
-#define BIT7 128
-#define BIT6  64
-#define BIT5  32
-#define BIT4  16
-#define BIT1   2
-#define BIT0   1
+/*=========================================
+*  Target specific
+=========================================*/
+#if defined(__BMI__) && defined(__GNUC__)
+#  include <immintrin.h>   /* support for bextr (experimental) */
+#endif
 
-#define ZSTDv06_WINDOWLOG_ABSOLUTEMIN 12
-static const size_t ZSTDv06_fcs_fieldSize[4] = { 0, 1, 2, 8 };
+/*-********************************************
+*  bitStream decoding API (read backward)
+**********************************************/
+typedef struct
+{
+    size_t   bitContainer;
+    unsigned bitsConsumed;
+    const char* ptr;
+    const char* start;
+} BITv07_DStream_t;
 
-#define ZSTDv06_BLOCKHEADERSIZE 3   /* because C standard does not allow a static const value to be defined using another static const value .... :( */
-static const size_t ZSTDv06_blockHeaderSize = ZSTDv06_BLOCKHEADERSIZE;
-typedef enum { bt_compressed, bt_raw, bt_rle, bt_end } blockType_t;
+typedef enum { BITv07_DStream_unfinished = 0,
+               BITv07_DStream_endOfBuffer = 1,
+               BITv07_DStream_completed = 2,
+               BITv07_DStream_overflow = 3 } BITv07_DStream_status;  /* result of BITv07_reloadDStream() */
+               /* 1,2,4,8 would be better for bitmap combinations, but slows down performance a bit ... :( */
 
-#define MIN_SEQUENCES_SIZE 1 /* nbSeq==0 */
-#define MIN_CBLOCK_SIZE (1 /*litCSize*/ + 1 /* RLE or RAW */ + MIN_SEQUENCES_SIZE /* nbSeq==0 */)   /* for a non-null block */
+MEM_STATIC size_t   BITv07_initDStream(BITv07_DStream_t* bitD, const void* srcBuffer, size_t srcSize);
+MEM_STATIC size_t   BITv07_readBits(BITv07_DStream_t* bitD, unsigned nbBits);
+MEM_STATIC BITv07_DStream_status BITv07_reloadDStream(BITv07_DStream_t* bitD);
+MEM_STATIC unsigned BITv07_endOfDStream(const BITv07_DStream_t* bitD);
 
-#define HufLog 12
-
-#define IS_HUF 0
-#define IS_PCH 1
-#define IS_RAW 2
-#define IS_RLE 3
-
-#define LONGNBSEQ 0x7F00
-
-#define MINMATCH 3
-#define EQUAL_READ32 4
-#define REPCODE_STARTVALUE 1
-
-#define Litbits  8
-#define MaxLit ((1<<Litbits) - 1)
-#define MaxML  52
-#define MaxLL  35
-#define MaxOff 28
-#define MaxSeq MAX(MaxLL, MaxML)   /* Assumption : MaxOff < MaxLL,MaxML */
-#define MLFSELog    9
-#define LLFSELog    9
-#define OffFSELog   8
-
-#define FSEv06_ENCODING_RAW     0
-#define FSEv06_ENCODING_RLE     1
-#define FSEv06_ENCODING_STATIC  2
-#define FSEv06_ENCODING_DYNAMIC 3
-
-static const U32 LL_bits[MaxLL+1] = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
-                                      1, 1, 1, 1, 2, 2, 3, 3, 4, 6, 7, 8, 9,10,11,12,
-                                     13,14,15,16 };
-static const S16 LL_defaultNorm[MaxLL+1] = { 4, 3, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1,
-                                             2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 2, 1, 1, 1, 1, 1,
-                                            -1,-1,-1,-1 };
-static const U32 LL_defaultNormLog = 6;
-
-static const U32 ML_bits[MaxML+1] = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
-                                      0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
-                                      1, 1, 1, 1, 2, 2, 3, 3, 4, 4, 5, 7, 8, 9,10,11,
-                                     12,13,14,15,16 };
-static const S16 ML_defaultNorm[MaxML+1] = { 1, 4, 3, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1,
-                                             1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
-                                             1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,-1,-1,
-                                            -1,-1,-1,-1,-1 };
-static const U32 ML_defaultNormLog = 6;
-
-static const S16 OF_defaultNorm[MaxOff+1] = { 1, 1, 1, 1, 1, 1, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1,
-                                              1, 1, 1, 1, 1, 1, 1, 1,-1,-1,-1,-1,-1 };
-static const U32 OF_defaultNormLog = 5;
-
-
-/*-*******************************************
-*  Shared functions to include for inlining
-*********************************************/
-static void ZSTDv06_copy8(void* dst, const void* src) { memcpy(dst, src, 8); }
-#define COPY8(d,s) { ZSTDv06_copy8(d,s); d+=8; s+=8; }
-
-/*! ZSTDv06_wildcopy() :
-*   custom version of memcpy(), can copy up to 7 bytes too many (8 bytes if length==0) */
-#define WILDCOPY_OVERLENGTH 8
-MEM_STATIC void ZSTDv06_wildcopy(void* dst, const void* src, size_t length)
-{
-    const BYTE* ip = (const BYTE*)src;
-    BYTE* op = (BYTE*)dst;
-    BYTE* const oend = op + length;
-    do
-        COPY8(op, ip)
-    while (op < oend);
-}
-
-MEM_STATIC unsigned ZSTDv06_highbit(U32 val)
-{
-#   if defined(_MSC_VER)   /* Visual */
-    unsigned long r=0;
-    _BitScanReverse(&r, val);
-    return (unsigned)r;
-#   elif defined(__GNUC__) && (__GNUC__ >= 3)   /* GCC Intrinsic */
-    return 31 - __builtin_clz(val);
-#   else   /* Software version */
-    static const int DeBruijnClz[32] = { 0, 9, 1, 10, 13, 21, 2, 29, 11, 14, 16, 18, 22, 25, 3, 30, 8, 12, 20, 28, 15, 17, 24, 7, 19, 27, 23, 6, 26, 5, 4, 31 };
-    U32 v = val;
-    int r;
-    v |= v >> 1;
-    v |= v >> 2;
-    v |= v >> 4;
-    v |= v >> 8;
-    v |= v >> 16;
-    r = DeBruijnClz[(U32)(v * 0x07C4ACDDU) >> 27];
-    return r;
-#   endif
-}
-
-
-/*-*******************************************
-*  Private interfaces
-*********************************************/
-typedef struct {
-    U32 off;
-    U32 len;
-} ZSTDv06_match_t;
-
-typedef struct {
-    U32 price;
-    U32 off;
-    U32 mlen;
-    U32 litlen;
-    U32 rep[ZSTDv06_REP_INIT];
-} ZSTDv06_optimal_t;
-
-#if ZSTDv06_OPT_DEBUG == 3
-    #include ".debug/zstd_stats.h"
-#else
-    typedef struct { U32  unused; } ZSTDv06_stats_t;
-    MEM_STATIC void ZSTDv06_statsPrint(ZSTDv06_stats_t* stats, U32 searchLength) { (void)stats; (void)searchLength; }
-    MEM_STATIC void ZSTDv06_statsInit(ZSTDv06_stats_t* stats) { (void)stats; }
-    MEM_STATIC void ZSTDv06_statsResetFreqs(ZSTDv06_stats_t* stats) { (void)stats; }
-    MEM_STATIC void ZSTDv06_statsUpdatePrices(ZSTDv06_stats_t* stats, size_t litLength, const BYTE* literals, size_t offset, size_t matchLength) { (void)stats; (void)litLength; (void)literals; (void)offset; (void)matchLength; }
-#endif
-
-typedef struct {
-    void* buffer;
-    U32*  offsetStart;
-    U32*  offset;
-    BYTE* offCodeStart;
-    BYTE* litStart;
-    BYTE* lit;
-    U16*  litLengthStart;
-    U16*  litLength;
-    BYTE* llCodeStart;
-    U16*  matchLengthStart;
-    U16*  matchLength;
-    BYTE* mlCodeStart;
-    U32   longLengthID;   /* 0 == no longLength; 1 == Lit.longLength; 2 == Match.longLength; */
-    U32   longLengthPos;
-    /* opt */
-    ZSTDv06_optimal_t* priceTable;
-    ZSTDv06_match_t* matchTable;
-    U32* matchLengthFreq;
-    U32* litLengthFreq;
-    U32* litFreq;
-    U32* offCodeFreq;
-    U32  matchLengthSum;
-    U32  matchSum;
-    U32  litLengthSum;
-    U32  litSum;
-    U32  offCodeSum;
-    U32  log2matchLengthSum;
-    U32  log2matchSum;
-    U32  log2litLengthSum;
-    U32  log2litSum;
-    U32  log2offCodeSum;
-    U32  factor;
-    U32  cachedPrice;
-    U32  cachedLitLength;
-    const BYTE* cachedLiterals;
-    ZSTDv06_stats_t stats;
-} seqStore_t;
-
-void ZSTDv06_seqToCodes(const seqStore_t* seqStorePtr, size_t const nbSeq);
-
-
-#endif   /* ZSTDv06_CCOMMON_H_MODULE */
-/* ******************************************************************
-   FSE : Finite State Entropy codec
-   Public Prototypes declaration
-   Copyright (C) 2013-2016, Yann Collet.
-
-   BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php)
-
-   Redistribution and use in source and binary forms, with or without
-   modification, are permitted provided that the following conditions are
-   met:
-
-       * Redistributions of source code must retain the above copyright
-   notice, this list of conditions and the following disclaimer.
-       * Redistributions in binary form must reproduce the above
-   copyright notice, this list of conditions and the following disclaimer
-   in the documentation and/or other materials provided with the
-   distribution.
-
-   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
-   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
-   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
-   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
-   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
-   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
-   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
-   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
-   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
-   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
-   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
-
-   You can contact the author at :
-   - Source repository : https://github.com/Cyan4973/FiniteStateEntropy
-****************************************************************** */
-#ifndef FSEv06_H
-#define FSEv06_H
-
-#if defined (__cplusplus)
-extern "C" {
-#endif
-
-
-
-/*-****************************************
-*  FSE simple functions
-******************************************/
-/*! FSEv06_decompress():
-    Decompress FSE data from buffer 'cSrc', of size 'cSrcSize',
-    into already allocated destination buffer 'dst', of size 'dstCapacity'.
-    @return : size of regenerated data (<= maxDstSize),
-              or an error code, which can be tested using FSEv06_isError() .
-
-    ** Important ** : FSEv06_decompress() does not decompress non-compressible nor RLE data !!!
-    Why ? : making this distinction requires a header.
-    Header management is intentionally delegated to the user layer, which can better manage special cases.
-*/
-size_t FSEv06_decompress(void* dst,  size_t dstCapacity,
-                const void* cSrc, size_t cSrcSize);
-
-
-/*-*****************************************
-*  Tool functions
-******************************************/
-size_t FSEv06_compressBound(size_t size);       /* maximum compressed size */
-
-/* Error Management */
-unsigned    FSEv06_isError(size_t code);        /* tells if a return value is an error code */
-const char* FSEv06_getErrorName(size_t code);   /* provides error code string (useful for debugging) */
-
-
-
-/*-*****************************************
-*  FSE detailed API
-******************************************/
-/*!
-
-FSEv06_decompress() does the following:
-1. read normalized counters with readNCount()
-2. build decoding table 'DTable' from normalized counters
-3. decode the data stream using decoding table 'DTable'
-
-The following API allows targeting specific sub-functions for advanced tasks.
-For example, it's possible to compress several blocks using the same 'CTable',
-or to save and provide normalized distribution using external method.
-*/
-
-
-/* *** DECOMPRESSION *** */
-
-/*! FSEv06_readNCount():
-    Read compactly saved 'normalizedCounter' from 'rBuffer'.
-    @return : size read from 'rBuffer',
-              or an errorCode, which can be tested using FSEv06_isError().
-              maxSymbolValuePtr[0] and tableLogPtr[0] will also be updated with their respective values */
-size_t FSEv06_readNCount (short* normalizedCounter, unsigned* maxSymbolValuePtr, unsigned* tableLogPtr, const void* rBuffer, size_t rBuffSize);
-
-/*! Constructor and Destructor of FSEv06_DTable.
-    Note that its size depends on 'tableLog' */
-typedef unsigned FSEv06_DTable;   /* don't allocate that. It's just a way to be more restrictive than void* */
-FSEv06_DTable* FSEv06_createDTable(unsigned tableLog);
-void        FSEv06_freeDTable(FSEv06_DTable* dt);
-
-/*! FSEv06_buildDTable():
-    Builds 'dt', which must be already allocated, using FSEv06_createDTable().
-    return : 0, or an errorCode, which can be tested using FSEv06_isError() */
-size_t FSEv06_buildDTable (FSEv06_DTable* dt, const short* normalizedCounter, unsigned maxSymbolValue, unsigned tableLog);
-
-/*! FSEv06_decompress_usingDTable():
-    Decompress compressed source `cSrc` of size `cSrcSize` using `dt`
-    into `dst` which must be already allocated.
-    @return : size of regenerated data (necessarily <= `dstCapacity`),
-              or an errorCode, which can be tested using FSEv06_isError() */
-size_t FSEv06_decompress_usingDTable(void* dst, size_t dstCapacity, const void* cSrc, size_t cSrcSize, const FSEv06_DTable* dt);
-
-/*!
-Tutorial :
-----------
-(Note : these functions only decompress FSE-compressed blocks.
- If block is uncompressed, use memcpy() instead
- If block is a single repeated byte, use memset() instead )
-
-The first step is to obtain the normalized frequencies of symbols.
-This can be performed by FSEv06_readNCount() if it was saved using FSEv06_writeNCount().
-'normalizedCounter' must be already allocated, and have at least 'maxSymbolValuePtr[0]+1' cells of signed short.
-In practice, that means it's necessary to know 'maxSymbolValue' beforehand,
-or size the table to handle worst case situations (typically 256).
-FSEv06_readNCount() will provide 'tableLog' and 'maxSymbolValue'.
-The result of FSEv06_readNCount() is the number of bytes read from 'rBuffer'.
-Note that 'rBufferSize' must be at least 4 bytes, even if useful information is less than that.
-If there is an error, the function will return an error code, which can be tested using FSEv06_isError().
-
-The next step is to build the decompression tables 'FSEv06_DTable' from 'normalizedCounter'.
-This is performed by the function FSEv06_buildDTable().
-The space required by 'FSEv06_DTable' must be already allocated using FSEv06_createDTable().
-If there is an error, the function will return an error code, which can be tested using FSEv06_isError().
-
-`FSEv06_DTable` can then be used to decompress `cSrc`, with FSEv06_decompress_usingDTable().
-`cSrcSize` must be strictly correct, otherwise decompression will fail.
-FSEv06_decompress_usingDTable() result will tell how many bytes were regenerated (<=`dstCapacity`).
-If there is an error, the function will return an error code, which can be tested using FSEv06_isError(). (ex: dst buffer too small)
-*/
-
-
-#if defined (__cplusplus)
-}
-#endif
-
-#endif  /* FSEv06_H */
-/* ******************************************************************
-   bitstream
-   Part of FSE library
-   header file (to include)
-   Copyright (C) 2013-2016, Yann Collet.
-
-   BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php)
-
-   Redistribution and use in source and binary forms, with or without
-   modification, are permitted provided that the following conditions are
-   met:
-
-       * Redistributions of source code must retain the above copyright
-   notice, this list of conditions and the following disclaimer.
-       * Redistributions in binary form must reproduce the above
-   copyright notice, this list of conditions and the following disclaimer
-   in the documentation and/or other materials provided with the
-   distribution.
 
-   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
-   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
-   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
-   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
-   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
-   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
-   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
-   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
-   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
-   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
-   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
-
-   You can contact the author at :
-   - Source repository : https://github.com/Cyan4973/FiniteStateEntropy
-****************************************************************** */
-#ifndef BITSTREAM_H_MODULE
-#define BITSTREAM_H_MODULE
-
-#if defined (__cplusplus)
-extern "C" {
-#endif
-
-
-/*
-*  This API consists of small unitary functions, which must be inlined for best performance.
-*  Since link-time-optimization is not available for all compilers,
-*  these functions are defined into a .h to be included.
-*/
-
-
-/*=========================================
-*  Target specific
-=========================================*/
-#if defined(__BMI__) && defined(__GNUC__)
-#  include <immintrin.h>   /* support for bextr (experimental) */
-#endif
-
-
-
-/*-********************************************
-*  bitStream decoding API (read backward)
-**********************************************/
-typedef struct
-{
-    size_t   bitContainer;
-    unsigned bitsConsumed;
-    const char* ptr;
-    const char* start;
-} BITv06_DStream_t;
-
-typedef enum { BITv06_DStream_unfinished = 0,
-               BITv06_DStream_endOfBuffer = 1,
-               BITv06_DStream_completed = 2,
-               BITv06_DStream_overflow = 3 } BITv06_DStream_status;  /* result of BITv06_reloadDStream() */
-               /* 1,2,4,8 would be better for bitmap combinations, but slows down performance a bit ... :( */
-
-MEM_STATIC size_t   BITv06_initDStream(BITv06_DStream_t* bitD, const void* srcBuffer, size_t srcSize);
-MEM_STATIC size_t   BITv06_readBits(BITv06_DStream_t* bitD, unsigned nbBits);
-MEM_STATIC BITv06_DStream_status BITv06_reloadDStream(BITv06_DStream_t* bitD);
-MEM_STATIC unsigned BITv06_endOfDStream(const BITv06_DStream_t* bitD);
-
-
-/* Start by invoking BITv06_initDStream().
+/* Start by invoking BITv07_initDStream().
 *  A chunk of the bitStream is then stored into a local register.
 *  Local register size is 64-bits on 64-bits systems, 32-bits on 32-bits systems (size_t).
 *  You can then retrieve bitFields stored into the local register, **in reverse order**.
-*  Local register is explicitly reloaded from memory by the BITv06_reloadDStream() method.
-*  A reload guarantee a minimum of ((8*sizeof(bitD->bitContainer))-7) bits when its result is BITv06_DStream_unfinished.
+*  Local register is explicitly reloaded from memory by the BITv07_reloadDStream() method.
+*  A reload guarantee a minimum of ((8*sizeof(bitD->bitContainer))-7) bits when its result is BITv07_DStream_unfinished.
 *  Otherwise, it can be less than that, so proceed accordingly.
-*  Checking if DStream has reached its end can be performed with BITv06_endOfDStream().
+*  Checking if DStream has reached its end can be performed with BITv07_endOfDStream().
 */
 
 
 /*-****************************************
 *  unsafe API
 ******************************************/
-MEM_STATIC size_t BITv06_readBitsFast(BITv06_DStream_t* bitD, unsigned nbBits);
+MEM_STATIC size_t BITv07_readBitsFast(BITv07_DStream_t* bitD, unsigned nbBits);
 /* faster, but works only if nbBits >= 1 */
 
 
@@ -1231,7 +867,7 @@ MEM_STATIC size_t BITv06_readBitsFast(BITv06_DStream_t* bitD, unsigned nbBits);
 /*-**************************************************************
 *  Internal functions
 ****************************************************************/
-MEM_STATIC unsigned BITv06_highbit32 (register U32 val)
+MEM_STATIC unsigned BITv07_highbit32 (register U32 val)
 {
 #   if defined(_MSC_VER)   /* Visual */
     unsigned long r=0;
@@ -1242,32 +878,29 @@ MEM_STATIC unsigned BITv06_highbit32 (register U32 val)
 #   else   /* Software version */
     static const unsigned DeBruijnClz[32] = { 0, 9, 1, 10, 13, 21, 2, 29, 11, 14, 16, 18, 22, 25, 3, 30, 8, 12, 20, 28, 15, 17, 24, 7, 19, 27, 23, 6, 26, 5, 4, 31 };
     U32 v = val;
-    unsigned r;
     v |= v >> 1;
     v |= v >> 2;
     v |= v >> 4;
     v |= v >> 8;
     v |= v >> 16;
-    r = DeBruijnClz[ (U32) (v * 0x07C4ACDDU) >> 27];
-    return r;
+    return DeBruijnClz[ (U32) (v * 0x07C4ACDDU) >> 27];
 #   endif
 }
 
 /*=====    Local Constants   =====*/
-static const unsigned BITv06_mask[] = { 0, 1, 3, 7, 0xF, 0x1F, 0x3F, 0x7F, 0xFF, 0x1FF, 0x3FF, 0x7FF, 0xFFF, 0x1FFF, 0x3FFF, 0x7FFF, 0xFFFF, 0x1FFFF, 0x3FFFF, 0x7FFFF, 0xFFFFF, 0x1FFFFF, 0x3FFFFF, 0x7FFFFF,  0xFFFFFF, 0x1FFFFFF, 0x3FFFFFF };   /* up to 26 bits */
-
+static const unsigned BITv07_mask[] = { 0, 1, 3, 7, 0xF, 0x1F, 0x3F, 0x7F, 0xFF, 0x1FF, 0x3FF, 0x7FF, 0xFFF, 0x1FFF, 0x3FFF, 0x7FFF, 0xFFFF, 0x1FFFF, 0x3FFFF, 0x7FFFF, 0xFFFFF, 0x1FFFFF, 0x3FFFFF, 0x7FFFFF,  0xFFFFFF, 0x1FFFFFF, 0x3FFFFFF };   /* up to 26 bits */
 
 
 /*-********************************************************
 * bitStream decoding
 **********************************************************/
-/*! BITv06_initDStream() :
-*   Initialize a BITv06_DStream_t.
-*   `bitD` : a pointer to an already allocated BITv06_DStream_t structure.
+/*! BITv07_initDStream() :
+*   Initialize a BITv07_DStream_t.
+*   `bitD` : a pointer to an already allocated BITv07_DStream_t structure.
 *   `srcSize` must be the *exact* size of the bitStream, in bytes.
 *   @return : size of stream (== srcSize) or an errorCode if a problem is detected
 */
-MEM_STATIC size_t BITv06_initDStream(BITv06_DStream_t* bitD, const void* srcBuffer, size_t srcSize)
+MEM_STATIC size_t BITv07_initDStream(BITv07_DStream_t* bitD, const void* srcBuffer, size_t srcSize)
 {
     if (srcSize < 1) { memset(bitD, 0, sizeof(*bitD)); return ERROR(srcSize_wrong); }
 
@@ -1276,8 +909,8 @@ MEM_STATIC size_t BITv06_initDStream(BITv06_DStream_t* bitD, const void* srcBuff
         bitD->ptr   = (const char*)srcBuffer + srcSize - sizeof(bitD->bitContainer);
         bitD->bitContainer = MEM_readLEST(bitD->ptr);
         { BYTE const lastByte = ((const BYTE*)srcBuffer)[srcSize-1];
-          if (lastByte == 0) return ERROR(GENERIC);   /* endMark not present */
-          bitD->bitsConsumed = 8 - BITv06_highbit32(lastByte); }
+          bitD->bitsConsumed = lastByte ? 8 - BITv07_highbit32(lastByte) : 0;
+          if (lastByte == 0) return ERROR(GENERIC); /* endMark not present */ }
     } else {
         bitD->start = (const char*)srcBuffer;
         bitD->ptr   = bitD->start;
@@ -1293,20 +926,20 @@ MEM_STATIC size_t BITv06_initDStream(BITv06_DStream_t* bitD, const void* srcBuff
             default:;
         }
         { BYTE const lastByte = ((const BYTE*)srcBuffer)[srcSize-1];
-          if (lastByte == 0) return ERROR(GENERIC);   /* endMark not present */
-          bitD->bitsConsumed = 8 - BITv06_highbit32(lastByte); }
+          bitD->bitsConsumed = lastByte ? 8 - BITv07_highbit32(lastByte) : 0;
+          if (lastByte == 0) return ERROR(GENERIC); /* endMark not present */ }
         bitD->bitsConsumed += (U32)(sizeof(bitD->bitContainer) - srcSize)*8;
     }
 
     return srcSize;
 }
 
-MEM_STATIC size_t BITv06_getUpperBits(size_t bitContainer, U32 const start)
+MEM_STATIC size_t BITv07_getUpperBits(size_t bitContainer, U32 const start)
 {
     return bitContainer >> start;
 }
 
-MEM_STATIC size_t BITv06_getMiddleBits(size_t bitContainer, U32 const start, U32 const nbBits)
+MEM_STATIC size_t BITv07_getMiddleBits(size_t bitContainer, U32 const start, U32 const nbBits)
 {
 #if defined(__BMI__) && defined(__GNUC__)   /* experimental */
 #  if defined(__x86_64__)
@@ -1316,91 +949,91 @@ MEM_STATIC size_t BITv06_getMiddleBits(size_t bitContainer, U32 const start, U32
 #  endif
         return _bextr_u32(bitContainer, start, nbBits);
 #else
-    return (bitContainer >> start) & BITv06_mask[nbBits];
+    return (bitContainer >> start) & BITv07_mask[nbBits];
 #endif
 }
 
-MEM_STATIC size_t BITv06_getLowerBits(size_t bitContainer, U32 const nbBits)
+MEM_STATIC size_t BITv07_getLowerBits(size_t bitContainer, U32 const nbBits)
 {
-    return bitContainer & BITv06_mask[nbBits];
+    return bitContainer & BITv07_mask[nbBits];
 }
 
-/*! BITv06_lookBits() :
+/*! BITv07_lookBits() :
  *  Provides next n bits from local register.
  *  local register is not modified.
  *  On 32-bits, maxNbBits==24.
  *  On 64-bits, maxNbBits==56.
  *  @return : value extracted
  */
- MEM_STATIC size_t BITv06_lookBits(const BITv06_DStream_t* bitD, U32 nbBits)
+ MEM_STATIC size_t BITv07_lookBits(const BITv07_DStream_t* bitD, U32 nbBits)
 {
 #if defined(__BMI__) && defined(__GNUC__)   /* experimental; fails if bitD->bitsConsumed + nbBits > sizeof(bitD->bitContainer)*8 */
-    return BITv06_getMiddleBits(bitD->bitContainer, (sizeof(bitD->bitContainer)*8) - bitD->bitsConsumed - nbBits, nbBits);
+    return BITv07_getMiddleBits(bitD->bitContainer, (sizeof(bitD->bitContainer)*8) - bitD->bitsConsumed - nbBits, nbBits);
 #else
     U32 const bitMask = sizeof(bitD->bitContainer)*8 - 1;
     return ((bitD->bitContainer << (bitD->bitsConsumed & bitMask)) >> 1) >> ((bitMask-nbBits) & bitMask);
 #endif
 }
 
-/*! BITv06_lookBitsFast() :
+/*! BITv07_lookBitsFast() :
 *   unsafe version; only works only if nbBits >= 1 */
-MEM_STATIC size_t BITv06_lookBitsFast(const BITv06_DStream_t* bitD, U32 nbBits)
+MEM_STATIC size_t BITv07_lookBitsFast(const BITv07_DStream_t* bitD, U32 nbBits)
 {
     U32 const bitMask = sizeof(bitD->bitContainer)*8 - 1;
     return (bitD->bitContainer << (bitD->bitsConsumed & bitMask)) >> (((bitMask+1)-nbBits) & bitMask);
 }
 
-MEM_STATIC void BITv06_skipBits(BITv06_DStream_t* bitD, U32 nbBits)
+MEM_STATIC void BITv07_skipBits(BITv07_DStream_t* bitD, U32 nbBits)
 {
     bitD->bitsConsumed += nbBits;
 }
 
-/*! BITv06_readBits() :
+/*! BITv07_readBits() :
  *  Read (consume) next n bits from local register and update.
  *  Pay attention to not read more than nbBits contained into local register.
  *  @return : extracted value.
  */
-MEM_STATIC size_t BITv06_readBits(BITv06_DStream_t* bitD, U32 nbBits)
+MEM_STATIC size_t BITv07_readBits(BITv07_DStream_t* bitD, U32 nbBits)
 {
-    size_t const value = BITv06_lookBits(bitD, nbBits);
-    BITv06_skipBits(bitD, nbBits);
+    size_t const value = BITv07_lookBits(bitD, nbBits);
+    BITv07_skipBits(bitD, nbBits);
     return value;
 }
 
-/*! BITv06_readBitsFast() :
+/*! BITv07_readBitsFast() :
 *   unsafe version; only works only if nbBits >= 1 */
-MEM_STATIC size_t BITv06_readBitsFast(BITv06_DStream_t* bitD, U32 nbBits)
+MEM_STATIC size_t BITv07_readBitsFast(BITv07_DStream_t* bitD, U32 nbBits)
 {
-    size_t const value = BITv06_lookBitsFast(bitD, nbBits);
-    BITv06_skipBits(bitD, nbBits);
+    size_t const value = BITv07_lookBitsFast(bitD, nbBits);
+    BITv07_skipBits(bitD, nbBits);
     return value;
 }
 
-/*! BITv06_reloadDStream() :
-*   Refill `BITv06_DStream_t` from src buffer previously defined (see BITv06_initDStream() ).
+/*! BITv07_reloadDStream() :
+*   Refill `BITv07_DStream_t` from src buffer previously defined (see BITv07_initDStream() ).
 *   This function is safe, it guarantees it will not read beyond src buffer.
-*   @return : status of `BITv06_DStream_t` internal register.
+*   @return : status of `BITv07_DStream_t` internal register.
               if status == unfinished, internal register is filled with >= (sizeof(bitD->bitContainer)*8 - 7) bits */
-MEM_STATIC BITv06_DStream_status BITv06_reloadDStream(BITv06_DStream_t* bitD)
+MEM_STATIC BITv07_DStream_status BITv07_reloadDStream(BITv07_DStream_t* bitD)
 {
-	if (bitD->bitsConsumed > (sizeof(bitD->bitContainer)*8))  /* should never happen */
-		return BITv06_DStream_overflow;
+	if (bitD->bitsConsumed > (sizeof(bitD->bitContainer)*8))  /* should not happen => corruption detected */
+		return BITv07_DStream_overflow;
 
     if (bitD->ptr >= bitD->start + sizeof(bitD->bitContainer)) {
         bitD->ptr -= bitD->bitsConsumed >> 3;
         bitD->bitsConsumed &= 7;
         bitD->bitContainer = MEM_readLEST(bitD->ptr);
-        return BITv06_DStream_unfinished;
+        return BITv07_DStream_unfinished;
     }
     if (bitD->ptr == bitD->start) {
-        if (bitD->bitsConsumed < sizeof(bitD->bitContainer)*8) return BITv06_DStream_endOfBuffer;
-        return BITv06_DStream_completed;
+        if (bitD->bitsConsumed < sizeof(bitD->bitContainer)*8) return BITv07_DStream_endOfBuffer;
+        return BITv07_DStream_completed;
     }
     {   U32 nbBytes = bitD->bitsConsumed >> 3;
-        BITv06_DStream_status result = BITv06_DStream_unfinished;
+        BITv07_DStream_status result = BITv07_DStream_unfinished;
         if (bitD->ptr - nbBytes < bitD->start) {
             nbBytes = (U32)(bitD->ptr - bitD->start);  /* ptr > start */
-            result = BITv06_DStream_endOfBuffer;
+            result = BITv07_DStream_endOfBuffer;
         }
         bitD->ptr -= nbBytes;
         bitD->bitsConsumed -= nbBytes*8;
@@ -1409,10 +1042,10 @@ MEM_STATIC BITv06_DStream_status BITv06_reloadDStream(BITv06_DStream_t* bitD)
     }
 }
 
-/*! BITv06_endOfDStream() :
+/*! BITv07_endOfDStream() :
 *   @return Tells if DStream has exactly reached its end (all bits consumed).
 */
-MEM_STATIC unsigned BITv06_endOfDStream(const BITv06_DStream_t* DStream)
+MEM_STATIC unsigned BITv07_endOfDStream(const BITv07_DStream_t* DStream)
 {
     return ((DStream->ptr == DStream->start) && (DStream->bitsConsumed == sizeof(DStream->bitContainer)*8));
 }
@@ -1423,9 +1056,9 @@ MEM_STATIC unsigned BITv06_endOfDStream(const BITv06_DStream_t* DStream)
 
 #endif /* BITSTREAM_H_MODULE */
 /* ******************************************************************
-   FSE : Finite State Entropy coder
-   header file for static linking (only)
-   Copyright (C) 2013-2015, Yann Collet
+   FSE : Finite State Entropy codec
+   Public Prototypes declaration
+   Copyright (C) 2013-2016, Yann Collet.
 
    BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php)
 
@@ -1454,39 +1087,139 @@ MEM_STATIC unsigned BITv06_endOfDStream(const BITv06_DStream_t* DStream)
 
    You can contact the author at :
    - Source repository : https://github.com/Cyan4973/FiniteStateEntropy
-   - Public forum : https://groups.google.com/forum/#!forum/lz4c
 ****************************************************************** */
-#ifndef FSEv06_STATIC_H
-#define FSEv06_STATIC_H
+#ifndef FSEv07_H
+#define FSEv07_H
 
 #if defined (__cplusplus)
 extern "C" {
 #endif
 
 
+
+/*-****************************************
+*  FSE simple functions
+******************************************/
+
+/*! FSEv07_decompress():
+    Decompress FSE data from buffer 'cSrc', of size 'cSrcSize',
+    into already allocated destination buffer 'dst', of size 'dstCapacity'.
+    @return : size of regenerated data (<= maxDstSize),
+              or an error code, which can be tested using FSEv07_isError() .
+
+    ** Important ** : FSEv07_decompress() does not decompress non-compressible nor RLE data !!!
+    Why ? : making this distinction requires a header.
+    Header management is intentionally delegated to the user layer, which can better manage special cases.
+*/
+size_t FSEv07_decompress(void* dst,  size_t dstCapacity,
+                const void* cSrc, size_t cSrcSize);
+
+
+/* Error Management */
+unsigned    FSEv07_isError(size_t code);        /* tells if a return value is an error code */
+const char* FSEv07_getErrorName(size_t code);   /* provides error code string (useful for debugging) */
+
+
+/*-*****************************************
+*  FSE detailed API
+******************************************/
+/*!
+FSEv07_decompress() does the following:
+1. read normalized counters with readNCount()
+2. build decoding table 'DTable' from normalized counters
+3. decode the data stream using decoding table 'DTable'
+
+The following API allows targeting specific sub-functions for advanced tasks.
+For example, it's possible to compress several blocks using the same 'CTable',
+or to save and provide normalized distribution using external method.
+*/
+
+
+/* *** DECOMPRESSION *** */
+
+/*! FSEv07_readNCount():
+    Read compactly saved 'normalizedCounter' from 'rBuffer'.
+    @return : size read from 'rBuffer',
+              or an errorCode, which can be tested using FSEv07_isError().
+              maxSymbolValuePtr[0] and tableLogPtr[0] will also be updated with their respective values */
+size_t FSEv07_readNCount (short* normalizedCounter, unsigned* maxSymbolValuePtr, unsigned* tableLogPtr, const void* rBuffer, size_t rBuffSize);
+
+/*! Constructor and Destructor of FSEv07_DTable.
+    Note that its size depends on 'tableLog' */
+typedef unsigned FSEv07_DTable;   /* don't allocate that. It's just a way to be more restrictive than void* */
+FSEv07_DTable* FSEv07_createDTable(unsigned tableLog);
+void        FSEv07_freeDTable(FSEv07_DTable* dt);
+
+/*! FSEv07_buildDTable():
+    Builds 'dt', which must be already allocated, using FSEv07_createDTable().
+    return : 0, or an errorCode, which can be tested using FSEv07_isError() */
+size_t FSEv07_buildDTable (FSEv07_DTable* dt, const short* normalizedCounter, unsigned maxSymbolValue, unsigned tableLog);
+
+/*! FSEv07_decompress_usingDTable():
+    Decompress compressed source `cSrc` of size `cSrcSize` using `dt`
+    into `dst` which must be already allocated.
+    @return : size of regenerated data (necessarily <= `dstCapacity`),
+              or an errorCode, which can be tested using FSEv07_isError() */
+size_t FSEv07_decompress_usingDTable(void* dst, size_t dstCapacity, const void* cSrc, size_t cSrcSize, const FSEv07_DTable* dt);
+
+/*!
+Tutorial :
+----------
+(Note : these functions only decompress FSE-compressed blocks.
+ If block is uncompressed, use memcpy() instead
+ If block is a single repeated byte, use memset() instead )
+
+The first step is to obtain the normalized frequencies of symbols.
+This can be performed by FSEv07_readNCount() if it was saved using FSEv07_writeNCount().
+'normalizedCounter' must be already allocated, and have at least 'maxSymbolValuePtr[0]+1' cells of signed short.
+In practice, that means it's necessary to know 'maxSymbolValue' beforehand,
+or size the table to handle worst case situations (typically 256).
+FSEv07_readNCount() will provide 'tableLog' and 'maxSymbolValue'.
+The result of FSEv07_readNCount() is the number of bytes read from 'rBuffer'.
+Note that 'rBufferSize' must be at least 4 bytes, even if useful information is less than that.
+If there is an error, the function will return an error code, which can be tested using FSEv07_isError().
+
+The next step is to build the decompression tables 'FSEv07_DTable' from 'normalizedCounter'.
+This is performed by the function FSEv07_buildDTable().
+The space required by 'FSEv07_DTable' must be already allocated using FSEv07_createDTable().
+If there is an error, the function will return an error code, which can be tested using FSEv07_isError().
+
+`FSEv07_DTable` can then be used to decompress `cSrc`, with FSEv07_decompress_usingDTable().
+`cSrcSize` must be strictly correct, otherwise decompression will fail.
+FSEv07_decompress_usingDTable() result will tell how many bytes were regenerated (<=`dstCapacity`).
+If there is an error, the function will return an error code, which can be tested using FSEv07_isError(). (ex: dst buffer too small)
+*/
+
+
+#ifdef FSEv07_STATIC_LINKING_ONLY
+
+
 /* *****************************************
 *  Static allocation
 *******************************************/
 /* FSE buffer bounds */
-#define FSEv06_NCOUNTBOUND 512
-#define FSEv06_BLOCKBOUND(size) (size + (size>>7))
-#define FSEv06_COMPRESSBOUND(size) (FSEv06_NCOUNTBOUND + FSEv06_BLOCKBOUND(size))   /* Macro version, useful for static allocation */
+#define FSEv07_NCOUNTBOUND 512
+#define FSEv07_BLOCKBOUND(size) (size + (size>>7))
 
 /* It is possible to statically allocate FSE CTable/DTable as a table of unsigned using below macros */
-#define FSEv06_DTABLE_SIZE_U32(maxTableLog)                   (1 + (1<<maxTableLog))
+#define FSEv07_DTABLE_SIZE_U32(maxTableLog)                   (1 + (1<<maxTableLog))
 
 
 /* *****************************************
 *  FSE advanced API
 *******************************************/
-size_t FSEv06_countFast(unsigned* count, unsigned* maxSymbolValuePtr, const void* src, size_t srcSize);
-/* same as FSEv06_count(), but blindly trusts that all byte values within src are <= *maxSymbolValuePtr  */
+size_t FSEv07_countFast(unsigned* count, unsigned* maxSymbolValuePtr, const void* src, size_t srcSize);
+/**< same as FSEv07_count(), but blindly trusts that all byte values within src are <= *maxSymbolValuePtr  */
+
+unsigned FSEv07_optimalTableLog_internal(unsigned maxTableLog, size_t srcSize, unsigned maxSymbolValue, unsigned minus);
+/**< same as FSEv07_optimalTableLog(), which used `minus==2` */
+
+size_t FSEv07_buildDTable_raw (FSEv07_DTable* dt, unsigned nbBits);
+/**< build a fake FSEv07_DTable, designed to read an uncompressed bitstream where each symbol uses nbBits */
 
-size_t FSEv06_buildDTable_raw (FSEv06_DTable* dt, unsigned nbBits);
-/* build a fake FSEv06_DTable, designed to read an uncompressed bitstream where each symbol uses nbBits */
+size_t FSEv07_buildDTable_rle (FSEv07_DTable* dt, unsigned char symbolValue);
+/**< build a fake FSEv07_DTable, designed to always generate the same symbolValue */
 
-size_t FSEv06_buildDTable_rle (FSEv06_DTable* dt, unsigned char symbolValue);
-/* build a fake FSEv06_DTable, designed to always generate the same symbolValue */
 
 
 /* *****************************************
@@ -1496,197 +1229,369 @@ typedef struct
 {
     size_t      state;
     const void* table;   /* precise table may vary, depending on U16 */
-} FSEv06_DState_t;
+} FSEv07_DState_t;
 
 
-static void     FSEv06_initDState(FSEv06_DState_t* DStatePtr, BITv06_DStream_t* bitD, const FSEv06_DTable* dt);
+static void     FSEv07_initDState(FSEv07_DState_t* DStatePtr, BITv07_DStream_t* bitD, const FSEv07_DTable* dt);
 
-static unsigned char FSEv06_decodeSymbol(FSEv06_DState_t* DStatePtr, BITv06_DStream_t* bitD);
+static unsigned char FSEv07_decodeSymbol(FSEv07_DState_t* DStatePtr, BITv07_DStream_t* bitD);
 
-static unsigned FSEv06_endOfDState(const FSEv06_DState_t* DStatePtr);
+static unsigned FSEv07_endOfDState(const FSEv07_DState_t* DStatePtr);
 
-/*!
-Let's now decompose FSEv06_decompress_usingDTable() into its unitary components.
+/**<
+Let's now decompose FSEv07_decompress_usingDTable() into its unitary components.
 You will decode FSE-encoded symbols from the bitStream,
 and also any other bitFields you put in, **in reverse order**.
 
 You will need a few variables to track your bitStream. They are :
 
-BITv06_DStream_t DStream;    // Stream context
-FSEv06_DState_t  DState;     // State context. Multiple ones are possible
-FSEv06_DTable*   DTablePtr;  // Decoding table, provided by FSEv06_buildDTable()
+BITv07_DStream_t DStream;    // Stream context
+FSEv07_DState_t  DState;     // State context. Multiple ones are possible
+FSEv07_DTable*   DTablePtr;  // Decoding table, provided by FSEv07_buildDTable()
 
 The first thing to do is to init the bitStream.
-    errorCode = BITv06_initDStream(&DStream, srcBuffer, srcSize);
+    errorCode = BITv07_initDStream(&DStream, srcBuffer, srcSize);
 
 You should then retrieve your initial state(s)
 (in reverse flushing order if you have several ones) :
-    errorCode = FSEv06_initDState(&DState, &DStream, DTablePtr);
+    errorCode = FSEv07_initDState(&DState, &DStream, DTablePtr);
 
 You can then decode your data, symbol after symbol.
-For information the maximum number of bits read by FSEv06_decodeSymbol() is 'tableLog'.
+For information the maximum number of bits read by FSEv07_decodeSymbol() is 'tableLog'.
 Keep in mind that symbols are decoded in reverse order, like a LIFO stack (last in, first out).
-    unsigned char symbol = FSEv06_decodeSymbol(&DState, &DStream);
+    unsigned char symbol = FSEv07_decodeSymbol(&DState, &DStream);
 
 You can retrieve any bitfield you eventually stored into the bitStream (in reverse order)
 Note : maximum allowed nbBits is 25, for 32-bits compatibility
-    size_t bitField = BITv06_readBits(&DStream, nbBits);
+    size_t bitField = BITv07_readBits(&DStream, nbBits);
 
 All above operations only read from local register (which size depends on size_t).
 Refueling the register from memory is manually performed by the reload method.
-    endSignal = FSEv06_reloadDStream(&DStream);
+    endSignal = FSEv07_reloadDStream(&DStream);
 
-BITv06_reloadDStream() result tells if there is still some more data to read from DStream.
-BITv06_DStream_unfinished : there is still some data left into the DStream.
-BITv06_DStream_endOfBuffer : Dstream reached end of buffer. Its container may no longer be completely filled.
-BITv06_DStream_completed : Dstream reached its exact end, corresponding in general to decompression completed.
-BITv06_DStream_tooFar : Dstream went too far. Decompression result is corrupted.
+BITv07_reloadDStream() result tells if there is still some more data to read from DStream.
+BITv07_DStream_unfinished : there is still some data left into the DStream.
+BITv07_DStream_endOfBuffer : Dstream reached end of buffer. Its container may no longer be completely filled.
+BITv07_DStream_completed : Dstream reached its exact end, corresponding in general to decompression completed.
+BITv07_DStream_tooFar : Dstream went too far. Decompression result is corrupted.
 
-When reaching end of buffer (BITv06_DStream_endOfBuffer), progress slowly, notably if you decode multiple symbols per loop,
+When reaching end of buffer (BITv07_DStream_endOfBuffer), progress slowly, notably if you decode multiple symbols per loop,
 to properly detect the exact end of stream.
 After each decoded symbol, check if DStream is fully consumed using this simple test :
-    BITv06_reloadDStream(&DStream) >= BITv06_DStream_completed
+    BITv07_reloadDStream(&DStream) >= BITv07_DStream_completed
 
 When it's done, verify decompression is fully completed, by checking both DStream and the relevant states.
 Checking if DStream has reached its end is performed by :
-    BITv06_endOfDStream(&DStream);
+    BITv07_endOfDStream(&DStream);
 Check also the states. There might be some symbols left there, if some high probability ones (>50%) are possible.
-    FSEv06_endOfDState(&DState);
+    FSEv07_endOfDState(&DState);
 */
 
 
 /* *****************************************
 *  FSE unsafe API
 *******************************************/
-static unsigned char FSEv06_decodeSymbolFast(FSEv06_DState_t* DStatePtr, BITv06_DStream_t* bitD);
+static unsigned char FSEv07_decodeSymbolFast(FSEv07_DState_t* DStatePtr, BITv07_DStream_t* bitD);
 /* faster, but works only if nbBits is always >= 1 (otherwise, result will be corrupted) */
 
 
-/* *****************************************
-*  Implementation of inlined functions
-*******************************************/
-
-
 /*<=====    Decompression    =====>*/
 
 typedef struct {
     U16 tableLog;
     U16 fastMode;
-} FSEv06_DTableHeader;   /* sizeof U32 */
+} FSEv07_DTableHeader;   /* sizeof U32 */
 
 typedef struct
 {
     unsigned short newState;
     unsigned char  symbol;
     unsigned char  nbBits;
-} FSEv06_decode_t;   /* size == U32 */
+} FSEv07_decode_t;   /* size == U32 */
 
-MEM_STATIC void FSEv06_initDState(FSEv06_DState_t* DStatePtr, BITv06_DStream_t* bitD, const FSEv06_DTable* dt)
+MEM_STATIC void FSEv07_initDState(FSEv07_DState_t* DStatePtr, BITv07_DStream_t* bitD, const FSEv07_DTable* dt)
 {
     const void* ptr = dt;
-    const FSEv06_DTableHeader* const DTableH = (const FSEv06_DTableHeader*)ptr;
-    DStatePtr->state = BITv06_readBits(bitD, DTableH->tableLog);
-    BITv06_reloadDStream(bitD);
+    const FSEv07_DTableHeader* const DTableH = (const FSEv07_DTableHeader*)ptr;
+    DStatePtr->state = BITv07_readBits(bitD, DTableH->tableLog);
+    BITv07_reloadDStream(bitD);
     DStatePtr->table = dt + 1;
 }
 
-MEM_STATIC BYTE FSEv06_peekSymbol(const FSEv06_DState_t* DStatePtr)
+MEM_STATIC BYTE FSEv07_peekSymbol(const FSEv07_DState_t* DStatePtr)
 {
-    FSEv06_decode_t const DInfo = ((const FSEv06_decode_t*)(DStatePtr->table))[DStatePtr->state];
+    FSEv07_decode_t const DInfo = ((const FSEv07_decode_t*)(DStatePtr->table))[DStatePtr->state];
     return DInfo.symbol;
 }
 
-MEM_STATIC void FSEv06_updateState(FSEv06_DState_t* DStatePtr, BITv06_DStream_t* bitD)
+MEM_STATIC void FSEv07_updateState(FSEv07_DState_t* DStatePtr, BITv07_DStream_t* bitD)
+{
+    FSEv07_decode_t const DInfo = ((const FSEv07_decode_t*)(DStatePtr->table))[DStatePtr->state];
+    U32 const nbBits = DInfo.nbBits;
+    size_t const lowBits = BITv07_readBits(bitD, nbBits);
+    DStatePtr->state = DInfo.newState + lowBits;
+}
+
+MEM_STATIC BYTE FSEv07_decodeSymbol(FSEv07_DState_t* DStatePtr, BITv07_DStream_t* bitD)
+{
+    FSEv07_decode_t const DInfo = ((const FSEv07_decode_t*)(DStatePtr->table))[DStatePtr->state];
+    U32 const nbBits = DInfo.nbBits;
+    BYTE const symbol = DInfo.symbol;
+    size_t const lowBits = BITv07_readBits(bitD, nbBits);
+
+    DStatePtr->state = DInfo.newState + lowBits;
+    return symbol;
+}
+
+/*! FSEv07_decodeSymbolFast() :
+    unsafe, only works if no symbol has a probability > 50% */
+MEM_STATIC BYTE FSEv07_decodeSymbolFast(FSEv07_DState_t* DStatePtr, BITv07_DStream_t* bitD)
 {
-    FSEv06_decode_t const DInfo = ((const FSEv06_decode_t*)(DStatePtr->table))[DStatePtr->state];
+    FSEv07_decode_t const DInfo = ((const FSEv07_decode_t*)(DStatePtr->table))[DStatePtr->state];
     U32 const nbBits = DInfo.nbBits;
-    size_t const lowBits = BITv06_readBits(bitD, nbBits);
+    BYTE const symbol = DInfo.symbol;
+    size_t const lowBits = BITv07_readBitsFast(bitD, nbBits);
+
     DStatePtr->state = DInfo.newState + lowBits;
+    return symbol;
+}
+
+MEM_STATIC unsigned FSEv07_endOfDState(const FSEv07_DState_t* DStatePtr)
+{
+    return DStatePtr->state == 0;
+}
+
+
+
+#ifndef FSEv07_COMMONDEFS_ONLY
+
+/* **************************************************************
+*  Tuning parameters
+****************************************************************/
+/*!MEMORY_USAGE :
+*  Memory usage formula : N->2^N Bytes (examples : 10 -> 1KB; 12 -> 4KB ; 16 -> 64KB; 20 -> 1MB; etc.)
+*  Increasing memory usage improves compression ratio
+*  Reduced memory usage can improve speed, due to cache effect
+*  Recommended max value is 14, for 16KB, which nicely fits into Intel x86 L1 cache */
+#define FSEv07_MAX_MEMORY_USAGE 14
+#define FSEv07_DEFAULT_MEMORY_USAGE 13
+
+/*!FSEv07_MAX_SYMBOL_VALUE :
+*  Maximum symbol value authorized.
+*  Required for proper stack allocation */
+#define FSEv07_MAX_SYMBOL_VALUE 255
+
+
+/* **************************************************************
+*  template functions type & suffix
+****************************************************************/
+#define FSEv07_FUNCTION_TYPE BYTE
+#define FSEv07_FUNCTION_EXTENSION
+#define FSEv07_DECODE_TYPE FSEv07_decode_t
+
+
+#endif   /* !FSEv07_COMMONDEFS_ONLY */
+
+
+/* ***************************************************************
+*  Constants
+*****************************************************************/
+#define FSEv07_MAX_TABLELOG  (FSEv07_MAX_MEMORY_USAGE-2)
+#define FSEv07_MAX_TABLESIZE (1U<<FSEv07_MAX_TABLELOG)
+#define FSEv07_MAXTABLESIZE_MASK (FSEv07_MAX_TABLESIZE-1)
+#define FSEv07_DEFAULT_TABLELOG (FSEv07_DEFAULT_MEMORY_USAGE-2)
+#define FSEv07_MIN_TABLELOG 5
+
+#define FSEv07_TABLELOG_ABSOLUTE_MAX 15
+#if FSEv07_MAX_TABLELOG > FSEv07_TABLELOG_ABSOLUTE_MAX
+#  error "FSEv07_MAX_TABLELOG > FSEv07_TABLELOG_ABSOLUTE_MAX is not supported"
+#endif
+
+#define FSEv07_TABLESTEP(tableSize) ((tableSize>>1) + (tableSize>>3) + 3)
+
+
+#endif /* FSEv07_STATIC_LINKING_ONLY */
+
+
+#if defined (__cplusplus)
 }
+#endif
+
+#endif  /* FSEv07_H */
+/* ******************************************************************
+   Huffman coder, part of New Generation Entropy library
+   header file
+   Copyright (C) 2013-2016, Yann Collet.
+
+   BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php)
+
+   Redistribution and use in source and binary forms, with or without
+   modification, are permitted provided that the following conditions are
+   met:
+
+       * Redistributions of source code must retain the above copyright
+   notice, this list of conditions and the following disclaimer.
+       * Redistributions in binary form must reproduce the above
+   copyright notice, this list of conditions and the following disclaimer
+   in the documentation and/or other materials provided with the
+   distribution.
+
+   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+   You can contact the author at :
+   - Source repository : https://github.com/Cyan4973/FiniteStateEntropy
+****************************************************************** */
+#ifndef HUFv07_H_298734234
+#define HUFv07_H_298734234
+
+#if defined (__cplusplus)
+extern "C" {
+#endif
+
+
+
+/* *** simple functions *** */
+/**
+HUFv07_decompress() :
+    Decompress HUF data from buffer 'cSrc', of size 'cSrcSize',
+    into already allocated buffer 'dst', of minimum size 'dstSize'.
+    `dstSize` : **must** be the ***exact*** size of original (uncompressed) data.
+    Note : in contrast with FSE, HUFv07_decompress can regenerate
+           RLE (cSrcSize==1) and uncompressed (cSrcSize==dstSize) data,
+           because it knows size to regenerate.
+    @return : size of regenerated data (== dstSize),
+              or an error code, which can be tested using HUFv07_isError()
+*/
+size_t HUFv07_decompress(void* dst,  size_t dstSize,
+                const void* cSrc, size_t cSrcSize);
+
+
+/* ****************************************
+*  Tool functions
+******************************************/
+#define HUFv07_BLOCKSIZE_MAX (128 * 1024)
+
+/* Error Management */
+unsigned    HUFv07_isError(size_t code);        /**< tells if a return value is an error code */
+const char* HUFv07_getErrorName(size_t code);   /**< provides error code string (useful for debugging) */
+
+
+/* *** Advanced function *** */
+
+
+#ifdef HUFv07_STATIC_LINKING_ONLY
+
+
+/* *** Constants *** */
+#define HUFv07_TABLELOG_ABSOLUTEMAX  16   /* absolute limit of HUFv07_MAX_TABLELOG. Beyond that value, code does not work */
+#define HUFv07_TABLELOG_MAX  12           /* max configured tableLog (for static allocation); can be modified up to HUFv07_ABSOLUTEMAX_TABLELOG */
+#define HUFv07_TABLELOG_DEFAULT  11       /* tableLog by default, when not specified */
+#define HUFv07_SYMBOLVALUE_MAX 255
+#if (HUFv07_TABLELOG_MAX > HUFv07_TABLELOG_ABSOLUTEMAX)
+#  error "HUFv07_TABLELOG_MAX is too large !"
+#endif
+
+
+/* ****************************************
+*  Static allocation
+******************************************/
+/* HUF buffer bounds */
+#define HUFv07_BLOCKBOUND(size) (size + (size>>8) + 8)   /* only true if incompressible pre-filtered with fast heuristic */
+
+/* static allocation of HUF's DTable */
+typedef U32 HUFv07_DTable;
+#define HUFv07_DTABLE_SIZE(maxTableLog)   (1 + (1<<(maxTableLog)))
+#define HUFv07_CREATE_STATIC_DTABLEX2(DTable, maxTableLog) \
+        HUFv07_DTable DTable[HUFv07_DTABLE_SIZE((maxTableLog)-1)] = { ((U32)((maxTableLog)-1)*0x1000001) }
+#define HUFv07_CREATE_STATIC_DTABLEX4(DTable, maxTableLog) \
+        HUFv07_DTable DTable[HUFv07_DTABLE_SIZE(maxTableLog)] = { ((U32)(maxTableLog)*0x1000001) }
 
-MEM_STATIC BYTE FSEv06_decodeSymbol(FSEv06_DState_t* DStatePtr, BITv06_DStream_t* bitD)
-{
-    FSEv06_decode_t const DInfo = ((const FSEv06_decode_t*)(DStatePtr->table))[DStatePtr->state];
-    U32 const nbBits = DInfo.nbBits;
-    BYTE const symbol = DInfo.symbol;
-    size_t const lowBits = BITv06_readBits(bitD, nbBits);
 
-    DStatePtr->state = DInfo.newState + lowBits;
-    return symbol;
-}
+/* ****************************************
+*  Advanced decompression functions
+******************************************/
+size_t HUFv07_decompress4X2 (void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize);   /**< single-symbol decoder */
+size_t HUFv07_decompress4X4 (void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize);   /**< double-symbols decoder */
 
-/*! FSEv06_decodeSymbolFast() :
-    unsafe, only works if no symbol has a probability > 50% */
-MEM_STATIC BYTE FSEv06_decodeSymbolFast(FSEv06_DState_t* DStatePtr, BITv06_DStream_t* bitD)
-{
-    FSEv06_decode_t const DInfo = ((const FSEv06_decode_t*)(DStatePtr->table))[DStatePtr->state];
-    U32 const nbBits = DInfo.nbBits;
-    BYTE const symbol = DInfo.symbol;
-    size_t const lowBits = BITv06_readBitsFast(bitD, nbBits);
+size_t HUFv07_decompress4X_DCtx (HUFv07_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize);   /**< decodes RLE and uncompressed */
+size_t HUFv07_decompress4X_hufOnly(HUFv07_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize); /**< considers RLE and uncompressed as errors */
+size_t HUFv07_decompress4X2_DCtx(HUFv07_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize);   /**< single-symbol decoder */
+size_t HUFv07_decompress4X4_DCtx(HUFv07_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize);   /**< double-symbols decoder */
 
-    DStatePtr->state = DInfo.newState + lowBits;
-    return symbol;
-}
+size_t HUFv07_decompress1X_DCtx (HUFv07_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize);
+size_t HUFv07_decompress1X2_DCtx(HUFv07_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize);   /**< single-symbol decoder */
+size_t HUFv07_decompress1X4_DCtx(HUFv07_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize);   /**< double-symbols decoder */
 
-MEM_STATIC unsigned FSEv06_endOfDState(const FSEv06_DState_t* DStatePtr)
-{
-    return DStatePtr->state == 0;
-}
 
+/* ****************************************
+*  HUF detailed API
+******************************************/
+/*!
+The following API allows targeting specific sub-functions for advanced tasks.
+For example, it's possible to compress several blocks using the same 'CTable',
+or to save and regenerate 'CTable' using external methods.
+*/
+/* FSEv07_count() : find it within "fse.h" */
 
+/*! HUFv07_readStats() :
+    Read compact Huffman tree, saved by HUFv07_writeCTable().
+    `huffWeight` is destination buffer.
+    @return : size read from `src` , or an error Code .
+    Note : Needed by HUFv07_readCTable() and HUFv07_readDTableXn() . */
+size_t HUFv07_readStats(BYTE* huffWeight, size_t hwSize, U32* rankStats,
+                     U32* nbSymbolsPtr, U32* tableLogPtr,
+                     const void* src, size_t srcSize);
 
-#ifndef FSEv06_COMMONDEFS_ONLY
 
-/* **************************************************************
-*  Tuning parameters
-****************************************************************/
-/*!MEMORY_USAGE :
-*  Memory usage formula : N->2^N Bytes (examples : 10 -> 1KB; 12 -> 4KB ; 16 -> 64KB; 20 -> 1MB; etc.)
-*  Increasing memory usage improves compression ratio
-*  Reduced memory usage can improve speed, due to cache effect
-*  Recommended max value is 14, for 16KB, which nicely fits into Intel x86 L1 cache */
-#define FSEv06_MAX_MEMORY_USAGE 14
-#define FSEv06_DEFAULT_MEMORY_USAGE 13
+/*
+HUFv07_decompress() does the following:
+1. select the decompression algorithm (X2, X4) based on pre-computed heuristics
+2. build Huffman table from save, using HUFv07_readDTableXn()
+3. decode 1 or 4 segments in parallel using HUFv07_decompressSXn_usingDTable
+*/
 
-/*!FSEv06_MAX_SYMBOL_VALUE :
-*  Maximum symbol value authorized.
-*  Required for proper stack allocation */
-#define FSEv06_MAX_SYMBOL_VALUE 255
+/** HUFv07_selectDecoder() :
+*   Tells which decoder is likely to decode faster,
+*   based on a set of pre-determined metrics.
+*   @return : 0==HUFv07_decompress4X2, 1==HUFv07_decompress4X4 .
+*   Assumption : 0 < cSrcSize < dstSize <= 128 KB */
+U32 HUFv07_selectDecoder (size_t dstSize, size_t cSrcSize);
 
+size_t HUFv07_readDTableX2 (HUFv07_DTable* DTable, const void* src, size_t srcSize);
+size_t HUFv07_readDTableX4 (HUFv07_DTable* DTable, const void* src, size_t srcSize);
 
-/* **************************************************************
-*  template functions type & suffix
-****************************************************************/
-#define FSEv06_FUNCTION_TYPE BYTE
-#define FSEv06_FUNCTION_EXTENSION
-#define FSEv06_DECODE_TYPE FSEv06_decode_t
+size_t HUFv07_decompress4X_usingDTable(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const HUFv07_DTable* DTable);
+size_t HUFv07_decompress4X2_usingDTable(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const HUFv07_DTable* DTable);
+size_t HUFv07_decompress4X4_usingDTable(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const HUFv07_DTable* DTable);
 
 
-#endif   /* !FSEv06_COMMONDEFS_ONLY */
+/* single stream variants */
+size_t HUFv07_decompress1X2 (void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize);   /* single-symbol decoder */
+size_t HUFv07_decompress1X4 (void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize);   /* double-symbol decoder */
 
+size_t HUFv07_decompress1X_usingDTable(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const HUFv07_DTable* DTable);
+size_t HUFv07_decompress1X2_usingDTable(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const HUFv07_DTable* DTable);
+size_t HUFv07_decompress1X4_usingDTable(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const HUFv07_DTable* DTable);
 
-/* ***************************************************************
-*  Constants
-*****************************************************************/
-#define FSEv06_MAX_TABLELOG  (FSEv06_MAX_MEMORY_USAGE-2)
-#define FSEv06_MAX_TABLESIZE (1U<<FSEv06_MAX_TABLELOG)
-#define FSEv06_MAXTABLESIZE_MASK (FSEv06_MAX_TABLESIZE-1)
-#define FSEv06_DEFAULT_TABLELOG (FSEv06_DEFAULT_MEMORY_USAGE-2)
-#define FSEv06_MIN_TABLELOG 5
-
-#define FSEv06_TABLELOG_ABSOLUTE_MAX 15
-#if FSEv06_MAX_TABLELOG > FSEv06_TABLELOG_ABSOLUTE_MAX
-#error "FSEv06_MAX_TABLELOG > FSEv06_TABLELOG_ABSOLUTE_MAX is not supported"
-#endif
 
-#define FSEv06_TABLESTEP(tableSize) ((tableSize>>1) + (tableSize>>3) + 3)
+#endif /* HUFv07_STATIC_LINKING_ONLY */
 
 
 #if defined (__cplusplus)
 }
 #endif
 
-#endif  /* FSEv06_STATIC_H */
+#endif   /* HUFv07_H_298734234 */
 /*
    Common functions of New Generation Entropy library
    Copyright (C) 2016, Yann Collet.
@@ -1722,28 +1627,29 @@ MEM_STATIC unsigned FSEv06_endOfDState(const FSEv06_DState_t* DStatePtr)
 *************************************************************************** */
 
 
+
 /*-****************************************
 *  FSE Error Management
 ******************************************/
-unsigned FSEv06_isError(size_t code) { return ERR_isError(code); }
+unsigned FSEv07_isError(size_t code) { return ERR_isError(code); }
 
-const char* FSEv06_getErrorName(size_t code) { return ERR_getErrorName(code); }
+const char* FSEv07_getErrorName(size_t code) { return ERR_getErrorName(code); }
 
 
 /* **************************************************************
 *  HUF Error Management
 ****************************************************************/
-unsigned HUFv06_isError(size_t code) { return ERR_isError(code); }
+unsigned HUFv07_isError(size_t code) { return ERR_isError(code); }
 
-const char* HUFv06_getErrorName(size_t code) { return ERR_getErrorName(code); }
+const char* HUFv07_getErrorName(size_t code) { return ERR_getErrorName(code); }
 
 
 /*-**************************************************************
 *  FSE NCount encoding-decoding
 ****************************************************************/
-static short FSEv06_abs(short a) { return a<0 ? -a : a; }
+static short FSEv07_abs(short a) { return (short)(a<0 ? -a : a); }
 
-size_t FSEv06_readNCount (short* normalizedCounter, unsigned* maxSVPtr, unsigned* tableLogPtr,
+size_t FSEv07_readNCount (short* normalizedCounter, unsigned* maxSVPtr, unsigned* tableLogPtr,
                  const void* headerBuffer, size_t hbSize)
 {
     const BYTE* const istart = (const BYTE*) headerBuffer;
@@ -1759,8 +1665,8 @@ size_t FSEv06_readNCount (short* normalizedCounter, unsigned* maxSVPtr, unsigned
 
     if (hbSize < 4) return ERROR(srcSize_wrong);
     bitStream = MEM_readLE32(ip);
-    nbBits = (bitStream & 0xF) + FSEv06_MIN_TABLELOG;   /* extract tableLog */
-    if (nbBits > FSEv06_TABLELOG_ABSOLUTE_MAX) return ERROR(tableLog_tooLarge);
+    nbBits = (bitStream & 0xF) + FSEv07_MIN_TABLELOG;   /* extract tableLog */
+    if (nbBits > FSEv07_TABLELOG_ABSOLUTE_MAX) return ERROR(tableLog_tooLarge);
     bitStream >>= 4;
     bitCount = 4;
     *tableLogPtr = nbBits;
@@ -1810,7 +1716,7 @@ size_t FSEv06_readNCount (short* normalizedCounter, unsigned* maxSVPtr, unsigned
             }
 
             count--;   /* extra accuracy */
-            remaining -= FSEv06_abs(count);
+            remaining -= FSEv07_abs(count);
             normalizedCounter[charnum++] = count;
             previous0 = !count;
             while (remaining < threshold) {
@@ -1834,6 +1740,79 @@ size_t FSEv06_readNCount (short* normalizedCounter, unsigned* maxSVPtr, unsigned
     if ((size_t)(ip-istart) > hbSize) return ERROR(srcSize_wrong);
     return ip-istart;
 }
+
+
+/*! HUFv07_readStats() :
+    Read compact Huffman tree, saved by HUFv07_writeCTable().
+    `huffWeight` is destination buffer.
+    @return : size read from `src` , or an error Code .
+    Note : Needed by HUFv07_readCTable() and HUFv07_readDTableXn() .
+*/
+size_t HUFv07_readStats(BYTE* huffWeight, size_t hwSize, U32* rankStats,
+                     U32* nbSymbolsPtr, U32* tableLogPtr,
+                     const void* src, size_t srcSize)
+{
+    U32 weightTotal;
+    const BYTE* ip = (const BYTE*) src;
+    size_t iSize = ip[0];
+    size_t oSize;
+
+    //memset(huffWeight, 0, hwSize);   /* is not necessary, even though some analyzer complain ... */
+
+    if (iSize >= 128)  { /* special header */
+        if (iSize >= (242)) {  /* RLE */
+            static U32 l[14] = { 1, 2, 3, 4, 7, 8, 15, 16, 31, 32, 63, 64, 127, 128 };
+            oSize = l[iSize-242];
+            memset(huffWeight, 1, hwSize);
+            iSize = 0;
+        }
+        else {   /* Incompressible */
+            oSize = iSize - 127;
+            iSize = ((oSize+1)/2);
+            if (iSize+1 > srcSize) return ERROR(srcSize_wrong);
+            if (oSize >= hwSize) return ERROR(corruption_detected);
+            ip += 1;
+            {   U32 n;
+                for (n=0; n<oSize; n+=2) {
+                    huffWeight[n]   = ip[n/2] >> 4;
+                    huffWeight[n+1] = ip[n/2] & 15;
+    }   }   }   }
+    else  {   /* header compressed with FSE (normal case) */
+        if (iSize+1 > srcSize) return ERROR(srcSize_wrong);
+        oSize = FSEv07_decompress(huffWeight, hwSize-1, ip+1, iSize);   /* max (hwSize-1) values decoded, as last one is implied */
+        if (FSEv07_isError(oSize)) return oSize;
+    }
+
+    /* collect weight stats */
+    memset(rankStats, 0, (HUFv07_TABLELOG_ABSOLUTEMAX + 1) * sizeof(U32));
+    weightTotal = 0;
+    {   U32 n; for (n=0; n<oSize; n++) {
+            if (huffWeight[n] >= HUFv07_TABLELOG_ABSOLUTEMAX) return ERROR(corruption_detected);
+            rankStats[huffWeight[n]]++;
+            weightTotal += (1 << huffWeight[n]) >> 1;
+    }   }
+
+    /* get last non-null symbol weight (implied, total must be 2^n) */
+    {   U32 const tableLog = BITv07_highbit32(weightTotal) + 1;
+        if (tableLog > HUFv07_TABLELOG_ABSOLUTEMAX) return ERROR(corruption_detected);
+        *tableLogPtr = tableLog;
+        /* determine last weight */
+        {   U32 const total = 1 << tableLog;
+            U32 const rest = total - weightTotal;
+            U32 const verif = 1 << BITv07_highbit32(rest);
+            U32 const lastWeight = BITv07_highbit32(rest) + 1;
+            if (verif != rest) return ERROR(corruption_detected);    /* last value must be a clean power of 2 */
+            huffWeight[oSize] = (BYTE)lastWeight;
+            rankStats[lastWeight]++;
+    }   }
+
+    /* check tree construction validity */
+    if ((rankStats[1] < 2) || (rankStats[1] & 1)) return ERROR(corruption_detected);   /* by construction : at least 2 elts of rank 1, must be even */
+
+    /* results */
+    *nbSymbolsPtr = (U32)(oSize+1);
+    return iSize+1;
+}
 /* ******************************************************************
    FSE : Finite State Entropy decoder
    Copyright (C) 2013-2015, Yann Collet.
@@ -1887,17 +1866,19 @@ size_t FSEv06_readNCount (short* normalizedCounter, unsigned* maxSVPtr, unsigned
 #endif
 
 
+
+
 /* **************************************************************
 *  Error Management
 ****************************************************************/
-#define FSEv06_isError ERR_isError
-#define FSEv06_STATIC_ASSERT(c) { enum { FSEv06_static_assert = 1/(int)(!!(c)) }; }   /* use only *after* variable declarations */
+#define FSEv07_isError ERR_isError
+#define FSEv07_STATIC_ASSERT(c) { enum { FSEv07_static_assert = 1/(int)(!!(c)) }; }   /* use only *after* variable declarations */
 
 
 /* **************************************************************
 *  Complex types
 ****************************************************************/
-typedef U32 DTable_max_t[FSEv06_DTABLE_SIZE_U32(FSEv06_MAX_TABLELOG)];
+typedef U32 DTable_max_t[FSEv07_DTABLE_SIZE_U32(FSEv07_MAX_TABLELOG)];
 
 
 /* **************************************************************
@@ -1910,54 +1891,54 @@ typedef U32 DTable_max_t[FSEv06_DTABLE_SIZE_U32(FSEv06_MAX_TABLELOG)];
 */
 
 /* safety checks */
-#ifndef FSEv06_FUNCTION_EXTENSION
-#  error "FSEv06_FUNCTION_EXTENSION must be defined"
+#ifndef FSEv07_FUNCTION_EXTENSION
+#  error "FSEv07_FUNCTION_EXTENSION must be defined"
 #endif
-#ifndef FSEv06_FUNCTION_TYPE
-#  error "FSEv06_FUNCTION_TYPE must be defined"
+#ifndef FSEv07_FUNCTION_TYPE
+#  error "FSEv07_FUNCTION_TYPE must be defined"
 #endif
 
 /* Function names */
-#define FSEv06_CAT(X,Y) X##Y
-#define FSEv06_FUNCTION_NAME(X,Y) FSEv06_CAT(X,Y)
-#define FSEv06_TYPE_NAME(X,Y) FSEv06_CAT(X,Y)
+#define FSEv07_CAT(X,Y) X##Y
+#define FSEv07_FUNCTION_NAME(X,Y) FSEv07_CAT(X,Y)
+#define FSEv07_TYPE_NAME(X,Y) FSEv07_CAT(X,Y)
 
 
 /* Function templates */
-FSEv06_DTable* FSEv06_createDTable (unsigned tableLog)
+FSEv07_DTable* FSEv07_createDTable (unsigned tableLog)
 {
-    if (tableLog > FSEv06_TABLELOG_ABSOLUTE_MAX) tableLog = FSEv06_TABLELOG_ABSOLUTE_MAX;
-    return (FSEv06_DTable*)malloc( FSEv06_DTABLE_SIZE_U32(tableLog) * sizeof (U32) );
+    if (tableLog > FSEv07_TABLELOG_ABSOLUTE_MAX) tableLog = FSEv07_TABLELOG_ABSOLUTE_MAX;
+    return (FSEv07_DTable*)malloc( FSEv07_DTABLE_SIZE_U32(tableLog) * sizeof (U32) );
 }
 
-void FSEv06_freeDTable (FSEv06_DTable* dt)
+void FSEv07_freeDTable (FSEv07_DTable* dt)
 {
     free(dt);
 }
 
-size_t FSEv06_buildDTable(FSEv06_DTable* dt, const short* normalizedCounter, unsigned maxSymbolValue, unsigned tableLog)
+size_t FSEv07_buildDTable(FSEv07_DTable* dt, const short* normalizedCounter, unsigned maxSymbolValue, unsigned tableLog)
 {
     void* const tdPtr = dt+1;   /* because *dt is unsigned, 32-bits aligned on 32-bits */
-    FSEv06_DECODE_TYPE* const tableDecode = (FSEv06_DECODE_TYPE*) (tdPtr);
-    U16 symbolNext[FSEv06_MAX_SYMBOL_VALUE+1];
+    FSEv07_DECODE_TYPE* const tableDecode = (FSEv07_DECODE_TYPE*) (tdPtr);
+    U16 symbolNext[FSEv07_MAX_SYMBOL_VALUE+1];
 
     U32 const maxSV1 = maxSymbolValue + 1;
     U32 const tableSize = 1 << tableLog;
     U32 highThreshold = tableSize-1;
 
     /* Sanity Checks */
-    if (maxSymbolValue > FSEv06_MAX_SYMBOL_VALUE) return ERROR(maxSymbolValue_tooLarge);
-    if (tableLog > FSEv06_MAX_TABLELOG) return ERROR(tableLog_tooLarge);
+    if (maxSymbolValue > FSEv07_MAX_SYMBOL_VALUE) return ERROR(maxSymbolValue_tooLarge);
+    if (tableLog > FSEv07_MAX_TABLELOG) return ERROR(tableLog_tooLarge);
 
     /* Init, lay down lowprob symbols */
-    {   FSEv06_DTableHeader DTableH;
+    {   FSEv07_DTableHeader DTableH;
         DTableH.tableLog = (U16)tableLog;
         DTableH.fastMode = 1;
         {   S16 const largeLimit= (S16)(1 << (tableLog-1));
             U32 s;
             for (s=0; s<maxSV1; s++) {
                 if (normalizedCounter[s]==-1) {
-                    tableDecode[highThreshold--].symbol = (FSEv06_FUNCTION_TYPE)s;
+                    tableDecode[highThreshold--].symbol = (FSEv07_FUNCTION_TYPE)s;
                     symbolNext[s] = 1;
                 } else {
                     if (normalizedCounter[s] >= largeLimit) DTableH.fastMode=0;
@@ -1968,12 +1949,12 @@ size_t FSEv06_buildDTable(FSEv06_DTable* dt, const short* normalizedCounter, uns
 
     /* Spread symbols */
     {   U32 const tableMask = tableSize-1;
-        U32 const step = FSEv06_TABLESTEP(tableSize);
+        U32 const step = FSEv07_TABLESTEP(tableSize);
         U32 s, position = 0;
         for (s=0; s<maxSV1; s++) {
             int i;
             for (i=0; i<normalizedCounter[s]; i++) {
-                tableDecode[position].symbol = (FSEv06_FUNCTION_TYPE)s;
+                tableDecode[position].symbol = (FSEv07_FUNCTION_TYPE)s;
                 position = (position + step) & tableMask;
                 while (position > highThreshold) position = (position + step) & tableMask;   /* lowprob area */
         }   }
@@ -1984,9 +1965,9 @@ size_t FSEv06_buildDTable(FSEv06_DTable* dt, const short* normalizedCounter, uns
     /* Build Decoding table */
     {   U32 u;
         for (u=0; u<tableSize; u++) {
-            FSEv06_FUNCTION_TYPE const symbol = (FSEv06_FUNCTION_TYPE)(tableDecode[u].symbol);
+            FSEv07_FUNCTION_TYPE const symbol = (FSEv07_FUNCTION_TYPE)(tableDecode[u].symbol);
             U16 nextState = symbolNext[symbol]++;
-            tableDecode[u].nbBits = (BYTE) (tableLog - BITv06_highbit32 ((U32)nextState) );
+            tableDecode[u].nbBits = (BYTE) (tableLog - BITv07_highbit32 ((U32)nextState) );
             tableDecode[u].newState = (U16) ( (nextState << tableDecode[u].nbBits) - tableSize);
     }   }
 
@@ -1995,17 +1976,17 @@ size_t FSEv06_buildDTable(FSEv06_DTable* dt, const short* normalizedCounter, uns
 
 
 
-#ifndef FSEv06_COMMONDEFS_ONLY
+#ifndef FSEv07_COMMONDEFS_ONLY
 
 /*-*******************************************************
 *  Decompression (Byte symbols)
 *********************************************************/
-size_t FSEv06_buildDTable_rle (FSEv06_DTable* dt, BYTE symbolValue)
+size_t FSEv07_buildDTable_rle (FSEv07_DTable* dt, BYTE symbolValue)
 {
     void* ptr = dt;
-    FSEv06_DTableHeader* const DTableH = (FSEv06_DTableHeader*)ptr;
+    FSEv07_DTableHeader* const DTableH = (FSEv07_DTableHeader*)ptr;
     void* dPtr = dt + 1;
-    FSEv06_decode_t* const cell = (FSEv06_decode_t*)dPtr;
+    FSEv07_decode_t* const cell = (FSEv07_decode_t*)dPtr;
 
     DTableH->tableLog = 0;
     DTableH->fastMode = 0;
@@ -2018,12 +1999,12 @@ size_t FSEv06_buildDTable_rle (FSEv06_DTable* dt, BYTE symbolValue)
 }
 
 
-size_t FSEv06_buildDTable_raw (FSEv06_DTable* dt, unsigned nbBits)
+size_t FSEv07_buildDTable_raw (FSEv07_DTable* dt, unsigned nbBits)
 {
     void* ptr = dt;
-    FSEv06_DTableHeader* const DTableH = (FSEv06_DTableHeader*)ptr;
+    FSEv07_DTableHeader* const DTableH = (FSEv07_DTableHeader*)ptr;
     void* dPtr = dt + 1;
-    FSEv06_decode_t* const dinfo = (FSEv06_decode_t*)dPtr;
+    FSEv07_decode_t* const dinfo = (FSEv07_decode_t*)dPtr;
     const unsigned tableSize = 1 << nbBits;
     const unsigned tableMask = tableSize - 1;
     const unsigned maxSV1 = tableMask+1;
@@ -2044,366 +2025,117 @@ size_t FSEv06_buildDTable_raw (FSEv06_DTable* dt, unsigned nbBits)
     return 0;
 }
 
-FORCE_INLINE size_t FSEv06_decompress_usingDTable_generic(
+FORCE_INLINE size_t FSEv07_decompress_usingDTable_generic(
           void* dst, size_t maxDstSize,
     const void* cSrc, size_t cSrcSize,
-    const FSEv06_DTable* dt, const unsigned fast)
+    const FSEv07_DTable* dt, const unsigned fast)
 {
     BYTE* const ostart = (BYTE*) dst;
     BYTE* op = ostart;
     BYTE* const omax = op + maxDstSize;
     BYTE* const olimit = omax-3;
 
-    BITv06_DStream_t bitD;
-    FSEv06_DState_t state1;
-    FSEv06_DState_t state2;
-
-    /* Init */
-    { size_t const errorCode = BITv06_initDStream(&bitD, cSrc, cSrcSize);   /* replaced last arg by maxCompressed Size */
-      if (FSEv06_isError(errorCode)) return errorCode; }
-
-    FSEv06_initDState(&state1, &bitD, dt);
-    FSEv06_initDState(&state2, &bitD, dt);
-
-#define FSEv06_GETSYMBOL(statePtr) fast ? FSEv06_decodeSymbolFast(statePtr, &bitD) : FSEv06_decodeSymbol(statePtr, &bitD)
-
-    /* 4 symbols per loop */
-    for ( ; (BITv06_reloadDStream(&bitD)==BITv06_DStream_unfinished) && (op<olimit) ; op+=4) {
-        op[0] = FSEv06_GETSYMBOL(&state1);
-
-        if (FSEv06_MAX_TABLELOG*2+7 > sizeof(bitD.bitContainer)*8)    /* This test must be static */
-            BITv06_reloadDStream(&bitD);
-
-        op[1] = FSEv06_GETSYMBOL(&state2);
-
-        if (FSEv06_MAX_TABLELOG*4+7 > sizeof(bitD.bitContainer)*8)    /* This test must be static */
-            { if (BITv06_reloadDStream(&bitD) > BITv06_DStream_unfinished) { op+=2; break; } }
-
-        op[2] = FSEv06_GETSYMBOL(&state1);
-
-        if (FSEv06_MAX_TABLELOG*2+7 > sizeof(bitD.bitContainer)*8)    /* This test must be static */
-            BITv06_reloadDStream(&bitD);
-
-        op[3] = FSEv06_GETSYMBOL(&state2);
-    }
-
-    /* tail */
-    /* note : BITv06_reloadDStream(&bitD) >= FSEv06_DStream_partiallyFilled; Ends at exactly BITv06_DStream_completed */
-    while (1) {
-        if (op>(omax-2)) return ERROR(dstSize_tooSmall);
-
-        *op++ = FSEv06_GETSYMBOL(&state1);
-
-        if (BITv06_reloadDStream(&bitD)==BITv06_DStream_overflow) {
-            *op++ = FSEv06_GETSYMBOL(&state2);
-            break;
-        }
-
-        if (op>(omax-2)) return ERROR(dstSize_tooSmall);
-
-        *op++ = FSEv06_GETSYMBOL(&state2);
-
-        if (BITv06_reloadDStream(&bitD)==BITv06_DStream_overflow) {
-            *op++ = FSEv06_GETSYMBOL(&state1);
-            break;
-    }   }
-
-    return op-ostart;
-}
-
-
-size_t FSEv06_decompress_usingDTable(void* dst, size_t originalSize,
-                            const void* cSrc, size_t cSrcSize,
-                            const FSEv06_DTable* dt)
-{
-    const void* ptr = dt;
-    const FSEv06_DTableHeader* DTableH = (const FSEv06_DTableHeader*)ptr;
-    const U32 fastMode = DTableH->fastMode;
-
-    /* select fast mode (static) */
-    if (fastMode) return FSEv06_decompress_usingDTable_generic(dst, originalSize, cSrc, cSrcSize, dt, 1);
-    return FSEv06_decompress_usingDTable_generic(dst, originalSize, cSrc, cSrcSize, dt, 0);
-}
-
-
-size_t FSEv06_decompress(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize)
-{
-    const BYTE* const istart = (const BYTE*)cSrc;
-    const BYTE* ip = istart;
-    short counting[FSEv06_MAX_SYMBOL_VALUE+1];
-    DTable_max_t dt;   /* Static analyzer seems unable to understand this table will be properly initialized later */
-    unsigned tableLog;
-    unsigned maxSymbolValue = FSEv06_MAX_SYMBOL_VALUE;
-
-    if (cSrcSize<2) return ERROR(srcSize_wrong);   /* too small input size */
-
-    /* normal FSE decoding mode */
-    {   size_t const NCountLength = FSEv06_readNCount (counting, &maxSymbolValue, &tableLog, istart, cSrcSize);
-        if (FSEv06_isError(NCountLength)) return NCountLength;
-        if (NCountLength >= cSrcSize) return ERROR(srcSize_wrong);   /* too small input size */
-        ip += NCountLength;
-        cSrcSize -= NCountLength;
-    }
-
-    { size_t const errorCode = FSEv06_buildDTable (dt, counting, maxSymbolValue, tableLog);
-      if (FSEv06_isError(errorCode)) return errorCode; }
-
-    return FSEv06_decompress_usingDTable (dst, maxDstSize, ip, cSrcSize, dt);   /* always return, even if it is an error code */
-}
-
-
-
-#endif   /* FSEv06_COMMONDEFS_ONLY */
-/* ******************************************************************
-   Huffman coder, part of New Generation Entropy library
-   header file
-   Copyright (C) 2013-2016, Yann Collet.
-
-   BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php)
-
-   Redistribution and use in source and binary forms, with or without
-   modification, are permitted provided that the following conditions are
-   met:
-
-       * Redistributions of source code must retain the above copyright
-   notice, this list of conditions and the following disclaimer.
-       * Redistributions in binary form must reproduce the above
-   copyright notice, this list of conditions and the following disclaimer
-   in the documentation and/or other materials provided with the
-   distribution.
-
-   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
-   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
-   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
-   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
-   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
-   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
-   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
-   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
-   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
-   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
-   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
-
-   You can contact the author at :
-   - Source repository : https://github.com/Cyan4973/FiniteStateEntropy
-****************************************************************** */
-#ifndef HUFv06_H
-#define HUFv06_H
-
-#if defined (__cplusplus)
-extern "C" {
-#endif
-
-
-/* ****************************************
-*  HUF simple functions
-******************************************/
-size_t HUFv06_decompress(void* dst,  size_t dstSize,
-                const void* cSrc, size_t cSrcSize);
-/*
-HUFv06_decompress() :
-    Decompress HUF data from buffer 'cSrc', of size 'cSrcSize',
-    into already allocated destination buffer 'dst', of size 'dstSize'.
-    `dstSize` : must be the **exact** size of original (uncompressed) data.
-    Note : in contrast with FSE, HUFv06_decompress can regenerate
-           RLE (cSrcSize==1) and uncompressed (cSrcSize==dstSize) data,
-           because it knows size to regenerate.
-    @return : size of regenerated data (== dstSize)
-              or an error code, which can be tested using HUFv06_isError()
-*/
-
-
-/* ****************************************
-*  Tool functions
-******************************************/
-size_t HUFv06_compressBound(size_t size);       /**< maximum compressed size */
-
-
-#if defined (__cplusplus)
-}
-#endif
-
-#endif   /* HUFv06_H */
-/* ******************************************************************
-   Huffman codec, part of New Generation Entropy library
-   header file, for static linking only
-   Copyright (C) 2013-2016, Yann Collet
-
-   BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php)
-
-   Redistribution and use in source and binary forms, with or without
-   modification, are permitted provided that the following conditions are
-   met:
-
-       * Redistributions of source code must retain the above copyright
-   notice, this list of conditions and the following disclaimer.
-       * Redistributions in binary form must reproduce the above
-   copyright notice, this list of conditions and the following disclaimer
-   in the documentation and/or other materials provided with the
-   distribution.
-
-   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
-   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
-   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
-   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
-   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
-   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
-   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
-   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
-   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
-   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
-   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
-
-   You can contact the author at :
-   - Source repository : https://github.com/Cyan4973/FiniteStateEntropy
-****************************************************************** */
-#ifndef HUFv06_STATIC_H
-#define HUFv06_STATIC_H
+    BITv07_DStream_t bitD;
+    FSEv07_DState_t state1;
+    FSEv07_DState_t state2;
 
-#if defined (__cplusplus)
-extern "C" {
-#endif
+    /* Init */
+    { size_t const errorCode = BITv07_initDStream(&bitD, cSrc, cSrcSize);   /* replaced last arg by maxCompressed Size */
+      if (FSEv07_isError(errorCode)) return errorCode; }
 
+    FSEv07_initDState(&state1, &bitD, dt);
+    FSEv07_initDState(&state2, &bitD, dt);
 
-/* ****************************************
-*  Static allocation
-******************************************/
-/* HUF buffer bounds */
-#define HUFv06_CTABLEBOUND 129
-#define HUFv06_BLOCKBOUND(size) (size + (size>>8) + 8)   /* only true if incompressible pre-filtered with fast heuristic */
-#define HUFv06_COMPRESSBOUND(size) (HUFv06_CTABLEBOUND + HUFv06_BLOCKBOUND(size))   /* Macro version, useful for static allocation */
+#define FSEv07_GETSYMBOL(statePtr) fast ? FSEv07_decodeSymbolFast(statePtr, &bitD) : FSEv07_decodeSymbol(statePtr, &bitD)
 
-/* static allocation of HUF's DTable */
-#define HUFv06_DTABLE_SIZE(maxTableLog)   (1 + (1<<maxTableLog))
-#define HUFv06_CREATE_STATIC_DTABLEX2(DTable, maxTableLog) \
-        unsigned short DTable[HUFv06_DTABLE_SIZE(maxTableLog)] = { maxTableLog }
-#define HUFv06_CREATE_STATIC_DTABLEX4(DTable, maxTableLog) \
-        unsigned int DTable[HUFv06_DTABLE_SIZE(maxTableLog)] = { maxTableLog }
-#define HUFv06_CREATE_STATIC_DTABLEX6(DTable, maxTableLog) \
-        unsigned int DTable[HUFv06_DTABLE_SIZE(maxTableLog) * 3 / 2] = { maxTableLog }
+    /* 4 symbols per loop */
+    for ( ; (BITv07_reloadDStream(&bitD)==BITv07_DStream_unfinished) && (op<olimit) ; op+=4) {
+        op[0] = FSEv07_GETSYMBOL(&state1);
 
+        if (FSEv07_MAX_TABLELOG*2+7 > sizeof(bitD.bitContainer)*8)    /* This test must be static */
+            BITv07_reloadDStream(&bitD);
 
-/* ****************************************
-*  Advanced decompression functions
-******************************************/
-size_t HUFv06_decompress4X2 (void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize);   /* single-symbol decoder */
-size_t HUFv06_decompress4X4 (void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize);   /* double-symbols decoder */
+        op[1] = FSEv07_GETSYMBOL(&state2);
 
+        if (FSEv07_MAX_TABLELOG*4+7 > sizeof(bitD.bitContainer)*8)    /* This test must be static */
+            { if (BITv07_reloadDStream(&bitD) > BITv07_DStream_unfinished) { op+=2; break; } }
 
+        op[2] = FSEv07_GETSYMBOL(&state1);
 
-/*!
-HUFv06_decompress() does the following:
-1. select the decompression algorithm (X2, X4, X6) based on pre-computed heuristics
-2. build Huffman table from save, using HUFv06_readDTableXn()
-3. decode 1 or 4 segments in parallel using HUFv06_decompressSXn_usingDTable
-*/
-size_t HUFv06_readDTableX2 (unsigned short* DTable, const void* src, size_t srcSize);
-size_t HUFv06_readDTableX4 (unsigned* DTable, const void* src, size_t srcSize);
+        if (FSEv07_MAX_TABLELOG*2+7 > sizeof(bitD.bitContainer)*8)    /* This test must be static */
+            BITv07_reloadDStream(&bitD);
 
-size_t HUFv06_decompress4X2_usingDTable(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const unsigned short* DTable);
-size_t HUFv06_decompress4X4_usingDTable(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const unsigned* DTable);
+        op[3] = FSEv07_GETSYMBOL(&state2);
+    }
 
+    /* tail */
+    /* note : BITv07_reloadDStream(&bitD) >= FSEv07_DStream_partiallyFilled; Ends at exactly BITv07_DStream_completed */
+    while (1) {
+        if (op>(omax-2)) return ERROR(dstSize_tooSmall);
 
-/* single stream variants */
-size_t HUFv06_decompress1X2 (void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize);   /* single-symbol decoder */
-size_t HUFv06_decompress1X4 (void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize);   /* double-symbol decoder */
+        *op++ = FSEv07_GETSYMBOL(&state1);
 
-size_t HUFv06_decompress1X2_usingDTable(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const unsigned short* DTable);
-size_t HUFv06_decompress1X4_usingDTable(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize, const unsigned* DTable);
+        if (BITv07_reloadDStream(&bitD)==BITv07_DStream_overflow) {
+            *op++ = FSEv07_GETSYMBOL(&state2);
+            break;
+        }
 
+        if (op>(omax-2)) return ERROR(dstSize_tooSmall);
 
+        *op++ = FSEv07_GETSYMBOL(&state2);
 
-/* **************************************************************
-*  Constants
-****************************************************************/
-#define HUFv06_ABSOLUTEMAX_TABLELOG  16   /* absolute limit of HUFv06_MAX_TABLELOG. Beyond that value, code does not work */
-#define HUFv06_MAX_TABLELOG  12           /* max configured tableLog (for static allocation); can be modified up to HUFv06_ABSOLUTEMAX_TABLELOG */
-#define HUFv06_DEFAULT_TABLELOG  HUFv06_MAX_TABLELOG   /* tableLog by default, when not specified */
-#define HUFv06_MAX_SYMBOL_VALUE 255
-#if (HUFv06_MAX_TABLELOG > HUFv06_ABSOLUTEMAX_TABLELOG)
-#  error "HUFv06_MAX_TABLELOG is too large !"
-#endif
+        if (BITv07_reloadDStream(&bitD)==BITv07_DStream_overflow) {
+            *op++ = FSEv07_GETSYMBOL(&state1);
+            break;
+    }   }
 
+    return op-ostart;
+}
 
 
-/*! HUFv06_readStats() :
-    Read compact Huffman tree, saved by HUFv06_writeCTable().
-    `huffWeight` is destination buffer.
-    @return : size read from `src`
-*/
-MEM_STATIC size_t HUFv06_readStats(BYTE* huffWeight, size_t hwSize, U32* rankStats,
-                            U32* nbSymbolsPtr, U32* tableLogPtr,
-                            const void* src, size_t srcSize)
+size_t FSEv07_decompress_usingDTable(void* dst, size_t originalSize,
+                            const void* cSrc, size_t cSrcSize,
+                            const FSEv07_DTable* dt)
 {
-    U32 weightTotal;
-    const BYTE* ip = (const BYTE*) src;
-    size_t iSize = ip[0];
-    size_t oSize;
+    const void* ptr = dt;
+    const FSEv07_DTableHeader* DTableH = (const FSEv07_DTableHeader*)ptr;
+    const U32 fastMode = DTableH->fastMode;
 
-    //memset(huffWeight, 0, hwSize);   /* is not necessary, even though some analyzer complain ... */
+    /* select fast mode (static) */
+    if (fastMode) return FSEv07_decompress_usingDTable_generic(dst, originalSize, cSrc, cSrcSize, dt, 1);
+    return FSEv07_decompress_usingDTable_generic(dst, originalSize, cSrc, cSrcSize, dt, 0);
+}
 
-    if (iSize >= 128)  { /* special header */
-        if (iSize >= (242)) {  /* RLE */
-            static U32 l[14] = { 1, 2, 3, 4, 7, 8, 15, 16, 31, 32, 63, 64, 127, 128 };
-            oSize = l[iSize-242];
-            memset(huffWeight, 1, hwSize);
-            iSize = 0;
-        }
-        else {   /* Incompressible */
-            oSize = iSize - 127;
-            iSize = ((oSize+1)/2);
-            if (iSize+1 > srcSize) return ERROR(srcSize_wrong);
-            if (oSize >= hwSize) return ERROR(corruption_detected);
-            ip += 1;
-            {   U32 n;
-                for (n=0; n<oSize; n+=2) {
-                    huffWeight[n]   = ip[n/2] >> 4;
-                    huffWeight[n+1] = ip[n/2] & 15;
-    }   }   }   }
-    else  {   /* header compressed with FSE (normal case) */
-        if (iSize+1 > srcSize) return ERROR(srcSize_wrong);
-        oSize = FSEv06_decompress(huffWeight, hwSize-1, ip+1, iSize);   /* max (hwSize-1) values decoded, as last one is implied */
-        if (FSEv06_isError(oSize)) return oSize;
-    }
 
-    /* collect weight stats */
-    memset(rankStats, 0, (HUFv06_ABSOLUTEMAX_TABLELOG + 1) * sizeof(U32));
-    weightTotal = 0;
-    {   U32 n; for (n=0; n<oSize; n++) {
-            if (huffWeight[n] >= HUFv06_ABSOLUTEMAX_TABLELOG) return ERROR(corruption_detected);
-            rankStats[huffWeight[n]]++;
-            weightTotal += (1 << huffWeight[n]) >> 1;
-    }   }
+size_t FSEv07_decompress(void* dst, size_t maxDstSize, const void* cSrc, size_t cSrcSize)
+{
+    const BYTE* const istart = (const BYTE*)cSrc;
+    const BYTE* ip = istart;
+    short counting[FSEv07_MAX_SYMBOL_VALUE+1];
+    DTable_max_t dt;   /* Static analyzer seems unable to understand this table will be properly initialized later */
+    unsigned tableLog;
+    unsigned maxSymbolValue = FSEv07_MAX_SYMBOL_VALUE;
 
-    /* get last non-null symbol weight (implied, total must be 2^n) */
-    {   U32 const tableLog = BITv06_highbit32(weightTotal) + 1;
-        if (tableLog > HUFv06_ABSOLUTEMAX_TABLELOG) return ERROR(corruption_detected);
-        *tableLogPtr = tableLog;
-        /* determine last weight */
-        {   U32 const total = 1 << tableLog;
-            U32 const rest = total - weightTotal;
-            U32 const verif = 1 << BITv06_highbit32(rest);
-            U32 const lastWeight = BITv06_highbit32(rest) + 1;
-            if (verif != rest) return ERROR(corruption_detected);    /* last value must be a clean power of 2 */
-            huffWeight[oSize] = (BYTE)lastWeight;
-            rankStats[lastWeight]++;
-    }   }
+    if (cSrcSize<2) return ERROR(srcSize_wrong);   /* too small input size */
 
-    /* check tree construction validity */
-    if ((rankStats[1] < 2) || (rankStats[1] & 1)) return ERROR(corruption_detected);   /* by construction : at least 2 elts of rank 1, must be even */
+    /* normal FSE decoding mode */
+    {   size_t const NCountLength = FSEv07_readNCount (counting, &maxSymbolValue, &tableLog, istart, cSrcSize);
+        if (FSEv07_isError(NCountLength)) return NCountLength;
+        if (NCountLength >= cSrcSize) return ERROR(srcSize_wrong);   /* too small input size */
+        ip += NCountLength;
+        cSrcSize -= NCountLength;
+    }
 
-    /* results */
-    *nbSymbolsPtr = (U32)(oSize+1);
-    return iSize+1;
+    { size_t const errorCode = FSEv07_buildDTable (dt, counting, maxSymbolValue, tableLog);
+      if (FSEv07_isError(errorCode)) return errorCode; }
+
+    return FSEv07_decompress_usingDTable (dst, maxDstSize, ip, cSrcSize, dt);   /* always return, even if it is an error code */
 }
 
 
 
-#if defined (__cplusplus)
-}
-#endif
+#endif   /* FSEv07_COMMONDEFS_ONLY */
 
-#endif /* HUFv06_STATIC_H */
 /* ******************************************************************
    Huffman decoder, part of New Generation Entropy library
    Copyright (C) 2013-2016, Yann Collet.
@@ -2466,155 +2198,177 @@ MEM_STATIC size_t HUFv06_readStats(BYTE* huffWeight, size_t hwSize, U32* rankSta
 /* **************************************************************
 *  Error Management
 ****************************************************************/
-#define HUFv06_STATIC_ASSERT(c) { enum { HUFv06_static_assert = 1/(int)(!!(c)) }; }   /* use only *after* variable declarations */
-
+#define HUFv07_STATIC_ASSERT(c) { enum { HUFv07_static_assert = 1/(int)(!!(c)) }; }   /* use only *after* variable declarations */
 
 
-/* *******************************************************
-*  HUF : Huffman block decompression
-*********************************************************/
-typedef struct { BYTE byte; BYTE nbBits; } HUFv06_DEltX2;   /* single-symbol decoding */
-
-typedef struct { U16 sequence; BYTE nbBits; BYTE length; } HUFv06_DEltX4;  /* double-symbols decoding */
+/*-***************************/
+/*  generic DTableDesc       */
+/*-***************************/
 
-typedef struct { BYTE symbol; BYTE weight; } sortedSymbol_t;
+typedef struct { BYTE maxTableLog; BYTE tableType; BYTE tableLog; BYTE reserved; } DTableDesc;
 
+static DTableDesc HUFv07_getDTableDesc(const HUFv07_DTable* table)
+{
+    DTableDesc dtd;
+    memcpy(&dtd, table, sizeof(dtd));
+    return dtd;
+}
 
 
 /*-***************************/
 /*  single-symbol decoding   */
 /*-***************************/
 
-size_t HUFv06_readDTableX2 (U16* DTable, const void* src, size_t srcSize)
+typedef struct { BYTE byte; BYTE nbBits; } HUFv07_DEltX2;   /* single-symbol decoding */
+
+size_t HUFv07_readDTableX2 (HUFv07_DTable* DTable, const void* src, size_t srcSize)
 {
-    BYTE huffWeight[HUFv06_MAX_SYMBOL_VALUE + 1];
-    U32 rankVal[HUFv06_ABSOLUTEMAX_TABLELOG + 1];   /* large enough for values from 0 to 16 */
+    BYTE huffWeight[HUFv07_SYMBOLVALUE_MAX + 1];
+    U32 rankVal[HUFv07_TABLELOG_ABSOLUTEMAX + 1];   /* large enough for values from 0 to 16 */
     U32 tableLog = 0;
-    size_t iSize;
     U32 nbSymbols = 0;
-    U32 n;
-    U32 nextRankStart;
+    size_t iSize;
     void* const dtPtr = DTable + 1;
-    HUFv06_DEltX2* const dt = (HUFv06_DEltX2*)dtPtr;
+    HUFv07_DEltX2* const dt = (HUFv07_DEltX2*)dtPtr;
 
-    HUFv06_STATIC_ASSERT(sizeof(HUFv06_DEltX2) == sizeof(U16));   /* if compilation fails here, assertion is false */
+    HUFv07_STATIC_ASSERT(sizeof(DTableDesc) == sizeof(HUFv07_DTable));
     //memset(huffWeight, 0, sizeof(huffWeight));   /* is not necessary, even though some analyzer complain ... */
 
-    iSize = HUFv06_readStats(huffWeight, HUFv06_MAX_SYMBOL_VALUE + 1, rankVal, &nbSymbols, &tableLog, src, srcSize);
-    if (HUFv06_isError(iSize)) return iSize;
+    iSize = HUFv07_readStats(huffWeight, HUFv07_SYMBOLVALUE_MAX + 1, rankVal, &nbSymbols, &tableLog, src, srcSize);
+    if (HUFv07_isError(iSize)) return iSize;
 
-    /* check result */
-    if (tableLog > DTable[0]) return ERROR(tableLog_tooLarge);   /* DTable is too small */
-    DTable[0] = (U16)tableLog;   /* maybe should separate sizeof allocated DTable, from used size of DTable, in case of re-use */
+    /* Table header */
+    {   DTableDesc dtd = HUFv07_getDTableDesc(DTable);
+        if (tableLog > (U32)(dtd.maxTableLog+1)) return ERROR(tableLog_tooLarge);   /* DTable too small, huffman tree cannot fit in */
+        dtd.tableType = 0;
+        dtd.tableLog = (BYTE)tableLog;
+        memcpy(DTable, &dtd, sizeof(dtd));
+    }
 
     /* Prepare ranks */
-    nextRankStart = 0;
-    for (n=1; n<tableLog+1; n++) {
-        U32 current = nextRankStart;
-        nextRankStart += (rankVal[n] << (n-1));
-        rankVal[n] = current;
-    }
+    {   U32 n, nextRankStart = 0;
+        for (n=1; n<tableLog+1; n++) {
+            U32 current = nextRankStart;
+            nextRankStart += (rankVal[n] << (n-1));
+            rankVal[n] = current;
+    }   }
 
     /* fill DTable */
-    for (n=0; n<nbSymbols; n++) {
-        const U32 w = huffWeight[n];
-        const U32 length = (1 << w) >> 1;
-        U32 i;
-        HUFv06_DEltX2 D;
-        D.byte = (BYTE)n; D.nbBits = (BYTE)(tableLog + 1 - w);
-        for (i = rankVal[w]; i < rankVal[w] + length; i++)
-            dt[i] = D;
-        rankVal[w] += length;
-    }
+    {   U32 n;
+        for (n=0; n<nbSymbols; n++) {
+            U32 const w = huffWeight[n];
+            U32 const length = (1 << w) >> 1;
+            U32 i;
+            HUFv07_DEltX2 D;
+            D.byte = (BYTE)n; D.nbBits = (BYTE)(tableLog + 1 - w);
+            for (i = rankVal[w]; i < rankVal[w] + length; i++)
+                dt[i] = D;
+            rankVal[w] += length;
+    }   }
 
     return iSize;
 }
 
 
-static BYTE HUFv06_decodeSymbolX2(BITv06_DStream_t* Dstream, const HUFv06_DEltX2* dt, const U32 dtLog)
+static BYTE HUFv07_decodeSymbolX2(BITv07_DStream_t* Dstream, const HUFv07_DEltX2* dt, const U32 dtLog)
 {
-    const size_t val = BITv06_lookBitsFast(Dstream, dtLog); /* note : dtLog >= 1 */
-    const BYTE c = dt[val].byte;
-    BITv06_skipBits(Dstream, dt[val].nbBits);
+    size_t const val = BITv07_lookBitsFast(Dstream, dtLog); /* note : dtLog >= 1 */
+    BYTE const c = dt[val].byte;
+    BITv07_skipBits(Dstream, dt[val].nbBits);
     return c;
 }
 
-#define HUFv06_DECODE_SYMBOLX2_0(ptr, DStreamPtr) \
-    *ptr++ = HUFv06_decodeSymbolX2(DStreamPtr, dt, dtLog)
+#define HUFv07_DECODE_SYMBOLX2_0(ptr, DStreamPtr) \
+    *ptr++ = HUFv07_decodeSymbolX2(DStreamPtr, dt, dtLog)
 
-#define HUFv06_DECODE_SYMBOLX2_1(ptr, DStreamPtr) \
-    if (MEM_64bits() || (HUFv06_MAX_TABLELOG<=12)) \
-        HUFv06_DECODE_SYMBOLX2_0(ptr, DStreamPtr)
+#define HUFv07_DECODE_SYMBOLX2_1(ptr, DStreamPtr) \
+    if (MEM_64bits() || (HUFv07_TABLELOG_MAX<=12)) \
+        HUFv07_DECODE_SYMBOLX2_0(ptr, DStreamPtr)
 
-#define HUFv06_DECODE_SYMBOLX2_2(ptr, DStreamPtr) \
+#define HUFv07_DECODE_SYMBOLX2_2(ptr, DStreamPtr) \
     if (MEM_64bits()) \
-        HUFv06_DECODE_SYMBOLX2_0(ptr, DStreamPtr)
+        HUFv07_DECODE_SYMBOLX2_0(ptr, DStreamPtr)
 
-static inline size_t HUFv06_decodeStreamX2(BYTE* p, BITv06_DStream_t* const bitDPtr, BYTE* const pEnd, const HUFv06_DEltX2* const dt, const U32 dtLog)
+static inline size_t HUFv07_decodeStreamX2(BYTE* p, BITv07_DStream_t* const bitDPtr, BYTE* const pEnd, const HUFv07_DEltX2* const dt, const U32 dtLog)
 {
     BYTE* const pStart = p;
 
     /* up to 4 symbols at a time */
-    while ((BITv06_reloadDStream(bitDPtr) == BITv06_DStream_unfinished) && (p <= pEnd-4)) {
-        HUFv06_DECODE_SYMBOLX2_2(p, bitDPtr);
-        HUFv06_DECODE_SYMBOLX2_1(p, bitDPtr);
-        HUFv06_DECODE_SYMBOLX2_2(p, bitDPtr);
-        HUFv06_DECODE_SYMBOLX2_0(p, bitDPtr);
+    while ((BITv07_reloadDStream(bitDPtr) == BITv07_DStream_unfinished) && (p <= pEnd-4)) {
+        HUFv07_DECODE_SYMBOLX2_2(p, bitDPtr);
+        HUFv07_DECODE_SYMBOLX2_1(p, bitDPtr);
+        HUFv07_DECODE_SYMBOLX2_2(p, bitDPtr);
+        HUFv07_DECODE_SYMBOLX2_0(p, bitDPtr);
     }
 
     /* closer to the end */
-    while ((BITv06_reloadDStream(bitDPtr) == BITv06_DStream_unfinished) && (p < pEnd))
-        HUFv06_DECODE_SYMBOLX2_0(p, bitDPtr);
+    while ((BITv07_reloadDStream(bitDPtr) == BITv07_DStream_unfinished) && (p < pEnd))
+        HUFv07_DECODE_SYMBOLX2_0(p, bitDPtr);
 
     /* no more data to retrieve from bitstream, hence no need to reload */
     while (p < pEnd)
-        HUFv06_DECODE_SYMBOLX2_0(p, bitDPtr);
+        HUFv07_DECODE_SYMBOLX2_0(p, bitDPtr);
 
     return pEnd-pStart;
 }
 
-size_t HUFv06_decompress1X2_usingDTable(
+static size_t HUFv07_decompress1X2_usingDTable_internal(
           void* dst,  size_t dstSize,
     const void* cSrc, size_t cSrcSize,
-    const U16* DTable)
+    const HUFv07_DTable* DTable)
 {
     BYTE* op = (BYTE*)dst;
     BYTE* const oend = op + dstSize;
-    const U32 dtLog = DTable[0];
-    const void* dtPtr = DTable;
-    const HUFv06_DEltX2* const dt = ((const HUFv06_DEltX2*)dtPtr)+1;
-    BITv06_DStream_t bitD;
+    const void* dtPtr = DTable + 1;
+    const HUFv07_DEltX2* const dt = (const HUFv07_DEltX2*)dtPtr;
+    BITv07_DStream_t bitD;
+    DTableDesc const dtd = HUFv07_getDTableDesc(DTable);
+    U32 const dtLog = dtd.tableLog;
 
-    { size_t const errorCode = BITv06_initDStream(&bitD, cSrc, cSrcSize);
-      if (HUFv06_isError(errorCode)) return errorCode; }
+    { size_t const errorCode = BITv07_initDStream(&bitD, cSrc, cSrcSize);
+      if (HUFv07_isError(errorCode)) return errorCode; }
 
-    HUFv06_decodeStreamX2(op, &bitD, oend, dt, dtLog);
+    HUFv07_decodeStreamX2(op, &bitD, oend, dt, dtLog);
 
     /* check */
-    if (!BITv06_endOfDStream(&bitD)) return ERROR(corruption_detected);
+    if (!BITv07_endOfDStream(&bitD)) return ERROR(corruption_detected);
 
     return dstSize;
 }
 
-size_t HUFv06_decompress1X2 (void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize)
+size_t HUFv07_decompress1X2_usingDTable(
+          void* dst,  size_t dstSize,
+    const void* cSrc, size_t cSrcSize,
+    const HUFv07_DTable* DTable)
+{
+    DTableDesc dtd = HUFv07_getDTableDesc(DTable);
+    if (dtd.tableType != 0) return ERROR(GENERIC);
+    return HUFv07_decompress1X2_usingDTable_internal(dst, dstSize, cSrc, cSrcSize, DTable);
+}
+
+size_t HUFv07_decompress1X2_DCtx (HUFv07_DTable* DCtx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize)
 {
-    HUFv06_CREATE_STATIC_DTABLEX2(DTable, HUFv06_MAX_TABLELOG);
     const BYTE* ip = (const BYTE*) cSrc;
 
-    size_t const errorCode = HUFv06_readDTableX2 (DTable, cSrc, cSrcSize);
-    if (HUFv06_isError(errorCode)) return errorCode;
-    if (errorCode >= cSrcSize) return ERROR(srcSize_wrong);
-    ip += errorCode;
-    cSrcSize -= errorCode;
+    size_t const hSize = HUFv07_readDTableX2 (DCtx, cSrc, cSrcSize);
+    if (HUFv07_isError(hSize)) return hSize;
+    if (hSize >= cSrcSize) return ERROR(srcSize_wrong);
+    ip += hSize; cSrcSize -= hSize;
+
+    return HUFv07_decompress1X2_usingDTable_internal (dst, dstSize, ip, cSrcSize, DCtx);
+}
 
-    return HUFv06_decompress1X2_usingDTable (dst, dstSize, ip, cSrcSize, DTable);
+size_t HUFv07_decompress1X2 (void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize)
+{
+    HUFv07_CREATE_STATIC_DTABLEX2(DTable, HUFv07_TABLELOG_MAX);
+    return HUFv07_decompress1X2_DCtx (DTable, dst, dstSize, cSrc, cSrcSize);
 }
 
 
-size_t HUFv06_decompress4X2_usingDTable(
+static size_t HUFv07_decompress4X2_usingDTable_internal(
           void* dst,  size_t dstSize,
     const void* cSrc, size_t cSrcSize,
-    const U16* DTable)
+    const HUFv07_DTable* DTable)
 {
     /* Check */
     if (cSrcSize < 10) return ERROR(corruption_detected);  /* strict minimum : jump table + 1 byte per stream */
@@ -2622,20 +2376,18 @@ size_t HUFv06_decompress4X2_usingDTable(
     {   const BYTE* const istart = (const BYTE*) cSrc;
         BYTE* const ostart = (BYTE*) dst;
         BYTE* const oend = ostart + dstSize;
-        const void* const dtPtr = DTable;
-        const HUFv06_DEltX2* const dt = ((const HUFv06_DEltX2*)dtPtr) +1;
-        const U32 dtLog = DTable[0];
-        size_t errorCode;
+        const void* const dtPtr = DTable + 1;
+        const HUFv07_DEltX2* const dt = (const HUFv07_DEltX2*)dtPtr;
 
         /* Init */
-        BITv06_DStream_t bitD1;
-        BITv06_DStream_t bitD2;
-        BITv06_DStream_t bitD3;
-        BITv06_DStream_t bitD4;
-        const size_t length1 = MEM_readLE16(istart);
-        const size_t length2 = MEM_readLE16(istart+2);
-        const size_t length3 = MEM_readLE16(istart+4);
-        size_t length4;
+        BITv07_DStream_t bitD1;
+        BITv07_DStream_t bitD2;
+        BITv07_DStream_t bitD3;
+        BITv07_DStream_t bitD4;
+        size_t const length1 = MEM_readLE16(istart);
+        size_t const length2 = MEM_readLE16(istart+2);
+        size_t const length3 = MEM_readLE16(istart+4);
+        size_t const length4 = cSrcSize - (length1 + length2 + length3 + 6);
         const BYTE* const istart1 = istart + 6;  /* jumpTable */
         const BYTE* const istart2 = istart1 + length1;
         const BYTE* const istart3 = istart2 + length2;
@@ -2649,38 +2401,39 @@ size_t HUFv06_decompress4X2_usingDTable(
         BYTE* op3 = opStart3;
         BYTE* op4 = opStart4;
         U32 endSignal;
+        DTableDesc const dtd = HUFv07_getDTableDesc(DTable);
+        U32 const dtLog = dtd.tableLog;
 
-        length4 = cSrcSize - (length1 + length2 + length3 + 6);
         if (length4 > cSrcSize) return ERROR(corruption_detected);   /* overflow */
-        errorCode = BITv06_initDStream(&bitD1, istart1, length1);
-        if (HUFv06_isError(errorCode)) return errorCode;
-        errorCode = BITv06_initDStream(&bitD2, istart2, length2);
-        if (HUFv06_isError(errorCode)) return errorCode;
-        errorCode = BITv06_initDStream(&bitD3, istart3, length3);
-        if (HUFv06_isError(errorCode)) return errorCode;
-        errorCode = BITv06_initDStream(&bitD4, istart4, length4);
-        if (HUFv06_isError(errorCode)) return errorCode;
+        { size_t const errorCode = BITv07_initDStream(&bitD1, istart1, length1);
+          if (HUFv07_isError(errorCode)) return errorCode; }
+        { size_t const errorCode = BITv07_initDStream(&bitD2, istart2, length2);
+          if (HUFv07_isError(errorCode)) return errorCode; }
+        { size_t const errorCode = BITv07_initDStream(&bitD3, istart3, length3);
+          if (HUFv07_isError(errorCode)) return errorCode; }
+        { size_t const errorCode = BITv07_initDStream(&bitD4, istart4, length4);
+          if (HUFv07_isError(errorCode)) return errorCode; }
 
         /* 16-32 symbols per loop (4-8 symbols per stream) */
-        endSignal = BITv06_reloadDStream(&bitD1) | BITv06_reloadDStream(&bitD2) | BITv06_reloadDStream(&bitD3) | BITv06_reloadDStream(&bitD4);
-        for ( ; (endSignal==BITv06_DStream_unfinished) && (op4<(oend-7)) ; ) {
-            HUFv06_DECODE_SYMBOLX2_2(op1, &bitD1);
-            HUFv06_DECODE_SYMBOLX2_2(op2, &bitD2);
-            HUFv06_DECODE_SYMBOLX2_2(op3, &bitD3);
-            HUFv06_DECODE_SYMBOLX2_2(op4, &bitD4);
-            HUFv06_DECODE_SYMBOLX2_1(op1, &bitD1);
-            HUFv06_DECODE_SYMBOLX2_1(op2, &bitD2);
-            HUFv06_DECODE_SYMBOLX2_1(op3, &bitD3);
-            HUFv06_DECODE_SYMBOLX2_1(op4, &bitD4);
-            HUFv06_DECODE_SYMBOLX2_2(op1, &bitD1);
-            HUFv06_DECODE_SYMBOLX2_2(op2, &bitD2);
-            HUFv06_DECODE_SYMBOLX2_2(op3, &bitD3);
-            HUFv06_DECODE_SYMBOLX2_2(op4, &bitD4);
-            HUFv06_DECODE_SYMBOLX2_0(op1, &bitD1);
-            HUFv06_DECODE_SYMBOLX2_0(op2, &bitD2);
-            HUFv06_DECODE_SYMBOLX2_0(op3, &bitD3);
-            HUFv06_DECODE_SYMBOLX2_0(op4, &bitD4);
-            endSignal = BITv06_reloadDStream(&bitD1) | BITv06_reloadDStream(&bitD2) | BITv06_reloadDStream(&bitD3) | BITv06_reloadDStream(&bitD4);
+        endSignal = BITv07_reloadDStream(&bitD1) | BITv07_reloadDStream(&bitD2) | BITv07_reloadDStream(&bitD3) | BITv07_reloadDStream(&bitD4);
+        for ( ; (endSignal==BITv07_DStream_unfinished) && (op4<(oend-7)) ; ) {
+            HUFv07_DECODE_SYMBOLX2_2(op1, &bitD1);
+            HUFv07_DECODE_SYMBOLX2_2(op2, &bitD2);
+            HUFv07_DECODE_SYMBOLX2_2(op3, &bitD3);
+            HUFv07_DECODE_SYMBOLX2_2(op4, &bitD4);
+            HUFv07_DECODE_SYMBOLX2_1(op1, &bitD1);
+            HUFv07_DECODE_SYMBOLX2_1(op2, &bitD2);
+            HUFv07_DECODE_SYMBOLX2_1(op3, &bitD3);
+            HUFv07_DECODE_SYMBOLX2_1(op4, &bitD4);
+            HUFv07_DECODE_SYMBOLX2_2(op1, &bitD1);
+            HUFv07_DECODE_SYMBOLX2_2(op2, &bitD2);
+            HUFv07_DECODE_SYMBOLX2_2(op3, &bitD3);
+            HUFv07_DECODE_SYMBOLX2_2(op4, &bitD4);
+            HUFv07_DECODE_SYMBOLX2_0(op1, &bitD1);
+            HUFv07_DECODE_SYMBOLX2_0(op2, &bitD2);
+            HUFv07_DECODE_SYMBOLX2_0(op3, &bitD3);
+            HUFv07_DECODE_SYMBOLX2_0(op4, &bitD4);
+            endSignal = BITv07_reloadDStream(&bitD1) | BITv07_reloadDStream(&bitD2) | BITv07_reloadDStream(&bitD3) | BITv07_reloadDStream(&bitD4);
         }
 
         /* check corruption */
@@ -2690,13 +2443,13 @@ size_t HUFv06_decompress4X2_usingDTable(
         /* note : op4 supposed already verified within main loop */
 
         /* finish bitStreams one by one */
-        HUFv06_decodeStreamX2(op1, &bitD1, opStart2, dt, dtLog);
-        HUFv06_decodeStreamX2(op2, &bitD2, opStart3, dt, dtLog);
-        HUFv06_decodeStreamX2(op3, &bitD3, opStart4, dt, dtLog);
-        HUFv06_decodeStreamX2(op4, &bitD4, oend,     dt, dtLog);
+        HUFv07_decodeStreamX2(op1, &bitD1, opStart2, dt, dtLog);
+        HUFv07_decodeStreamX2(op2, &bitD2, opStart3, dt, dtLog);
+        HUFv07_decodeStreamX2(op3, &bitD3, opStart4, dt, dtLog);
+        HUFv07_decodeStreamX2(op4, &bitD4, oend,     dt, dtLog);
 
         /* check */
-        endSignal = BITv06_endOfDStream(&bitD1) & BITv06_endOfDStream(&bitD2) & BITv06_endOfDStream(&bitD3) & BITv06_endOfDStream(&bitD4);
+        endSignal = BITv07_endOfDStream(&bitD1) & BITv07_endOfDStream(&bitD2) & BITv07_endOfDStream(&bitD3) & BITv07_endOfDStream(&bitD4);
         if (!endSignal) return ERROR(corruption_detected);
 
         /* decoded size */
@@ -2705,32 +2458,50 @@ size_t HUFv06_decompress4X2_usingDTable(
 }
 
 
-size_t HUFv06_decompress4X2 (void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize)
+size_t HUFv07_decompress4X2_usingDTable(
+          void* dst,  size_t dstSize,
+    const void* cSrc, size_t cSrcSize,
+    const HUFv07_DTable* DTable)
+{
+    DTableDesc dtd = HUFv07_getDTableDesc(DTable);
+    if (dtd.tableType != 0) return ERROR(GENERIC);
+    return HUFv07_decompress4X2_usingDTable_internal(dst, dstSize, cSrc, cSrcSize, DTable);
+}
+
+
+size_t HUFv07_decompress4X2_DCtx (HUFv07_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize)
 {
-    HUFv06_CREATE_STATIC_DTABLEX2(DTable, HUFv06_MAX_TABLELOG);
     const BYTE* ip = (const BYTE*) cSrc;
 
-    size_t const errorCode = HUFv06_readDTableX2 (DTable, cSrc, cSrcSize);
-    if (HUFv06_isError(errorCode)) return errorCode;
-    if (errorCode >= cSrcSize) return ERROR(srcSize_wrong);
-    ip += errorCode;
-    cSrcSize -= errorCode;
+    size_t const hSize = HUFv07_readDTableX2 (dctx, cSrc, cSrcSize);
+    if (HUFv07_isError(hSize)) return hSize;
+    if (hSize >= cSrcSize) return ERROR(srcSize_wrong);
+    ip += hSize; cSrcSize -= hSize;
+
+    return HUFv07_decompress4X2_usingDTable_internal (dst, dstSize, ip, cSrcSize, dctx);
+}
 
-    return HUFv06_decompress4X2_usingDTable (dst, dstSize, ip, cSrcSize, DTable);
+size_t HUFv07_decompress4X2 (void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize)
+{
+    HUFv07_CREATE_STATIC_DTABLEX2(DTable, HUFv07_TABLELOG_MAX);
+    return HUFv07_decompress4X2_DCtx(DTable, dst, dstSize, cSrc, cSrcSize);
 }
 
 
 /* *************************/
 /* double-symbols decoding */
 /* *************************/
+typedef struct { U16 sequence; BYTE nbBits; BYTE length; } HUFv07_DEltX4;  /* double-symbols decoding */
 
-static void HUFv06_fillDTableX4Level2(HUFv06_DEltX4* DTable, U32 sizeLog, const U32 consumed,
+typedef struct { BYTE symbol; BYTE weight; } sortedSymbol_t;
+
+static void HUFv07_fillDTableX4Level2(HUFv07_DEltX4* DTable, U32 sizeLog, const U32 consumed,
                            const U32* rankValOrigin, const int minWeight,
                            const sortedSymbol_t* sortedSymbols, const U32 sortedListSize,
                            U32 nbBitsBaseline, U16 baseSeq)
 {
-    HUFv06_DEltX4 DElt;
-    U32 rankVal[HUFv06_ABSOLUTEMAX_TABLELOG + 1];
+    HUFv07_DEltX4 DElt;
+    U32 rankVal[HUFv07_TABLELOG_ABSOLUTEMAX + 1];
 
     /* get pre-calculated rankVal */
     memcpy(rankVal, rankValOrigin, sizeof(rankVal));
@@ -2764,14 +2535,14 @@ static void HUFv06_fillDTableX4Level2(HUFv06_DEltX4* DTable, U32 sizeLog, const
     }}
 }
 
-typedef U32 rankVal_t[HUFv06_ABSOLUTEMAX_TABLELOG][HUFv06_ABSOLUTEMAX_TABLELOG + 1];
+typedef U32 rankVal_t[HUFv07_TABLELOG_ABSOLUTEMAX][HUFv07_TABLELOG_ABSOLUTEMAX + 1];
 
-static void HUFv06_fillDTableX4(HUFv06_DEltX4* DTable, const U32 targetLog,
+static void HUFv07_fillDTableX4(HUFv07_DEltX4* DTable, const U32 targetLog,
                            const sortedSymbol_t* sortedList, const U32 sortedListSize,
                            const U32* rankStart, rankVal_t rankValOrigin, const U32 maxWeight,
                            const U32 nbBitsBaseline)
 {
-    U32 rankVal[HUFv06_ABSOLUTEMAX_TABLELOG + 1];
+    U32 rankVal[HUFv07_TABLELOG_ABSOLUTEMAX + 1];
     const int scaleLog = nbBitsBaseline - targetLog;   /* note : targetLog >= srcLog, hence scaleLog <= 1 */
     const U32 minBits  = nbBitsBaseline - maxWeight;
     U32 s;
@@ -2791,12 +2562,12 @@ static void HUFv06_fillDTableX4(HUFv06_DEltX4* DTable, const U32 targetLog,
             int minWeight = nbBits + scaleLog;
             if (minWeight < 1) minWeight = 1;
             sortedRank = rankStart[minWeight];
-            HUFv06_fillDTableX4Level2(DTable+start, targetLog-nbBits, nbBits,
+            HUFv07_fillDTableX4Level2(DTable+start, targetLog-nbBits, nbBits,
                            rankValOrigin[nbBits], minWeight,
                            sortedList+sortedRank, sortedListSize-sortedRank,
                            nbBitsBaseline, symbol);
         } else {
-            HUFv06_DEltX4 DElt;
+            HUFv07_DEltX4 DElt;
             MEM_writeLE16(&(DElt.sequence), symbol);
             DElt.nbBits = (BYTE)(nbBits);
             DElt.length = 1;
@@ -2808,29 +2579,30 @@ static void HUFv06_fillDTableX4(HUFv06_DEltX4* DTable, const U32 targetLog,
     }
 }
 
-size_t HUFv06_readDTableX4 (U32* DTable, const void* src, size_t srcSize)
+size_t HUFv07_readDTableX4 (HUFv07_DTable* DTable, const void* src, size_t srcSize)
 {
-    BYTE weightList[HUFv06_MAX_SYMBOL_VALUE + 1];
-    sortedSymbol_t sortedSymbol[HUFv06_MAX_SYMBOL_VALUE + 1];
-    U32 rankStats[HUFv06_ABSOLUTEMAX_TABLELOG + 1] = { 0 };
-    U32 rankStart0[HUFv06_ABSOLUTEMAX_TABLELOG + 2] = { 0 };
+    BYTE weightList[HUFv07_SYMBOLVALUE_MAX + 1];
+    sortedSymbol_t sortedSymbol[HUFv07_SYMBOLVALUE_MAX + 1];
+    U32 rankStats[HUFv07_TABLELOG_ABSOLUTEMAX + 1] = { 0 };
+    U32 rankStart0[HUFv07_TABLELOG_ABSOLUTEMAX + 2] = { 0 };
     U32* const rankStart = rankStart0+1;
     rankVal_t rankVal;
     U32 tableLog, maxW, sizeOfSort, nbSymbols;
-    const U32 memLog = DTable[0];
+    DTableDesc dtd = HUFv07_getDTableDesc(DTable);
+    U32 const maxTableLog = dtd.maxTableLog;
     size_t iSize;
-    void* dtPtr = DTable;
-    HUFv06_DEltX4* const dt = ((HUFv06_DEltX4*)dtPtr) + 1;
+    void* dtPtr = DTable+1;   /* force compiler to avoid strict-aliasing */
+    HUFv07_DEltX4* const dt = (HUFv07_DEltX4*)dtPtr;
 
-    HUFv06_STATIC_ASSERT(sizeof(HUFv06_DEltX4) == sizeof(U32));   /* if compilation fails here, assertion is false */
-    if (memLog > HUFv06_ABSOLUTEMAX_TABLELOG) return ERROR(tableLog_tooLarge);
+    HUFv07_STATIC_ASSERT(sizeof(HUFv07_DEltX4) == sizeof(HUFv07_DTable));   /* if compilation fails here, assertion is false */
+    if (maxTableLog > HUFv07_TABLELOG_ABSOLUTEMAX) return ERROR(tableLog_tooLarge);
     //memset(weightList, 0, sizeof(weightList));   /* is not necessary, even though some analyzer complain ... */
 
-    iSize = HUFv06_readStats(weightList, HUFv06_MAX_SYMBOL_VALUE + 1, rankStats, &nbSymbols, &tableLog, src, srcSize);
-    if (HUFv06_isError(iSize)) return iSize;
+    iSize = HUFv07_readStats(weightList, HUFv07_SYMBOLVALUE_MAX + 1, rankStats, &nbSymbols, &tableLog, src, srcSize);
+    if (HUFv07_isError(iSize)) return iSize;
 
     /* check result */
-    if (tableLog > memLog) return ERROR(tableLog_tooLarge);   /* DTable can't fit code depth */
+    if (tableLog > maxTableLog) return ERROR(tableLog_tooLarge);   /* DTable can't fit code depth */
 
     /* find maxWeight */
     for (maxW = tableLog; rankStats[maxW]==0; maxW--) {}  /* necessarily finds a solution before 0 */
@@ -2859,7 +2631,7 @@ size_t HUFv06_readDTableX4 (U32* DTable, const void* src, size_t srcSize)
 
     /* Build rankVal */
     {   U32* const rankVal0 = rankVal[0];
-        {   int const rescale = (memLog-tableLog) - 1;   /* tableLog <= memLog */
+        {   int const rescale = (maxTableLog-tableLog) - 1;   /* tableLog <= maxTableLog */
             U32 nextRankVal = 0;
             U32 w;
             for (w=1; w<maxW+1; w++) {
@@ -2869,38 +2641,41 @@ size_t HUFv06_readDTableX4 (U32* DTable, const void* src, size_t srcSize)
         }   }
         {   U32 const minBits = tableLog+1 - maxW;
             U32 consumed;
-            for (consumed = minBits; consumed < memLog - minBits + 1; consumed++) {
+            for (consumed = minBits; consumed < maxTableLog - minBits + 1; consumed++) {
                 U32* const rankValPtr = rankVal[consumed];
                 U32 w;
                 for (w = 1; w < maxW+1; w++) {
                     rankValPtr[w] = rankVal0[w] >> consumed;
     }   }   }   }
 
-    HUFv06_fillDTableX4(dt, memLog,
+    HUFv07_fillDTableX4(dt, maxTableLog,
                    sortedSymbol, sizeOfSort,
                    rankStart0, rankVal, maxW,
                    tableLog+1);
 
+    dtd.tableLog = (BYTE)maxTableLog;
+    dtd.tableType = 1;
+    memcpy(DTable, &dtd, sizeof(dtd));
     return iSize;
 }
 
 
-static U32 HUFv06_decodeSymbolX4(void* op, BITv06_DStream_t* DStream, const HUFv06_DEltX4* dt, const U32 dtLog)
+static U32 HUFv07_decodeSymbolX4(void* op, BITv07_DStream_t* DStream, const HUFv07_DEltX4* dt, const U32 dtLog)
 {
-    const size_t val = BITv06_lookBitsFast(DStream, dtLog);   /* note : dtLog >= 1 */
+    const size_t val = BITv07_lookBitsFast(DStream, dtLog);   /* note : dtLog >= 1 */
     memcpy(op, dt+val, 2);
-    BITv06_skipBits(DStream, dt[val].nbBits);
+    BITv07_skipBits(DStream, dt[val].nbBits);
     return dt[val].length;
 }
 
-static U32 HUFv06_decodeLastSymbolX4(void* op, BITv06_DStream_t* DStream, const HUFv06_DEltX4* dt, const U32 dtLog)
+static U32 HUFv07_decodeLastSymbolX4(void* op, BITv07_DStream_t* DStream, const HUFv07_DEltX4* dt, const U32 dtLog)
 {
-    const size_t val = BITv06_lookBitsFast(DStream, dtLog);   /* note : dtLog >= 1 */
+    const size_t val = BITv07_lookBitsFast(DStream, dtLog);   /* note : dtLog >= 1 */
     memcpy(op, dt+val, 1);
-    if (dt[val].length==1) BITv06_skipBits(DStream, dt[val].nbBits);
+    if (dt[val].length==1) BITv07_skipBits(DStream, dt[val].nbBits);
     else {
         if (DStream->bitsConsumed < (sizeof(DStream->bitContainer)*8)) {
-            BITv06_skipBits(DStream, dt[val].nbBits);
+            BITv07_skipBits(DStream, dt[val].nbBits);
             if (DStream->bitsConsumed > (sizeof(DStream->bitContainer)*8))
                 DStream->bitsConsumed = (sizeof(DStream->bitContainer)*8);   /* ugly hack; works only because it's the last symbol. Note : can't easily extract nbBits from just this symbol */
     }   }
@@ -2908,114 +2683,126 @@ static U32 HUFv06_decodeLastSymbolX4(void* op, BITv06_DStream_t* DStream, const
 }
 
 
-#define HUFv06_DECODE_SYMBOLX4_0(ptr, DStreamPtr) \
-    ptr += HUFv06_decodeSymbolX4(ptr, DStreamPtr, dt, dtLog)
+#define HUFv07_DECODE_SYMBOLX4_0(ptr, DStreamPtr) \
+    ptr += HUFv07_decodeSymbolX4(ptr, DStreamPtr, dt, dtLog)
 
-#define HUFv06_DECODE_SYMBOLX4_1(ptr, DStreamPtr) \
-    if (MEM_64bits() || (HUFv06_MAX_TABLELOG<=12)) \
-        ptr += HUFv06_decodeSymbolX4(ptr, DStreamPtr, dt, dtLog)
+#define HUFv07_DECODE_SYMBOLX4_1(ptr, DStreamPtr) \
+    if (MEM_64bits() || (HUFv07_TABLELOG_MAX<=12)) \
+        ptr += HUFv07_decodeSymbolX4(ptr, DStreamPtr, dt, dtLog)
 
-#define HUFv06_DECODE_SYMBOLX4_2(ptr, DStreamPtr) \
+#define HUFv07_DECODE_SYMBOLX4_2(ptr, DStreamPtr) \
     if (MEM_64bits()) \
-        ptr += HUFv06_decodeSymbolX4(ptr, DStreamPtr, dt, dtLog)
+        ptr += HUFv07_decodeSymbolX4(ptr, DStreamPtr, dt, dtLog)
 
-static inline size_t HUFv06_decodeStreamX4(BYTE* p, BITv06_DStream_t* bitDPtr, BYTE* const pEnd, const HUFv06_DEltX4* const dt, const U32 dtLog)
+static inline size_t HUFv07_decodeStreamX4(BYTE* p, BITv07_DStream_t* bitDPtr, BYTE* const pEnd, const HUFv07_DEltX4* const dt, const U32 dtLog)
 {
     BYTE* const pStart = p;
 
     /* up to 8 symbols at a time */
-    while ((BITv06_reloadDStream(bitDPtr) == BITv06_DStream_unfinished) && (p < pEnd-7)) {
-        HUFv06_DECODE_SYMBOLX4_2(p, bitDPtr);
-        HUFv06_DECODE_SYMBOLX4_1(p, bitDPtr);
-        HUFv06_DECODE_SYMBOLX4_2(p, bitDPtr);
-        HUFv06_DECODE_SYMBOLX4_0(p, bitDPtr);
+    while ((BITv07_reloadDStream(bitDPtr) == BITv07_DStream_unfinished) && (p < pEnd-7)) {
+        HUFv07_DECODE_SYMBOLX4_2(p, bitDPtr);
+        HUFv07_DECODE_SYMBOLX4_1(p, bitDPtr);
+        HUFv07_DECODE_SYMBOLX4_2(p, bitDPtr);
+        HUFv07_DECODE_SYMBOLX4_0(p, bitDPtr);
     }
 
-    /* closer to the end */
-    while ((BITv06_reloadDStream(bitDPtr) == BITv06_DStream_unfinished) && (p <= pEnd-2))
-        HUFv06_DECODE_SYMBOLX4_0(p, bitDPtr);
+    /* closer to end : up to 2 symbols at a time */
+    while ((BITv07_reloadDStream(bitDPtr) == BITv07_DStream_unfinished) && (p <= pEnd-2))
+        HUFv07_DECODE_SYMBOLX4_0(p, bitDPtr);
 
     while (p <= pEnd-2)
-        HUFv06_DECODE_SYMBOLX4_0(p, bitDPtr);   /* no need to reload : reached the end of DStream */
+        HUFv07_DECODE_SYMBOLX4_0(p, bitDPtr);   /* no need to reload : reached the end of DStream */
 
     if (p < pEnd)
-        p += HUFv06_decodeLastSymbolX4(p, bitDPtr, dt, dtLog);
+        p += HUFv07_decodeLastSymbolX4(p, bitDPtr, dt, dtLog);
 
     return p-pStart;
 }
 
 
-size_t HUFv06_decompress1X4_usingDTable(
+static size_t HUFv07_decompress1X4_usingDTable_internal(
           void* dst,  size_t dstSize,
     const void* cSrc, size_t cSrcSize,
-    const U32* DTable)
+    const HUFv07_DTable* DTable)
 {
-    const BYTE* const istart = (const BYTE*) cSrc;
-    BYTE* const ostart = (BYTE*) dst;
-    BYTE* const oend = ostart + dstSize;
-
-    const U32 dtLog = DTable[0];
-    const void* const dtPtr = DTable;
-    const HUFv06_DEltX4* const dt = ((const HUFv06_DEltX4*)dtPtr) +1;
+    BITv07_DStream_t bitD;
 
     /* Init */
-    BITv06_DStream_t bitD;
-    { size_t const errorCode = BITv06_initDStream(&bitD, istart, cSrcSize);
-      if (HUFv06_isError(errorCode)) return errorCode; }
+    {   size_t const errorCode = BITv07_initDStream(&bitD, cSrc, cSrcSize);
+        if (HUFv07_isError(errorCode)) return errorCode;
+    }
 
     /* decode */
-    HUFv06_decodeStreamX4(ostart, &bitD, oend, dt, dtLog);
+    {   BYTE* const ostart = (BYTE*) dst;
+        BYTE* const oend = ostart + dstSize;
+        const void* const dtPtr = DTable+1;   /* force compiler to not use strict-aliasing */
+        const HUFv07_DEltX4* const dt = (const HUFv07_DEltX4*)dtPtr;
+        DTableDesc const dtd = HUFv07_getDTableDesc(DTable);
+        HUFv07_decodeStreamX4(ostart, &bitD, oend, dt, dtd.tableLog);
+    }
 
     /* check */
-    if (!BITv06_endOfDStream(&bitD)) return ERROR(corruption_detected);
+    if (!BITv07_endOfDStream(&bitD)) return ERROR(corruption_detected);
 
     /* decoded size */
     return dstSize;
 }
 
-size_t HUFv06_decompress1X4 (void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize)
+size_t HUFv07_decompress1X4_usingDTable(
+          void* dst,  size_t dstSize,
+    const void* cSrc, size_t cSrcSize,
+    const HUFv07_DTable* DTable)
+{
+    DTableDesc dtd = HUFv07_getDTableDesc(DTable);
+    if (dtd.tableType != 1) return ERROR(GENERIC);
+    return HUFv07_decompress1X4_usingDTable_internal(dst, dstSize, cSrc, cSrcSize, DTable);
+}
+
+size_t HUFv07_decompress1X4_DCtx (HUFv07_DTable* DCtx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize)
 {
-    HUFv06_CREATE_STATIC_DTABLEX4(DTable, HUFv06_MAX_TABLELOG);
     const BYTE* ip = (const BYTE*) cSrc;
 
-    size_t const hSize = HUFv06_readDTableX4 (DTable, cSrc, cSrcSize);
-    if (HUFv06_isError(hSize)) return hSize;
+    size_t const hSize = HUFv07_readDTableX4 (DCtx, cSrc, cSrcSize);
+    if (HUFv07_isError(hSize)) return hSize;
     if (hSize >= cSrcSize) return ERROR(srcSize_wrong);
-    ip += hSize;
-    cSrcSize -= hSize;
+    ip += hSize; cSrcSize -= hSize;
+
+    return HUFv07_decompress1X4_usingDTable_internal (dst, dstSize, ip, cSrcSize, DCtx);
+}
 
-    return HUFv06_decompress1X4_usingDTable (dst, dstSize, ip, cSrcSize, DTable);
+size_t HUFv07_decompress1X4 (void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize)
+{
+    HUFv07_CREATE_STATIC_DTABLEX4(DTable, HUFv07_TABLELOG_MAX);
+    return HUFv07_decompress1X4_DCtx(DTable, dst, dstSize, cSrc, cSrcSize);
 }
 
-size_t HUFv06_decompress4X4_usingDTable(
+static size_t HUFv07_decompress4X4_usingDTable_internal(
           void* dst,  size_t dstSize,
     const void* cSrc, size_t cSrcSize,
-    const U32* DTable)
+    const HUFv07_DTable* DTable)
 {
     if (cSrcSize < 10) return ERROR(corruption_detected);   /* strict minimum : jump table + 1 byte per stream */
 
     {   const BYTE* const istart = (const BYTE*) cSrc;
         BYTE* const ostart = (BYTE*) dst;
         BYTE* const oend = ostart + dstSize;
-        const void* const dtPtr = DTable;
-        const HUFv06_DEltX4* const dt = ((const HUFv06_DEltX4*)dtPtr) +1;
-        const U32 dtLog = DTable[0];
-        size_t errorCode;
+        const void* const dtPtr = DTable+1;
+        const HUFv07_DEltX4* const dt = (const HUFv07_DEltX4*)dtPtr;
 
         /* Init */
-        BITv06_DStream_t bitD1;
-        BITv06_DStream_t bitD2;
-        BITv06_DStream_t bitD3;
-        BITv06_DStream_t bitD4;
-        const size_t length1 = MEM_readLE16(istart);
-        const size_t length2 = MEM_readLE16(istart+2);
-        const size_t length3 = MEM_readLE16(istart+4);
-        size_t length4;
+        BITv07_DStream_t bitD1;
+        BITv07_DStream_t bitD2;
+        BITv07_DStream_t bitD3;
+        BITv07_DStream_t bitD4;
+        size_t const length1 = MEM_readLE16(istart);
+        size_t const length2 = MEM_readLE16(istart+2);
+        size_t const length3 = MEM_readLE16(istart+4);
+        size_t const length4 = cSrcSize - (length1 + length2 + length3 + 6);
         const BYTE* const istart1 = istart + 6;  /* jumpTable */
         const BYTE* const istart2 = istart1 + length1;
         const BYTE* const istart3 = istart2 + length2;
         const BYTE* const istart4 = istart3 + length3;
-        const size_t segmentSize = (dstSize+3) / 4;
+        size_t const segmentSize = (dstSize+3) / 4;
         BYTE* const opStart2 = ostart + segmentSize;
         BYTE* const opStart3 = opStart2 + segmentSize;
         BYTE* const opStart4 = opStart3 + segmentSize;
@@ -3024,39 +2811,40 @@ size_t HUFv06_decompress4X4_usingDTable(
         BYTE* op3 = opStart3;
         BYTE* op4 = opStart4;
         U32 endSignal;
+        DTableDesc const dtd = HUFv07_getDTableDesc(DTable);
+        U32 const dtLog = dtd.tableLog;
 
-        length4 = cSrcSize - (length1 + length2 + length3 + 6);
         if (length4 > cSrcSize) return ERROR(corruption_detected);   /* overflow */
-        errorCode = BITv06_initDStream(&bitD1, istart1, length1);
-        if (HUFv06_isError(errorCode)) return errorCode;
-        errorCode = BITv06_initDStream(&bitD2, istart2, length2);
-        if (HUFv06_isError(errorCode)) return errorCode;
-        errorCode = BITv06_initDStream(&bitD3, istart3, length3);
-        if (HUFv06_isError(errorCode)) return errorCode;
-        errorCode = BITv06_initDStream(&bitD4, istart4, length4);
-        if (HUFv06_isError(errorCode)) return errorCode;
+        { size_t const errorCode = BITv07_initDStream(&bitD1, istart1, length1);
+          if (HUFv07_isError(errorCode)) return errorCode; }
+        { size_t const errorCode = BITv07_initDStream(&bitD2, istart2, length2);
+          if (HUFv07_isError(errorCode)) return errorCode; }
+        { size_t const errorCode = BITv07_initDStream(&bitD3, istart3, length3);
+          if (HUFv07_isError(errorCode)) return errorCode; }
+        { size_t const errorCode = BITv07_initDStream(&bitD4, istart4, length4);
+          if (HUFv07_isError(errorCode)) return errorCode; }
 
         /* 16-32 symbols per loop (4-8 symbols per stream) */
-        endSignal = BITv06_reloadDStream(&bitD1) | BITv06_reloadDStream(&bitD2) | BITv06_reloadDStream(&bitD3) | BITv06_reloadDStream(&bitD4);
-        for ( ; (endSignal==BITv06_DStream_unfinished) && (op4<(oend-7)) ; ) {
-            HUFv06_DECODE_SYMBOLX4_2(op1, &bitD1);
-            HUFv06_DECODE_SYMBOLX4_2(op2, &bitD2);
-            HUFv06_DECODE_SYMBOLX4_2(op3, &bitD3);
-            HUFv06_DECODE_SYMBOLX4_2(op4, &bitD4);
-            HUFv06_DECODE_SYMBOLX4_1(op1, &bitD1);
-            HUFv06_DECODE_SYMBOLX4_1(op2, &bitD2);
-            HUFv06_DECODE_SYMBOLX4_1(op3, &bitD3);
-            HUFv06_DECODE_SYMBOLX4_1(op4, &bitD4);
-            HUFv06_DECODE_SYMBOLX4_2(op1, &bitD1);
-            HUFv06_DECODE_SYMBOLX4_2(op2, &bitD2);
-            HUFv06_DECODE_SYMBOLX4_2(op3, &bitD3);
-            HUFv06_DECODE_SYMBOLX4_2(op4, &bitD4);
-            HUFv06_DECODE_SYMBOLX4_0(op1, &bitD1);
-            HUFv06_DECODE_SYMBOLX4_0(op2, &bitD2);
-            HUFv06_DECODE_SYMBOLX4_0(op3, &bitD3);
-            HUFv06_DECODE_SYMBOLX4_0(op4, &bitD4);
-
-            endSignal = BITv06_reloadDStream(&bitD1) | BITv06_reloadDStream(&bitD2) | BITv06_reloadDStream(&bitD3) | BITv06_reloadDStream(&bitD4);
+        endSignal = BITv07_reloadDStream(&bitD1) | BITv07_reloadDStream(&bitD2) | BITv07_reloadDStream(&bitD3) | BITv07_reloadDStream(&bitD4);
+        for ( ; (endSignal==BITv07_DStream_unfinished) && (op4<(oend-7)) ; ) {
+            HUFv07_DECODE_SYMBOLX4_2(op1, &bitD1);
+            HUFv07_DECODE_SYMBOLX4_2(op2, &bitD2);
+            HUFv07_DECODE_SYMBOLX4_2(op3, &bitD3);
+            HUFv07_DECODE_SYMBOLX4_2(op4, &bitD4);
+            HUFv07_DECODE_SYMBOLX4_1(op1, &bitD1);
+            HUFv07_DECODE_SYMBOLX4_1(op2, &bitD2);
+            HUFv07_DECODE_SYMBOLX4_1(op3, &bitD3);
+            HUFv07_DECODE_SYMBOLX4_1(op4, &bitD4);
+            HUFv07_DECODE_SYMBOLX4_2(op1, &bitD1);
+            HUFv07_DECODE_SYMBOLX4_2(op2, &bitD2);
+            HUFv07_DECODE_SYMBOLX4_2(op3, &bitD3);
+            HUFv07_DECODE_SYMBOLX4_2(op4, &bitD4);
+            HUFv07_DECODE_SYMBOLX4_0(op1, &bitD1);
+            HUFv07_DECODE_SYMBOLX4_0(op2, &bitD2);
+            HUFv07_DECODE_SYMBOLX4_0(op3, &bitD3);
+            HUFv07_DECODE_SYMBOLX4_0(op4, &bitD4);
+
+            endSignal = BITv07_reloadDStream(&bitD1) | BITv07_reloadDStream(&bitD2) | BITv07_reloadDStream(&bitD3) | BITv07_reloadDStream(&bitD4);
         }
 
         /* check corruption */
@@ -3066,14 +2854,14 @@ size_t HUFv06_decompress4X4_usingDTable(
         /* note : op4 supposed already verified within main loop */
 
         /* finish bitStreams one by one */
-        HUFv06_decodeStreamX4(op1, &bitD1, opStart2, dt, dtLog);
-        HUFv06_decodeStreamX4(op2, &bitD2, opStart3, dt, dtLog);
-        HUFv06_decodeStreamX4(op3, &bitD3, opStart4, dt, dtLog);
-        HUFv06_decodeStreamX4(op4, &bitD4, oend,     dt, dtLog);
+        HUFv07_decodeStreamX4(op1, &bitD1, opStart2, dt, dtLog);
+        HUFv07_decodeStreamX4(op2, &bitD2, opStart3, dt, dtLog);
+        HUFv07_decodeStreamX4(op3, &bitD3, opStart4, dt, dtLog);
+        HUFv07_decodeStreamX4(op4, &bitD4, oend,     dt, dtLog);
 
         /* check */
-        endSignal = BITv06_endOfDStream(&bitD1) & BITv06_endOfDStream(&bitD2) & BITv06_endOfDStream(&bitD3) & BITv06_endOfDStream(&bitD4);
-        if (!endSignal) return ERROR(corruption_detected);
+        { U32 const endCheck = BITv07_endOfDStream(&bitD1) & BITv07_endOfDStream(&bitD2) & BITv07_endOfDStream(&bitD3) & BITv07_endOfDStream(&bitD4);
+          if (!endCheck) return ERROR(corruption_detected); }
 
         /* decoded size */
         return dstSize;
@@ -3081,27 +2869,59 @@ size_t HUFv06_decompress4X4_usingDTable(
 }
 
 
-size_t HUFv06_decompress4X4 (void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize)
+size_t HUFv07_decompress4X4_usingDTable(
+          void* dst,  size_t dstSize,
+    const void* cSrc, size_t cSrcSize,
+    const HUFv07_DTable* DTable)
+{
+    DTableDesc dtd = HUFv07_getDTableDesc(DTable);
+    if (dtd.tableType != 1) return ERROR(GENERIC);
+    return HUFv07_decompress4X4_usingDTable_internal(dst, dstSize, cSrc, cSrcSize, DTable);
+}
+
+
+size_t HUFv07_decompress4X4_DCtx (HUFv07_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize)
 {
-    HUFv06_CREATE_STATIC_DTABLEX4(DTable, HUFv06_MAX_TABLELOG);
     const BYTE* ip = (const BYTE*) cSrc;
 
-    size_t hSize = HUFv06_readDTableX4 (DTable, cSrc, cSrcSize);
-    if (HUFv06_isError(hSize)) return hSize;
+    size_t hSize = HUFv07_readDTableX4 (dctx, cSrc, cSrcSize);
+    if (HUFv07_isError(hSize)) return hSize;
     if (hSize >= cSrcSize) return ERROR(srcSize_wrong);
-    ip += hSize;
-    cSrcSize -= hSize;
+    ip += hSize; cSrcSize -= hSize;
 
-    return HUFv06_decompress4X4_usingDTable (dst, dstSize, ip, cSrcSize, DTable);
+    return HUFv07_decompress4X4_usingDTable_internal(dst, dstSize, ip, cSrcSize, dctx);
 }
 
-
+size_t HUFv07_decompress4X4 (void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize)
+{
+    HUFv07_CREATE_STATIC_DTABLEX4(DTable, HUFv07_TABLELOG_MAX);
+    return HUFv07_decompress4X4_DCtx(DTable, dst, dstSize, cSrc, cSrcSize);
+}
 
 
 /* ********************************/
 /* Generic decompression selector */
 /* ********************************/
 
+size_t HUFv07_decompress1X_usingDTable(void* dst, size_t maxDstSize,
+                                    const void* cSrc, size_t cSrcSize,
+                                    const HUFv07_DTable* DTable)
+{
+    DTableDesc const dtd = HUFv07_getDTableDesc(DTable);
+    return dtd.tableType ? HUFv07_decompress1X4_usingDTable_internal(dst, maxDstSize, cSrc, cSrcSize, DTable) :
+                           HUFv07_decompress1X2_usingDTable_internal(dst, maxDstSize, cSrc, cSrcSize, DTable);
+}
+
+size_t HUFv07_decompress4X_usingDTable(void* dst, size_t maxDstSize,
+                                    const void* cSrc, size_t cSrcSize,
+                                    const HUFv07_DTable* DTable)
+{
+    DTableDesc const dtd = HUFv07_getDTableDesc(DTable);
+    return dtd.tableType ? HUFv07_decompress4X4_usingDTable_internal(dst, maxDstSize, cSrc, cSrcSize, DTable) :
+                           HUFv07_decompress4X2_usingDTable_internal(dst, maxDstSize, cSrc, cSrcSize, DTable);
+}
+
+
 typedef struct { U32 tableTime; U32 decode256Time; } algo_time_t;
 static const algo_time_t algoTime[16 /* Quantization */][3 /* single, double, quad */] =
 {
@@ -3124,12 +2944,29 @@ static const algo_time_t algoTime[16 /* Quantization */][3 /* single, double, qu
     {{ 722,128}, {1891,145}, {1936,146}},   /* Q ==15 : 93-99% */
 };
 
+/** HUFv07_selectDecoder() :
+*   Tells which decoder is likely to decode faster,
+*   based on a set of pre-determined metrics.
+*   @return : 0==HUFv07_decompress4X2, 1==HUFv07_decompress4X4 .
+*   Assumption : 0 < cSrcSize < dstSize <= 128 KB */
+U32 HUFv07_selectDecoder (size_t dstSize, size_t cSrcSize)
+{
+    /* decoder timing evaluation */
+    U32 const Q = (U32)(cSrcSize * 16 / dstSize);   /* Q < 16 since dstSize > cSrcSize */
+    U32 const D256 = (U32)(dstSize >> 8);
+    U32 const DTime0 = algoTime[Q][0].tableTime + (algoTime[Q][0].decode256Time * D256);
+    U32 DTime1 = algoTime[Q][1].tableTime + (algoTime[Q][1].decode256Time * D256);
+    DTime1 += DTime1 >> 3;  /* advantage to algorithm using less memory, for cache eviction */
+
+    return DTime1 < DTime0;
+}
+
+
 typedef size_t (*decompressionAlgo)(void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize);
 
-size_t HUFv06_decompress (void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize)
+size_t HUFv07_decompress (void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize)
 {
-    static const decompressionAlgo decompress[3] = { HUFv06_decompress4X2, HUFv06_decompress4X4, NULL };
-    U32 Dtime[3];   /* decompression time estimation */
+    static const decompressionAlgo decompress[2] = { HUFv07_decompress4X2, HUFv07_decompress4X4 };
 
     /* validation checks */
     if (dstSize == 0) return ERROR(dstSize_tooSmall);
@@ -3137,24 +2974,52 @@ size_t HUFv06_decompress (void* dst, size_t dstSize, const void* cSrc, size_t cS
     if (cSrcSize == dstSize) { memcpy(dst, cSrc, dstSize); return dstSize; }   /* not compressed */
     if (cSrcSize == 1) { memset(dst, *(const BYTE*)cSrc, dstSize); return dstSize; }   /* RLE */
 
-    /* decoder timing evaluation */
-    {   U32 const Q = (U32)(cSrcSize * 16 / dstSize);   /* Q < 16 since dstSize > cSrcSize */
-        U32 const D256 = (U32)(dstSize >> 8);
-        U32 n; for (n=0; n<3; n++)
-            Dtime[n] = algoTime[Q][n].tableTime + (algoTime[Q][n].decode256Time * D256);
+    {   U32 const algoNb = HUFv07_selectDecoder(dstSize, cSrcSize);
+        return decompress[algoNb](dst, dstSize, cSrc, cSrcSize);
     }
 
-    Dtime[1] += Dtime[1] >> 4; Dtime[2] += Dtime[2] >> 3; /* advantage to algorithms using less memory, for cache eviction */
+    //return HUFv07_decompress4X2(dst, dstSize, cSrc, cSrcSize);   /* multi-streams single-symbol decoding */
+    //return HUFv07_decompress4X4(dst, dstSize, cSrc, cSrcSize);   /* multi-streams double-symbols decoding */
+}
 
-    {   U32 algoNb = 0;
-        if (Dtime[1] < Dtime[0]) algoNb = 1;
-        // if (Dtime[2] < Dtime[algoNb]) algoNb = 2;   /* current speed of HUFv06_decompress4X6 is not good */
-        return decompress[algoNb](dst, dstSize, cSrc, cSrcSize);
+size_t HUFv07_decompress4X_DCtx (HUFv07_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize)
+{
+    /* validation checks */
+    if (dstSize == 0) return ERROR(dstSize_tooSmall);
+    if (cSrcSize > dstSize) return ERROR(corruption_detected);   /* invalid */
+    if (cSrcSize == dstSize) { memcpy(dst, cSrc, dstSize); return dstSize; }   /* not compressed */
+    if (cSrcSize == 1) { memset(dst, *(const BYTE*)cSrc, dstSize); return dstSize; }   /* RLE */
+
+    {   U32 const algoNb = HUFv07_selectDecoder(dstSize, cSrcSize);
+        return algoNb ? HUFv07_decompress4X4_DCtx(dctx, dst, dstSize, cSrc, cSrcSize) :
+                        HUFv07_decompress4X2_DCtx(dctx, dst, dstSize, cSrc, cSrcSize) ;
+    }
+}
+
+size_t HUFv07_decompress4X_hufOnly (HUFv07_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize)
+{
+    /* validation checks */
+    if (dstSize == 0) return ERROR(dstSize_tooSmall);
+    if ((cSrcSize >= dstSize) || (cSrcSize <= 1)) return ERROR(corruption_detected);   /* invalid */
+
+    {   U32 const algoNb = HUFv07_selectDecoder(dstSize, cSrcSize);
+        return algoNb ? HUFv07_decompress4X4_DCtx(dctx, dst, dstSize, cSrc, cSrcSize) :
+                        HUFv07_decompress4X2_DCtx(dctx, dst, dstSize, cSrc, cSrcSize) ;
     }
+}
+
+size_t HUFv07_decompress1X_DCtx (HUFv07_DTable* dctx, void* dst, size_t dstSize, const void* cSrc, size_t cSrcSize)
+{
+    /* validation checks */
+    if (dstSize == 0) return ERROR(dstSize_tooSmall);
+    if (cSrcSize > dstSize) return ERROR(corruption_detected);   /* invalid */
+    if (cSrcSize == dstSize) { memcpy(dst, cSrc, dstSize); return dstSize; }   /* not compressed */
+    if (cSrcSize == 1) { memset(dst, *(const BYTE*)cSrc, dstSize); return dstSize; }   /* RLE */
 
-    //return HUFv06_decompress4X2(dst, dstSize, cSrc, cSrcSize);   /* multi-streams single-symbol decoding */
-    //return HUFv06_decompress4X4(dst, dstSize, cSrc, cSrcSize);   /* multi-streams double-symbols decoding */
-    //return HUFv06_decompress4X6(dst, dstSize, cSrc, cSrcSize);   /* multi-streams quad-symbols decoding */
+    {   U32 const algoNb = HUFv07_selectDecoder(dstSize, cSrcSize);
+        return algoNb ? HUFv07_decompress1X4_DCtx(dctx, dst, dstSize, cSrc, cSrcSize) :
+                        HUFv07_decompress1X2_DCtx(dctx, dst, dstSize, cSrc, cSrcSize) ;
+    }
 }
 /*
     Common functions of Zstd compression library
@@ -3184,40 +3049,278 @@ size_t HUFv06_decompress (void* dst, size_t dstSize, const void* cSrc, size_t cS
     OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 
     You can contact the author at :
-    - zstd homepage : http://www.zstd.net/
+    - zstd homepage : http://www.zstd.net/
+*/
+
+
+
+/*-****************************************
+*  ZSTD Error Management
+******************************************/
+/*! ZSTDv07_isError() :
+*   tells if a return value is an error code */
+unsigned ZSTDv07_isError(size_t code) { return ERR_isError(code); }
+
+/*! ZSTDv07_getErrorName() :
+*   provides error code string from function result (useful for debugging) */
+const char* ZSTDv07_getErrorName(size_t code) { return ERR_getErrorName(code); }
+
+/*! ZSTDv07_getError() :
+*   convert a `size_t` function result into a proper ZSTDv07_errorCode enum */
+ZSTDv07_ErrorCode ZSTDv07_getErrorCode(size_t code) { return ERR_getErrorCode(code); }
+
+/*! ZSTDv07_getErrorString() :
+*   provides error code string from enum */
+const char* ZSTDv07_getErrorString(ZSTDv07_ErrorCode code) { return ERR_getErrorName(code); }
+
+
+/* **************************************************************
+*  ZBUFF Error Management
+****************************************************************/
+unsigned ZBUFFv07_isError(size_t errorCode) { return ERR_isError(errorCode); }
+
+const char* ZBUFFv07_getErrorName(size_t errorCode) { return ERR_getErrorName(errorCode); }
+
+
+
+void* ZSTDv07_defaultAllocFunction(void* opaque, size_t size)
+{
+    void* address = malloc(size);
+    (void)opaque;
+    /* printf("alloc %p, %d opaque=%p \n", address, (int)size, opaque); */
+    return address;
+}
+
+void ZSTDv07_defaultFreeFunction(void* opaque, void* address)
+{
+    (void)opaque;
+    /* if (address) printf("free %p opaque=%p \n", address, opaque); */
+    free(address);
+}
+/*
+    zstd_internal - common functions to include
+    Header File for include
+    Copyright (C) 2014-2016, Yann Collet.
+
+    BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php)
+
+    Redistribution and use in source and binary forms, with or without
+    modification, are permitted provided that the following conditions are
+    met:
+    * Redistributions of source code must retain the above copyright
+    notice, this list of conditions and the following disclaimer.
+    * Redistributions in binary form must reproduce the above
+    copyright notice, this list of conditions and the following disclaimer
+    in the documentation and/or other materials provided with the
+    distribution.
+    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+    A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+    OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+    SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+    LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+    DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+    THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+    (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+    You can contact the author at :
+    - zstd homepage : https://www.zstd.net
 */
+#ifndef ZSTDv07_CCOMMON_H_MODULE
+#define ZSTDv07_CCOMMON_H_MODULE
 
 
-/*-****************************************
-*  Version
-******************************************/
+/*-*************************************
+*  Common macros
+***************************************/
+#define MIN(a,b) ((a)<(b) ? (a) : (b))
+#define MAX(a,b) ((a)>(b) ? (a) : (b))
 
-/*-****************************************
-*  ZSTD Error Management
-******************************************/
-/*! ZSTDv06_isError() :
-*   tells if a return value is an error code */
-unsigned ZSTDv06_isError(size_t code) { return ERR_isError(code); }
 
-/*! ZSTDv06_getErrorName() :
-*   provides error code string from function result (useful for debugging) */
-const char* ZSTDv06_getErrorName(size_t code) { return ERR_getErrorName(code); }
+/*-*************************************
+*  Common constants
+***************************************/
+#define ZSTDv07_OPT_DEBUG 0     /* 3 = compression stats;  5 = check encoded sequences;  9 = full logs */
+#include <stdio.h>
+#if defined(ZSTDv07_OPT_DEBUG) && ZSTDv07_OPT_DEBUG>=9
+    #define ZSTDv07_LOG_PARSER(...) printf(__VA_ARGS__)
+    #define ZSTDv07_LOG_ENCODE(...) printf(__VA_ARGS__)
+    #define ZSTDv07_LOG_BLOCK(...) printf(__VA_ARGS__)
+#else
+    #define ZSTDv07_LOG_PARSER(...)
+    #define ZSTDv07_LOG_ENCODE(...)
+    #define ZSTDv07_LOG_BLOCK(...)
+#endif
 
-/*! ZSTDv06_getError() :
-*   convert a `size_t` function result into a proper ZSTDv06_errorCode enum */
-ZSTDv06_ErrorCode ZSTDv06_getErrorCode(size_t code) { return ERR_getErrorCode(code); }
+#define ZSTDv07_OPT_NUM    (1<<12)
+#define ZSTDv07_DICT_MAGIC  0xEC30A437   /* v0.7 */
 
-/*! ZSTDv06_getErrorString() :
-*   provides error code string from enum */
-const char* ZSTDv06_getErrorString(ZSTDv06_ErrorCode code) { return ERR_getErrorName(code); }
+#define ZSTDv07_REP_NUM    3
+#define ZSTDv07_REP_INIT   ZSTDv07_REP_NUM
+#define ZSTDv07_REP_MOVE   (ZSTDv07_REP_NUM-1)
+static const U32 repStartValue[ZSTDv07_REP_NUM] = { 1, 4, 8 };
 
+#define KB *(1 <<10)
+#define MB *(1 <<20)
+#define GB *(1U<<30)
 
-/* **************************************************************
-*  ZBUFF Error Management
-****************************************************************/
-unsigned ZBUFFv06_isError(size_t errorCode) { return ERR_isError(errorCode); }
+#define BIT7 128
+#define BIT6  64
+#define BIT5  32
+#define BIT4  16
+#define BIT1   2
+#define BIT0   1
+
+#define ZSTDv07_WINDOWLOG_ABSOLUTEMIN 10
+static const size_t ZSTDv07_fcs_fieldSize[4] = { 0, 2, 4, 8 };
+static const size_t ZSTDv07_did_fieldSize[4] = { 0, 1, 2, 4 };
+
+#define ZSTDv07_BLOCKHEADERSIZE 3   /* C standard doesn't allow `static const` variable to be init using another `static const` variable */
+static const size_t ZSTDv07_blockHeaderSize = ZSTDv07_BLOCKHEADERSIZE;
+typedef enum { bt_compressed, bt_raw, bt_rle, bt_end } blockType_t;
+
+#define MIN_SEQUENCES_SIZE 1 /* nbSeq==0 */
+#define MIN_CBLOCK_SIZE (1 /*litCSize*/ + 1 /* RLE or RAW */ + MIN_SEQUENCES_SIZE /* nbSeq==0 */)   /* for a non-null block */
+
+#define HufLog 12
+typedef enum { lbt_huffman, lbt_repeat, lbt_raw, lbt_rle } litBlockType_t;
+
+#define LONGNBSEQ 0x7F00
+
+#define MINMATCH 3
+#define EQUAL_READ32 4
+
+#define Litbits  8
+#define MaxLit ((1<<Litbits) - 1)
+#define MaxML  52
+#define MaxLL  35
+#define MaxOff 28
+#define MaxSeq MAX(MaxLL, MaxML)   /* Assumption : MaxOff < MaxLL,MaxML */
+#define MLFSELog    9
+#define LLFSELog    9
+#define OffFSELog   8
+
+#define FSEv07_ENCODING_RAW     0
+#define FSEv07_ENCODING_RLE     1
+#define FSEv07_ENCODING_STATIC  2
+#define FSEv07_ENCODING_DYNAMIC 3
+
+static const U32 LL_bits[MaxLL+1] = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+                                      1, 1, 1, 1, 2, 2, 3, 3, 4, 6, 7, 8, 9,10,11,12,
+                                     13,14,15,16 };
+static const S16 LL_defaultNorm[MaxLL+1] = { 4, 3, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1,
+                                             2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 2, 1, 1, 1, 1, 1,
+                                            -1,-1,-1,-1 };
+static const U32 LL_defaultNormLog = 6;
+
+static const U32 ML_bits[MaxML+1] = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+                                      0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+                                      1, 1, 1, 1, 2, 2, 3, 3, 4, 4, 5, 7, 8, 9,10,11,
+                                     12,13,14,15,16 };
+static const S16 ML_defaultNorm[MaxML+1] = { 1, 4, 3, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1,
+                                             1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
+                                             1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,-1,-1,
+                                            -1,-1,-1,-1,-1 };
+static const U32 ML_defaultNormLog = 6;
+
+static const S16 OF_defaultNorm[MaxOff+1] = { 1, 1, 1, 1, 1, 1, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1,
+                                              1, 1, 1, 1, 1, 1, 1, 1,-1,-1,-1,-1,-1 };
+static const U32 OF_defaultNormLog = 5;
+
+
+/*-*******************************************
+*  Shared functions to include for inlining
+*********************************************/
+static void ZSTDv07_copy8(void* dst, const void* src) { memcpy(dst, src, 8); }
+#define COPY8(d,s) { ZSTDv07_copy8(d,s); d+=8; s+=8; }
+
+/*! ZSTDv07_wildcopy() :
+*   custom version of memcpy(), can copy up to 7 bytes too many (8 bytes if length==0) */
+#define WILDCOPY_OVERLENGTH 8
+MEM_STATIC void ZSTDv07_wildcopy(void* dst, const void* src, size_t length)
+{
+    const BYTE* ip = (const BYTE*)src;
+    BYTE* op = (BYTE*)dst;
+    BYTE* const oend = op + length;
+    do
+        COPY8(op, ip)
+    while (op < oend);
+}
+
+
+/*-*******************************************
+*  Private interfaces
+*********************************************/
+typedef struct ZSTDv07_stats_s ZSTDv07_stats_t;
+
+typedef struct {
+    U32 off;
+    U32 len;
+} ZSTDv07_match_t;
+
+typedef struct {
+    U32 price;
+    U32 off;
+    U32 mlen;
+    U32 litlen;
+    U32 rep[ZSTDv07_REP_INIT];
+} ZSTDv07_optimal_t;
+
+struct ZSTDv07_stats_s { U32 unused; };
+MEM_STATIC void ZSTDv07_statsPrint(ZSTDv07_stats_t* stats, U32 searchLength) { (void)stats; (void)searchLength; }
+MEM_STATIC void ZSTDv07_statsInit(ZSTDv07_stats_t* stats) { (void)stats; }
+MEM_STATIC void ZSTDv07_statsResetFreqs(ZSTDv07_stats_t* stats) { (void)stats; }
+MEM_STATIC void ZSTDv07_statsUpdatePrices(ZSTDv07_stats_t* stats, size_t litLength, const BYTE* literals, size_t offset, size_t matchLength) { (void)stats; (void)litLength; (void)literals; (void)offset; (void)matchLength; }
+
+typedef struct {
+    void* buffer;
+    U32*  offsetStart;
+    U32*  offset;
+    BYTE* offCodeStart;
+    BYTE* litStart;
+    BYTE* lit;
+    U16*  litLengthStart;
+    U16*  litLength;
+    BYTE* llCodeStart;
+    U16*  matchLengthStart;
+    U16*  matchLength;
+    BYTE* mlCodeStart;
+    U32   longLengthID;   /* 0 == no longLength; 1 == Lit.longLength; 2 == Match.longLength; */
+    U32   longLengthPos;
+    /* opt */
+    ZSTDv07_optimal_t* priceTable;
+    ZSTDv07_match_t* matchTable;
+    U32* matchLengthFreq;
+    U32* litLengthFreq;
+    U32* litFreq;
+    U32* offCodeFreq;
+    U32  matchLengthSum;
+    U32  matchSum;
+    U32  litLengthSum;
+    U32  litSum;
+    U32  offCodeSum;
+    U32  log2matchLengthSum;
+    U32  log2matchSum;
+    U32  log2litLengthSum;
+    U32  log2litSum;
+    U32  log2offCodeSum;
+    U32  factor;
+    U32  cachedPrice;
+    U32  cachedLitLength;
+    const BYTE* cachedLiterals;
+    ZSTDv07_stats_t stats;
+} seqStore_t;
+
+void ZSTDv07_seqToCodes(const seqStore_t* seqStorePtr, size_t const nbSeq);
 
-const char* ZBUFFv06_getErrorName(size_t errorCode) { return ERR_getErrorName(errorCode); }
+/* custom memory allocation functions */
+void* ZSTDv07_defaultAllocFunction(void* opaque, size_t size);
+void ZSTDv07_defaultFreeFunction(void* opaque, void* address);
+static const ZSTDv07_customMem defaultCustomMem = { ZSTDv07_defaultAllocFunction, ZSTDv07_defaultFreeFunction, NULL };
+
+#endif   /* ZSTDv07_CCOMMON_H_MODULE */
 /*
     zstd - standard compression library
     Copyright (C) 2014-2016, Yann Collet.
@@ -3254,15 +3357,14 @@ const char* ZBUFFv06_getErrorName(size_t errorCode) { return ERR_getErrorName(er
 *****************************************************************/
 /*!
  * HEAPMODE :
- * Select how default decompression function ZSTDv06_decompress() will allocate memory,
+ * Select how default decompression function ZSTDv07_decompress() will allocate memory,
  * in memory stack (0), or in memory heap (1, requires malloc())
  */
-#ifndef ZSTDv06_HEAPMODE
-#  define ZSTDv06_HEAPMODE 1
+#ifndef ZSTDv07_HEAPMODE
+#  define ZSTDv07_HEAPMODE 1
 #endif
 
 
-
 /*-*******************************************************
 *  Compiler specifics
 *********************************************************/
@@ -3283,79 +3385,106 @@ const char* ZBUFFv06_getErrorName(size_t errorCode) { return ERR_getErrorName(er
 /*-*************************************
 *  Macros
 ***************************************/
-#define ZSTDv06_isError ERR_isError   /* for inlining */
-#define FSEv06_isError  ERR_isError
-#define HUFv06_isError  ERR_isError
+#define ZSTDv07_isError ERR_isError   /* for inlining */
+#define FSEv07_isError  ERR_isError
+#define HUFv07_isError  ERR_isError
 
 
 /*_*******************************************************
 *  Memory operations
 **********************************************************/
-static void ZSTDv06_copy4(void* dst, const void* src) { memcpy(dst, src, 4); }
+static void ZSTDv07_copy4(void* dst, const void* src) { memcpy(dst, src, 4); }
 
 
 /*-*************************************************************
 *   Context management
 ***************************************************************/
 typedef enum { ZSTDds_getFrameHeaderSize, ZSTDds_decodeFrameHeader,
-               ZSTDds_decodeBlockHeader, ZSTDds_decompressBlock } ZSTDv06_dStage;
+               ZSTDds_decodeBlockHeader, ZSTDds_decompressBlock,
+               ZSTDds_decodeSkippableHeader, ZSTDds_skipFrame } ZSTDv07_dStage;
 
-struct ZSTDv06_DCtx_s
+struct ZSTDv07_DCtx_s
 {
-    FSEv06_DTable LLTable[FSEv06_DTABLE_SIZE_U32(LLFSELog)];
-    FSEv06_DTable OffTable[FSEv06_DTABLE_SIZE_U32(OffFSELog)];
-    FSEv06_DTable MLTable[FSEv06_DTABLE_SIZE_U32(MLFSELog)];
-    unsigned   hufTableX4[HUFv06_DTABLE_SIZE(HufLog)];
+    FSEv07_DTable LLTable[FSEv07_DTABLE_SIZE_U32(LLFSELog)];
+    FSEv07_DTable OffTable[FSEv07_DTABLE_SIZE_U32(OffFSELog)];
+    FSEv07_DTable MLTable[FSEv07_DTABLE_SIZE_U32(MLFSELog)];
+    HUFv07_DTable hufTable[HUFv07_DTABLE_SIZE(HufLog)];  /* can accommodate HUFv07_decompress4X */
     const void* previousDstEnd;
     const void* base;
     const void* vBase;
     const void* dictEnd;
     size_t expected;
+    U32 rep[3];
+    ZSTDv07_frameParams fParams;
+    blockType_t bType;   /* used in ZSTDv07_decompressContinue(), to transfer blockType between header decoding and block decoding stages */
+    ZSTDv07_dStage stage;
+    U32 litEntropy;
+    U32 fseEntropy;
+    XXH64_state_t xxhState;
     size_t headerSize;
-    ZSTDv06_frameParams fParams;
-    blockType_t bType;   /* used in ZSTDv06_decompressContinue(), to transfer blockType between header decoding and block decoding stages */
-    ZSTDv06_dStage stage;
-    U32 flagRepeatTable;
+    U32 dictID;
     const BYTE* litPtr;
+    ZSTDv07_customMem customMem;
     size_t litBufSize;
     size_t litSize;
-    BYTE litBuffer[ZSTDv06_BLOCKSIZE_MAX + WILDCOPY_OVERLENGTH];
-    BYTE headerBuffer[ZSTDv06_FRAMEHEADERSIZE_MAX];
-};  /* typedef'd to ZSTDv06_DCtx within "zstd_static.h" */
+    BYTE litBuffer[ZSTDv07_BLOCKSIZE_ABSOLUTEMAX + WILDCOPY_OVERLENGTH];
+    BYTE headerBuffer[ZSTDv07_FRAMEHEADERSIZE_MAX];
+};  /* typedef'd to ZSTDv07_DCtx within "zstd_static.h" */
+
+int ZSTDv07_isSkipFrame(ZSTDv07_DCtx* dctx);
+
+size_t ZSTDv07_sizeofDCtx (const ZSTDv07_DCtx* dctx) { return sizeof(*dctx); }
 
-size_t ZSTDv06_sizeofDCtx (void) { return sizeof(ZSTDv06_DCtx); }   /* non published interface */
+size_t ZSTDv07_estimateDCtxSize(void) { return sizeof(ZSTDv07_DCtx); }
 
-size_t ZSTDv06_decompressBegin(ZSTDv06_DCtx* dctx)
+size_t ZSTDv07_decompressBegin(ZSTDv07_DCtx* dctx)
 {
-    dctx->expected = ZSTDv06_frameHeaderSize_min;
+    dctx->expected = ZSTDv07_frameHeaderSize_min;
     dctx->stage = ZSTDds_getFrameHeaderSize;
     dctx->previousDstEnd = NULL;
     dctx->base = NULL;
     dctx->vBase = NULL;
     dctx->dictEnd = NULL;
-    dctx->hufTableX4[0] = HufLog;
-    dctx->flagRepeatTable = 0;
+    dctx->hufTable[0] = (HUFv07_DTable)((HufLog)*0x1000001);
+    dctx->litEntropy = dctx->fseEntropy = 0;
+    dctx->dictID = 0;
+    { int i; for (i=0; i<ZSTDv07_REP_NUM; i++) dctx->rep[i] = repStartValue[i]; }
     return 0;
 }
 
-ZSTDv06_DCtx* ZSTDv06_createDCtx(void)
+ZSTDv07_DCtx* ZSTDv07_createDCtx_advanced(ZSTDv07_customMem customMem)
 {
-    ZSTDv06_DCtx* dctx = (ZSTDv06_DCtx*)malloc(sizeof(ZSTDv06_DCtx));
-    if (dctx==NULL) return NULL;
-    ZSTDv06_decompressBegin(dctx);
+    ZSTDv07_DCtx* dctx;
+
+    if (!customMem.customAlloc && !customMem.customFree)
+        customMem = defaultCustomMem;
+
+    if (!customMem.customAlloc || !customMem.customFree)
+        return NULL;
+
+    dctx = (ZSTDv07_DCtx*) customMem.customAlloc(customMem.opaque, sizeof(ZSTDv07_DCtx));
+    if (!dctx) return NULL;
+    memcpy(&dctx->customMem, &customMem, sizeof(ZSTDv07_customMem));
+    ZSTDv07_decompressBegin(dctx);
     return dctx;
 }
 
-size_t ZSTDv06_freeDCtx(ZSTDv06_DCtx* dctx)
+ZSTDv07_DCtx* ZSTDv07_createDCtx(void)
+{
+    return ZSTDv07_createDCtx_advanced(defaultCustomMem);
+}
+
+size_t ZSTDv07_freeDCtx(ZSTDv07_DCtx* dctx)
 {
-    free(dctx);
+    if (dctx==NULL) return 0;   /* support free on NULL */
+    dctx->customMem.customFree(dctx->customMem.opaque, dctx);
     return 0;   /* reserved as a potential error code in the future */
 }
 
-void ZSTDv06_copyDCtx(ZSTDv06_DCtx* dstDCtx, const ZSTDv06_DCtx* srcDCtx)
+void ZSTDv07_copyDCtx(ZSTDv07_DCtx* dstDCtx, const ZSTDv07_DCtx* srcDCtx)
 {
     memcpy(dstDCtx, srcDCtx,
-           sizeof(ZSTDv06_DCtx) - (ZSTDv06_BLOCKSIZE_MAX+WILDCOPY_OVERLENGTH + ZSTDv06_frameHeaderSize_max));  /* no need to copy workspace */
+           sizeof(ZSTDv07_DCtx) - (ZSTDv07_BLOCKSIZE_ABSOLUTEMAX+WILDCOPY_OVERLENGTH + ZSTDv07_frameHeaderSize_max));  /* no need to copy workspace */
 }
 
 
@@ -3366,7 +3495,7 @@ void ZSTDv06_copyDCtx(ZSTDv06_DCtx* dstDCtx, const ZSTDv06_DCtx* srcDCtx)
 /* Frame format description
    Frame Header -  [ Block Header - Block ] - Frame End
    1) Frame Header
-      - 4 bytes - Magic Number : ZSTDv06_MAGICNUMBER (defined within zstd_static.h)
+      - 4 bytes - Magic Number : ZSTDv07_MAGICNUMBER (defined within zstd.h)
       - 1 byte  - Frame Descriptor
    2) Block Header
       - 3 bytes, starting with a 2-bits descriptor
@@ -3378,19 +3507,34 @@ void ZSTDv06_copyDCtx(ZSTDv06_DCtx* dstDCtx, const ZSTDv06_DCtx* srcDCtx)
 */
 
 
-/* Frame descriptor
-
-   1 byte, using :
-   bit 0-3 : windowLog - ZSTDv06_WINDOWLOG_ABSOLUTEMIN   (see zstd_internal.h)
-   bit 4   : minmatch 4(0) or 3(1)
-   bit 5   : reserved (must be zero)
-   bit 6-7 : Frame content size : unknown, 1 byte, 2 bytes, 8 bytes
-
-   Optional : content size (0, 1, 2 or 8 bytes)
-   0 : unknown
-   1 : 0-255 bytes
-   2 : 256 - 65535+256
-   8 : up to 16 exa
+/* Frame Header :
+
+   1 byte - FrameHeaderDescription :
+   bit 0-1 : dictID (0, 1, 2 or 4 bytes)
+   bit 2   : checksumFlag
+   bit 3   : reserved (must be zero)
+   bit 4   : reserved (unused, can be any value)
+   bit 5   : Single Segment (if 1, WindowLog byte is not present)
+   bit 6-7 : FrameContentFieldSize (0, 2, 4, or 8)
+             if (SkippedWindowLog && !FrameContentFieldsize) FrameContentFieldsize=1;
+
+   Optional : WindowLog (0 or 1 byte)
+   bit 0-2 : octal Fractional (1/8th)
+   bit 3-7 : Power of 2, with 0 = 1 KB (up to 2 TB)
+
+   Optional : dictID (0, 1, 2 or 4 bytes)
+   Automatic adaptation
+   0 : no dictID
+   1 : 1 - 255
+   2 : 256 - 65535
+   4 : all other values
+
+   Optional : content size (0, 1, 2, 4 or 8 bytes)
+   0 : unknown          (fcfs==0 and swl==0)
+   1 : 0-255 bytes      (fcfs==0 and swl==1)
+   2 : 256 - 65535+256  (fcfs==1)
+   4 : 0 - 4GB-1        (fcfs==2)
+   8 : 0 - 16EB-1       (fcfs==3)
 */
 
 
@@ -3460,56 +3604,118 @@ void ZSTDv06_copyDCtx(ZSTDv06_DCtx* dstDCtx, const ZSTDv06_DCtx* srcDCtx)
       TO DO
 */
 
-/** ZSTDv06_frameHeaderSize() :
-*   srcSize must be >= ZSTDv06_frameHeaderSize_min.
+/** ZSTDv07_frameHeaderSize() :
+*   srcSize must be >= ZSTDv07_frameHeaderSize_min.
 *   @return : size of the Frame Header */
-static size_t ZSTDv06_frameHeaderSize(const void* src, size_t srcSize)
+static size_t ZSTDv07_frameHeaderSize(const void* src, size_t srcSize)
 {
-    if (srcSize < ZSTDv06_frameHeaderSize_min) return ERROR(srcSize_wrong);
-    { U32 const fcsId = (((const BYTE*)src)[4]) >> 6;
-      return ZSTDv06_frameHeaderSize_min + ZSTDv06_fcs_fieldSize[fcsId]; }
+    if (srcSize < ZSTDv07_frameHeaderSize_min) return ERROR(srcSize_wrong);
+    {   BYTE const fhd = ((const BYTE*)src)[4];
+        U32 const dictID= fhd & 3;
+        U32 const directMode = (fhd >> 5) & 1;
+        U32 const fcsId = fhd >> 6;
+        return ZSTDv07_frameHeaderSize_min + !directMode + ZSTDv07_did_fieldSize[dictID] + ZSTDv07_fcs_fieldSize[fcsId]
+                + (directMode && !ZSTDv07_fcs_fieldSize[fcsId]);
+    }
 }
 
 
-/** ZSTDv06_getFrameParams() :
-*   decode Frame Header, or provide expected `srcSize`.
+/** ZSTDv07_getFrameParams() :
+*   decode Frame Header, or require larger `srcSize`.
 *   @return : 0, `fparamsPtr` is correctly filled,
 *            >0, `srcSize` is too small, result is expected `srcSize`,
-*             or an error code, which can be tested using ZSTDv06_isError() */
-size_t ZSTDv06_getFrameParams(ZSTDv06_frameParams* fparamsPtr, const void* src, size_t srcSize)
+*             or an error code, which can be tested using ZSTDv07_isError() */
+size_t ZSTDv07_getFrameParams(ZSTDv07_frameParams* fparamsPtr, const void* src, size_t srcSize)
 {
     const BYTE* ip = (const BYTE*)src;
 
-    if (srcSize < ZSTDv06_frameHeaderSize_min) return ZSTDv06_frameHeaderSize_min;
-    if (MEM_readLE32(src) != ZSTDv06_MAGICNUMBER) return ERROR(prefix_unknown);
+    if (srcSize < ZSTDv07_frameHeaderSize_min) return ZSTDv07_frameHeaderSize_min;
+    if (MEM_readLE32(src) != ZSTDv07_MAGICNUMBER) {
+        if ((MEM_readLE32(src) & 0xFFFFFFF0U) == ZSTDv07_MAGIC_SKIPPABLE_START) {
+            if (srcSize < ZSTDv07_skippableHeaderSize) return ZSTDv07_skippableHeaderSize; /* magic number + skippable frame length */
+            memset(fparamsPtr, 0, sizeof(*fparamsPtr));
+            fparamsPtr->frameContentSize = MEM_readLE32((const char *)src + 4);
+            fparamsPtr->windowSize = 0; /* windowSize==0 means a frame is skippable */
+            return 0;
+        }
+        return ERROR(prefix_unknown);
+    }
 
     /* ensure there is enough `srcSize` to fully read/decode frame header */
-    { size_t const fhsize = ZSTDv06_frameHeaderSize(src, srcSize);
+    { size_t const fhsize = ZSTDv07_frameHeaderSize(src, srcSize);
       if (srcSize < fhsize) return fhsize; }
 
-    memset(fparamsPtr, 0, sizeof(*fparamsPtr));
-    {   BYTE const frameDesc = ip[4];
-        fparamsPtr->windowLog = (frameDesc & 0xF) + ZSTDv06_WINDOWLOG_ABSOLUTEMIN;
-        if ((frameDesc & 0x20) != 0) return ERROR(frameParameter_unsupported);   /* reserved 1 bit */
-        switch(frameDesc >> 6)  /* fcsId */
+    {   BYTE const fhdByte = ip[4];
+        size_t pos = 5;
+        U32 const dictIDSizeCode = fhdByte&3;
+        U32 const checksumFlag = (fhdByte>>2)&1;
+        U32 const directMode = (fhdByte>>5)&1;
+        U32 const fcsID = fhdByte>>6;
+        U32 const windowSizeMax = 1U << ZSTDv07_WINDOWLOG_MAX;
+        U32 windowSize = 0;
+        U32 dictID = 0;
+        U64 frameContentSize = 0;
+        if ((fhdByte & 0x08) != 0) return ERROR(frameParameter_unsupported);   /* reserved bits, which must be zero */
+        if (!directMode) {
+            BYTE const wlByte = ip[pos++];
+            U32 const windowLog = (wlByte >> 3) + ZSTDv07_WINDOWLOG_ABSOLUTEMIN;
+            if (windowLog > ZSTDv07_WINDOWLOG_MAX) return ERROR(frameParameter_unsupported);
+            windowSize = (1U << windowLog);
+            windowSize += (windowSize >> 3) * (wlByte&7);
+        }
+
+        switch(dictIDSizeCode)
         {
             default:   /* impossible */
-            case 0 : fparamsPtr->frameContentSize = 0; break;
-            case 1 : fparamsPtr->frameContentSize = ip[5]; break;
-            case 2 : fparamsPtr->frameContentSize = MEM_readLE16(ip+5)+256; break;
-            case 3 : fparamsPtr->frameContentSize = MEM_readLE64(ip+5); break;
-    }   }
+            case 0 : break;
+            case 1 : dictID = ip[pos]; pos++; break;
+            case 2 : dictID = MEM_readLE16(ip+pos); pos+=2; break;
+            case 3 : dictID = MEM_readLE32(ip+pos); pos+=4; break;
+        }
+        switch(fcsID)
+        {
+            default:   /* impossible */
+            case 0 : if (directMode) frameContentSize = ip[pos]; break;
+            case 1 : frameContentSize = MEM_readLE16(ip+pos)+256; break;
+            case 2 : frameContentSize = MEM_readLE32(ip+pos); break;
+            case 3 : frameContentSize = MEM_readLE64(ip+pos); break;
+        }
+        if (!windowSize) windowSize = (U32)frameContentSize;
+        if (windowSize > windowSizeMax) return ERROR(frameParameter_unsupported);
+        fparamsPtr->frameContentSize = frameContentSize;
+        fparamsPtr->windowSize = windowSize;
+        fparamsPtr->dictID = dictID;
+        fparamsPtr->checksumFlag = checksumFlag;
+    }
     return 0;
 }
 
 
-/** ZSTDv06_decodeFrameHeader() :
-*   `srcSize` must be the size provided by ZSTDv06_frameHeaderSize().
-*   @return : 0 if success, or an error code, which can be tested using ZSTDv06_isError() */
-static size_t ZSTDv06_decodeFrameHeader(ZSTDv06_DCtx* zc, const void* src, size_t srcSize)
+/** ZSTDv07_getDecompressedSize() :
+*   compatible with legacy mode
+*   @return : decompressed size if known, 0 otherwise
+              note : 0 can mean any of the following :
+                   - decompressed size is not provided within frame header
+                   - frame header unknown / not supported
+                   - frame header not completely provided (`srcSize` too small) */
+unsigned long long ZSTDv07_getDecompressedSize(const void* src, size_t srcSize)
 {
-    size_t const result = ZSTDv06_getFrameParams(&(zc->fParams), src, srcSize);
-    if ((MEM_32bits()) && (zc->fParams.windowLog > 25)) return ERROR(frameParameter_unsupportedBy32bits);
+    {   ZSTDv07_frameParams fparams;
+        size_t const frResult = ZSTDv07_getFrameParams(&fparams, src, srcSize);
+        if (frResult!=0) return 0;
+        return fparams.frameContentSize;
+    }
+}
+
+
+/** ZSTDv07_decodeFrameHeader() :
+*   `srcSize` must be the size provided by ZSTDv07_frameHeaderSize().
+*   @return : 0 if success, or an error code, which can be tested using ZSTDv07_isError() */
+static size_t ZSTDv07_decodeFrameHeader(ZSTDv07_DCtx* dctx, const void* src, size_t srcSize)
+{
+    size_t const result = ZSTDv07_getFrameParams(&(dctx->fParams), src, srcSize);
+    if (dctx->fParams.dictID && (dctx->dictID != dctx->fParams.dictID)) return ERROR(dictionary_wrong);
+    if (dctx->fParams.checksumFlag) XXH64_reset(&dctx->xxhState, 0);
     return result;
 }
 
@@ -3520,14 +3726,14 @@ typedef struct
     U32 origSize;
 } blockProperties_t;
 
-/*! ZSTDv06_getcBlockSize() :
+/*! ZSTDv07_getcBlockSize() :
 *   Provides the size of compressed block from block header `src` */
-size_t ZSTDv06_getcBlockSize(const void* src, size_t srcSize, blockProperties_t* bpPtr)
+size_t ZSTDv07_getcBlockSize(const void* src, size_t srcSize, blockProperties_t* bpPtr)
 {
     const BYTE* const in = (const BYTE* const)src;
     U32 cSize;
 
-    if (srcSize < ZSTDv06_blockHeaderSize) return ERROR(srcSize_wrong);
+    if (srcSize < ZSTDv07_blockHeaderSize) return ERROR(srcSize_wrong);
 
     bpPtr->blockType = (blockType_t)((*in) >> 6);
     cSize = in[2] + (in[1]<<8) + ((in[0] & 7)<<16);
@@ -3539,7 +3745,7 @@ size_t ZSTDv06_getcBlockSize(const void* src, size_t srcSize, blockProperties_t*
 }
 
 
-static size_t ZSTDv06_copyRawBlock(void* dst, size_t dstCapacity, const void* src, size_t srcSize)
+static size_t ZSTDv07_copyRawBlock(void* dst, size_t dstCapacity, const void* src, size_t srcSize)
 {
     if (srcSize > dstCapacity) return ERROR(dstSize_tooSmall);
     memcpy(dst, src, srcSize);
@@ -3547,21 +3753,20 @@ static size_t ZSTDv06_copyRawBlock(void* dst, size_t dstCapacity, const void* sr
 }
 
 
-/*! ZSTDv06_decodeLiteralsBlock() :
+/*! ZSTDv07_decodeLiteralsBlock() :
     @return : nb of bytes read from src (< srcSize ) */
-size_t ZSTDv06_decodeLiteralsBlock(ZSTDv06_DCtx* dctx,
+size_t ZSTDv07_decodeLiteralsBlock(ZSTDv07_DCtx* dctx,
                           const void* src, size_t srcSize)   /* note : srcSize < BLOCKSIZE */
 {
     const BYTE* const istart = (const BYTE*) src;
 
-    /* any compressed block with literals segment must be at least this size */
     if (srcSize < MIN_CBLOCK_SIZE) return ERROR(corruption_detected);
 
-    switch(istart[0]>> 6)
+    switch((litBlockType_t)(istart[0]>> 6))
     {
-    case IS_HUF:
+    case lbt_huffman:
         {   size_t litSize, litCSize, singleStream=0;
-            U32 lhSize = ((istart[0]) >> 4) & 3;
+            U32 lhSize = (istart[0] >> 4) & 3;
             if (srcSize < 5) return ERROR(corruption_detected);   /* srcSize >= MIN_CBLOCK_SIZE == 3; here we need up to 5 for lhSize, + cSize (+nbSeq) */
             switch(lhSize)
             {
@@ -3585,41 +3790,43 @@ size_t ZSTDv06_decodeLiteralsBlock(ZSTDv06_DCtx* dctx,
                 litCSize = ((istart[2] &  3) << 16) + (istart[3] << 8) + istart[4];
                 break;
             }
-            if (litSize > ZSTDv06_BLOCKSIZE_MAX) return ERROR(corruption_detected);
+            if (litSize > ZSTDv07_BLOCKSIZE_ABSOLUTEMAX) return ERROR(corruption_detected);
             if (litCSize + lhSize > srcSize) return ERROR(corruption_detected);
 
-            if (HUFv06_isError(singleStream ?
-                            HUFv06_decompress1X2(dctx->litBuffer, litSize, istart+lhSize, litCSize) :
-                            HUFv06_decompress   (dctx->litBuffer, litSize, istart+lhSize, litCSize) ))
+            if (HUFv07_isError(singleStream ?
+                            HUFv07_decompress1X2_DCtx(dctx->hufTable, dctx->litBuffer, litSize, istart+lhSize, litCSize) :
+                            HUFv07_decompress4X_hufOnly (dctx->hufTable, dctx->litBuffer, litSize, istart+lhSize, litCSize) ))
                 return ERROR(corruption_detected);
 
             dctx->litPtr = dctx->litBuffer;
-            dctx->litBufSize = ZSTDv06_BLOCKSIZE_MAX+8;
+            dctx->litBufSize = ZSTDv07_BLOCKSIZE_ABSOLUTEMAX+8;
             dctx->litSize = litSize;
+            dctx->litEntropy = 1;
             return litCSize + lhSize;
         }
-    case IS_PCH:
+    case lbt_repeat:
         {   size_t litSize, litCSize;
             U32 lhSize = ((istart[0]) >> 4) & 3;
             if (lhSize != 1)  /* only case supported for now : small litSize, single stream */
                 return ERROR(corruption_detected);
-            if (!dctx->flagRepeatTable)
+            if (dctx->litEntropy==0)
                 return ERROR(dictionary_corrupted);
 
             /* 2 - 2 - 10 - 10 */
             lhSize=3;
             litSize  = ((istart[0] & 15) << 6) + (istart[1] >> 2);
             litCSize = ((istart[1] &  3) << 8) + istart[2];
+            if (litCSize + lhSize > srcSize) return ERROR(corruption_detected);
 
-            {   size_t const errorCode = HUFv06_decompress1X4_usingDTable(dctx->litBuffer, litSize, istart+lhSize, litCSize, dctx->hufTableX4);
-                if (HUFv06_isError(errorCode)) return ERROR(corruption_detected);
+            {   size_t const errorCode = HUFv07_decompress1X4_usingDTable(dctx->litBuffer, litSize, istart+lhSize, litCSize, dctx->hufTable);
+                if (HUFv07_isError(errorCode)) return ERROR(corruption_detected);
             }
             dctx->litPtr = dctx->litBuffer;
-            dctx->litBufSize = ZSTDv06_BLOCKSIZE_MAX+WILDCOPY_OVERLENGTH;
+            dctx->litBufSize = ZSTDv07_BLOCKSIZE_ABSOLUTEMAX+WILDCOPY_OVERLENGTH;
             dctx->litSize = litSize;
             return litCSize + lhSize;
         }
-    case IS_RAW:
+    case lbt_raw:
         {   size_t litSize;
             U32 lhSize = ((istart[0]) >> 4) & 3;
             switch(lhSize)
@@ -3640,7 +3847,7 @@ size_t ZSTDv06_decodeLiteralsBlock(ZSTDv06_DCtx* dctx,
                 if (litSize+lhSize > srcSize) return ERROR(corruption_detected);
                 memcpy(dctx->litBuffer, istart+lhSize, litSize);
                 dctx->litPtr = dctx->litBuffer;
-                dctx->litBufSize = ZSTDv06_BLOCKSIZE_MAX+8;
+                dctx->litBufSize = ZSTDv07_BLOCKSIZE_ABSOLUTEMAX+8;
                 dctx->litSize = litSize;
                 return lhSize+litSize;
             }
@@ -3650,7 +3857,7 @@ size_t ZSTDv06_decodeLiteralsBlock(ZSTDv06_DCtx* dctx,
             dctx->litSize = litSize;
             return lhSize+litSize;
         }
-    case IS_RLE:
+    case lbt_rle:
         {   size_t litSize;
             U32 lhSize = ((istart[0]) >> 4) & 3;
             switch(lhSize)
@@ -3667,10 +3874,10 @@ size_t ZSTDv06_decodeLiteralsBlock(ZSTDv06_DCtx* dctx,
                 if (srcSize<4) return ERROR(corruption_detected);   /* srcSize >= MIN_CBLOCK_SIZE == 3; here we need lhSize+1 = 4 */
                 break;
             }
-            if (litSize > ZSTDv06_BLOCKSIZE_MAX) return ERROR(corruption_detected);
+            if (litSize > ZSTDv07_BLOCKSIZE_ABSOLUTEMAX) return ERROR(corruption_detected);
             memset(dctx->litBuffer, istart[lhSize], litSize);
             dctx->litPtr = dctx->litBuffer;
-            dctx->litBufSize = ZSTDv06_BLOCKSIZE_MAX+WILDCOPY_OVERLENGTH;
+            dctx->litBufSize = ZSTDv07_BLOCKSIZE_ABSOLUTEMAX+WILDCOPY_OVERLENGTH;
             dctx->litSize = litSize;
             return lhSize+1;
         }
@@ -3680,42 +3887,42 @@ size_t ZSTDv06_decodeLiteralsBlock(ZSTDv06_DCtx* dctx,
 }
 
 
-/*! ZSTDv06_buildSeqTable() :
+/*! ZSTDv07_buildSeqTable() :
     @return : nb bytes read from src,
-              or an error code if it fails, testable with ZSTDv06_isError()
+              or an error code if it fails, testable with ZSTDv07_isError()
 */
-size_t ZSTDv06_buildSeqTable(FSEv06_DTable* DTable, U32 type, U32 max, U32 maxLog,
+size_t ZSTDv07_buildSeqTable(FSEv07_DTable* DTable, U32 type, U32 max, U32 maxLog,
                                  const void* src, size_t srcSize,
                                  const S16* defaultNorm, U32 defaultLog, U32 flagRepeatTable)
 {
     switch(type)
     {
-    case FSEv06_ENCODING_RLE :
+    case FSEv07_ENCODING_RLE :
         if (!srcSize) return ERROR(srcSize_wrong);
         if ( (*(const BYTE*)src) > max) return ERROR(corruption_detected);
-        FSEv06_buildDTable_rle(DTable, *(const BYTE*)src);   /* if *src > max, data is corrupted */
+        FSEv07_buildDTable_rle(DTable, *(const BYTE*)src);   /* if *src > max, data is corrupted */
         return 1;
-    case FSEv06_ENCODING_RAW :
-        FSEv06_buildDTable(DTable, defaultNorm, max, defaultLog);
+    case FSEv07_ENCODING_RAW :
+        FSEv07_buildDTable(DTable, defaultNorm, max, defaultLog);
         return 0;
-    case FSEv06_ENCODING_STATIC:
+    case FSEv07_ENCODING_STATIC:
         if (!flagRepeatTable) return ERROR(corruption_detected);
         return 0;
     default :   /* impossible */
-    case FSEv06_ENCODING_DYNAMIC :
+    case FSEv07_ENCODING_DYNAMIC :
         {   U32 tableLog;
             S16 norm[MaxSeq+1];
-            size_t const headerSize = FSEv06_readNCount(norm, &max, &tableLog, src, srcSize);
-            if (FSEv06_isError(headerSize)) return ERROR(corruption_detected);
+            size_t const headerSize = FSEv07_readNCount(norm, &max, &tableLog, src, srcSize);
+            if (FSEv07_isError(headerSize)) return ERROR(corruption_detected);
             if (tableLog > maxLog) return ERROR(corruption_detected);
-            FSEv06_buildDTable(DTable, norm, max, tableLog);
+            FSEv07_buildDTable(DTable, norm, max, tableLog);
             return headerSize;
     }   }
 }
 
 
-size_t ZSTDv06_decodeSeqHeaders(int* nbSeqPtr,
-                             FSEv06_DTable* DTableLL, FSEv06_DTable* DTableML, FSEv06_DTable* DTableOffb, U32 flagRepeatTable,
+size_t ZSTDv07_decodeSeqHeaders(int* nbSeqPtr,
+                             FSEv07_DTable* DTableLL, FSEv07_DTable* DTableML, FSEv07_DTable* DTableOffb, U32 flagRepeatTable,
                              const void* src, size_t srcSize)
 {
     const BYTE* const istart = (const BYTE* const)src;
@@ -3739,7 +3946,7 @@ size_t ZSTDv06_decodeSeqHeaders(int* nbSeqPtr,
 
     /* FSE table descriptors */
     {   U32 const LLtype  = *ip >> 6;
-        U32 const Offtype = (*ip >> 4) & 3;
+        U32 const OFtype = (*ip >> 4) & 3;
         U32 const MLtype  = (*ip >> 2) & 3;
         ip++;
 
@@ -3747,17 +3954,17 @@ size_t ZSTDv06_decodeSeqHeaders(int* nbSeqPtr,
         if (ip > iend-3) return ERROR(srcSize_wrong); /* min : all 3 are "raw", hence no header, but at least xxLog bits per type */
 
         /* Build DTables */
-        {   size_t const bhSize = ZSTDv06_buildSeqTable(DTableLL, LLtype, MaxLL, LLFSELog, ip, iend-ip, LL_defaultNorm, LL_defaultNormLog, flagRepeatTable);
-            if (ZSTDv06_isError(bhSize)) return ERROR(corruption_detected);
-            ip += bhSize;
+        {   size_t const llhSize = ZSTDv07_buildSeqTable(DTableLL, LLtype, MaxLL, LLFSELog, ip, iend-ip, LL_defaultNorm, LL_defaultNormLog, flagRepeatTable);
+            if (ZSTDv07_isError(llhSize)) return ERROR(corruption_detected);
+            ip += llhSize;
         }
-        {   size_t const bhSize = ZSTDv06_buildSeqTable(DTableOffb, Offtype, MaxOff, OffFSELog, ip, iend-ip, OF_defaultNorm, OF_defaultNormLog, flagRepeatTable);
-            if (ZSTDv06_isError(bhSize)) return ERROR(corruption_detected);
-            ip += bhSize;
+        {   size_t const ofhSize = ZSTDv07_buildSeqTable(DTableOffb, OFtype, MaxOff, OffFSELog, ip, iend-ip, OF_defaultNorm, OF_defaultNormLog, flagRepeatTable);
+            if (ZSTDv07_isError(ofhSize)) return ERROR(corruption_detected);
+            ip += ofhSize;
         }
-        {   size_t const bhSize = ZSTDv06_buildSeqTable(DTableML, MLtype, MaxML, MLFSELog, ip, iend-ip, ML_defaultNorm, ML_defaultNormLog, flagRepeatTable);
-            if (ZSTDv06_isError(bhSize)) return ERROR(corruption_detected);
-            ip += bhSize;
+        {   size_t const mlhSize = ZSTDv07_buildSeqTable(DTableML, MLtype, MaxML, MLFSELog, ip, iend-ip, ML_defaultNorm, ML_defaultNormLog, flagRepeatTable);
+            if (ZSTDv07_isError(mlhSize)) return ERROR(corruption_detected);
+            ip += mlhSize;
     }   }
 
     return ip-istart;
@@ -3771,21 +3978,21 @@ typedef struct {
 } seq_t;
 
 typedef struct {
-    BITv06_DStream_t DStream;
-    FSEv06_DState_t stateLL;
-    FSEv06_DState_t stateOffb;
-    FSEv06_DState_t stateML;
-    size_t prevOffset[ZSTDv06_REP_INIT];
+    BITv07_DStream_t DStream;
+    FSEv07_DState_t stateLL;
+    FSEv07_DState_t stateOffb;
+    FSEv07_DState_t stateML;
+    size_t prevOffset[ZSTDv07_REP_INIT];
 } seqState_t;
 
 
-
-static void ZSTDv06_decodeSequence(seq_t* seq, seqState_t* seqState)
+static seq_t ZSTDv07_decodeSequence(seqState_t* seqState)
 {
-    /* Literal length */
-    U32 const llCode = FSEv06_peekSymbol(&(seqState->stateLL));
-    U32 const mlCode = FSEv06_peekSymbol(&(seqState->stateML));
-    U32 const ofCode = FSEv06_peekSymbol(&(seqState->stateOffb));   /* <= maxOff, by table construction */
+    seq_t seq;
+
+    U32 const llCode = FSEv07_peekSymbol(&(seqState->stateLL));
+    U32 const mlCode = FSEv07_peekSymbol(&(seqState->stateML));
+    U32 const ofCode = FSEv07_peekSymbol(&(seqState->stateOffb));   /* <= maxOff, by table construction */
 
     U32 const llBits = LL_bits[llCode];
     U32 const mlBits = ML_bits[mlCode];
@@ -3798,83 +4005,80 @@ static void ZSTDv06_decodeSequence(seq_t* seq, seqState_t* seqState)
                             0x2000, 0x4000, 0x8000, 0x10000 };
 
     static const U32 ML_base[MaxML+1] = {
-                             0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10,   11,    12,    13,    14,    15,
-                            16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26,   27,    28,    29,    30,    31,
-                            32, 34, 36, 38, 40, 44, 48, 56, 64, 80, 96, 0x80, 0x100, 0x200, 0x400, 0x800,
-                            0x1000, 0x2000, 0x4000, 0x8000, 0x10000 };
+                             3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13,   14,    15,    16,    17,    18,
+                            19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29,   30,    31,    32,    33,    34,
+                            35, 37, 39, 41, 43, 47, 51, 59, 67, 83, 99, 0x83, 0x103, 0x203, 0x403, 0x803,
+                            0x1003, 0x2003, 0x4003, 0x8003, 0x10003 };
 
     static const U32 OF_base[MaxOff+1] = {
-                 0,        1,       3,       7,     0xF,     0x1F,     0x3F,     0x7F,
-                 0xFF,   0x1FF,   0x3FF,   0x7FF,   0xFFF,   0x1FFF,   0x3FFF,   0x7FFF,
-                 0xFFFF, 0x1FFFF, 0x3FFFF, 0x7FFFF, 0xFFFFF, 0x1FFFFF, 0x3FFFFF, 0x7FFFFF,
-                 0xFFFFFF, 0x1FFFFFF, 0x3FFFFFF, /*fake*/ 1, 1 };
+                 0,        1,       1,       5,     0xD,     0x1D,     0x3D,     0x7D,
+                 0xFD,   0x1FD,   0x3FD,   0x7FD,   0xFFD,   0x1FFD,   0x3FFD,   0x7FFD,
+                 0xFFFD, 0x1FFFD, 0x3FFFD, 0x7FFFD, 0xFFFFD, 0x1FFFFD, 0x3FFFFD, 0x7FFFFD,
+                 0xFFFFFD, 0x1FFFFFD, 0x3FFFFFD, 0x7FFFFFD, 0xFFFFFFD };
 
     /* sequence */
     {   size_t offset;
         if (!ofCode)
             offset = 0;
         else {
-            offset = OF_base[ofCode] + BITv06_readBits(&(seqState->DStream), ofBits);   /* <=  26 bits */
-            if (MEM_32bits()) BITv06_reloadDStream(&(seqState->DStream));
+            offset = OF_base[ofCode] + BITv07_readBits(&(seqState->DStream), ofBits);   /* <=  (ZSTDv07_WINDOWLOG_MAX-1) bits */
+            if (MEM_32bits()) BITv07_reloadDStream(&(seqState->DStream));
         }
 
-        if (offset < ZSTDv06_REP_NUM) {
-            if (llCode == 0 && offset <= 1) offset = 1-offset;
-
-            if (offset != 0) {
-                size_t temp = seqState->prevOffset[offset];
-                if (offset != 1) {
-                    seqState->prevOffset[2] = seqState->prevOffset[1];
-                }
+        if (ofCode <= 1) {
+            if ((llCode == 0) & (offset <= 1)) offset = 1-offset;
+            if (offset) {
+                size_t const temp = seqState->prevOffset[offset];
+                if (offset != 1) seqState->prevOffset[2] = seqState->prevOffset[1];
                 seqState->prevOffset[1] = seqState->prevOffset[0];
                 seqState->prevOffset[0] = offset = temp;
-
             } else {
                 offset = seqState->prevOffset[0];
             }
         } else {
-            offset -= ZSTDv06_REP_MOVE;
             seqState->prevOffset[2] = seqState->prevOffset[1];
             seqState->prevOffset[1] = seqState->prevOffset[0];
             seqState->prevOffset[0] = offset;
         }
-        seq->offset = offset;
+        seq.offset = offset;
     }
 
-    seq->matchLength = ML_base[mlCode] + MINMATCH + ((mlCode>31) ? BITv06_readBits(&(seqState->DStream), mlBits) : 0);   /* <=  16 bits */
-    if (MEM_32bits() && (mlBits+llBits>24)) BITv06_reloadDStream(&(seqState->DStream));
+    seq.matchLength = ML_base[mlCode] + ((mlCode>31) ? BITv07_readBits(&(seqState->DStream), mlBits) : 0);   /* <=  16 bits */
+    if (MEM_32bits() && (mlBits+llBits>24)) BITv07_reloadDStream(&(seqState->DStream));
 
-    seq->litLength = LL_base[llCode] + ((llCode>15) ? BITv06_readBits(&(seqState->DStream), llBits) : 0);   /* <=  16 bits */
+    seq.litLength = LL_base[llCode] + ((llCode>15) ? BITv07_readBits(&(seqState->DStream), llBits) : 0);   /* <=  16 bits */
     if (MEM_32bits() ||
-       (totalBits > 64 - 7 - (LLFSELog+MLFSELog+OffFSELog)) ) BITv06_reloadDStream(&(seqState->DStream));
+       (totalBits > 64 - 7 - (LLFSELog+MLFSELog+OffFSELog)) ) BITv07_reloadDStream(&(seqState->DStream));
 
     /* ANS state update */
-    FSEv06_updateState(&(seqState->stateLL), &(seqState->DStream));   /* <=  9 bits */
-    FSEv06_updateState(&(seqState->stateML), &(seqState->DStream));   /* <=  9 bits */
-    if (MEM_32bits()) BITv06_reloadDStream(&(seqState->DStream));     /* <= 18 bits */
-    FSEv06_updateState(&(seqState->stateOffb), &(seqState->DStream)); /* <=  8 bits */
+    FSEv07_updateState(&(seqState->stateLL), &(seqState->DStream));   /* <=  9 bits */
+    FSEv07_updateState(&(seqState->stateML), &(seqState->DStream));   /* <=  9 bits */
+    if (MEM_32bits()) BITv07_reloadDStream(&(seqState->DStream));     /* <= 18 bits */
+    FSEv07_updateState(&(seqState->stateOffb), &(seqState->DStream)); /* <=  8 bits */
+
+    return seq;
 }
 
 
-size_t ZSTDv06_execSequence(BYTE* op,
+static
+size_t ZSTDv07_execSequence(BYTE* op,
                                 BYTE* const oend, seq_t sequence,
-                                const BYTE** litPtr, const BYTE* const litLimit_8,
+                                const BYTE** litPtr, const BYTE* const litLimit_w,
                                 const BYTE* const base, const BYTE* const vBase, const BYTE* const dictEnd)
 {
     BYTE* const oLitEnd = op + sequence.litLength;
     size_t const sequenceLength = sequence.litLength + sequence.matchLength;
     BYTE* const oMatchEnd = op + sequenceLength;   /* risk : address space overflow (32-bits) */
-    BYTE* const oend_8 = oend-8;
+    BYTE* const oend_w = oend-WILDCOPY_OVERLENGTH;
     const BYTE* const iLitEnd = *litPtr + sequence.litLength;
     const BYTE* match = oLitEnd - sequence.offset;
 
     /* check */
-    if (oLitEnd > oend_8) return ERROR(dstSize_tooSmall);   /* last match must start at a minimum distance of 8 from oend */
-    if (oMatchEnd > oend) return ERROR(dstSize_tooSmall);   /* overwrite beyond dst buffer */
-    if (iLitEnd > litLimit_8) return ERROR(corruption_detected);   /* over-read beyond lit buffer */
+    if ((oLitEnd>oend_w) | (oMatchEnd>oend)) return ERROR(dstSize_tooSmall); /* last match must start at a minimum distance of WILDCOPY_OVERLENGTH from oend */
+    if (iLitEnd > litLimit_w) return ERROR(corruption_detected);   /* over-read beyond lit buffer */
 
     /* copy Literals */
-    ZSTDv06_wildcopy(op, *litPtr, sequence.litLength);   /* note : oLitEnd <= oend-8 : no risk of overwrite beyond oend */
+    ZSTDv07_wildcopy(op, *litPtr, sequence.litLength);   /* note : since oLitEnd <= oend-WILDCOPY_OVERLENGTH, no risk of overwrite beyond oend */
     op = oLitEnd;
     *litPtr = iLitEnd;   /* update for next sequence */
 
@@ -3906,29 +4110,29 @@ size_t ZSTDv06_execSequence(BYTE* op,
         op[2] = match[2];
         op[3] = match[3];
         match += dec32table[sequence.offset];
-        ZSTDv06_copy4(op+4, match);
+        ZSTDv07_copy4(op+4, match);
         match -= sub2;
     } else {
-        ZSTDv06_copy8(op, match);
+        ZSTDv07_copy8(op, match);
     }
     op += 8; match += 8;
 
     if (oMatchEnd > oend-(16-MINMATCH)) {
-        if (op < oend_8) {
-            ZSTDv06_wildcopy(op, match, oend_8 - op);
-            match += oend_8 - op;
-            op = oend_8;
+        if (op < oend_w) {
+            ZSTDv07_wildcopy(op, match, oend_w - op);
+            match += oend_w - op;
+            op = oend_w;
         }
         while (op < oMatchEnd) *op++ = *match++;
     } else {
-        ZSTDv06_wildcopy(op, match, sequence.matchLength-8);   /* works even if matchLength < 8 */
+        ZSTDv07_wildcopy(op, match, sequence.matchLength-8);   /* works even if matchLength < 8 */
     }
     return sequenceLength;
 }
 
 
-static size_t ZSTDv06_decompressSequences(
-                               ZSTDv06_DCtx* dctx,
+static size_t ZSTDv07_decompressSequences(
+                               ZSTDv07_DCtx* dctx,
                                void* dst, size_t maxDstSize,
                          const void* seqStart, size_t seqSize)
 {
@@ -3938,63 +4142,51 @@ static size_t ZSTDv06_decompressSequences(
     BYTE* const oend = ostart + maxDstSize;
     BYTE* op = ostart;
     const BYTE* litPtr = dctx->litPtr;
-    const BYTE* const litLimit_8 = litPtr + dctx->litBufSize - 8;
+    const BYTE* const litLimit_w = litPtr + dctx->litBufSize - WILDCOPY_OVERLENGTH;
     const BYTE* const litEnd = litPtr + dctx->litSize;
-    FSEv06_DTable* DTableLL = dctx->LLTable;
-    FSEv06_DTable* DTableML = dctx->MLTable;
-    FSEv06_DTable* DTableOffb = dctx->OffTable;
+    FSEv07_DTable* DTableLL = dctx->LLTable;
+    FSEv07_DTable* DTableML = dctx->MLTable;
+    FSEv07_DTable* DTableOffb = dctx->OffTable;
     const BYTE* const base = (const BYTE*) (dctx->base);
     const BYTE* const vBase = (const BYTE*) (dctx->vBase);
     const BYTE* const dictEnd = (const BYTE*) (dctx->dictEnd);
     int nbSeq;
 
     /* Build Decoding Tables */
-    {   size_t const seqHSize = ZSTDv06_decodeSeqHeaders(&nbSeq, DTableLL, DTableML, DTableOffb, dctx->flagRepeatTable, ip, seqSize);
-        if (ZSTDv06_isError(seqHSize)) return seqHSize;
+    {   size_t const seqHSize = ZSTDv07_decodeSeqHeaders(&nbSeq, DTableLL, DTableML, DTableOffb, dctx->fseEntropy, ip, seqSize);
+        if (ZSTDv07_isError(seqHSize)) return seqHSize;
         ip += seqHSize;
-        dctx->flagRepeatTable = 0;
     }
 
     /* Regen sequences */
     if (nbSeq) {
-        seq_t sequence;
         seqState_t seqState;
-
-        memset(&sequence, 0, sizeof(sequence));
-        sequence.offset = REPCODE_STARTVALUE;
-        { U32 i; for (i=0; i<ZSTDv06_REP_INIT; i++) seqState.prevOffset[i] = REPCODE_STARTVALUE; }
-        { size_t const errorCode = BITv06_initDStream(&(seqState.DStream), ip, iend-ip);
+        dctx->fseEntropy = 1;
+        { U32 i; for (i=0; i<ZSTDv07_REP_INIT; i++) seqState.prevOffset[i] = dctx->rep[i]; }
+        { size_t const errorCode = BITv07_initDStream(&(seqState.DStream), ip, iend-ip);
           if (ERR_isError(errorCode)) return ERROR(corruption_detected); }
-        FSEv06_initDState(&(seqState.stateLL), &(seqState.DStream), DTableLL);
-        FSEv06_initDState(&(seqState.stateOffb), &(seqState.DStream), DTableOffb);
-        FSEv06_initDState(&(seqState.stateML), &(seqState.DStream), DTableML);
+        FSEv07_initDState(&(seqState.stateLL), &(seqState.DStream), DTableLL);
+        FSEv07_initDState(&(seqState.stateOffb), &(seqState.DStream), DTableOffb);
+        FSEv07_initDState(&(seqState.stateML), &(seqState.DStream), DTableML);
 
-        for ( ; (BITv06_reloadDStream(&(seqState.DStream)) <= BITv06_DStream_completed) && nbSeq ; ) {
+        for ( ; (BITv07_reloadDStream(&(seqState.DStream)) <= BITv07_DStream_completed) && nbSeq ; ) {
             nbSeq--;
-            ZSTDv06_decodeSequence(&sequence, &seqState);
-
-#if 0  /* debug */
-            static BYTE* start = NULL;
-            if (start==NULL) start = op;
-            size_t pos = (size_t)(op-start);
-            if ((pos >= 5810037) && (pos < 5810400))
-                printf("Dpos %6u :%5u literals & match %3u bytes at distance %6u \n",
-                       pos, (U32)sequence.litLength, (U32)sequence.matchLength, (U32)sequence.offset);
-#endif
-
-            {   size_t const oneSeqSize = ZSTDv06_execSequence(op, oend, sequence, &litPtr, litLimit_8, base, vBase, dictEnd);
-                if (ZSTDv06_isError(oneSeqSize)) return oneSeqSize;
+            {   seq_t const sequence = ZSTDv07_decodeSequence(&seqState);
+                size_t const oneSeqSize = ZSTDv07_execSequence(op, oend, sequence, &litPtr, litLimit_w, base, vBase, dictEnd);
+                if (ZSTDv07_isError(oneSeqSize)) return oneSeqSize;
                 op += oneSeqSize;
         }   }
 
         /* check if reached exact end */
         if (nbSeq) return ERROR(corruption_detected);
+        /* save reps for next block */
+        { U32 i; for (i=0; i<ZSTDv07_REP_INIT; i++) dctx->rep[i] = (U32)(seqState.prevOffset[i]); }
     }
 
     /* last literal segment */
     {   size_t const lastLLSize = litEnd - litPtr;
-        if (litPtr > litEnd) return ERROR(corruption_detected);   /* too many literals already used */
-        if (op+lastLLSize > oend) return ERROR(dstSize_tooSmall);
+        //if (litPtr > litEnd) return ERROR(corruption_detected);   /* too many literals already used */
+        if (lastLLSize > (size_t)(oend-op)) return ERROR(dstSize_tooSmall);
         memcpy(op, litPtr, lastLLSize);
         op += lastLLSize;
     }
@@ -4003,7 +4195,7 @@ static size_t ZSTDv06_decompressSequences(
 }
 
 
-static void ZSTDv06_checkContinuity(ZSTDv06_DCtx* dctx, const void* dst)
+static void ZSTDv07_checkContinuity(ZSTDv07_DCtx* dctx, const void* dst)
 {
     if (dst != dctx->previousDstEnd) {   /* not contiguous */
         dctx->dictEnd = dctx->previousDstEnd;
@@ -4014,89 +4206,112 @@ static void ZSTDv06_checkContinuity(ZSTDv06_DCtx* dctx, const void* dst)
 }
 
 
-static size_t ZSTDv06_decompressBlock_internal(ZSTDv06_DCtx* dctx,
+static size_t ZSTDv07_decompressBlock_internal(ZSTDv07_DCtx* dctx,
                             void* dst, size_t dstCapacity,
                       const void* src, size_t srcSize)
 {   /* blockType == blockCompressed */
     const BYTE* ip = (const BYTE*)src;
 
-    if (srcSize >= ZSTDv06_BLOCKSIZE_MAX) return ERROR(srcSize_wrong);
+    if (srcSize >= ZSTDv07_BLOCKSIZE_ABSOLUTEMAX) return ERROR(srcSize_wrong);
 
     /* Decode literals sub-block */
-    {   size_t const litCSize = ZSTDv06_decodeLiteralsBlock(dctx, src, srcSize);
-        if (ZSTDv06_isError(litCSize)) return litCSize;
+    {   size_t const litCSize = ZSTDv07_decodeLiteralsBlock(dctx, src, srcSize);
+        if (ZSTDv07_isError(litCSize)) return litCSize;
         ip += litCSize;
         srcSize -= litCSize;
     }
-    return ZSTDv06_decompressSequences(dctx, dst, dstCapacity, ip, srcSize);
+    return ZSTDv07_decompressSequences(dctx, dst, dstCapacity, ip, srcSize);
 }
 
 
-size_t ZSTDv06_decompressBlock(ZSTDv06_DCtx* dctx,
+size_t ZSTDv07_decompressBlock(ZSTDv07_DCtx* dctx,
                             void* dst, size_t dstCapacity,
                       const void* src, size_t srcSize)
 {
-    ZSTDv06_checkContinuity(dctx, dst);
-    return ZSTDv06_decompressBlock_internal(dctx, dst, dstCapacity, src, srcSize);
+    size_t dSize;
+    ZSTDv07_checkContinuity(dctx, dst);
+    dSize = ZSTDv07_decompressBlock_internal(dctx, dst, dstCapacity, src, srcSize);
+    dctx->previousDstEnd = (char*)dst + dSize;
+    return dSize;
 }
 
 
-/*! ZSTDv06_decompressFrame() :
+/** ZSTDv07_insertBlock() :
+    insert `src` block into `dctx` history. Useful to track uncompressed blocks. */
+ZSTDLIB_API size_t ZSTDv07_insertBlock(ZSTDv07_DCtx* dctx, const void* blockStart, size_t blockSize)
+{
+    ZSTDv07_checkContinuity(dctx, blockStart);
+    dctx->previousDstEnd = (const char*)blockStart + blockSize;
+    return blockSize;
+}
+
+
+size_t ZSTDv07_generateNxBytes(void* dst, size_t dstCapacity, BYTE byte, size_t length)
+{
+    if (length > dstCapacity) return ERROR(dstSize_tooSmall);
+    memset(dst, byte, length);
+    return length;
+}
+
+
+/*! ZSTDv07_decompressFrame() :
 *   `dctx` must be properly initialized */
-static size_t ZSTDv06_decompressFrame(ZSTDv06_DCtx* dctx,
+static size_t ZSTDv07_decompressFrame(ZSTDv07_DCtx* dctx,
                                  void* dst, size_t dstCapacity,
                                  const void* src, size_t srcSize)
 {
     const BYTE* ip = (const BYTE*)src;
     const BYTE* const iend = ip + srcSize;
     BYTE* const ostart = (BYTE* const)dst;
-    BYTE* op = ostart;
     BYTE* const oend = ostart + dstCapacity;
+    BYTE* op = ostart;
     size_t remainingSize = srcSize;
-    blockProperties_t blockProperties = { bt_compressed, 0 };
 
     /* check */
-    if (srcSize < ZSTDv06_frameHeaderSize_min+ZSTDv06_blockHeaderSize) return ERROR(srcSize_wrong);
+    if (srcSize < ZSTDv07_frameHeaderSize_min+ZSTDv07_blockHeaderSize) return ERROR(srcSize_wrong);
 
     /* Frame Header */
-    {   size_t const frameHeaderSize = ZSTDv06_frameHeaderSize(src, ZSTDv06_frameHeaderSize_min);
-        if (ZSTDv06_isError(frameHeaderSize)) return frameHeaderSize;
-        if (srcSize < frameHeaderSize+ZSTDv06_blockHeaderSize) return ERROR(srcSize_wrong);
-        if (ZSTDv06_decodeFrameHeader(dctx, src, frameHeaderSize)) return ERROR(corruption_detected);
+    {   size_t const frameHeaderSize = ZSTDv07_frameHeaderSize(src, ZSTDv07_frameHeaderSize_min);
+        if (ZSTDv07_isError(frameHeaderSize)) return frameHeaderSize;
+        if (srcSize < frameHeaderSize+ZSTDv07_blockHeaderSize) return ERROR(srcSize_wrong);
+        if (ZSTDv07_decodeFrameHeader(dctx, src, frameHeaderSize)) return ERROR(corruption_detected);
         ip += frameHeaderSize; remainingSize -= frameHeaderSize;
     }
 
     /* Loop on each block */
     while (1) {
-        size_t decodedSize=0;
-        size_t const cBlockSize = ZSTDv06_getcBlockSize(ip, iend-ip, &blockProperties);
-        if (ZSTDv06_isError(cBlockSize)) return cBlockSize;
+        size_t decodedSize;
+        blockProperties_t blockProperties;
+        size_t const cBlockSize = ZSTDv07_getcBlockSize(ip, iend-ip, &blockProperties);
+        if (ZSTDv07_isError(cBlockSize)) return cBlockSize;
 
-        ip += ZSTDv06_blockHeaderSize;
-        remainingSize -= ZSTDv06_blockHeaderSize;
+        ip += ZSTDv07_blockHeaderSize;
+        remainingSize -= ZSTDv07_blockHeaderSize;
         if (cBlockSize > remainingSize) return ERROR(srcSize_wrong);
 
         switch(blockProperties.blockType)
         {
         case bt_compressed:
-            decodedSize = ZSTDv06_decompressBlock_internal(dctx, op, oend-op, ip, cBlockSize);
+            decodedSize = ZSTDv07_decompressBlock_internal(dctx, op, oend-op, ip, cBlockSize);
             break;
         case bt_raw :
-            decodedSize = ZSTDv06_copyRawBlock(op, oend-op, ip, cBlockSize);
+            decodedSize = ZSTDv07_copyRawBlock(op, oend-op, ip, cBlockSize);
             break;
         case bt_rle :
-            return ERROR(GENERIC);   /* not yet supported */
+            decodedSize = ZSTDv07_generateNxBytes(op, oend-op, *ip, blockProperties.origSize);
             break;
         case bt_end :
             /* end of frame */
             if (remainingSize) return ERROR(srcSize_wrong);
+            decodedSize = 0;
             break;
         default:
             return ERROR(GENERIC);   /* impossible */
         }
-        if (cBlockSize == 0) break;   /* bt_end */
+        if (blockProperties.blockType == bt_end) break;   /* bt_end */
 
-        if (ZSTDv06_isError(decodedSize)) return decodedSize;
+        if (ZSTDv07_isError(decodedSize)) return decodedSize;
+        if (dctx->fParams.checksumFlag) XXH64_update(&dctx->xxhState, op, decodedSize);
         op += decodedSize;
         ip += cBlockSize;
         remainingSize -= cBlockSize;
@@ -4106,45 +4321,50 @@ static size_t ZSTDv06_decompressFrame(ZSTDv06_DCtx* dctx,
 }
 
 
-size_t ZSTDv06_decompress_usingPreparedDCtx(ZSTDv06_DCtx* dctx, const ZSTDv06_DCtx* refDCtx,
+/*! ZSTDv07_decompress_usingPreparedDCtx() :
+*   Same as ZSTDv07_decompress_usingDict, but using a reference context `preparedDCtx`, where dictionary has been loaded.
+*   It avoids reloading the dictionary each time.
+*   `preparedDCtx` must have been properly initialized using ZSTDv07_decompressBegin_usingDict().
+*   Requires 2 contexts : 1 for reference (preparedDCtx), which will not be modified, and 1 to run the decompression operation (dctx) */
+size_t ZSTDv07_decompress_usingPreparedDCtx(ZSTDv07_DCtx* dctx, const ZSTDv07_DCtx* refDCtx,
                                          void* dst, size_t dstCapacity,
                                    const void* src, size_t srcSize)
 {
-    ZSTDv06_copyDCtx(dctx, refDCtx);
-    ZSTDv06_checkContinuity(dctx, dst);
-    return ZSTDv06_decompressFrame(dctx, dst, dstCapacity, src, srcSize);
+    ZSTDv07_copyDCtx(dctx, refDCtx);
+    ZSTDv07_checkContinuity(dctx, dst);
+    return ZSTDv07_decompressFrame(dctx, dst, dstCapacity, src, srcSize);
 }
 
 
-size_t ZSTDv06_decompress_usingDict(ZSTDv06_DCtx* dctx,
+size_t ZSTDv07_decompress_usingDict(ZSTDv07_DCtx* dctx,
                                  void* dst, size_t dstCapacity,
                                  const void* src, size_t srcSize,
                                  const void* dict, size_t dictSize)
 {
-    ZSTDv06_decompressBegin_usingDict(dctx, dict, dictSize);
-    ZSTDv06_checkContinuity(dctx, dst);
-    return ZSTDv06_decompressFrame(dctx, dst, dstCapacity, src, srcSize);
+    ZSTDv07_decompressBegin_usingDict(dctx, dict, dictSize);
+    ZSTDv07_checkContinuity(dctx, dst);
+    return ZSTDv07_decompressFrame(dctx, dst, dstCapacity, src, srcSize);
 }
 
 
-size_t ZSTDv06_decompressDCtx(ZSTDv06_DCtx* dctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize)
+size_t ZSTDv07_decompressDCtx(ZSTDv07_DCtx* dctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize)
 {
-    return ZSTDv06_decompress_usingDict(dctx, dst, dstCapacity, src, srcSize, NULL, 0);
+    return ZSTDv07_decompress_usingDict(dctx, dst, dstCapacity, src, srcSize, NULL, 0);
 }
 
 
-size_t ZSTDv06_decompress(void* dst, size_t dstCapacity, const void* src, size_t srcSize)
+size_t ZSTDv07_decompress(void* dst, size_t dstCapacity, const void* src, size_t srcSize)
 {
-#if defined(ZSTDv06_HEAPMODE) && (ZSTDv06_HEAPMODE==1)
+#if defined(ZSTDv07_HEAPMODE) && (ZSTDv07_HEAPMODE==1)
     size_t regenSize;
-    ZSTDv06_DCtx* dctx = ZSTDv06_createDCtx();
+    ZSTDv07_DCtx* const dctx = ZSTDv07_createDCtx();
     if (dctx==NULL) return ERROR(memory_allocation);
-    regenSize = ZSTDv06_decompressDCtx(dctx, dst, dstCapacity, src, srcSize);
-    ZSTDv06_freeDCtx(dctx);
+    regenSize = ZSTDv07_decompressDCtx(dctx, dst, dstCapacity, src, srcSize);
+    ZSTDv07_freeDCtx(dctx);
     return regenSize;
 #else   /* stack mode */
-    ZSTDv06_DCtx dctx;
-    return ZSTDv06_decompressDCtx(&dctx, dst, dstCapacity, src, srcSize);
+    ZSTDv07_DCtx dctx;
+    return ZSTDv07_decompressDCtx(&dctx, dst, dstCapacity, src, srcSize);
 #endif
 }
 
@@ -4152,27 +4372,40 @@ size_t ZSTDv06_decompress(void* dst, size_t dstCapacity, const void* src, size_t
 /*_******************************
 *  Streaming Decompression API
 ********************************/
-size_t ZSTDv06_nextSrcSizeToDecompress(ZSTDv06_DCtx* dctx)
+size_t ZSTDv07_nextSrcSizeToDecompress(ZSTDv07_DCtx* dctx)
 {
     return dctx->expected;
 }
 
-size_t ZSTDv06_decompressContinue(ZSTDv06_DCtx* dctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize)
+int ZSTDv07_isSkipFrame(ZSTDv07_DCtx* dctx)
+{
+    return dctx->stage == ZSTDds_skipFrame;
+}
+
+/** ZSTDv07_decompressContinue() :
+*   @return : nb of bytes generated into `dst` (necessarily <= `dstCapacity)
+*             or an error code, which can be tested using ZSTDv07_isError() */
+size_t ZSTDv07_decompressContinue(ZSTDv07_DCtx* dctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize)
 {
     /* Sanity check */
     if (srcSize != dctx->expected) return ERROR(srcSize_wrong);
-    if (dstCapacity) ZSTDv06_checkContinuity(dctx, dst);
+    if (dstCapacity) ZSTDv07_checkContinuity(dctx, dst);
 
-    /* Decompress : frame header; part 1 */
     switch (dctx->stage)
     {
     case ZSTDds_getFrameHeaderSize :
-        if (srcSize != ZSTDv06_frameHeaderSize_min) return ERROR(srcSize_wrong);   /* impossible */
-        dctx->headerSize = ZSTDv06_frameHeaderSize(src, ZSTDv06_frameHeaderSize_min);
-        if (ZSTDv06_isError(dctx->headerSize)) return dctx->headerSize;
-        memcpy(dctx->headerBuffer, src, ZSTDv06_frameHeaderSize_min);
-        if (dctx->headerSize > ZSTDv06_frameHeaderSize_min) {
-            dctx->expected = dctx->headerSize - ZSTDv06_frameHeaderSize_min;
+        if (srcSize != ZSTDv07_frameHeaderSize_min) return ERROR(srcSize_wrong);   /* impossible */
+        if ((MEM_readLE32(src) & 0xFFFFFFF0U) == ZSTDv07_MAGIC_SKIPPABLE_START) {
+            memcpy(dctx->headerBuffer, src, ZSTDv07_frameHeaderSize_min);
+            dctx->expected = ZSTDv07_skippableHeaderSize - ZSTDv07_frameHeaderSize_min; /* magic number + skippable frame length */
+            dctx->stage = ZSTDds_decodeSkippableHeader;
+            return 0;
+        }
+        dctx->headerSize = ZSTDv07_frameHeaderSize(src, ZSTDv07_frameHeaderSize_min);
+        if (ZSTDv07_isError(dctx->headerSize)) return dctx->headerSize;
+        memcpy(dctx->headerBuffer, src, ZSTDv07_frameHeaderSize_min);
+        if (dctx->headerSize > ZSTDv07_frameHeaderSize_min) {
+            dctx->expected = dctx->headerSize - ZSTDv07_frameHeaderSize_min;
             dctx->stage = ZSTDds_decodeFrameHeader;
             return 0;
         }
@@ -4180,18 +4413,25 @@ size_t ZSTDv06_decompressContinue(ZSTDv06_DCtx* dctx, void* dst, size_t dstCapac
 
     case ZSTDds_decodeFrameHeader:
         {   size_t result;
-            memcpy(dctx->headerBuffer + ZSTDv06_frameHeaderSize_min, src, dctx->expected);
-            result = ZSTDv06_decodeFrameHeader(dctx, dctx->headerBuffer, dctx->headerSize);
-            if (ZSTDv06_isError(result)) return result;
-            dctx->expected = ZSTDv06_blockHeaderSize;
+            memcpy(dctx->headerBuffer + ZSTDv07_frameHeaderSize_min, src, dctx->expected);
+            result = ZSTDv07_decodeFrameHeader(dctx, dctx->headerBuffer, dctx->headerSize);
+            if (ZSTDv07_isError(result)) return result;
+            dctx->expected = ZSTDv07_blockHeaderSize;
             dctx->stage = ZSTDds_decodeBlockHeader;
             return 0;
         }
     case ZSTDds_decodeBlockHeader:
         {   blockProperties_t bp;
-            size_t const cBlockSize = ZSTDv06_getcBlockSize(src, ZSTDv06_blockHeaderSize, &bp);
-            if (ZSTDv06_isError(cBlockSize)) return cBlockSize;
+            size_t const cBlockSize = ZSTDv07_getcBlockSize(src, ZSTDv07_blockHeaderSize, &bp);
+            if (ZSTDv07_isError(cBlockSize)) return cBlockSize;
             if (bp.blockType == bt_end) {
+                if (dctx->fParams.checksumFlag) {
+                    U64 const h64 = XXH64_digest(&dctx->xxhState);
+                    U32 const h32 = (U32)(h64>>11) & ((1<<22)-1);
+                    const BYTE* const ip = (const BYTE*)src;
+                    U32 const check32 = ip[2] + (ip[1] << 8) + ((ip[0] & 0x3F) << 16);
+                    if (check32 != h32) return ERROR(checksum_wrong);
+                }
                 dctx->expected = 0;
                 dctx->stage = ZSTDds_getFrameHeaderSize;
             } else {
@@ -4206,10 +4446,10 @@ size_t ZSTDv06_decompressContinue(ZSTDv06_DCtx* dctx, void* dst, size_t dstCapac
             switch(dctx->bType)
             {
             case bt_compressed:
-                rSize = ZSTDv06_decompressBlock_internal(dctx, dst, dstCapacity, src, srcSize);
+                rSize = ZSTDv07_decompressBlock_internal(dctx, dst, dstCapacity, src, srcSize);
                 break;
             case bt_raw :
-                rSize = ZSTDv06_copyRawBlock(dst, dstCapacity, src, srcSize);
+                rSize = ZSTDv07_copyRawBlock(dst, dstCapacity, src, srcSize);
                 break;
             case bt_rle :
                 return ERROR(GENERIC);   /* not yet handled */
@@ -4221,102 +4461,194 @@ size_t ZSTDv06_decompressContinue(ZSTDv06_DCtx* dctx, void* dst, size_t dstCapac
                 return ERROR(GENERIC);   /* impossible */
             }
             dctx->stage = ZSTDds_decodeBlockHeader;
-            dctx->expected = ZSTDv06_blockHeaderSize;
+            dctx->expected = ZSTDv07_blockHeaderSize;
             dctx->previousDstEnd = (char*)dst + rSize;
+            if (ZSTDv07_isError(rSize)) return rSize;
+            if (dctx->fParams.checksumFlag) XXH64_update(&dctx->xxhState, dst, rSize);
             return rSize;
         }
+    case ZSTDds_decodeSkippableHeader:
+        {   memcpy(dctx->headerBuffer + ZSTDv07_frameHeaderSize_min, src, dctx->expected);
+            dctx->expected = MEM_readLE32(dctx->headerBuffer + 4);
+            dctx->stage = ZSTDds_skipFrame;
+            return 0;
+        }
+    case ZSTDds_skipFrame:
+        {   dctx->expected = 0;
+            dctx->stage = ZSTDds_getFrameHeaderSize;
+            return 0;
+        }
     default:
         return ERROR(GENERIC);   /* impossible */
     }
 }
 
 
-static void ZSTDv06_refDictContent(ZSTDv06_DCtx* dctx, const void* dict, size_t dictSize)
+static size_t ZSTDv07_refDictContent(ZSTDv07_DCtx* dctx, const void* dict, size_t dictSize)
 {
     dctx->dictEnd = dctx->previousDstEnd;
     dctx->vBase = (const char*)dict - ((const char*)(dctx->previousDstEnd) - (const char*)(dctx->base));
     dctx->base = dict;
     dctx->previousDstEnd = (const char*)dict + dictSize;
+    return 0;
 }
 
-static size_t ZSTDv06_loadEntropy(ZSTDv06_DCtx* dctx, const void* dict, size_t dictSize)
+static size_t ZSTDv07_loadEntropy(ZSTDv07_DCtx* dctx, const void* const dict, size_t const dictSize)
 {
-    size_t hSize, offcodeHeaderSize, matchlengthHeaderSize, litlengthHeaderSize;
+    const BYTE* dictPtr = (const BYTE*)dict;
+    const BYTE* const dictEnd = dictPtr + dictSize;
 
-    hSize = HUFv06_readDTableX4(dctx->hufTableX4, dict, dictSize);
-    if (HUFv06_isError(hSize)) return ERROR(dictionary_corrupted);
-    dict = (const char*)dict + hSize;
-    dictSize -= hSize;
+    {   size_t const hSize = HUFv07_readDTableX4(dctx->hufTable, dict, dictSize);
+        if (HUFv07_isError(hSize)) return ERROR(dictionary_corrupted);
+        dictPtr += hSize;
+    }
 
     {   short offcodeNCount[MaxOff+1];
         U32 offcodeMaxValue=MaxOff, offcodeLog=OffFSELog;
-        offcodeHeaderSize = FSEv06_readNCount(offcodeNCount, &offcodeMaxValue, &offcodeLog, dict, dictSize);
-        if (FSEv06_isError(offcodeHeaderSize)) return ERROR(dictionary_corrupted);
-        { size_t const errorCode = FSEv06_buildDTable(dctx->OffTable, offcodeNCount, offcodeMaxValue, offcodeLog);
-          if (FSEv06_isError(errorCode)) return ERROR(dictionary_corrupted); }
-        dict = (const char*)dict + offcodeHeaderSize;
-        dictSize -= offcodeHeaderSize;
+        size_t const offcodeHeaderSize = FSEv07_readNCount(offcodeNCount, &offcodeMaxValue, &offcodeLog, dictPtr, dictEnd-dictPtr);
+        if (FSEv07_isError(offcodeHeaderSize)) return ERROR(dictionary_corrupted);
+        { size_t const errorCode = FSEv07_buildDTable(dctx->OffTable, offcodeNCount, offcodeMaxValue, offcodeLog);
+          if (FSEv07_isError(errorCode)) return ERROR(dictionary_corrupted); }
+        dictPtr += offcodeHeaderSize;
     }
 
     {   short matchlengthNCount[MaxML+1];
         unsigned matchlengthMaxValue = MaxML, matchlengthLog = MLFSELog;
-        matchlengthHeaderSize = FSEv06_readNCount(matchlengthNCount, &matchlengthMaxValue, &matchlengthLog, dict, dictSize);
-        if (FSEv06_isError(matchlengthHeaderSize)) return ERROR(dictionary_corrupted);
-        { size_t const errorCode = FSEv06_buildDTable(dctx->MLTable, matchlengthNCount, matchlengthMaxValue, matchlengthLog);
-          if (FSEv06_isError(errorCode)) return ERROR(dictionary_corrupted); }
-        dict = (const char*)dict + matchlengthHeaderSize;
-        dictSize -= matchlengthHeaderSize;
+        size_t const matchlengthHeaderSize = FSEv07_readNCount(matchlengthNCount, &matchlengthMaxValue, &matchlengthLog, dictPtr, dictEnd-dictPtr);
+        if (FSEv07_isError(matchlengthHeaderSize)) return ERROR(dictionary_corrupted);
+        { size_t const errorCode = FSEv07_buildDTable(dctx->MLTable, matchlengthNCount, matchlengthMaxValue, matchlengthLog);
+          if (FSEv07_isError(errorCode)) return ERROR(dictionary_corrupted); }
+        dictPtr += matchlengthHeaderSize;
     }
 
     {   short litlengthNCount[MaxLL+1];
         unsigned litlengthMaxValue = MaxLL, litlengthLog = LLFSELog;
-        litlengthHeaderSize = FSEv06_readNCount(litlengthNCount, &litlengthMaxValue, &litlengthLog, dict, dictSize);
-        if (FSEv06_isError(litlengthHeaderSize)) return ERROR(dictionary_corrupted);
-        { size_t const errorCode = FSEv06_buildDTable(dctx->LLTable, litlengthNCount, litlengthMaxValue, litlengthLog);
-          if (FSEv06_isError(errorCode)) return ERROR(dictionary_corrupted); }
+        size_t const litlengthHeaderSize = FSEv07_readNCount(litlengthNCount, &litlengthMaxValue, &litlengthLog, dictPtr, dictEnd-dictPtr);
+        if (FSEv07_isError(litlengthHeaderSize)) return ERROR(dictionary_corrupted);
+        { size_t const errorCode = FSEv07_buildDTable(dctx->LLTable, litlengthNCount, litlengthMaxValue, litlengthLog);
+          if (FSEv07_isError(errorCode)) return ERROR(dictionary_corrupted); }
+        dictPtr += litlengthHeaderSize;
     }
 
-    dctx->flagRepeatTable = 1;
-    return hSize + offcodeHeaderSize + matchlengthHeaderSize + litlengthHeaderSize;
+    if (dictPtr+12 > dictEnd) return ERROR(dictionary_corrupted);
+    dctx->rep[0] = MEM_readLE32(dictPtr+0); if (dctx->rep[0] >= dictSize) return ERROR(dictionary_corrupted);
+    dctx->rep[1] = MEM_readLE32(dictPtr+4); if (dctx->rep[1] >= dictSize) return ERROR(dictionary_corrupted);
+    dctx->rep[2] = MEM_readLE32(dictPtr+8); if (dctx->rep[2] >= dictSize) return ERROR(dictionary_corrupted);
+    dictPtr += 12;
+
+    dctx->litEntropy = dctx->fseEntropy = 1;
+    return dictPtr - (const BYTE*)dict;
 }
 
-static size_t ZSTDv06_decompress_insertDictionary(ZSTDv06_DCtx* dctx, const void* dict, size_t dictSize)
+static size_t ZSTDv07_decompress_insertDictionary(ZSTDv07_DCtx* dctx, const void* dict, size_t dictSize)
 {
-    size_t eSize;
-    U32 const magic = MEM_readLE32(dict);
-    if (magic != ZSTDv06_DICT_MAGIC) {
-        /* pure content mode */
-        ZSTDv06_refDictContent(dctx, dict, dictSize);
-        return 0;
-    }
+    if (dictSize < 8) return ZSTDv07_refDictContent(dctx, dict, dictSize);
+    {   U32 const magic = MEM_readLE32(dict);
+        if (magic != ZSTDv07_DICT_MAGIC) {
+            return ZSTDv07_refDictContent(dctx, dict, dictSize);   /* pure content mode */
+    }   }
+    dctx->dictID = MEM_readLE32((const char*)dict + 4);
+
     /* load entropy tables */
-    dict = (const char*)dict + 4;
-    dictSize -= 4;
-    eSize = ZSTDv06_loadEntropy(dctx, dict, dictSize);
-    if (ZSTDv06_isError(eSize)) return ERROR(dictionary_corrupted);
+    dict = (const char*)dict + 8;
+    dictSize -= 8;
+    {   size_t const eSize = ZSTDv07_loadEntropy(dctx, dict, dictSize);
+        if (ZSTDv07_isError(eSize)) return ERROR(dictionary_corrupted);
+        dict = (const char*)dict + eSize;
+        dictSize -= eSize;
+    }
 
     /* reference dictionary content */
-    dict = (const char*)dict + eSize;
-    dictSize -= eSize;
-    ZSTDv06_refDictContent(dctx, dict, dictSize);
+    return ZSTDv07_refDictContent(dctx, dict, dictSize);
+}
+
+
+size_t ZSTDv07_decompressBegin_usingDict(ZSTDv07_DCtx* dctx, const void* dict, size_t dictSize)
+{
+    { size_t const errorCode = ZSTDv07_decompressBegin(dctx);
+      if (ZSTDv07_isError(errorCode)) return errorCode; }
+
+    if (dict && dictSize) {
+        size_t const errorCode = ZSTDv07_decompress_insertDictionary(dctx, dict, dictSize);
+        if (ZSTDv07_isError(errorCode)) return ERROR(dictionary_corrupted);
+    }
 
     return 0;
 }
 
 
-size_t ZSTDv06_decompressBegin_usingDict(ZSTDv06_DCtx* dctx, const void* dict, size_t dictSize)
+struct ZSTDv07_DDict_s {
+    void* dict;
+    size_t dictSize;
+    ZSTDv07_DCtx* refContext;
+};  /* typedef'd tp ZSTDv07_CDict within zstd.h */
+
+ZSTDv07_DDict* ZSTDv07_createDDict_advanced(const void* dict, size_t dictSize, ZSTDv07_customMem customMem)
 {
-    { size_t const errorCode = ZSTDv06_decompressBegin(dctx);
-      if (ZSTDv06_isError(errorCode)) return errorCode; }
+    if (!customMem.customAlloc && !customMem.customFree)
+        customMem = defaultCustomMem;
 
-    if (dict && dictSize) {
-        size_t const errorCode = ZSTDv06_decompress_insertDictionary(dctx, dict, dictSize);
-        if (ZSTDv06_isError(errorCode)) return ERROR(dictionary_corrupted);
+    if (!customMem.customAlloc || !customMem.customFree)
+        return NULL;
+
+    {   ZSTDv07_DDict* const ddict = (ZSTDv07_DDict*) customMem.customAlloc(customMem.opaque, sizeof(*ddict));
+        void* const dictContent = customMem.customAlloc(customMem.opaque, dictSize);
+        ZSTDv07_DCtx* const dctx = ZSTDv07_createDCtx_advanced(customMem);
+
+        if (!dictContent || !ddict || !dctx) {
+            customMem.customFree(customMem.opaque, dictContent);
+            customMem.customFree(customMem.opaque, ddict);
+            customMem.customFree(customMem.opaque, dctx);
+            return NULL;
+        }
+
+        memcpy(dictContent, dict, dictSize);
+        {   size_t const errorCode = ZSTDv07_decompressBegin_usingDict(dctx, dictContent, dictSize);
+            if (ZSTDv07_isError(errorCode)) {
+                customMem.customFree(customMem.opaque, dictContent);
+                customMem.customFree(customMem.opaque, ddict);
+                customMem.customFree(customMem.opaque, dctx);
+                return NULL;
+        }   }
+
+        ddict->dict = dictContent;
+        ddict->dictSize = dictSize;
+        ddict->refContext = dctx;
+        return ddict;
     }
+}
+
+/*! ZSTDv07_createDDict() :
+*   Create a digested dictionary, ready to start decompression without startup delay.
+*   `dict` can be released after `ZSTDv07_DDict` creation */
+ZSTDv07_DDict* ZSTDv07_createDDict(const void* dict, size_t dictSize)
+{
+    ZSTDv07_customMem const allocator = { NULL, NULL, NULL };
+    return ZSTDv07_createDDict_advanced(dict, dictSize, allocator);
+}
 
+size_t ZSTDv07_freeDDict(ZSTDv07_DDict* ddict)
+{
+    ZSTDv07_freeFunction const cFree = ddict->refContext->customMem.customFree;
+    void* const opaque = ddict->refContext->customMem.opaque;
+    ZSTDv07_freeDCtx(ddict->refContext);
+    cFree(opaque, ddict->dict);
+    cFree(opaque, ddict);
     return 0;
 }
 
+/*! ZSTDv07_decompress_usingDDict() :
+*   Decompression using a pre-digested Dictionary
+*   Use dictionary without significant overhead. */
+ZSTDLIB_API size_t ZSTDv07_decompress_usingDDict(ZSTDv07_DCtx* dctx,
+                                           void* dst, size_t dstCapacity,
+                                     const void* src, size_t srcSize,
+                                     const ZSTDv07_DDict* ddict)
+{
+    return ZSTDv07_decompress_usingPreparedDCtx(dctx, ddict->refContext,
+                                           dst, dstCapacity,
+                                           src, srcSize);
+}
 /*
     Buffered version of Zstd compression library
     Copyright (C) 2015-2016, Yann Collet.
@@ -4349,38 +4681,39 @@ size_t ZSTDv06_decompressBegin_usingDict(ZSTDv06_DCtx* dctx, const void* dict, s
 */
 
 
+
 /*-***************************************************************************
 *  Streaming decompression howto
 *
-*  A ZBUFFv06_DCtx object is required to track streaming operations.
-*  Use ZBUFFv06_createDCtx() and ZBUFFv06_freeDCtx() to create/release resources.
-*  Use ZBUFFv06_decompressInit() to start a new decompression operation,
-*   or ZBUFFv06_decompressInitDictionary() if decompression requires a dictionary.
-*  Note that ZBUFFv06_DCtx objects can be re-init multiple times.
+*  A ZBUFFv07_DCtx object is required to track streaming operations.
+*  Use ZBUFFv07_createDCtx() and ZBUFFv07_freeDCtx() to create/release resources.
+*  Use ZBUFFv07_decompressInit() to start a new decompression operation,
+*   or ZBUFFv07_decompressInitDictionary() if decompression requires a dictionary.
+*  Note that ZBUFFv07_DCtx objects can be re-init multiple times.
 *
-*  Use ZBUFFv06_decompressContinue() repetitively to consume your input.
+*  Use ZBUFFv07_decompressContinue() repetitively to consume your input.
 *  *srcSizePtr and *dstCapacityPtr can be any size.
 *  The function will report how many bytes were read or written by modifying *srcSizePtr and *dstCapacityPtr.
 *  Note that it may not consume the entire input, in which case it's up to the caller to present remaining input again.
 *  The content of @dst will be overwritten (up to *dstCapacityPtr) at each function call, so save its content if it matters, or change @dst.
 *  @return : a hint to preferred nb of bytes to use as input for next function call (it's only a hint, to help latency),
 *            or 0 when a frame is completely decoded,
-*            or an error code, which can be tested using ZBUFFv06_isError().
+*            or an error code, which can be tested using ZBUFFv07_isError().
 *
-*  Hint : recommended buffer sizes (not compulsory) : ZBUFFv06_recommendedDInSize() and ZBUFFv06_recommendedDOutSize()
-*  output : ZBUFFv06_recommendedDOutSize==128 KB block size is the internal unit, it ensures it's always possible to write a full block when decoded.
-*  input  : ZBUFFv06_recommendedDInSize == 128KB + 3;
-*           just follow indications from ZBUFFv06_decompressContinue() to minimize latency. It should always be <= 128 KB + 3 .
+*  Hint : recommended buffer sizes (not compulsory) : ZBUFFv07_recommendedDInSize() and ZBUFFv07_recommendedDOutSize()
+*  output : ZBUFFv07_recommendedDOutSize==128 KB block size is the internal unit, it ensures it's always possible to write a full block when decoded.
+*  input  : ZBUFFv07_recommendedDInSize == 128KB + 3;
+*           just follow indications from ZBUFFv07_decompressContinue() to minimize latency. It should always be <= 128 KB + 3 .
 * *******************************************************************************/
 
 typedef enum { ZBUFFds_init, ZBUFFds_loadHeader,
-               ZBUFFds_read, ZBUFFds_load, ZBUFFds_flush } ZBUFFv06_dStage;
+               ZBUFFds_read, ZBUFFds_load, ZBUFFds_flush } ZBUFFv07_dStage;
 
 /* *** Resource management *** */
-struct ZBUFFv06_DCtx_s {
-    ZSTDv06_DCtx* zd;
-    ZSTDv06_frameParams fParams;
-    ZBUFFv06_dStage stage;
+struct ZBUFFv07_DCtx_s {
+    ZSTDv07_DCtx* zd;
+    ZSTDv07_frameParams fParams;
+    ZBUFFv07_dStage stage;
     char*  inBuff;
     size_t inBuffSize;
     size_t inPos;
@@ -4389,51 +4722,68 @@ struct ZBUFFv06_DCtx_s {
     size_t outStart;
     size_t outEnd;
     size_t blockSize;
-    BYTE headerBuffer[ZSTDv06_FRAMEHEADERSIZE_MAX];
+    BYTE headerBuffer[ZSTDv07_FRAMEHEADERSIZE_MAX];
     size_t lhSize;
-};   /* typedef'd to ZBUFFv06_DCtx within "zstd_buffered.h" */
+    ZSTDv07_customMem customMem;
+};   /* typedef'd to ZBUFFv07_DCtx within "zstd_buffered.h" */
+
+ZSTDLIB_API ZBUFFv07_DCtx* ZBUFFv07_createDCtx_advanced(ZSTDv07_customMem customMem);
 
+ZBUFFv07_DCtx* ZBUFFv07_createDCtx(void)
+{
+    return ZBUFFv07_createDCtx_advanced(defaultCustomMem);
+}
 
-ZBUFFv06_DCtx* ZBUFFv06_createDCtx(void)
+ZBUFFv07_DCtx* ZBUFFv07_createDCtx_advanced(ZSTDv07_customMem customMem)
 {
-    ZBUFFv06_DCtx* zbd = (ZBUFFv06_DCtx*)malloc(sizeof(ZBUFFv06_DCtx));
+    ZBUFFv07_DCtx* zbd;
+
+    if (!customMem.customAlloc && !customMem.customFree)
+        customMem = defaultCustomMem;
+
+    if (!customMem.customAlloc || !customMem.customFree)
+        return NULL;
+
+    zbd = (ZBUFFv07_DCtx*)customMem.customAlloc(customMem.opaque, sizeof(ZBUFFv07_DCtx));
     if (zbd==NULL) return NULL;
-    memset(zbd, 0, sizeof(*zbd));
-    zbd->zd = ZSTDv06_createDCtx();
+    memset(zbd, 0, sizeof(ZBUFFv07_DCtx));
+    memcpy(&zbd->customMem, &customMem, sizeof(ZSTDv07_customMem));
+    zbd->zd = ZSTDv07_createDCtx_advanced(customMem);
+    if (zbd->zd == NULL) { ZBUFFv07_freeDCtx(zbd); return NULL; }
     zbd->stage = ZBUFFds_init;
     return zbd;
 }
 
-size_t ZBUFFv06_freeDCtx(ZBUFFv06_DCtx* zbd)
+size_t ZBUFFv07_freeDCtx(ZBUFFv07_DCtx* zbd)
 {
     if (zbd==NULL) return 0;   /* support free on null */
-    ZSTDv06_freeDCtx(zbd->zd);
-    free(zbd->inBuff);
-    free(zbd->outBuff);
-    free(zbd);
+    ZSTDv07_freeDCtx(zbd->zd);
+    if (zbd->inBuff) zbd->customMem.customFree(zbd->customMem.opaque, zbd->inBuff);
+    if (zbd->outBuff) zbd->customMem.customFree(zbd->customMem.opaque, zbd->outBuff);
+    zbd->customMem.customFree(zbd->customMem.opaque, zbd);
     return 0;
 }
 
 
 /* *** Initialization *** */
 
-size_t ZBUFFv06_decompressInitDictionary(ZBUFFv06_DCtx* zbd, const void* dict, size_t dictSize)
+size_t ZBUFFv07_decompressInitDictionary(ZBUFFv07_DCtx* zbd, const void* dict, size_t dictSize)
 {
     zbd->stage = ZBUFFds_loadHeader;
     zbd->lhSize = zbd->inPos = zbd->outStart = zbd->outEnd = 0;
-    return ZSTDv06_decompressBegin_usingDict(zbd->zd, dict, dictSize);
+    return ZSTDv07_decompressBegin_usingDict(zbd->zd, dict, dictSize);
 }
 
-size_t ZBUFFv06_decompressInit(ZBUFFv06_DCtx* zbd)
+size_t ZBUFFv07_decompressInit(ZBUFFv07_DCtx* zbd)
 {
-    return ZBUFFv06_decompressInitDictionary(zbd, NULL, 0);
+    return ZBUFFv07_decompressInitDictionary(zbd, NULL, 0);
 }
 
 
-
-MEM_STATIC size_t ZBUFFv06_limitCopy(void* dst, size_t dstCapacity, const void* src, size_t srcSize)
+/* internal util function */
+MEM_STATIC size_t ZBUFFv07_limitCopy(void* dst, size_t dstCapacity, const void* src, size_t srcSize)
 {
-    size_t length = MIN(dstCapacity, srcSize);
+    size_t const length = MIN(dstCapacity, srcSize);
     memcpy(dst, src, length);
     return length;
 }
@@ -4441,7 +4791,7 @@ MEM_STATIC size_t ZBUFFv06_limitCopy(void* dst, size_t dstCapacity, const void*
 
 /* *** Decompression *** */
 
-size_t ZBUFFv06_decompressContinue(ZBUFFv06_DCtx* zbd,
+size_t ZBUFFv07_decompressContinue(ZBUFFv07_DCtx* zbd,
                                 void* dst, size_t* dstCapacityPtr,
                           const void* src, size_t* srcSizePtr)
 {
@@ -4460,62 +4810,65 @@ size_t ZBUFFv06_decompressContinue(ZBUFFv06_DCtx* zbd,
             return ERROR(init_missing);
 
         case ZBUFFds_loadHeader :
-            {   size_t const hSize = ZSTDv06_getFrameParams(&(zbd->fParams), zbd->headerBuffer, zbd->lhSize);
+            {   size_t const hSize = ZSTDv07_getFrameParams(&(zbd->fParams), zbd->headerBuffer, zbd->lhSize);
                 if (hSize != 0) {
                     size_t const toLoad = hSize - zbd->lhSize;   /* if hSize!=0, hSize > zbd->lhSize */
-                    if (ZSTDv06_isError(hSize)) return hSize;
+                    if (ZSTDv07_isError(hSize)) return hSize;
                     if (toLoad > (size_t)(iend-ip)) {   /* not enough input to load full header */
                         memcpy(zbd->headerBuffer + zbd->lhSize, ip, iend-ip);
-                        zbd->lhSize += iend-ip; ip = iend; notDone = 0;
+                        zbd->lhSize += iend-ip;
                         *dstCapacityPtr = 0;
-                        return (hSize - zbd->lhSize) + ZSTDv06_blockHeaderSize;   /* remaining header bytes + next block header */
+                        return (hSize - zbd->lhSize) + ZSTDv07_blockHeaderSize;   /* remaining header bytes + next block header */
                     }
                     memcpy(zbd->headerBuffer + zbd->lhSize, ip, toLoad); zbd->lhSize = hSize; ip += toLoad;
                     break;
             }   }
 
             /* Consume header */
-            {   size_t const h1Size = ZSTDv06_nextSrcSizeToDecompress(zbd->zd);  /* == ZSTDv06_frameHeaderSize_min */
-                size_t const h1Result = ZSTDv06_decompressContinue(zbd->zd, NULL, 0, zbd->headerBuffer, h1Size);
-                if (ZSTDv06_isError(h1Result)) return h1Result;
+            {   size_t const h1Size = ZSTDv07_nextSrcSizeToDecompress(zbd->zd);  /* == ZSTDv07_frameHeaderSize_min */
+                size_t const h1Result = ZSTDv07_decompressContinue(zbd->zd, NULL, 0, zbd->headerBuffer, h1Size);
+                if (ZSTDv07_isError(h1Result)) return h1Result;
                 if (h1Size < zbd->lhSize) {   /* long header */
-                    size_t const h2Size = ZSTDv06_nextSrcSizeToDecompress(zbd->zd);
-                    size_t const h2Result = ZSTDv06_decompressContinue(zbd->zd, NULL, 0, zbd->headerBuffer+h1Size, h2Size);
-                    if (ZSTDv06_isError(h2Result)) return h2Result;
+                    size_t const h2Size = ZSTDv07_nextSrcSizeToDecompress(zbd->zd);
+                    size_t const h2Result = ZSTDv07_decompressContinue(zbd->zd, NULL, 0, zbd->headerBuffer+h1Size, h2Size);
+                    if (ZSTDv07_isError(h2Result)) return h2Result;
             }   }
 
+            zbd->fParams.windowSize = MAX(zbd->fParams.windowSize, 1U << ZSTDv07_WINDOWLOG_ABSOLUTEMIN);
+
             /* Frame header instruct buffer sizes */
-            {   size_t const blockSize = MIN(1 << zbd->fParams.windowLog, ZSTDv06_BLOCKSIZE_MAX);
+            {   size_t const blockSize = MIN(zbd->fParams.windowSize, ZSTDv07_BLOCKSIZE_ABSOLUTEMAX);
                 zbd->blockSize = blockSize;
                 if (zbd->inBuffSize < blockSize) {
-                    free(zbd->inBuff);
+                    zbd->customMem.customFree(zbd->customMem.opaque, zbd->inBuff);
                     zbd->inBuffSize = blockSize;
-                    zbd->inBuff = (char*)malloc(blockSize);
+                    zbd->inBuff = (char*)zbd->customMem.customAlloc(zbd->customMem.opaque, blockSize);
                     if (zbd->inBuff == NULL) return ERROR(memory_allocation);
                 }
-                {   size_t const neededOutSize = ((size_t)1 << zbd->fParams.windowLog) + blockSize;
+                {   size_t const neededOutSize = zbd->fParams.windowSize + blockSize;
                     if (zbd->outBuffSize < neededOutSize) {
-                        free(zbd->outBuff);
+                        zbd->customMem.customFree(zbd->customMem.opaque, zbd->outBuff);
                         zbd->outBuffSize = neededOutSize;
-                        zbd->outBuff = (char*)malloc(neededOutSize);
+                        zbd->outBuff = (char*)zbd->customMem.customAlloc(zbd->customMem.opaque, neededOutSize);
                         if (zbd->outBuff == NULL) return ERROR(memory_allocation);
             }   }   }
             zbd->stage = ZBUFFds_read;
 
         case ZBUFFds_read:
-            {   size_t const neededInSize = ZSTDv06_nextSrcSizeToDecompress(zbd->zd);
+            {   size_t const neededInSize = ZSTDv07_nextSrcSizeToDecompress(zbd->zd);
                 if (neededInSize==0) {  /* end of frame */
                     zbd->stage = ZBUFFds_init;
                     notDone = 0;
                     break;
                 }
                 if ((size_t)(iend-ip) >= neededInSize) {  /* decode directly from src */
-                    size_t const decodedSize = ZSTDv06_decompressContinue(zbd->zd,
-                        zbd->outBuff + zbd->outStart, zbd->outBuffSize - zbd->outStart,
+                    const int isSkipFrame = ZSTDv07_isSkipFrame(zbd->zd);
+                    size_t const decodedSize = ZSTDv07_decompressContinue(zbd->zd,
+                        zbd->outBuff + zbd->outStart, (isSkipFrame ? 0 : zbd->outBuffSize - zbd->outStart),
                         ip, neededInSize);
-                    if (ZSTDv06_isError(decodedSize)) return decodedSize;
+                    if (ZSTDv07_isError(decodedSize)) return decodedSize;
                     ip += neededInSize;
-                    if (!decodedSize) break;   /* this was just a header */
+                    if (!decodedSize && !isSkipFrame) break;   /* this was just a header */
                     zbd->outEnd = zbd->outStart +  decodedSize;
                     zbd->stage = ZBUFFds_flush;
                     break;
@@ -4525,22 +4878,23 @@ size_t ZBUFFv06_decompressContinue(ZBUFFv06_DCtx* zbd,
             }
 
         case ZBUFFds_load:
-            {   size_t const neededInSize = ZSTDv06_nextSrcSizeToDecompress(zbd->zd);
+            {   size_t const neededInSize = ZSTDv07_nextSrcSizeToDecompress(zbd->zd);
                 size_t const toLoad = neededInSize - zbd->inPos;   /* should always be <= remaining space within inBuff */
                 size_t loadedSize;
                 if (toLoad > zbd->inBuffSize - zbd->inPos) return ERROR(corruption_detected);   /* should never happen */
-                loadedSize = ZBUFFv06_limitCopy(zbd->inBuff + zbd->inPos, toLoad, ip, iend-ip);
+                loadedSize = ZBUFFv07_limitCopy(zbd->inBuff + zbd->inPos, toLoad, ip, iend-ip);
                 ip += loadedSize;
                 zbd->inPos += loadedSize;
                 if (loadedSize < toLoad) { notDone = 0; break; }   /* not enough input, wait for more */
 
                 /* decode loaded input */
-                {   size_t const decodedSize = ZSTDv06_decompressContinue(zbd->zd,
+                {  const int isSkipFrame = ZSTDv07_isSkipFrame(zbd->zd);
+                   size_t const decodedSize = ZSTDv07_decompressContinue(zbd->zd,
                         zbd->outBuff + zbd->outStart, zbd->outBuffSize - zbd->outStart,
                         zbd->inBuff, neededInSize);
-                    if (ZSTDv06_isError(decodedSize)) return decodedSize;
+                    if (ZSTDv07_isError(decodedSize)) return decodedSize;
                     zbd->inPos = 0;   /* input is consumed */
-                    if (!decodedSize) { zbd->stage = ZBUFFds_read; break; }   /* this was just a header */
+                    if (!decodedSize && !isSkipFrame) { zbd->stage = ZBUFFds_read; break; }   /* this was just a header */
                     zbd->outEnd = zbd->outStart +  decodedSize;
                     zbd->stage = ZBUFFds_flush;
                     // break; /* ZBUFFds_flush follows */
@@ -4548,7 +4902,7 @@ size_t ZBUFFv06_decompressContinue(ZBUFFv06_DCtx* zbd,
 
         case ZBUFFds_flush:
             {   size_t const toFlushSize = zbd->outEnd - zbd->outStart;
-                size_t const flushedSize = ZBUFFv06_limitCopy(op, oend-op, zbd->outBuff + zbd->outStart, toFlushSize);
+                size_t const flushedSize = ZBUFFv07_limitCopy(op, oend-op, zbd->outBuff + zbd->outStart, toFlushSize);
                 op += flushedSize;
                 zbd->outStart += flushedSize;
                 if (flushedSize == toFlushSize) {
@@ -4567,8 +4921,7 @@ size_t ZBUFFv06_decompressContinue(ZBUFFv06_DCtx* zbd,
     /* result */
     *srcSizePtr = ip-istart;
     *dstCapacityPtr = op-ostart;
-    {   size_t nextSrcSizeHint = ZSTDv06_nextSrcSizeToDecompress(zbd->zd);
-        if (nextSrcSizeHint > ZSTDv06_blockHeaderSize) nextSrcSizeHint+= ZSTDv06_blockHeaderSize;   /* get following block header too */
+    {   size_t nextSrcSizeHint = ZSTDv07_nextSrcSizeToDecompress(zbd->zd);
         nextSrcSizeHint -= zbd->inPos;   /* already loaded*/
         return nextSrcSizeHint;
     }
@@ -4579,5 +4932,5 @@ size_t ZBUFFv06_decompressContinue(ZBUFFv06_DCtx* zbd,
 /* *************************************
 *  Tool functions
 ***************************************/
-size_t ZBUFFv06_recommendedDInSize(void)  { return ZSTDv06_BLOCKSIZE_MAX + ZSTDv06_blockHeaderSize /* block header size*/ ; }
-size_t ZBUFFv06_recommendedDOutSize(void) { return ZSTDv06_BLOCKSIZE_MAX; }
+size_t ZBUFFv07_recommendedDInSize(void)  { return ZSTDv07_BLOCKSIZE_ABSOLUTEMAX + ZSTDv07_blockHeaderSize /* block header size*/ ; }
+size_t ZBUFFv07_recommendedDOutSize(void) { return ZSTDv07_BLOCKSIZE_ABSOLUTEMAX; }
diff --git a/lib/legacy/zstd_v06.h b/lib/legacy/zstd_v07.h
similarity index 52%
copy from lib/legacy/zstd_v06.h
copy to lib/legacy/zstd_v07.h
index 55619be..162566c 100644
--- a/lib/legacy/zstd_v06.h
+++ b/lib/legacy/zstd_v07.h
@@ -1,5 +1,5 @@
 /*
-    zstd_v06 - decoder for 0.6 format
+    zstd_v07 - decoder for 0.7 format
     Header File
     Copyright (C) 2014-2016, Yann Collet.
 
@@ -29,157 +29,168 @@
     You can contact the author at :
     - zstd source repository : https://github.com/Cyan4973/zstd
 */
-#ifndef ZSTDv06_H
-#define ZSTDv06_H
+#ifndef ZSTDv07_H_235446
+#define ZSTDv07_H_235446
 
 #if defined (__cplusplus)
 extern "C" {
 #endif
 
-/*-*************************************
-*  Dependencies
-***************************************/
+/*======  Dependency  ======*/
 #include <stddef.h>   /* size_t */
 
 
-/*-***************************************************************
-*  Export parameters
-*****************************************************************/
+/*======  Export for Windows  ======*/
 /*!
-*  ZSTDv06_DLL_EXPORT :
+*  ZSTDv07_DLL_EXPORT :
 *  Enable exporting of functions when building a Windows DLL
 */
-#if defined(_WIN32) && defined(ZSTDv06_DLL_EXPORT) && (ZSTDv06_DLL_EXPORT==1)
+#if defined(_WIN32) && defined(ZSTDv07_DLL_EXPORT) && (ZSTDv07_DLL_EXPORT==1)
 #  define ZSTDLIB_API __declspec(dllexport)
 #else
 #  define ZSTDLIB_API
 #endif
 
 
+
 /* *************************************
-*  Simple functions
+*  Simple API
 ***************************************/
-/*! ZSTDv06_decompress() :
-    `compressedSize` : is the _exact_ size of the compressed blob, otherwise decompression will fail.
-    `dstCapacity` must be large enough, equal or larger than originalSize.
+/*! ZSTDv07_getDecompressedSize() :
+*   @return : decompressed size if known, 0 otherwise.
+       note 1 : if `0`, follow up with ZSTDv07_getFrameParams() to know precise failure cause.
+       note 2 : decompressed size could be wrong or intentionally modified !
+                always ensure results fit within application's authorized limits */
+unsigned long long ZSTDv07_getDecompressedSize(const void* src, size_t srcSize);
+
+/*! ZSTDv07_decompress() :
+    `compressedSize` : must be _exact_ size of compressed input, otherwise decompression will fail.
+    `dstCapacity` must be equal or larger than originalSize.
     @return : the number of bytes decompressed into `dst` (<= `dstCapacity`),
-              or an errorCode if it fails (which can be tested using ZSTDv06_isError()) */
-ZSTDLIB_API size_t ZSTDv06_decompress( void* dst, size_t dstCapacity,
+              or an errorCode if it fails (which can be tested using ZSTDv07_isError()) */
+ZSTDLIB_API size_t ZSTDv07_decompress( void* dst, size_t dstCapacity,
                               const void* src, size_t compressedSize);
 
+/*======  Helper functions  ======*/
+ZSTDLIB_API unsigned    ZSTDv07_isError(size_t code);          /*!< tells if a `size_t` function result is an error code */
+ZSTDLIB_API const char* ZSTDv07_getErrorName(size_t code);     /*!< provides readable string from an error code */
 
-/* *************************************
-*  Helper functions
-***************************************/
-ZSTDLIB_API size_t      ZSTDv06_compressBound(size_t srcSize); /*!< maximum compressed size (worst case scenario) */
-
-/* Error Management */
-ZSTDLIB_API unsigned    ZSTDv06_isError(size_t code);          /*!< tells if a `size_t` function result is an error code */
-ZSTDLIB_API const char* ZSTDv06_getErrorName(size_t code);     /*!< provides readable string for an error code */
 
-
-/* *************************************
+/*-*************************************
 *  Explicit memory management
 ***************************************/
 /** Decompression context */
-typedef struct ZSTDv06_DCtx_s ZSTDv06_DCtx;
-ZSTDLIB_API ZSTDv06_DCtx* ZSTDv06_createDCtx(void);
-ZSTDLIB_API size_t     ZSTDv06_freeDCtx(ZSTDv06_DCtx* dctx);      /*!< @return : errorCode */
+typedef struct ZSTDv07_DCtx_s ZSTDv07_DCtx;
+ZSTDLIB_API ZSTDv07_DCtx* ZSTDv07_createDCtx(void);
+ZSTDLIB_API size_t     ZSTDv07_freeDCtx(ZSTDv07_DCtx* dctx);      /*!< @return : errorCode */
 
-/** ZSTDv06_decompressDCtx() :
-*   Same as ZSTDv06_decompress(), but requires an already allocated ZSTDv06_DCtx (see ZSTDv06_createDCtx()) */
-ZSTDLIB_API size_t ZSTDv06_decompressDCtx(ZSTDv06_DCtx* ctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize);
+/** ZSTDv07_decompressDCtx() :
+*   Same as ZSTDv07_decompress(), requires an allocated ZSTDv07_DCtx (see ZSTDv07_createDCtx()) */
+ZSTDLIB_API size_t ZSTDv07_decompressDCtx(ZSTDv07_DCtx* ctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize);
 
 
-/*-***********************
-*  Dictionary API
-*************************/
-/*! ZSTDv06_decompress_usingDict() :
+/*-************************
+*  Simple dictionary API
+***************************/
+/*! ZSTDv07_decompress_usingDict() :
 *   Decompression using a pre-defined Dictionary content (see dictBuilder).
-*   Dictionary must be identical to the one used during compression, otherwise regenerated data will be corrupted.
-*   Note : dict can be NULL, in which case, it's equivalent to ZSTDv06_decompressDCtx() */
-ZSTDLIB_API size_t ZSTDv06_decompress_usingDict(ZSTDv06_DCtx* dctx,
+*   Dictionary must be identical to the one used during compression.
+*   Note : This function load the dictionary, resulting in a significant startup time */
+ZSTDLIB_API size_t ZSTDv07_decompress_usingDict(ZSTDv07_DCtx* dctx,
                                              void* dst, size_t dstCapacity,
                                        const void* src, size_t srcSize,
                                        const void* dict,size_t dictSize);
 
 
-/*-************************
-*  Advanced Streaming API
-***************************/
+/*-**************************
+*  Advanced Dictionary API
+****************************/
+/*! ZSTDv07_createDDict() :
+*   Create a digested dictionary, ready to start decompression operation without startup delay.
+*   `dict` can be released after creation */
+typedef struct ZSTDv07_DDict_s ZSTDv07_DDict;
+ZSTDLIB_API ZSTDv07_DDict* ZSTDv07_createDDict(const void* dict, size_t dictSize);
+ZSTDLIB_API size_t      ZSTDv07_freeDDict(ZSTDv07_DDict* ddict);
+
+/*! ZSTDv07_decompress_usingDDict() :
+*   Decompression using a pre-digested Dictionary
+*   Faster startup than ZSTDv07_decompress_usingDict(), recommended when same dictionary is used multiple times. */
+ZSTDLIB_API size_t ZSTDv07_decompress_usingDDict(ZSTDv07_DCtx* dctx,
+                                              void* dst, size_t dstCapacity,
+                                        const void* src, size_t srcSize,
+                                        const ZSTDv07_DDict* ddict);
 
-typedef struct ZSTDv06_frameParams_s ZSTDv06_frameParams;
+typedef struct {
+    unsigned long long frameContentSize;
+    unsigned windowSize;
+    unsigned dictID;
+    unsigned checksumFlag;
+} ZSTDv07_frameParams;
 
-ZSTDLIB_API size_t ZSTDv06_getFrameParams(ZSTDv06_frameParams* fparamsPtr, const void* src, size_t srcSize);   /**< doesn't consume input */
-ZSTDLIB_API size_t ZSTDv06_decompressBegin_usingDict(ZSTDv06_DCtx* dctx, const void* dict, size_t dictSize);
-ZSTDLIB_API void   ZSTDv06_copyDCtx(ZSTDv06_DCtx* dctx, const ZSTDv06_DCtx* preparedDCtx);
+ZSTDLIB_API size_t ZSTDv07_getFrameParams(ZSTDv07_frameParams* fparamsPtr, const void* src, size_t srcSize);   /**< doesn't consume input */
 
-ZSTDLIB_API size_t ZSTDv06_nextSrcSizeToDecompress(ZSTDv06_DCtx* dctx);
-ZSTDLIB_API size_t ZSTDv06_decompressContinue(ZSTDv06_DCtx* dctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize);
 
 
 
 /* *************************************
-*  ZBUFF API
+*  Streaming functions
 ***************************************/
+typedef struct ZBUFFv07_DCtx_s ZBUFFv07_DCtx;
+ZSTDLIB_API ZBUFFv07_DCtx* ZBUFFv07_createDCtx(void);
+ZSTDLIB_API size_t      ZBUFFv07_freeDCtx(ZBUFFv07_DCtx* dctx);
 
-typedef struct ZBUFFv06_DCtx_s ZBUFFv06_DCtx;
-ZSTDLIB_API ZBUFFv06_DCtx* ZBUFFv06_createDCtx(void);
-ZSTDLIB_API size_t      ZBUFFv06_freeDCtx(ZBUFFv06_DCtx* dctx);
+ZSTDLIB_API size_t ZBUFFv07_decompressInit(ZBUFFv07_DCtx* dctx);
+ZSTDLIB_API size_t ZBUFFv07_decompressInitDictionary(ZBUFFv07_DCtx* dctx, const void* dict, size_t dictSize);
 
-ZSTDLIB_API size_t ZBUFFv06_decompressInit(ZBUFFv06_DCtx* dctx);
-ZSTDLIB_API size_t ZBUFFv06_decompressInitDictionary(ZBUFFv06_DCtx* dctx, const void* dict, size_t dictSize);
-
-ZSTDLIB_API size_t ZBUFFv06_decompressContinue(ZBUFFv06_DCtx* dctx,
+ZSTDLIB_API size_t ZBUFFv07_decompressContinue(ZBUFFv07_DCtx* dctx,
                                             void* dst, size_t* dstCapacityPtr,
                                       const void* src, size_t* srcSizePtr);
 
 /*-***************************************************************************
 *  Streaming decompression howto
 *
-*  A ZBUFFv06_DCtx object is required to track streaming operations.
-*  Use ZBUFFv06_createDCtx() and ZBUFFv06_freeDCtx() to create/release resources.
-*  Use ZBUFFv06_decompressInit() to start a new decompression operation,
-*   or ZBUFFv06_decompressInitDictionary() if decompression requires a dictionary.
-*  Note that ZBUFFv06_DCtx objects can be re-init multiple times.
+*  A ZBUFFv07_DCtx object is required to track streaming operations.
+*  Use ZBUFFv07_createDCtx() and ZBUFFv07_freeDCtx() to create/release resources.
+*  Use ZBUFFv07_decompressInit() to start a new decompression operation,
+*   or ZBUFFv07_decompressInitDictionary() if decompression requires a dictionary.
+*  Note that ZBUFFv07_DCtx objects can be re-init multiple times.
 *
-*  Use ZBUFFv06_decompressContinue() repetitively to consume your input.
+*  Use ZBUFFv07_decompressContinue() repetitively to consume your input.
 *  *srcSizePtr and *dstCapacityPtr can be any size.
 *  The function will report how many bytes were read or written by modifying *srcSizePtr and *dstCapacityPtr.
 *  Note that it may not consume the entire input, in which case it's up to the caller to present remaining input again.
 *  The content of `dst` will be overwritten (up to *dstCapacityPtr) at each function call, so save its content if it matters, or change `dst`.
 *  @return : a hint to preferred nb of bytes to use as input for next function call (it's only a hint, to help latency),
 *            or 0 when a frame is completely decoded,
-*            or an error code, which can be tested using ZBUFFv06_isError().
+*            or an error code, which can be tested using ZBUFFv07_isError().
 *
-*  Hint : recommended buffer sizes (not compulsory) : ZBUFFv06_recommendedDInSize() and ZBUFFv06_recommendedDOutSize()
-*  output : ZBUFFv06_recommendedDOutSize== 128 KB block size is the internal unit, it ensures it's always possible to write a full block when decoded.
-*  input  : ZBUFFv06_recommendedDInSize == 128KB + 3;
-*           just follow indications from ZBUFFv06_decompressContinue() to minimize latency. It should always be <= 128 KB + 3 .
+*  Hint : recommended buffer sizes (not compulsory) : ZBUFFv07_recommendedDInSize() and ZBUFFv07_recommendedDOutSize()
+*  output : ZBUFFv07_recommendedDOutSize== 128 KB block size is the internal unit, it ensures it's always possible to write a full block when decoded.
+*  input  : ZBUFFv07_recommendedDInSize == 128KB + 3;
+*           just follow indications from ZBUFFv07_decompressContinue() to minimize latency. It should always be <= 128 KB + 3 .
 * *******************************************************************************/
 
 
 /* *************************************
 *  Tool functions
 ***************************************/
-ZSTDLIB_API unsigned ZBUFFv06_isError(size_t errorCode);
-ZSTDLIB_API const char* ZBUFFv06_getErrorName(size_t errorCode);
+ZSTDLIB_API unsigned ZBUFFv07_isError(size_t errorCode);
+ZSTDLIB_API const char* ZBUFFv07_getErrorName(size_t errorCode);
 
 /** Functions below provide recommended buffer sizes for Compression or Decompression operations.
 *   These sizes are just hints, they tend to offer better latency */
-ZSTDLIB_API size_t ZBUFFv06_recommendedDInSize(void);
-ZSTDLIB_API size_t ZBUFFv06_recommendedDOutSize(void);
+ZSTDLIB_API size_t ZBUFFv07_recommendedDInSize(void);
+ZSTDLIB_API size_t ZBUFFv07_recommendedDOutSize(void);
 
 
 /*-*************************************
 *  Constants
 ***************************************/
-#define ZSTDv06_MAGICNUMBER 0xFD2FB526   /* v0.6 */
-
+#define ZSTDv07_MAGICNUMBER            0xFD2FB527   /* v0.7 */
 
 
 #if defined (__cplusplus)
 }
 #endif
 
-#endif  /* ZSTDv06_BUFFERED_H */
+#endif  /* ZSTDv07_H_235446 */
diff --git a/lib/common/zstd.h b/lib/zstd.h
similarity index 58%
rename from lib/common/zstd.h
rename to lib/zstd.h
index d6a1cce..cb33b55 100644
--- a/lib/common/zstd.h
+++ b/lib/zstd.h
@@ -36,15 +36,11 @@
 extern "C" {
 #endif
 
-/*-*************************************
-*  Dependencies
-***************************************/
+/*======  Dependency  ======*/
 #include <stddef.h>   /* size_t */
 
 
-/*-***************************************************************
-*  Export parameters
-*****************************************************************/
+/*======  Export for Windows  ======*/
 /*!
 *  ZSTD_DLL_EXPORT :
 *  Enable exporting of functions when building a Windows DLL
@@ -56,12 +52,10 @@ extern "C" {
 #endif
 
 
-/* *************************************
-*  Version
-***************************************/
+/*======  Version  ======*/
 #define ZSTD_VERSION_MAJOR    0
-#define ZSTD_VERSION_MINOR    7
-#define ZSTD_VERSION_RELEASE  1
+#define ZSTD_VERSION_MINOR    8
+#define ZSTD_VERSION_RELEASE  0
 
 #define ZSTD_LIB_VERSION ZSTD_VERSION_MAJOR.ZSTD_VERSION_MINOR.ZSTD_VERSION_RELEASE
 #define ZSTD_QUOTE(str) #str
@@ -73,56 +67,72 @@ ZSTDLIB_API unsigned ZSTD_versionNumber (void);
 
 
 /* *************************************
-*  Simple functions
+*  Simple API
 ***************************************/
 /*! ZSTD_compress() :
-    Compresses `srcSize` bytes from buffer `src` into buffer `dst` of size `dstCapacity`.
-    Destination buffer must be already allocated.
-    Compression runs faster if `dstCapacity` >=  `ZSTD_compressBound(srcSize)`.
-    @return : the number of bytes written into `dst`,
+    Compresses `src` buffer into already allocated `dst`.
+    Hint : compression runs faster if `dstCapacity` >=  `ZSTD_compressBound(srcSize)`.
+    @return : the number of bytes written into `dst` (<= `dstCapacity),
               or an error code if it fails (which can be tested using ZSTD_isError()) */
-ZSTDLIB_API size_t ZSTD_compress(   void* dst, size_t dstCapacity,
-                              const void* src, size_t srcSize,
-                                     int  compressionLevel);
+ZSTDLIB_API size_t ZSTD_compress( void* dst, size_t dstCapacity,
+                            const void* src, size_t srcSize,
+                                  int compressionLevel);
+
+/*! ZSTD_getDecompressedSize() :
+*   @return : decompressed size as a 64-bits value _if known_, 0 otherwise.
+*    note 1 : decompressed size can be very large (64-bits value),
+*             potentially larger than what local system can handle as a single memory segment.
+*             In which case, it's necessary to use streaming mode to decompress data.
+*    note 2 : decompressed size is an optional field, that may not be present.
+*             When `return==0`, consider data to decompress could have any size.
+*             In which case, it's necessary to use streaming mode to decompress data,
+*             or rely on application's implied limits.
+*             (For example, it may know that its own data is necessarily cut into blocks <= 16 KB).
+*    note 3 : decompressed size could be wrong or intentionally modified !
+*             Always ensure result fits within application's authorized limits !
+*             Each application can have its own set of conditions.
+*             If the intention is to decompress public data compressed by zstd command line utility,
+*             it is recommended to support at least 8 MB for extended compatibility.
+*    note 4 : when `return==0`, if precise failure cause is needed, use ZSTD_getFrameParams() to know more. */
+unsigned long long ZSTD_getDecompressedSize(const void* src, size_t srcSize);
 
 /*! ZSTD_decompress() :
-    `compressedSize` : is the _exact_ size of the compressed blob, otherwise decompression will fail.
-    `dstCapacity` must be large enough, equal or larger than originalSize.
+    `compressedSize` : must be the _exact_ size of compressed input, otherwise decompression will fail.
+    `dstCapacity` must be equal or larger than originalSize (see ZSTD_getDecompressedSize() ).
+    If originalSize is unknown, and if there is no implied application-specific limitations,
+    it's necessary to use streaming mode to decompress data.
     @return : the number of bytes decompressed into `dst` (<= `dstCapacity`),
               or an errorCode if it fails (which can be tested using ZSTD_isError()) */
 ZSTDLIB_API size_t ZSTD_decompress( void* dst, size_t dstCapacity,
                               const void* src, size_t compressedSize);
 
 
-/* *************************************
-*  Helper functions
-***************************************/
-ZSTDLIB_API size_t      ZSTD_compressBound(size_t srcSize); /*!< maximum compressed size (worst case scenario) */
-
-/* Error Management */
+/*======  Helper functions  ======*/
+ZSTDLIB_API int         ZSTD_maxCLevel(void);               /*!< maximum compression level available */
+ZSTDLIB_API size_t      ZSTD_compressBound(size_t srcSize); /*!< maximum compressed size in worst case scenario */
 ZSTDLIB_API unsigned    ZSTD_isError(size_t code);          /*!< tells if a `size_t` function result is an error code */
-ZSTDLIB_API const char* ZSTD_getErrorName(size_t code);     /*!< provides readable string for an error code */
+ZSTDLIB_API const char* ZSTD_getErrorName(size_t code);     /*!< provides readable string from an error code */
 
 
-/* *************************************
+/*-*************************************
 *  Explicit memory management
 ***************************************/
 /** Compression context */
 typedef struct ZSTD_CCtx_s ZSTD_CCtx;                       /*< incomplete type */
 ZSTDLIB_API ZSTD_CCtx* ZSTD_createCCtx(void);
-ZSTDLIB_API size_t     ZSTD_freeCCtx(ZSTD_CCtx* cctx);      /*!< @return : errorCode */
+ZSTDLIB_API size_t     ZSTD_freeCCtx(ZSTD_CCtx* cctx);
 
 /** ZSTD_compressCCtx() :
-    Same as ZSTD_compress(), but requires an already allocated ZSTD_CCtx (see ZSTD_createCCtx()) */
+    Same as ZSTD_compress(), requires an allocated ZSTD_CCtx (see ZSTD_createCCtx()) */
 ZSTDLIB_API size_t ZSTD_compressCCtx(ZSTD_CCtx* ctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize, int compressionLevel);
 
 /** Decompression context */
-typedef struct ZSTD_DCtx_s ZSTD_DCtx;
+typedef struct ZSTD_DCtx_s ZSTD_DCtx;                       /*< incomplete type */
 ZSTDLIB_API ZSTD_DCtx* ZSTD_createDCtx(void);
-ZSTDLIB_API size_t     ZSTD_freeDCtx(ZSTD_DCtx* dctx);      /*!< @return : errorCode */
+ZSTDLIB_API size_t     ZSTD_freeDCtx(ZSTD_DCtx* dctx);
 
 /** ZSTD_decompressDCtx() :
-*   Same as ZSTD_decompress(), but requires an already allocated ZSTD_DCtx (see ZSTD_createDCtx()) */
+*   Same as ZSTD_decompress(), requires an allocated ZSTD_DCtx (see ZSTD_createDCtx()) */
 ZSTDLIB_API size_t ZSTD_decompressDCtx(ZSTD_DCtx* ctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize);
 
 
@@ -130,10 +140,8 @@ ZSTDLIB_API size_t ZSTD_decompressDCtx(ZSTD_DCtx* ctx, void* dst, size_t dstCapa
 *  Simple dictionary API
 ***************************/
 /*! ZSTD_compress_usingDict() :
-*   Compression using a pre-defined Dictionary content (see dictBuilder).
-*   Note 1 : This function load the dictionary, resulting in a significant startup time.
-*   Note 2 : `dict` must remain accessible and unmodified during compression operation.
-*   Note 3 : `dict` can be `NULL`, in which case, it's equivalent to ZSTD_compressCCtx() */
+*   Compression using a predefined Dictionary (see dictBuilder/zdict.h).
+*   Note : This function load the dictionary, resulting in a significant startup time. */
 ZSTDLIB_API size_t ZSTD_compress_usingDict(ZSTD_CCtx* ctx,
                                            void* dst, size_t dstCapacity,
                                      const void* src, size_t srcSize,
@@ -141,11 +149,9 @@ ZSTDLIB_API size_t ZSTD_compress_usingDict(ZSTD_CCtx* ctx,
                                            int compressionLevel);
 
 /*! ZSTD_decompress_usingDict() :
-*   Decompression using a pre-defined Dictionary content (see dictBuilder).
+*   Decompression using a predefined Dictionary (see dictBuilder/zdict.h).
 *   Dictionary must be identical to the one used during compression.
-*   Note 1 : This function load the dictionary, resulting in a significant startup time
-*   Note 2 : `dict` must remain accessible and unmodified during compression operation.
-*   Note 3 : `dict` can be `NULL`, in which case, it's equivalent to ZSTD_decompressDCtx() */
+*   Note : This function load the dictionary, resulting in a significant startup time */
 ZSTDLIB_API size_t ZSTD_decompress_usingDict(ZSTD_DCtx* dctx,
                                              void* dst, size_t dstCapacity,
                                        const void* src, size_t srcSize,
@@ -153,7 +159,7 @@ ZSTDLIB_API size_t ZSTD_decompress_usingDict(ZSTD_DCtx* dctx,
 
 
 /*-**************************
-*  Advanced Dictionary API
+*  Fast Dictionary API
 ****************************/
 /*! ZSTD_createCDict() :
 *   Create a digested dictionary, ready to start compression operation without startup delay.
@@ -163,8 +169,8 @@ ZSTDLIB_API ZSTD_CDict* ZSTD_createCDict(const void* dict, size_t dictSize, int
 ZSTDLIB_API size_t      ZSTD_freeCDict(ZSTD_CDict* CDict);
 
 /*! ZSTD_compress_usingCDict() :
-*   Compression using a pre-digested Dictionary.
-*   Much faster than ZSTD_compress_usingDict() when same dictionary is used multiple times.
+*   Compression using a digested Dictionary.
+*   Faster startup than ZSTD_compress_usingDict(), recommended when same dictionary is used multiple times.
 *   Note that compression level is decided during dictionary creation */
 ZSTDLIB_API size_t ZSTD_compress_usingCDict(ZSTD_CCtx* cctx,
                                             void* dst, size_t dstCapacity,
@@ -179,15 +185,14 @@ ZSTDLIB_API ZSTD_DDict* ZSTD_createDDict(const void* dict, size_t dictSize);
 ZSTDLIB_API size_t      ZSTD_freeDDict(ZSTD_DDict* ddict);
 
 /*! ZSTD_decompress_usingDDict() :
-*   Decompression using a pre-digested Dictionary
-*   Much faster than ZSTD_decompress_usingDict() when same dictionary is used multiple times. */
+*   Decompression using a digested Dictionary
+*   Faster startup than ZSTD_decompress_usingDict(), recommended when same dictionary is used multiple times. */
 ZSTDLIB_API size_t ZSTD_decompress_usingDDict(ZSTD_DCtx* dctx,
                                               void* dst, size_t dstCapacity,
                                         const void* src, size_t srcSize,
                                         const ZSTD_DDict* ddict);
 
 
-
 #ifdef ZSTD_STATIC_LINKING_ONLY
 
 /* ====================================================================================
@@ -197,22 +202,19 @@ ZSTDLIB_API size_t ZSTD_decompress_usingDDict(ZSTD_DCtx* dctx,
  * Use them only in association with static linking.
  * ==================================================================================== */
 
-/*--- Dependency ---*/
-#include "mem.h"   /* U32 */
-
-
 /*--- Constants ---*/
-#define ZSTD_MAGICNUMBER            0xFD2FB527   /* v0.7 */
+#define ZSTD_MAGICNUMBER            0xFD2FB528   /* v0.8 */
 #define ZSTD_MAGIC_SKIPPABLE_START  0x184D2A50U
 
-#define ZSTD_WINDOWLOG_MAX    ((U32)(MEM_32bits() ? 25 : 27))
+#define ZSTD_WINDOWLOG_MAX_32  25
+#define ZSTD_WINDOWLOG_MAX_64  27
+#define ZSTD_WINDOWLOG_MAX    ((U32)(MEM_32bits() ? ZSTD_WINDOWLOG_MAX_32 : ZSTD_WINDOWLOG_MAX_64))
 #define ZSTD_WINDOWLOG_MIN     18
 #define ZSTD_CHAINLOG_MAX     (ZSTD_WINDOWLOG_MAX+1)
 #define ZSTD_CHAINLOG_MIN       4
 #define ZSTD_HASHLOG_MAX       ZSTD_WINDOWLOG_MAX
 #define ZSTD_HASHLOG_MIN       12
 #define ZSTD_HASHLOG3_MAX      17
-#define ZSTD_HASHLOG3_MIN      15
 #define ZSTD_SEARCHLOG_MAX    (ZSTD_WINDOWLOG_MAX-1)
 #define ZSTD_SEARCHLOG_MIN      1
 #define ZSTD_SEARCHLENGTH_MAX   7
@@ -227,22 +229,22 @@ static const size_t ZSTD_skippableHeaderSize = 8;  /* magic number + skippable f
 
 
 /*--- Types ---*/
-typedef enum { ZSTD_fast, ZSTD_greedy, ZSTD_lazy, ZSTD_lazy2, ZSTD_btlazy2, ZSTD_btopt } ZSTD_strategy;   /*< from faster to stronger */
+typedef enum { ZSTD_fast, ZSTD_dfast, ZSTD_greedy, ZSTD_lazy, ZSTD_lazy2, ZSTD_btlazy2, ZSTD_btopt } ZSTD_strategy;   /*< from faster to stronger */
 
 typedef struct {
-    U32 windowLog;     /*< largest match distance : larger == more compression, more memory needed during decompression */
-    U32 chainLog;      /*< fully searched segment : larger == more compression, slower, more memory (useless for fast) */
-    U32 hashLog;       /*< dispatch table : larger == faster, more memory */
-    U32 searchLog;     /*< nb of searches : larger == more compression, slower */
-    U32 searchLength;  /*< match length searched : larger == faster decompression, sometimes less compression */
-    U32 targetLength;  /*< acceptable match size for optimal parser (only) : larger == more compression, slower */
+    unsigned windowLog;      /*< largest match distance : larger == more compression, more memory needed during decompression */
+    unsigned chainLog;       /*< fully searched segment : larger == more compression, slower, more memory (useless for fast) */
+    unsigned hashLog;        /*< dispatch table : larger == faster, more memory */
+    unsigned searchLog;      /*< nb of searches : larger == more compression, slower */
+    unsigned searchLength;   /*< match length searched : larger == faster decompression, sometimes less compression */
+    unsigned targetLength;   /*< acceptable match size for optimal parser (only) : larger == more compression, slower */
     ZSTD_strategy strategy;
 } ZSTD_compressionParameters;
 
 typedef struct {
-    U32 contentSizeFlag;  /*< 1: content size will be in frame header (if known). */
-    U32 checksumFlag;     /*< 1: will generate a 22-bits checksum at end of frame, to be used for error detection by decompressor */
-    U32 noDictIDFlag;     /*< 1: no dict ID will be saved into frame header (if dictionary compression) */
+    unsigned contentSizeFlag; /*< 1: content size will be in frame header (if known). */
+    unsigned checksumFlag;    /*< 1: will generate a 22-bits checksum at end of frame, to be used for error detection by decompressor */
+    unsigned noDictIDFlag;    /*< 1: no dict ID will be saved into frame header (if dictionary compression) */
 } ZSTD_frameParameters;
 
 typedef struct {
@@ -259,6 +261,11 @@ typedef struct { ZSTD_allocFunction customAlloc; ZSTD_freeFunction customFree; v
 /*-*************************************
 *  Advanced compression functions
 ***************************************/
+/*! ZSTD_estimateCCtxSize() :
+ *  Gives the amount of memory allocated for a ZSTD_CCtx given a set of compression parameters.
+ *  `frameContentSize` is an optional parameter, provide `0` if unknown */
+ZSTDLIB_API size_t ZSTD_estimateCCtxSize(ZSTD_compressionParameters cParams);
+
 /*! ZSTD_createCCtx_advanced() :
  *  Create a ZSTD compression context using external alloc and free functions */
 ZSTDLIB_API ZSTD_CCtx* ZSTD_createCCtx_advanced(ZSTD_customMem customMem);
@@ -268,21 +275,28 @@ ZSTDLIB_API ZSTD_CCtx* ZSTD_createCCtx_advanced(ZSTD_customMem customMem);
 ZSTDLIB_API ZSTD_CDict* ZSTD_createCDict_advanced(const void* dict, size_t dictSize,
                                                   ZSTD_parameters params, ZSTD_customMem customMem);
 
-ZSTDLIB_API unsigned ZSTD_maxCLevel (void);
+/*! ZSTD_sizeofCCtx() :
+ *  Gives the amount of memory used by a given ZSTD_CCtx */
+ZSTDLIB_API size_t ZSTD_sizeofCCtx(const ZSTD_CCtx* cctx);
+
+/*! ZSTD_getParams() :
+*   same as ZSTD_getCParams(), but @return a full `ZSTD_parameters` object instead of a `ZSTD_compressionParameters`.
+*   All fields of `ZSTD_frameParameters` are set to default (0) */
+ZSTD_parameters ZSTD_getParams(int compressionLevel, unsigned long long srcSize, size_t dictSize);
 
 /*! ZSTD_getCParams() :
 *   @return ZSTD_compressionParameters structure for a selected compression level and srcSize.
 *   `srcSize` value is optional, select 0 if not known */
-ZSTDLIB_API ZSTD_compressionParameters ZSTD_getCParams(int compressionLevel, U64 srcSize, size_t dictSize);
+ZSTDLIB_API ZSTD_compressionParameters ZSTD_getCParams(int compressionLevel, unsigned long long srcSize, size_t dictSize);
 
-/*! ZSTD_checkParams() :
+/*! ZSTD_checkCParams() :
 *   Ensure param values remain within authorized range */
 ZSTDLIB_API size_t ZSTD_checkCParams(ZSTD_compressionParameters params);
 
-/*! ZSTD_adjustParams() :
+/*! ZSTD_adjustCParams() :
 *   optimize params for a given `srcSize` and `dictSize`.
 *   both values are optional, select `0` if unknown. */
-ZSTDLIB_API ZSTD_compressionParameters ZSTD_adjustCParams(ZSTD_compressionParameters cPar, U64 srcSize, size_t dictSize);
+ZSTDLIB_API ZSTD_compressionParameters ZSTD_adjustCParams(ZSTD_compressionParameters cPar, unsigned long long srcSize, size_t dictSize);
 
 /*! ZSTD_compress_advanced() :
 *   Same as ZSTD_compress_usingDict(), with fine-tune control of each compression parameter */
@@ -295,21 +309,34 @@ ZSTDLIB_API size_t ZSTD_compress_advanced (ZSTD_CCtx* ctx,
 
 /*--- Advanced Decompression functions ---*/
 
+/*! ZSTD_estimateDCtxSize() :
+ *  Gives the potential amount of memory allocated to create a ZSTD_DCtx */
+ZSTDLIB_API size_t ZSTD_estimateDCtxSize(void);
+
 /*! ZSTD_createDCtx_advanced() :
  *  Create a ZSTD decompression context using external alloc and free functions */
 ZSTDLIB_API ZSTD_DCtx* ZSTD_createDCtx_advanced(ZSTD_customMem customMem);
 
+/*! ZSTD_sizeofDCtx() :
+ *  Gives the amount of memory used by a given ZSTD_DCtx */
+ZSTDLIB_API size_t ZSTD_sizeofDCtx(const ZSTD_DCtx* dctx);
+
+
+/* ******************************************************************
+*  Buffer-less streaming functions (synchronous mode)
+********************************************************************/
+/* This is an advanced API, giving full control over buffer management, for users which need direct control over memory.
+*  But it's also a complex one, with a lot of restrictions (documented below).
+*  For an easier streaming API, look into common/zbuff.h
+*  which removes all restrictions by allocating and managing its own internal buffer */
 
-/* ****************************************************************
-*  Streaming functions (direct mode - synchronous and buffer-less)
-******************************************************************/
 ZSTDLIB_API size_t ZSTD_compressBegin(ZSTD_CCtx* cctx, int compressionLevel);
 ZSTDLIB_API size_t ZSTD_compressBegin_usingDict(ZSTD_CCtx* cctx, const void* dict, size_t dictSize, int compressionLevel);
-ZSTDLIB_API size_t ZSTD_compressBegin_advanced(ZSTD_CCtx* cctx, const void* dict, size_t dictSize, ZSTD_parameters params, U64 pledgedSrcSize);
+ZSTDLIB_API size_t ZSTD_compressBegin_advanced(ZSTD_CCtx* cctx, const void* dict, size_t dictSize, ZSTD_parameters params, unsigned long long pledgedSrcSize);
 ZSTDLIB_API size_t ZSTD_copyCCtx(ZSTD_CCtx* cctx, const ZSTD_CCtx* preparedCCtx);
 
 ZSTDLIB_API size_t ZSTD_compressContinue(ZSTD_CCtx* cctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize);
-ZSTDLIB_API size_t ZSTD_compressEnd(ZSTD_CCtx* cctx, void* dst, size_t dstCapacity);
+ZSTDLIB_API size_t ZSTD_compressEnd(ZSTD_CCtx* cctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize);
 
 /*
   A ZSTD_CCtx object is required to track streaming operations.
@@ -324,7 +351,7 @@ ZSTDLIB_API size_t ZSTD_compressEnd(ZSTD_CCtx* cctx, void* dst, size_t dstCapaci
   Then, consume your input using ZSTD_compressContinue().
   There are some important considerations to keep in mind when using this advanced function :
   - ZSTD_compressContinue() has no internal buffer. It uses externally provided buffer only.
-  - Interface is synchronous : input will be entirely consumed and produce 1+ compressed blocks.
+  - Interface is synchronous : input is consumed entirely and produce 1+ (or more) compressed blocks.
   - Caller must ensure there is enough space in `dst` to store compressed data under worst case scenario.
     Worst case evaluation is provided by ZSTD_compressBound().
     ZSTD_compressContinue() doesn't guarantee recover after a failed compression.
@@ -333,21 +360,21 @@ ZSTDLIB_API size_t ZSTD_compressEnd(ZSTD_CCtx* cctx, void* dst, size_t dstCapaci
   - ZSTD_compressContinue() detects that prior input has been overwritten when `src` buffer overlaps.
     In which case, it will "discard" the relevant memory section from its history.
 
-
-  Finish a frame with ZSTD_compressEnd(), which will write the epilogue.
-  Without epilogue, frames will be considered unfinished (broken) by decoders.
+  Finish a frame with ZSTD_compressEnd(), which will write the last block(s) and optional checksum.
+  It's possible to use a NULL,0 src content, in which case, it will write a final empty block to end the frame,
+  Without last block mark, frames will be considered unfinished (broken) by decoders.
 
   You can then reuse `ZSTD_CCtx` (ZSTD_compressBegin()) to compress some new frame.
 */
 
 typedef struct {
-    U64 frameContentSize;
-    U32 windowSize;
-    U32 dictID;
-    U32 checksumFlag;
+    unsigned long long frameContentSize;
+    unsigned windowSize;
+    unsigned dictID;
+    unsigned checksumFlag;
 } ZSTD_frameParams;
 
-ZSTDLIB_API size_t ZSTD_getFrameParams(ZSTD_frameParams* fparamsPtr, const void* src, size_t srcSize);   /**< doesn't consume input */
+ZSTDLIB_API size_t ZSTD_getFrameParams(ZSTD_frameParams* fparamsPtr, const void* src, size_t srcSize);   /**< doesn't consume input, see details below */
 
 ZSTDLIB_API size_t ZSTD_decompressBegin(ZSTD_DCtx* dctx);
 ZSTDLIB_API size_t ZSTD_decompressBegin_usingDict(ZSTD_DCtx* dctx, const void* dict, size_t dictSize);
@@ -356,41 +383,58 @@ ZSTDLIB_API void   ZSTD_copyDCtx(ZSTD_DCtx* dctx, const ZSTD_DCtx* preparedDCtx)
 ZSTDLIB_API size_t ZSTD_nextSrcSizeToDecompress(ZSTD_DCtx* dctx);
 ZSTDLIB_API size_t ZSTD_decompressContinue(ZSTD_DCtx* dctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize);
 
+typedef enum { ZSTDnit_frameHeader, ZSTDnit_blockHeader, ZSTDnit_block, ZSTDnit_lastBlock, ZSTDnit_checksum, ZSTDnit_skippableFrame } ZSTD_nextInputType_e;
+ZSTDLIB_API ZSTD_nextInputType_e ZSTD_nextInputType(ZSTD_DCtx* dctx);
+
 /*
-  Streaming decompression, direct mode (bufferless)
+  Buffer-less streaming decompression (synchronous mode)
 
   A ZSTD_DCtx object is required to track streaming operations.
   Use ZSTD_createDCtx() / ZSTD_freeDCtx() to manage it.
   A ZSTD_DCtx object can be re-used multiple times.
 
-  First optional operation is to retrieve frame parameters, using ZSTD_getFrameParams(), which doesn't consume the input.
-  It can provide the minimum size of rolling buffer required to properly decompress data (`windowSize`),
-  and optionally the final size of uncompressed content.
-  (Note : content size is an optional info that may not be present. 0 means : content size unknown)
-  Frame parameters are extracted from the beginning of compressed frame.
-  The amount of data to read is variable, from ZSTD_frameHeaderSize_min to ZSTD_frameHeaderSize_max (so if `srcSize` >= ZSTD_frameHeaderSize_max, it will always work)
-  If `srcSize` is too small for operation to succeed, function will return the minimum size it requires to produce a result.
-  Result : 0 when successful, it means the ZSTD_frameParams structure has been filled.
-          >0 : means there is not enough data into `src`. Provides the expected size to successfully decode header.
-           errorCode, which can be tested using ZSTD_isError()
+  First typical operation is to retrieve frame parameters, using ZSTD_getFrameParams().
+  It fills a ZSTD_frameParams structure which provide important information to correctly decode the frame,
+  such as the minimum rolling buffer size to allocate to decompress data (`windowSize`),
+  and the dictionary ID used.
+  (Note : content size is optional, it may not be present. 0 means : content size unknown).
+  Note that these values could be wrong, either because of data malformation, or because an attacker is spoofing deliberate false information.
+  As a consequence, check that values remain within valid application range, especially `windowSize`, before allocation.
+  Each application can set its own limit, depending on local restrictions. For extended interoperability, it is recommended to support at least 8 MB.
+  Frame parameters are extracted from the beginning of the compressed frame.
+  Data fragment must be large enough to ensure successful decoding, typically `ZSTD_frameHeaderSize_max` bytes.
+  @result : 0 : successful decoding, the `ZSTD_frameParams` structure is correctly filled.
+           >0 : `srcSize` is too small, please provide at least @result bytes on next attempt.
+           errorCode, which can be tested using ZSTD_isError().
 
   Start decompression, with ZSTD_decompressBegin() or ZSTD_decompressBegin_usingDict().
   Alternatively, you can copy a prepared context, using ZSTD_copyDCtx().
 
   Then use ZSTD_nextSrcSizeToDecompress() and ZSTD_decompressContinue() alternatively.
-  ZSTD_nextSrcSizeToDecompress() tells how much bytes to provide as 'srcSize' to ZSTD_decompressContinue().
-  ZSTD_decompressContinue() requires this exact amount of bytes, or it will fail.
-  ZSTD_decompressContinue() needs previous data blocks during decompression, up to `windowSize`.
-  They should preferably be located contiguously, prior to current block. Alternatively, a round buffer is also possible.
+  ZSTD_nextSrcSizeToDecompress() tells how many bytes to provide as 'srcSize' to ZSTD_decompressContinue().
+  ZSTD_decompressContinue() requires this _exact_ amount of bytes, or it will fail.
 
   @result of ZSTD_decompressContinue() is the number of bytes regenerated within 'dst' (necessarily <= dstCapacity).
-  It can be zero, which is not an error; it just means ZSTD_decompressContinue() has decoded some header.
+  It can be zero, which is not an error; it just means ZSTD_decompressContinue() has decoded some metadata item.
+  It can also be an error code, which can be tested with ZSTD_isError().
+
+  ZSTD_decompressContinue() needs previous data blocks during decompression, up to `windowSize`.
+  They should preferably be located contiguously, prior to current block.
+  Alternatively, a round buffer of sufficient size is also possible. Sufficient size is determined by frame parameters.
+  ZSTD_decompressContinue() is very sensitive to contiguity,
+  if 2 blocks don't follow each other, make sure that either the compressor breaks contiguity at the same place,
+  or that previous contiguous segment is large enough to properly handle maximum back-reference.
 
   A frame is fully decoded when ZSTD_nextSrcSizeToDecompress() returns zero.
   Context can then be reset to start a new decompression.
 
-  Skippable frames allow the integration of user-defined data into a flow of concatenated frames.
-  Skippable frames will be ignored (skipped) by a decompressor. The format of skippable frame is following:
+  Note : it's possible to know if next input to present is a header or a block, using ZSTD_nextInputType().
+  This information is not required to properly decode a frame.
+
+  == Special case : skippable frames ==
+
+  Skippable frames allow integration of user-defined data into a flow of concatenated frames.
+  Skippable frames will be ignored (skipped) by a decompressor. The format of skippable frames is as follows :
   a) Skippable frame ID - 4 Bytes, Little endian format, any value from 0x184D2A50 to 0x184D2A5F
   b) Frame Size - 4 Bytes, Little endian format, unsigned 32-bits
   c) Frame Content - any content (User Data) of length equal to Frame Size
@@ -404,37 +448,33 @@ ZSTDLIB_API size_t ZSTD_decompressContinue(ZSTD_DCtx* dctx, void* dst, size_t ds
 *  Block functions
 ****************************************/
 /*! Block functions produce and decode raw zstd blocks, without frame metadata.
+    Frame metadata cost is typically ~18 bytes, which can be non-negligible for very small blocks (< 100 bytes).
     User will have to take in charge required information to regenerate data, such as compressed and content sizes.
 
     A few rules to respect :
-    - Uncompressed block size must be <= ZSTD_BLOCKSIZE_MAX (128 KB)
-    - Compressing or decompressing requires a context structure
+    - Compressing and decompressing require a context structure
       + Use ZSTD_createCCtx() and ZSTD_createDCtx()
     - It is necessary to init context before starting
       + compression : ZSTD_compressBegin()
       + decompression : ZSTD_decompressBegin()
       + variants _usingDict() are also allowed
       + copyCCtx() and copyDCtx() work too
+    - Block size is limited, it must be <= ZSTD_getBlockSizeMax()
+      + If you need to compress more, cut data into multiple blocks
+      + Consider using the regular ZSTD_compress() instead, as frame metadata costs become negligible when source size is large.
     - When a block is considered not compressible enough, ZSTD_compressBlock() result will be zero.
       In which case, nothing is produced into `dst`.
       + User must test for such outcome and deal directly with uncompressed data
-      + ZSTD_decompressBlock() doesn't accept uncompressed data as input !!
+      + ZSTD_decompressBlock() doesn't accept uncompressed data as input !!!
+      + In case of multiple successive blocks, decoder must be informed of uncompressed block existence to follow proper history.
+        Use ZSTD_insertBlock() in such a case.
 */
 
-#define ZSTD_BLOCKSIZE_MAX (128 * 1024)   /* define, for static allocation */
+#define ZSTD_BLOCKSIZE_ABSOLUTEMAX (128 * 1024)   /* define, for static allocation */
+ZSTDLIB_API size_t ZSTD_getBlockSizeMax(ZSTD_CCtx* cctx);
 ZSTDLIB_API size_t ZSTD_compressBlock  (ZSTD_CCtx* cctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize);
 ZSTDLIB_API size_t ZSTD_decompressBlock(ZSTD_DCtx* dctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize);
-
-
-/*-*************************************
-*  Error management
-***************************************/
-#include "error_public.h"
-/*! ZSTD_getErrorCode() :
-    convert a `size_t` function result into a `ZSTD_ErrorCode` enum type,
-    which can be used to compare directly with enum list published into "error_public.h" */
-ZSTDLIB_API ZSTD_ErrorCode ZSTD_getErrorCode(size_t functionResult);
-ZSTDLIB_API const char* ZSTD_getErrorString(ZSTD_ErrorCode code);
+ZSTDLIB_API size_t ZSTD_insertBlock(ZSTD_DCtx* dctx, const void* blockStart, size_t blockSize);  /**< insert block into `dctx` history. Useful for uncompressed blocks */
 
 
 #endif   /* ZSTD_STATIC_LINKING_ONLY */
diff --git a/programs/.gitignore b/programs/.gitignore
index cbe39dc..adf7808 100644
--- a/programs/.gitignore
+++ b/programs/.gitignore
@@ -11,6 +11,7 @@ zbufftest
 zbufftest32
 datagen
 paramgrill
+paramgrill32
 roundTripCrash
 
 # Object files
@@ -43,6 +44,7 @@ _*
 tmp*
 *.zst
 result
+out
 
 # fuzzer
 afl
diff --git a/programs/Makefile b/programs/Makefile
index 52a7ca0..be6fbf2 100644
--- a/programs/Makefile
+++ b/programs/Makefile
@@ -38,15 +38,23 @@ MANDIR  = $(PREFIX)/share/man/man1
 
 ZSTDDIR = ../lib
 
-CPPFLAGS= -I$(ZSTDDIR)/common -I$(ZSTDDIR)/dictBuilder -DXXH_NAMESPACE=ZSTD_
-CFLAGS ?= -O3  # -falign-loops=32   # not always beneficial
-CFLAGS += -Wall -Wextra -Wcast-qual -Wcast-align -Wshadow -Wstrict-aliasing=1 -Wswitch-enum -Wdeclaration-after-statement -Wstrict-prototypes -Wundef
-FLAGS   = $(CPPFLAGS) $(CFLAGS) $(LDFLAGS) $(MOREFLAGS)
+ifeq ($(shell $(CC) -v 2>&1 | grep -c "gcc version "), 1)
+ALIGN_LOOP = -falign-loops=32
+else
+ALIGN_LOOP =
+endif
+
+CPPFLAGS= -I$(ZSTDDIR) -I$(ZSTDDIR)/common -I$(ZSTDDIR)/dictBuilder -DXXH_NAMESPACE=ZSTD_
+CFLAGS ?= -O3
+CFLAGS += -Wall -Wextra -Wcast-qual -Wcast-align -Wshadow -Wstrict-aliasing=1 \
+          -Wswitch-enum -Wdeclaration-after-statement -Wstrict-prototypes -Wundef
+CFLAGS += $(MOREFLAGS)
+FLAGS   = $(CPPFLAGS) $(CFLAGS) $(LDFLAGS)
 
 
 ZSTDCOMMON_FILES := $(ZSTDDIR)/common/*.c
 ZSTDCOMP_FILES := $(ZSTDDIR)/compress/zstd_compress.c $(ZSTDDIR)/compress/fse_compress.c $(ZSTDDIR)/compress/huf_compress.c
-ZSTDDECOMP_FILES := $(ZSTDDIR)/decompress/huf_decompress.c $(ZSTDDIR)/decompress/zstd_decompress.c
+ZSTDDECOMP_FILES := $(ZSTDDIR)/decompress/zstd_decompress.o $(ZSTDDIR)/decompress/huf_decompress.c
 ZSTD_FILES := $(ZSTDDECOMP_FILES) $(ZSTDCOMMON_FILES) $(ZSTDCOMP_FILES)
 ZBUFF_FILES := $(ZSTDDIR)/compress/zbuff_compress.c $(ZSTDDIR)/decompress/zbuff_decompress.c
 ZDICT_FILES := $(ZSTDDIR)/dictBuilder/*.c
@@ -74,19 +82,25 @@ ZBUFFTEST = -T2mn
 FUZZERTEST= -T5mn
 ZSTDRTTEST= --test-large-data
 
-.PHONY: default all clean install uninstall test test32 test-all
+.PHONY: default all all32 clean install uninstall test test32 test-all
 
 default: zstd
 
-all: zstd fullbench fuzzer zbufftest paramgrill datagen zstd32 fullbench32 fuzzer32 zbufftest32
+all: zstd fullbench fuzzer zbufftest paramgrill datagen
+
+all32: CFLAGS += -m32
+all32: EXT := 32$(EXT)
+all32: cleano32 all
+
+$(ZSTDDIR)/decompress/zstd_decompress.o: CFLAGS += $(ALIGN_LOOP)
 
 zstd  : $(ZSTD_FILES) $(ZSTDLEGACY_FILES) $(ZBUFF_FILES) $(ZDICT_FILES) \
         zstdcli.c fileio.c bench.c datagen.c dibio.c
 	$(CC)      $(FLAGS) -DZSTD_LEGACY_SUPPORT=$(ZSTD_LEGACY_SUPPORT) $^ -o $@$(EXT)
 
-zstd32: $(ZSTD_FILES) $(ZSTDLEGACY_FILES) $(ZBUFF_FILES) $(ZDICT_FILES) \
-        zstdcli.c fileio.c bench.c datagen.c dibio.c
-	$(CC) -m32 $(FLAGS) -DZSTD_LEGACY_SUPPORT=$(ZSTD_LEGACY_SUPPORT) $^ -o $@$(EXT)
+zstd32: CFLAGS += -m32
+zstd32: EXT := 32$(EXT)
+zstd32: zstd
 
 zstd_nolegacy :
 	$(MAKE) zstd ZSTD_LEGACY_SUPPORT=0
@@ -119,22 +133,24 @@ zstd-small: clean
 fullbench  : $(ZSTD_FILES) $(ZBUFF_FILES) datagen.c fullbench.c
 	$(CC)      $(FLAGS) $^ -o $@$(EXT)
 
-fullbench32: $(ZSTD_FILES) $(ZBUFF_FILES) datagen.c fullbench.c
-	$(CC) -m32 $(FLAGS) $^ -o $@$(EXT)
+fullbench32 : CFLAGS += -m32
+fullbench32 : EXT := 32$(EXT)
+fullbench32 : fullbench
 
 fuzzer  : CPPFLAGS += -I$(ZSTDDIR)/dictBuilder
 fuzzer  : $(ZSTD_FILES) $(ZDICT_FILES) datagen.c fuzzer.c
 	$(CC)      $(FLAGS) $^ -o $@$(EXT)
 
-fuzzer32 : CPPFLAGS += -I$(ZSTDDIR)/dictBuilder
-fuzzer32: $(ZSTD_FILES) $(ZDICT_FILES) datagen.c fuzzer.c
-	$(CC) -m32 $(FLAGS) $^ -o $@$(EXT)
+fuzzer32 : CFLAGS += -m32
+fuzzer32 : EXT := 32$(EXT)
+fuzzer32 : fuzzer
 
 zbufftest  : $(ZSTD_FILES) $(ZBUFF_FILES) datagen.c zbufftest.c
 	$(CC)      $(FLAGS) $^ -o $@$(EXT)
 
-zbufftest32: $(ZSTD_FILES) $(ZBUFF_FILES) datagen.c zbufftest.c
-	$(CC) -m32 $(FLAGS) $^ -o $@$(EXT)
+zbufftest32 : CFLAGS += -m32
+zbufftest32 : EXT := 32$(EXT)
+zbufftest32 : zbufftest
 
 paramgrill : $(ZSTD_FILES) datagen.c paramgrill.c
 	$(CC)      $(FLAGS) $^ -lm -o $@$(EXT)
@@ -146,6 +162,7 @@ roundTripCrash : $(ZSTD_FILES) roundTripCrash.c
 	$(CC)      $(FLAGS) $^ -o $@$(EXT)
 
 clean:
+	$(MAKE) -C ../lib clean
 	@rm -f core *.o tmp* result* *.gcda dictionary *.zst \
         zstd$(EXT) zstd32$(EXT) zstd-compress$(EXT) zstd-decompress$(EXT) \
         fullbench$(EXT) fullbench32$(EXT) \
@@ -153,11 +170,13 @@ clean:
         datagen$(EXT) paramgrill$(EXT) roundTripCrash$(EXT)
 	@echo Cleaning completed
 
+cleano32:
+	@rm -f ../lib/decompress/*.o
 
-#------------------------------------------------------------------------
-#make install is validated only for Linux, OSX, kFreeBSD and Hurd targets
-#------------------------------------------------------------------------
-ifneq (,$(filter $(shell uname),Linux Darwin GNU/kFreeBSD GNU))
+#----------------------------------------------------------------------------------
+#make install is validated only for Linux, OSX, kFreeBSD, Hurd and some BSD targets
+#----------------------------------------------------------------------------------
+ifneq (,$(filter $(shell uname),Linux Darwin GNU/kFreeBSD GNU OpenBSD FreeBSD DragonFly))
 HOST_OS = POSIX
 install: zstd
 	@echo Installing binaries
diff --git a/programs/bench.c b/programs/bench.c
index a8fc740..f4bff88 100644
--- a/programs/bench.c
+++ b/programs/bench.c
@@ -142,22 +142,20 @@ static int BMK_benchMem(const void* srcBuffer, size_t srcSize,
                         const size_t* fileSizes, U32 nbFiles,
                         const void* dictBuffer, size_t dictBufferSize, benchResult_t *result)
 {
-    size_t const blockSize = (g_blockSize>=32 ? g_blockSize : srcSize) + (!srcSize);   /* avoid div by 0 */
+    size_t const blockSize = (g_blockSize>=32 ? g_blockSize : srcSize) + (!srcSize) /* avoid div by 0 */ ;
     U32 const maxNbBlocks = (U32) ((srcSize + (blockSize-1)) / blockSize) + nbFiles;
     blockParam_t* const blockTable = (blockParam_t*) malloc(maxNbBlocks * sizeof(blockParam_t));
     size_t const maxCompressedSize = ZSTD_compressBound(srcSize) + (maxNbBlocks * 1024);   /* add some room for safety */
     void* const compressedBuffer = malloc(maxCompressedSize);
     void* const resultBuffer = malloc(srcSize);
-    ZSTD_CCtx* refCtx = ZSTD_createCCtx();
-    ZSTD_CCtx* ctx = ZSTD_createCCtx();
-    ZSTD_DCtx* refDCtx = ZSTD_createDCtx();
-    ZSTD_DCtx* dctx = ZSTD_createDCtx();
+    ZSTD_CCtx* const ctx = ZSTD_createCCtx();
+    ZSTD_DCtx* const dctx = ZSTD_createDCtx();
     U32 nbBlocks;
     UTIL_time_t ticksPerSecond;
 
     /* checks */
-    if (!compressedBuffer || !resultBuffer || !blockTable || !refCtx || !ctx || !refDCtx || !dctx)
-        EXM_THROW(31, "not enough memory");
+    if (!compressedBuffer || !resultBuffer || !blockTable || !ctx || !dctx)
+        EXM_THROW(31, "allocation error : not enough memory");
 
     /* init */
     if (strlen(displayName)>17) displayName += strlen(displayName)-17;   /* can only display 17 characters */
@@ -204,7 +202,7 @@ static int BMK_benchMem(const void* srcBuffer, size_t srcSize,
 
             /* overheat protection */
             if (UTIL_clockSpanMicro(coolTime, ticksPerSecond) > ACTIVEPERIOD_MICROSEC) {
-                DISPLAY("\rcooling down ...    \r");
+                DISPLAYLEVEL(2, "\rcooling down ...    \r");
                 UTIL_sleep(COOLPERIOD_SEC);
                 UTIL_getTime(&coolTime);
             }
@@ -213,13 +211,17 @@ static int BMK_benchMem(const void* srcBuffer, size_t srcSize,
             DISPLAYLEVEL(2, "%2i-%-17.17s :%10u ->\r", testNb, displayName, (U32)srcSize);
             memset(compressedBuffer, 0xE5, maxCompressedSize);  /* warm up and erase result buffer */
 
-            UTIL_sleepMilli(1); /* give processor time to other processes */
+            UTIL_sleepMilli(1);  /* give processor time to other processes */
             UTIL_waitForNextTick(ticksPerSecond);
             UTIL_getTime(&clockStart);
 
-            {   U32 nbLoops = 0;
-                ZSTD_CDict* cdict = ZSTD_createCDict(dictBuffer, dictBufferSize, cLevel);
-                if (cdict==NULL) EXM_THROW(1, "ZSTD_createCDict() allocation failure");
+            {   //size_t const refSrcSize = (nbBlocks == 1) ? srcSize : 0;
+                //ZSTD_parameters const zparams = ZSTD_getParams(cLevel, refSrcSize, dictBufferSize);
+                ZSTD_parameters const zparams = ZSTD_getParams(cLevel, blockSize, dictBufferSize);
+                ZSTD_customMem const cmem = { NULL, NULL, NULL };
+                U32 nbLoops = 0;
+                ZSTD_CDict* cdict = ZSTD_createCDict_advanced(dictBuffer, dictBufferSize, zparams, cmem);
+                if (cdict==NULL) EXM_THROW(1, "ZSTD_createCDict_advanced() allocation failure");
                 do {
                     U32 blockNb;
                     for (blockNb=0; blockNb<nbBlocks; blockNb++) {
@@ -227,7 +229,7 @@ static int BMK_benchMem(const void* srcBuffer, size_t srcSize,
                                             blockTable[blockNb].cPtr,  blockTable[blockNb].cRoom,
                                             blockTable[blockNb].srcPtr,blockTable[blockNb].srcSize,
                                             cdict);
-                        if (ZSTD_isError(rSize)) EXM_THROW(1, "ZSTD_compress_usingPreparedCCtx() failed : %s", ZSTD_getErrorName(rSize));
+                        if (ZSTD_isError(rSize)) EXM_THROW(1, "ZSTD_compress_usingCDict() failed : %s", ZSTD_getErrorName(rSize));
                         blockTable[blockNb].cSize = rSize;
                     }
                     nbLoops++;
@@ -264,7 +266,7 @@ static int BMK_benchMem(const void* srcBuffer, size_t srcSize,
                             blockTable[blockNb].cPtr, blockTable[blockNb].cSize,
                             ddict);
                         if (ZSTD_isError(regenSize)) {
-                            DISPLAY("ZSTD_decompress_usingPreparedDCtx() failed on block %u : %s  \n",
+                            DISPLAY("ZSTD_decompress_usingDDict() failed on block %u : %s  \n",
                                       blockNb, ZSTD_getErrorName(regenSize));
                             clockLoop = 0;   /* force immediate test end */
                             break;
@@ -321,9 +323,7 @@ static int BMK_benchMem(const void* srcBuffer, size_t srcSize,
     free(blockTable);
     free(compressedBuffer);
     free(resultBuffer);
-    ZSTD_freeCCtx(refCtx);
     ZSTD_freeCCtx(ctx);
-    ZSTD_freeDCtx(refDCtx);
     ZSTD_freeDCtx(dctx);
     return 0;
 }
@@ -352,7 +352,7 @@ static void BMK_benchCLevel(void* srcBuffer, size_t benchedSize,
                             const size_t* fileSizes, unsigned nbFiles,
                             const void* dictBuffer, size_t dictBufferSize)
 {
-    benchResult_t result, total;
+    benchResult_t result;
     int l;
 
     const char* pch = strrchr(displayName, '\\'); /* Windows */
@@ -362,7 +362,6 @@ static void BMK_benchCLevel(void* srcBuffer, size_t benchedSize,
     SET_HIGH_PRIORITY;
 
     memset(&result, 0, sizeof(result));
-    memset(&total, 0, sizeof(total));
 
     if (g_displayLevel == 1 && !g_additionalParam)
         DISPLAY("bench %s %s: input %u bytes, %i iterations, %u KB blocks\n", ZSTD_VERSION_STRING, ZSTD_GIT_COMMIT_STRING, (U32)benchedSize, g_nbIterations, (U32)(g_blockSize>>10));
@@ -379,18 +378,7 @@ static void BMK_benchCLevel(void* srcBuffer, size_t benchedSize,
                 DISPLAY("%-3i%11i (%5.3f) %6.2f MB/s %6.1f MB/s  %s (param=%d)\n", -l, (int)result.cSize, result.ratio, result.cSpeed, result.dSpeed, displayName, g_additionalParam);
             else
                 DISPLAY("%-3i%11i (%5.3f) %6.2f MB/s %6.1f MB/s  %s\n", -l, (int)result.cSize, result.ratio, result.cSpeed, result.dSpeed, displayName);
-            total.cSize += result.cSize;
-            total.cSpeed += result.cSpeed;
-            total.dSpeed += result.dSpeed;
-            total.ratio += result.ratio;
     }   }
-    if (g_displayLevel == 1 && cLevelLast > cLevel) {
-        total.cSize /= 1+cLevelLast-cLevel;
-        total.cSpeed /= 1+cLevelLast-cLevel;
-        total.dSpeed /= 1+cLevelLast-cLevel;
-        total.ratio /= 1+cLevelLast-cLevel;
-        DISPLAY("avg%11i (%5.3f) %6.2f MB/s %6.1f MB/s  %s\n", (int)total.cSize, total.ratio, total.cSpeed, total.dSpeed, displayName);
-    }
 }
 
 
diff --git a/programs/datagen.c b/programs/datagen.c
index ec118f5..6cb5111 100644
--- a/programs/datagen.c
+++ b/programs/datagen.c
@@ -23,12 +23,19 @@
     - source repository : https://github.com/Cyan4973/zstd
 */
 
+/* *************************************
+*  Compiler Options
+***************************************/
+#define _CRT_SECURE_NO_WARNINGS  /* removes Visual warning on strerror() */
+
+
 /*-************************************
-*  Includes
+*  Dependencies
 **************************************/
 #include <stdlib.h>    /* malloc */
 #include <stdio.h>     /* FILE, fwrite, fprintf */
 #include <string.h>    /* memcpy */
+#include <errno.h>     /* errno */
 #include "mem.h"       /* U32 */
 
 
@@ -87,12 +94,10 @@ static void RDG_fillLiteralDistrib(BYTE* ldt, double ld)
     U32 u;
 
     if (ld<=0.0) ld = 0.0;
-    //TRACE(" percent:%5.2f%% \n", ld*100.);
-    //TRACE(" start:(%c)[%02X] ", character, character);
     for (u=0; u<LTSIZE; ) {
         U32 const weight = (U32)((double)(LTSIZE - u) * ld) + 1;
         U32 const end = MIN ( u + weight , LTSIZE);
-        while (u < end) ldt[u++] = character;   // TRACE(" %u(%c)[%02X] ", u, character, character);
+        while (u < end) ldt[u++] = character;
         character++;
         if (character > lastChar) character = firstChar;
     }
@@ -102,9 +107,7 @@ static void RDG_fillLiteralDistrib(BYTE* ldt, double ld)
 static BYTE RDG_genChar(U32* seed, const BYTE* ldt)
 {
     U32 const id = RDG_rand(seed) & LTMASK;
-    //TRACE(" %u : \n", id);
-    //TRACE(" %4u [%4u] ; val : %4u \n", id, id&255, ldt[id]);
-    return (ldt[id]);  /* memory-sanitizer fails here, stating "uninitialized value" when table initialized with 0.0. Checked : table is fully initialized */
+    return ldt[id];  /* memory-sanitizer fails here, stating "uninitialized value" when table initialized with P==0.0. Checked : table is fully initialized */
 }
 
 
@@ -115,8 +118,7 @@ static U32 RDG_rand15Bits (unsigned* seedPtr)
 
 static U32 RDG_randLength(unsigned* seedPtr)
 {
-    if (RDG_rand(seedPtr) & 7)
-        return (RDG_rand(seedPtr) & 0xF);
+    if (RDG_rand(seedPtr) & 7) return (RDG_rand(seedPtr) & 0xF);   /* small length */
     return (RDG_rand(seedPtr) & 0x1FF) + 0xF;
 }
 
@@ -156,7 +158,6 @@ void RDG_genBlock(void* buffer, size_t buffSize, size_t prefixSize, double match
             U32 const randOffset = RDG_rand15Bits(seedPtr) + 1;
             U32 const offset = repeatOffset ? prevOffset : (U32) MIN(randOffset , pos);
             size_t match = pos - offset;
-            //TRACE("pos : %u; offset: %u ; length : %u \n", (U32)pos, offset, length);
             while (pos < d) buffPtr[pos++] = buffPtr[match++];   /* correctly manages overlaps */
             prevOffset = offset;
         } else {
@@ -171,9 +172,8 @@ void RDG_genBlock(void* buffer, size_t buffSize, size_t prefixSize, double match
 void RDG_genBuffer(void* buffer, size_t size, double matchProba, double litProba, unsigned seed)
 {
     BYTE ldt[LTSIZE];
-    memset(ldt, '0', sizeof(ldt));
+    memset(ldt, '0', sizeof(ldt));  /* yes, character '0', this is intentional */
     if (litProba<=0.0) litProba = matchProba / 4.5;
-    //TRACE(" percent:%5.2f%% \n", litProba*100.);
     RDG_fillLiteralDistrib(ldt, litProba);
     RDG_genBlock(buffer, size, 0, matchProba, ldt, &seed);
 }
@@ -185,12 +185,12 @@ void RDG_genStdout(unsigned long long size, double matchProba, double litProba,
     size_t const stdDictSize = 32 KB;
     BYTE* const buff = (BYTE*)malloc(stdDictSize + stdBlockSize);
     U64 total = 0;
-    BYTE ldt[LTSIZE];
+    BYTE ldt[LTSIZE];   /* literals distribution table */
 
     /* init */
-    if (buff==NULL) { fprintf(stdout, "not enough memory\n"); exit(1); }
+    if (buff==NULL) { fprintf(stderr, "datagen: error: %s \n", strerror(errno)); exit(1); }
     if (litProba<=0.0) litProba = matchProba / 4.5;
-    memset(ldt, '0', sizeof(ldt));
+    memset(ldt, '0', sizeof(ldt));   /* yes, character '0', this is intentional */
     RDG_fillLiteralDistrib(ldt, litProba);
     SET_BINARY_MODE(stdout);
 
diff --git a/programs/datagencli.c b/programs/datagencli.c
index d437d5c..c4fa7f7 100644
--- a/programs/datagencli.c
+++ b/programs/datagencli.c
@@ -39,7 +39,7 @@
 #define MB *(1 <<20)
 #define GB *(1U<<30)
 
-#define SIZE_DEFAULT (64 KB)
+#define SIZE_DEFAULT ((64 KB) + 1)
 #define SEED_DEFAULT 0
 #define COMPRESSIBILITY_DEFAULT 50
 
@@ -72,15 +72,13 @@ static int usage(const char* programName)
 
 int main(int argc, const char** argv)
 {
-    int argNb;
     double proba = (double)COMPRESSIBILITY_DEFAULT / 100;
     double litProba = 0.0;
     U64 size = SIZE_DEFAULT;
     U32 seed = SEED_DEFAULT;
-    const char* programName;
+    const char* const programName = argv[0];
 
-    /* Check command line */
-    programName = argv[0];
+    int argNb;
     for(argNb=1; argNb<argc; argNb++) {
         const char* argument = argv[argNb];
 
diff --git a/programs/dibio.c b/programs/dibio.c
index d23476e..cb864ec 100644
--- a/programs/dibio.c
+++ b/programs/dibio.c
@@ -30,6 +30,7 @@
 #include <string.h>         /* memset */
 #include <stdio.h>          /* fprintf, fopen, ftello64 */
 #include <time.h>           /* clock_t, clock, CLOCKS_PER_SEC */
+#include <errno.h>          /* errno */
 
 #include "mem.h"            /* read */
 #include "error_private.h"
@@ -43,13 +44,10 @@
 #define MB *(1 <<20)
 #define GB *(1U<<30)
 
-#define DICTLISTSIZE 10000
 #define MEMMULT 11
 static const size_t maxMemory = (sizeof(size_t) == 4) ? (2 GB - 64 MB) : ((size_t)(512 MB) << sizeof(size_t));
 
 #define NOISELENGTH 32
-#define PRIME1   2654435761U
-#define PRIME2   2246822519U
 
 
 /*-*************************************
@@ -60,17 +58,13 @@ static const size_t maxMemory = (sizeof(size_t) == 4) ? (2 GB - 64 MB) : ((size_
 static unsigned g_displayLevel = 0;   /* 0 : no display;   1: errors;   2: default;  4: full information */
 
 #define DISPLAYUPDATE(l, ...) if (g_displayLevel>=l) { \
-            if ((DIB_GetMilliSpan(g_time) > refreshRate) || (g_displayLevel>=4)) \
+            if ((DIB_clockSpan(g_time) > refreshRate) || (g_displayLevel>=4)) \
             { g_time = clock(); DISPLAY(__VA_ARGS__); \
             if (g_displayLevel>=4) fflush(stdout); } }
-static const unsigned refreshRate = 150;
+static const clock_t refreshRate = CLOCKS_PER_SEC * 2 / 10;
 static clock_t g_time = 0;
 
-static unsigned DIB_GetMilliSpan(clock_t nPrevious)
-{
-    clock_t const nCurrent = clock();
-    return (unsigned)(((nCurrent - nPrevious) * 1000) / CLOCKS_PER_SEC);
-}
+static clock_t DIB_clockSpan(clock_t nPrevious) { return clock() - nPrevious; }
 
 
 /*-*************************************
@@ -97,13 +91,15 @@ unsigned DiB_isError(size_t errorCode) { return ERR_isError(errorCode); }
 
 const char* DiB_getErrorName(size_t errorCode) { return ERR_getErrorName(errorCode); }
 
+#define MIN(a,b)   ( (a) < (b) ? (a) : (b) )
+
 
 /* ********************************************************
 *  File related operations
 **********************************************************/
 /** DiB_loadFiles() :
 *   @return : nb of files effectively loaded into `buffer` */
-static unsigned DiB_loadFiles(void* buffer, size_t bufferSize,
+static unsigned DiB_loadFiles(void* buffer, size_t* bufferSizePtr,
                               size_t* fileSizes,
                               const char** fileNamesTable, unsigned nbFiles)
 {
@@ -112,18 +108,20 @@ static unsigned DiB_loadFiles(void* buffer, size_t bufferSize,
     unsigned n;
 
     for (n=0; n<nbFiles; n++) {
-        unsigned long long const fs64 = UTIL_getFileSize(fileNamesTable[n]);
-        size_t const fileSize = (size_t)(fs64 > bufferSize-pos ? 0 : fs64);
-        FILE* const f = fopen(fileNamesTable[n], "rb");
-        if (f==NULL) EXM_THROW(10, "impossible to open file %s", fileNamesTable[n]);
-        DISPLAYUPDATE(2, "Loading %s...       \r", fileNamesTable[n]);
-        { size_t const readSize = fread(buff+pos, 1, fileSize, f);
-          if (readSize != fileSize) EXM_THROW(11, "could not read %s", fileNamesTable[n]);
-          pos += readSize; }
-        fileSizes[n] = fileSize;
-        fclose(f);
-        if (fileSize == 0) break;  /* stop there, not enough memory to load all files */
-    }
+        const char* const fileName = fileNamesTable[n];
+        unsigned long long const fs64 = UTIL_getFileSize(fileName);
+        size_t const fileSize = (size_t) MIN(fs64, 128 KB);
+        if (fileSize > *bufferSizePtr-pos) break;
+        {   FILE* const f = fopen(fileName, "rb");
+            if (f==NULL) EXM_THROW(10, "zstd: dictBuilder: %s %s ", fileName, strerror(errno));
+            DISPLAYUPDATE(2, "Loading %s...       \r", fileName);
+            { size_t const readSize = fread(buff+pos, 1, fileSize, f);
+              if (readSize != fileSize) EXM_THROW(11, "Pb reading %s", fileName);
+              pos += readSize; }
+            fileSizes[n] = fileSize;
+            fclose(f);
+    }   }
+    *bufferSizePtr = pos;
     return n;
 }
 
@@ -137,26 +135,28 @@ static size_t DiB_findMaxMem(unsigned long long requiredMem)
     void* testmem = NULL;
 
     requiredMem = (((requiredMem >> 23) + 1) << 23);
-    requiredMem += 2 * step;
+    requiredMem += step;
     if (requiredMem > maxMemory) requiredMem = maxMemory;
 
     while (!testmem) {
-        requiredMem -= step;
         testmem = malloc((size_t)requiredMem);
+        requiredMem -= step;
     }
 
     free(testmem);
-    return (size_t)(requiredMem - step);
+    return (size_t)requiredMem;
 }
 
 
 static void DiB_fillNoise(void* buffer, size_t length)
 {
-    unsigned acc = PRIME1;
+    unsigned const prime1 = 2654435761U;
+    unsigned const prime2 = 2246822519U;
+    unsigned acc = prime1;
     size_t p=0;;
 
     for (p=0; p<length; p++) {
-        acc *= PRIME2;
+        acc *= prime2;
         ((unsigned char*)buffer)[p] = (unsigned char)(acc >> 21);
     }
 }
@@ -188,7 +188,6 @@ size_t ZDICT_trainFromBuffer_unsafe(void* dictBuffer, size_t dictBufferCapacity,
                               ZDICT_params_t parameters);
 
 
-#define MIN(a,b)  ((a)<(b)?(a):(b))
 int DiB_trainFromFiles(const char* dictFileName, unsigned maxDictSize,
                        const char** fileNamesTable, unsigned nbFiles,
                        ZDICT_params_t params)
@@ -197,20 +196,27 @@ int DiB_trainFromFiles(const char* dictFileName, unsigned maxDictSize,
     size_t* const fileSizes = (size_t*)malloc(nbFiles * sizeof(size_t));
     unsigned long long const totalSizeToLoad = UTIL_getTotalFileSize(fileNamesTable, nbFiles);
     size_t const maxMem =  DiB_findMaxMem(totalSizeToLoad * MEMMULT) / MEMMULT;
-    size_t const benchedSize = MIN (maxMem, (size_t)totalSizeToLoad);
+    size_t benchedSize = MIN (maxMem, (size_t)totalSizeToLoad);
     void* const srcBuffer = malloc(benchedSize+NOISELENGTH);
     int result = 0;
 
     /* Checks */
     if ((!fileSizes) || (!srcBuffer) || (!dictBuffer)) EXM_THROW(12, "not enough memory for DiB_trainFiles");   /* should not happen */
+    g_displayLevel = params.notificationLevel;
+    if (nbFiles < 5) {
+        DISPLAYLEVEL(2, "!  Warning : nb of samples too low for proper processing \n");
+        DISPLAYLEVEL(2, "!  Please provide one file per sample \n");
+        DISPLAYLEVEL(2, "!  Avoid concatenating multiple samples into a single file \n");
+        DISPLAYLEVEL(2, "!  otherwise, dictBuilder will be unable to find the beginning of each sample \n");
+        DISPLAYLEVEL(2, "!  resulting in distorted statistics \n");
+    }
 
     /* init */
-    g_displayLevel = params.notificationLevel;
     if (benchedSize < totalSizeToLoad)
         DISPLAYLEVEL(1, "Not enough memory; training on %u MB only...\n", (unsigned)(benchedSize >> 20));
 
     /* Load input buffer */
-    nbFiles = DiB_loadFiles(srcBuffer, benchedSize, fileSizes, fileNamesTable, nbFiles);
+    nbFiles = DiB_loadFiles(srcBuffer, &benchedSize, fileSizes, fileNamesTable, nbFiles);
     DiB_fillNoise((char*)srcBuffer + benchedSize, NOISELENGTH);   /* guard band, for end of buffer condition */
 
     {   size_t const dictSize = ZDICT_trainFromBuffer_unsafe(dictBuffer, maxDictSize,
diff --git a/programs/fileio.c b/programs/fileio.c
index 5e7b26d..b04ee3b 100644
--- a/programs/fileio.c
+++ b/programs/fileio.c
@@ -41,13 +41,14 @@
 /* *************************************
 *  Compiler Options
 ***************************************/
-#define _POSIX_SOURCE 1        /* enable %llu on Windows */
+#define _POSIX_SOURCE 1          /* enable %llu on Windows */
+#define _CRT_SECURE_NO_WARNINGS  /* removes Visual warning on strerror() */
 
 
 /*-*************************************
 *  Includes
 ***************************************/
-#include "util.h"       /* Compiler options, UTIL_GetFileSize */
+#include "util.h"       /* Compiler options, UTIL_GetFileSize, _LARGEFILE64_SOURCE */
 #include <stdio.h>      /* fprintf, fopen, fread, _fileno, stdin, stdout */
 #include <stdlib.h>     /* malloc, free */
 #include <string.h>     /* strcmp, strlen */
@@ -58,7 +59,6 @@
 #include "fileio.h"
 #define ZSTD_STATIC_LINKING_ONLY   /* ZSTD_magicNumber, ZSTD_frameHeaderSize_max */
 #include "zstd.h"
-#include "zstd_internal.h" /* MIN, KB, MB */
 #define ZBUFF_STATIC_LINKING_ONLY
 #include "zbuff.h"
 
@@ -84,6 +84,10 @@
 /*-*************************************
 *  Constants
 ***************************************/
+#define KB *(1<<10)
+#define MB *(1<<20)
+#define GB *(1U<<30)
+
 #define _1BIT  0x01
 #define _2BITS 0x03
 #define _3BITS 0x07
@@ -113,21 +117,17 @@ static U32 g_displayLevel = 2;   /* 0 : no display;   1: errors;   2 : + result
 void FIO_setNotificationLevel(unsigned level) { g_displayLevel=level; }
 
 #define DISPLAYUPDATE(l, ...) if (g_displayLevel>=l) { \
-            if ((FIO_GetMilliSpan(g_time) > refreshRate) || (g_displayLevel>=4)) \
+            if ((clock() - g_time > refreshRate) || (g_displayLevel>=4)) \
             { g_time = clock(); DISPLAY(__VA_ARGS__); \
             if (g_displayLevel>=4) fflush(stdout); } }
-static const unsigned refreshRate = 150;
+static const clock_t refreshRate = CLOCKS_PER_SEC * 15 / 100;
 static clock_t g_time = 0;
 
-static unsigned FIO_GetMilliSpan(clock_t nPrevious)
-{
-    clock_t const nCurrent = clock();
-    return (unsigned)(((nCurrent - nPrevious) * 1000) / CLOCKS_PER_SEC);
-}
+#define MIN(a,b)    ((a) < (b) ? (a) : (b))
 
 
 /*-*************************************
-*  Local Parameters
+*  Local Parameters - Not thread safe
 ***************************************/
 static U32 g_overwrite = 0;
 void FIO_overwriteMode(void) { g_overwrite=1; }
@@ -175,12 +175,12 @@ static FILE* FIO_openSrcFile(const char* srcFileName)
         f = fopen(srcFileName, "rb");
     }
 
-    if ( f==NULL ) DISPLAYLEVEL(1, "zstd: %s: No such file\n", srcFileName);
+    if ( f==NULL ) DISPLAYLEVEL(1, "zstd: %s: %s \n", srcFileName, strerror(errno));
 
     return f;
 }
 
-
+/* `dstFileName must` be non-NULL */
 static FILE* FIO_openDstFile(const char* dstFileName)
 {
     FILE* f;
@@ -201,18 +201,20 @@ static FILE* FIO_openDstFile(const char* dstFileName)
                 if (g_displayLevel <= 1) {
                     /* No interaction possible */
                     DISPLAY("zstd: %s already exists; not overwritten  \n", dstFileName);
-                    return 0;
+                    return NULL;
                 }
                 DISPLAY("zstd: %s already exists; do you wish to overwrite (y/N) ? ", dstFileName);
                 {   int ch = getchar();
                     if ((ch!='Y') && (ch!='y')) {
                         DISPLAY("    not overwritten  \n");
-                        return 0;
+                        return NULL;
                     }
                     while ((ch!=EOF) && (ch!='\n')) ch = getchar();  /* flush rest of input line */
         }   }   }
         f = fopen( dstFileName, "wb" );
     }
+
+    if (f==NULL) DISPLAYLEVEL(1, "zstd: %s: %s\n", dstFileName, strerror(errno));
     return f;
 }
 
@@ -233,18 +235,18 @@ static size_t FIO_loadFile(void** bufferPtr, const char* fileName)
 
     DISPLAYLEVEL(4,"Loading %s as dictionary \n", fileName);
     fileHandle = fopen(fileName, "rb");
-    if (fileHandle==0) EXM_THROW(31, "Error opening file %s", fileName);
+    if (fileHandle==0) EXM_THROW(31, "zstd: %s: %s", fileName, strerror(errno));
     fileSize = UTIL_getFileSize(fileName);
     if (fileSize > MAX_DICT_SIZE) {
         int seekResult;
         if (fileSize > 1 GB) EXM_THROW(32, "Dictionary file %s is too large", fileName);   /* avoid extreme cases */
         DISPLAYLEVEL(2,"Dictionary %s is too large : using last %u bytes only \n", fileName, MAX_DICT_SIZE);
         seekResult = fseek(fileHandle, (long int)(fileSize-MAX_DICT_SIZE), SEEK_SET);   /* use end of file */
-        if (seekResult != 0) EXM_THROW(33, "Error seeking into file %s", fileName);
+        if (seekResult != 0) EXM_THROW(33, "zstd: %s: %s", fileName, strerror(errno));
         fileSize = MAX_DICT_SIZE;
     }
-    *bufferPtr = (BYTE*)malloc((size_t)fileSize);
-    if (*bufferPtr==NULL) EXM_THROW(34, "Allocation error : not enough memory for dictBuffer");
+    *bufferPtr = malloc((size_t)fileSize);
+    if (*bufferPtr==NULL) EXM_THROW(34, "zstd: %s", strerror(errno));
     { size_t const readSize = fread(*bufferPtr, 1, (size_t)fileSize, fileHandle);
       if (readSize!=fileSize) EXM_THROW(35, "Error reading dictionary file %s", fileName); }
     fclose(fileHandle);
@@ -271,16 +273,15 @@ typedef struct {
 static cRess_t FIO_createCResources(const char* dictFileName)
 {
     cRess_t ress;
+    memset(&ress, 0, sizeof(ress));
 
     ress.ctx = ZBUFF_createCCtx();
-    if (ress.ctx == NULL) EXM_THROW(30, "Allocation error : can't create ZBUFF context");
-
-    /* Allocate Memory */
+    if (ress.ctx == NULL) EXM_THROW(30, "zstd: allocation error : can't create ZBUFF context");
     ress.srcBufferSize = ZBUFF_recommendedCInSize();
     ress.srcBuffer = malloc(ress.srcBufferSize);
     ress.dstBufferSize = ZBUFF_recommendedCOutSize();
     ress.dstBuffer = malloc(ress.dstBufferSize);
-    if (!ress.srcBuffer || !ress.dstBuffer) EXM_THROW(31, "Allocation error : not enough memory");
+    if (!ress.srcBuffer || !ress.dstBuffer) EXM_THROW(31, "zstd: allocation error : not enough memory");
 
     /* dictionary */
     ress.dictBufferSize = FIO_loadFile(&(ress.dictBuffer), dictFileName);
@@ -295,7 +296,7 @@ static void FIO_freeCResources(cRess_t ress)
     free(ress.dstBuffer);
     free(ress.dictBuffer);
     errorCode = ZBUFF_freeCCtx(ress.ctx);
-    if (ZBUFF_isError(errorCode)) EXM_THROW(38, "Error : can't release ZBUFF context resource : %s", ZBUFF_getErrorName(errorCode));
+    if (ZBUFF_isError(errorCode)) EXM_THROW(38, "zstd: error : can't release ZBUFF context resource : %s", ZBUFF_getErrorName(errorCode));
 }
 
 
@@ -315,9 +316,7 @@ static int FIO_compressFilename_internal(cRess_t ress,
     U64 const fileSize = UTIL_getFileSize(srcFileName);
 
     /* init */
-    {   ZSTD_parameters params;
-        memset(&params, 0, sizeof(params));
-        params.cParams = ZSTD_getCParams(cLevel, fileSize, ress.dictBufferSize);
+    {   ZSTD_parameters params = ZSTD_getParams(cLevel, fileSize, ress.dictBufferSize);
         params.fParams.contentSizeFlag = 1;
         params.fParams.checksumFlag = g_checksumFlag;
         params.fParams.noDictIDFlag = !g_dictIDFlag;
@@ -330,7 +329,6 @@ static int FIO_compressFilename_internal(cRess_t ress,
     }   }
 
     /* Main compression loop */
-    readsize = 0;
     while (1) {
         /* Fill input Buffer */
         size_t const inSize = fread(ress.srcBuffer, (size_t)1, ress.srcBufferSize, srcFile);
@@ -338,8 +336,8 @@ static int FIO_compressFilename_internal(cRess_t ress,
         readsize += inSize;
         DISPLAYUPDATE(2, "\rRead : %u MB  ", (U32)(readsize>>20));
 
-        {   /* Compress using buffered streaming */
-            size_t usedInSize = inSize;
+        /* Compress using buffered streaming */
+        {   size_t usedInSize = inSize;
             size_t cSize = ress.dstBufferSize;
             { size_t const result = ZBUFF_compressContinue(ress.ctx, ress.dstBuffer, &cSize, ress.srcBuffer, &usedInSize);
               if (ZBUFF_isError(result)) EXM_THROW(23, "Compression error : %s ", ZBUFF_getErrorName(result)); }
@@ -366,17 +364,19 @@ static int FIO_compressFilename_internal(cRess_t ress,
     }
 
     /* Status */
+    if (strlen(srcFileName) > 20) srcFileName += strlen(srcFileName)-20; /* display last 20 characters */
     DISPLAYLEVEL(2, "\r%79s\r", "");
-    DISPLAYLEVEL(2,"%-20.20s :%6.2f%%   (%6llu =>%6llu bytes, %s) \n", srcFileName,
-        (double)compressedfilesize/readsize*100, (unsigned long long)readsize, (unsigned long long) compressedfilesize,
-                 dstFileName);
+    DISPLAYLEVEL(2,"%-20.20s :%6.2f%%   (%6llu => %6llu bytes, %s) \n", srcFileName,
+        (double)compressedfilesize/(readsize+(!readsize) /* avoid div by zero */ )*100,
+        (unsigned long long)readsize, (unsigned long long) compressedfilesize,
+         dstFileName);
 
     return 0;
 }
 
 
-/*! FIO_compressFilename_internal() :
- *  same as FIO_compressFilename_extRess(), with ress.destFile already opened (typically stdout)
+/*! FIO_compressFilename_srcFile() :
+ *  note : ress.destFile already opened
  *  @return : 0 : compression completed correctly,
  *            1 : missing or pb opening srcFileName
  */
@@ -397,7 +397,7 @@ static int FIO_compressFilename_srcFile(cRess_t ress,
     result = FIO_compressFilename_internal(ress, dstFileName, srcFileName, cLevel);
 
     fclose(ress.srcFile);
-    if ((g_removeSrcFile) && (!result)) remove(srcFileName);
+    if ((g_removeSrcFile) && (!result)) { if (remove(srcFileName)) EXM_THROW(1, "zstd: %s: %s", srcFileName, strerror(errno)); }
     return result;
 }
 
@@ -417,8 +417,8 @@ static int FIO_compressFilename_dstFile(cRess_t ress,
 
     result = FIO_compressFilename_srcFile(ress, dstFileName, srcFileName, cLevel);
 
-    if (fclose(ress.dstFile)) EXM_THROW(28, "Write error : cannot properly close %s", dstFileName);
-    if (result!=0) remove(dstFileName);   /* remove operation artefact */
+    if (fclose(ress.dstFile)) { DISPLAYLEVEL(1, "zstd: %s: %s \n", dstFileName, strerror(errno)); result=1; }
+    if (result!=0) { if (remove(dstFileName)) EXM_THROW(1, "zstd: %s: %s", dstFileName, strerror(errno)); }  /* remove operation artefact */
     return result;
 }
 
@@ -429,13 +429,13 @@ int FIO_compressFilename(const char* dstFileName, const char* srcFileName,
     clock_t const start = clock();
 
     cRess_t const ress = FIO_createCResources(dictFileName);
-    int const issueWithSrcFile = FIO_compressFilename_dstFile(ress, dstFileName, srcFileName, compressionLevel);
-    FIO_freeCResources(ress);
+    int const result = FIO_compressFilename_dstFile(ress, dstFileName, srcFileName, compressionLevel);
 
-    {   double const seconds = (double)(clock() - start) / CLOCKS_PER_SEC;
-        DISPLAYLEVEL(4, "Completed in %.2f sec \n", seconds);
-    }
-    return issueWithSrcFile;
+    double const seconds = (double)(clock() - start) / CLOCKS_PER_SEC;
+    DISPLAYLEVEL(4, "Completed in %.2f sec \n", seconds);
+
+    FIO_freeCResources(ress);
+    return result;
 }
 
 
@@ -444,13 +444,14 @@ int FIO_compressMultipleFilenames(const char** inFileNamesTable, unsigned nbFile
                                   const char* dictFileName, int compressionLevel)
 {
     int missed_files = 0;
-    char*  dstFileName = (char*)malloc(FNSPACE);
     size_t dfnSize = FNSPACE;
+    char*  dstFileName = (char*)malloc(FNSPACE);
     size_t const suffixSize = suffix ? strlen(suffix) : 0;
-    cRess_t ress;
+    cRess_t ress = FIO_createCResources(dictFileName);
 
     /* init */
-    ress = FIO_createCResources(dictFileName);
+    if (dstFileName==NULL) EXM_THROW(27, "FIO_compressMultipleFilenames : allocation error for dstFileName");
+    if (suffix == NULL) EXM_THROW(28, "FIO_compressMultipleFilenames : dst unknown");  /* should never happen */
 
     /* loop on each file */
     if (!strcmp(suffix, stdoutmark)) {
@@ -502,12 +503,11 @@ typedef struct {
 static dRess_t FIO_createDResources(const char* dictFileName)
 {
     dRess_t ress;
+    memset(&ress, 0, sizeof(ress));
 
-    /* init */
+    /* Allocation */
     ress.dctx = ZBUFF_createDCtx();
     if (ress.dctx==NULL) EXM_THROW(60, "Can't create ZBUFF decompression context");
-
-    /* Allocate Memory */
     ress.srcBufferSize = ZBUFF_recommendedDInSize();
     ress.srcBuffer = malloc(ress.srcBufferSize);
     ress.dstBufferSize = ZBUFF_recommendedDOutSize();
@@ -636,13 +636,12 @@ unsigned long long FIO_decompressFrame(dRess_t ress,
         DISPLAYUPDATE(2, "\rDecoded : %u MB...     ", (U32)(frameSize>>20) );
 
         if (toRead == 0) break;   /* end of frame */
-        if (readSize) EXM_THROW(38, "Decoding error : should consume entire input");
+        if (readSize) EXM_THROW(37, "Decoding error : should consume entire input");
 
         /* Fill input buffer */
-        if (toRead > ress.srcBufferSize) EXM_THROW(34, "too large block");
+        if (toRead > ress.srcBufferSize) EXM_THROW(38, "too large block");
         readSize = fread(ress.srcBuffer, 1, toRead, finput);
-        if (readSize != toRead)
-            EXM_THROW(35, "Read error");
+        if (readSize == 0) EXM_THROW(39, "Read error : premature end");
     }
 
     FIO_fwriteSparseEnd(foutput, storedSkips);
@@ -683,6 +682,7 @@ static int FIO_decompressSrcFile(dRess_t ress, const char* srcFileName)
     unsigned long long filesize = 0;
     FILE* const dstFile = ress.dstFile;
     FILE* srcFile;
+    unsigned readSomething = 0;
 
     if (UTIL_isDirectory(srcFileName)) {
         DISPLAYLEVEL(1, "zstd: %s is a directory -- ignored \n", srcFileName);
@@ -696,20 +696,27 @@ static int FIO_decompressSrcFile(dRess_t ress, const char* srcFileName)
         /* check magic number -> version */
         size_t const toRead = 4;
         size_t const sizeCheck = fread(ress.srcBuffer, (size_t)1, toRead, srcFile);
-        if (sizeCheck==0) break;   /* no more input */
-        if (sizeCheck != toRead) EXM_THROW(31, "zstd: %s read error : cannot read header", srcFileName);
+        if (sizeCheck==0) {
+            if (readSomething==0) { DISPLAY("zstd: %s: unexpected end of file \n", srcFileName); fclose(srcFile); return 1; }  /* srcFileName is empty */
+            break;   /* no more input */
+        }
+        readSomething = 1;
+        if (sizeCheck != toRead) { DISPLAY("zstd: %s: unknown header \n", srcFileName); fclose(srcFile); return 1; }  /* srcFileName is empty */
         {   U32 const magic = MEM_readLE32(ress.srcBuffer);
 #if defined(ZSTD_LEGACY_SUPPORT) && (ZSTD_LEGACY_SUPPORT>=1)
-            if (ZSTD_isLegacy(magic)) {
+            if (ZSTD_isLegacy(ress.srcBuffer, 4)) {
                 filesize += FIO_decompressLegacyFrame(dstFile, srcFile, ress.dictBuffer, ress.dictBufferSize, magic);
                 continue;
             }
 #endif
-            if (((magic & 0xFFFFFFF0U) != ZSTD_MAGIC_SKIPPABLE_START) && (magic != ZSTD_MAGICNUMBER)) {
-                if (g_overwrite)   /* -df : pass-through mode */
-                    return FIO_passThrough(dstFile, srcFile, ress.srcBuffer, ress.srcBufferSize);
-                else {
+            if (((magic & 0xFFFFFFF0U) != ZSTD_MAGIC_SKIPPABLE_START) & (magic != ZSTD_MAGICNUMBER)) {
+                if ((g_overwrite) && !strcmp (srcFileName, stdinmark)) {  /* pass-through mode */
+                    unsigned const result = FIO_passThrough(dstFile, srcFile, ress.srcBuffer, ress.srcBufferSize);
+                    if (fclose(srcFile)) EXM_THROW(32, "zstd: %s close error", srcFileName);  /* error should never happen */
+                    return result;
+                } else {
                     DISPLAYLEVEL(1, "zstd: %s: not in zstd format \n", srcFileName);
+                    fclose(srcFile);
                     return 1;
         }   }   }
         filesize += FIO_decompressFrame(ress, dstFile, srcFile, toRead);
@@ -720,8 +727,8 @@ static int FIO_decompressSrcFile(dRess_t ress, const char* srcFileName)
     DISPLAYLEVEL(2, "%-20.20s: %llu bytes \n", srcFileName, filesize);
 
     /* Close */
-    fclose(srcFile);
-    if (g_removeSrcFile) remove(srcFileName);
+    if (fclose(srcFile)) EXM_THROW(33, "zstd: %s close error", srcFileName);  /* error should never happen */
+    if (g_removeSrcFile) { if (remove(srcFileName)) EXM_THROW(34, "zstd: %s: %s", srcFileName, strerror(errno)); };
     return 0;
 }
 
@@ -741,7 +748,7 @@ static int FIO_decompressDstFile(dRess_t ress,
     result = FIO_decompressSrcFile(ress, srcFileName);
 
     if (fclose(ress.dstFile)) EXM_THROW(38, "Write error : cannot properly close %s", dstFileName);
-    if (result != 0) remove(dstFileName);
+    if (result != 0) if (remove(dstFileName)) result=1;   /* don't do anything if remove fails */
     return result;
 }
 
@@ -768,19 +775,21 @@ int FIO_decompressMultipleFilenames(const char** srcNamesTable, unsigned nbFiles
     int missingFiles = 0;
     dRess_t ress = FIO_createDResources(dictFileName);
 
+    if (suffix==NULL) EXM_THROW(70, "zstd: decompression: unknown dst");   /* should never happen */
+
     if (!strcmp(suffix, stdoutmark) || !strcmp(suffix, nulmark)) {
         unsigned u;
         ress.dstFile = FIO_openDstFile(suffix);
         if (ress.dstFile == 0) EXM_THROW(71, "cannot open %s", suffix);
         for (u=0; u<nbFiles; u++)
             missingFiles += FIO_decompressSrcFile(ress, srcNamesTable[u]);
-        if (fclose(ress.dstFile)) EXM_THROW(39, "Write error : cannot properly close %s", stdoutmark);
+        if (fclose(ress.dstFile)) EXM_THROW(72, "Write error : cannot properly close %s", stdoutmark);
     } else {
-        size_t const suffixSize = suffix ? strlen(suffix) : 0;
+        size_t const suffixSize = strlen(suffix);
         size_t dfnSize = FNSPACE;
         unsigned u;
         char* dstFileName = (char*)malloc(FNSPACE);
-        if (dstFileName==NULL) EXM_THROW(70, "not enough memory for dstFileName");
+        if (dstFileName==NULL) EXM_THROW(73, "not enough memory for dstFileName");
         for (u=0; u<nbFiles; u++) {   /* create dstFileName */
             const char* const srcFileName = srcNamesTable[u];
             size_t const sfnSize = strlen(srcFileName);
@@ -789,7 +798,7 @@ int FIO_decompressMultipleFilenames(const char** srcNamesTable, unsigned nbFiles
                 free(dstFileName);
                 dfnSize = sfnSize + 20;
                 dstFileName = (char*)malloc(dfnSize);
-                if (dstFileName==NULL) EXM_THROW(71, "not enough memory for dstFileName");
+                if (dstFileName==NULL) EXM_THROW(74, "not enough memory for dstFileName");
             }
             if (sfnSize <= suffixSize || strcmp(suffixPtr, suffix) != 0) {
                 DISPLAYLEVEL(1, "zstd: %s: unknown suffix (%4s expected) -- ignored \n", srcFileName, suffix);
diff --git a/programs/fileio.h b/programs/fileio.h
index 4a4f3d2..06d977d 100644
--- a/programs/fileio.h
+++ b/programs/fileio.h
@@ -31,7 +31,6 @@ extern "C" {
 /* *************************************
 *  Special i/o constants
 **************************************/
-#define nullString "null"
 #define stdinmark "stdin"
 #define stdoutmark "stdout"
 #ifdef _WIN32
diff --git a/programs/fullbench.c b/programs/fullbench.c
index 01e8f59..f6852f6 100644
--- a/programs/fullbench.c
+++ b/programs/fullbench.c
@@ -31,8 +31,9 @@
 #include <time.h>        /* clock_t, clock, CLOCKS_PER_SEC */
 
 #include "mem.h"
+#include "zstd_internal.h"   /* ZSTD_blockHeaderSize, blockType_e, KB, MB */
 #define ZSTD_STATIC_LINKING_ONLY  /* ZSTD_compressBegin, ZSTD_compressContinue, etc. */
-#include "zstd.h"        /* ZSTD_VERSION_STRING */
+#include "zstd.h"            /* ZSTD_VERSION_STRING */
 #define FSE_STATIC_LINKING_ONLY   /* FSE_DTABLE_SIZE_U32 */
 #include "fse.h"
 #include "zbuff.h"
@@ -46,10 +47,6 @@
 #define AUTHOR "Yann Collet"
 #define WELCOME_MESSAGE "*** %s %s %i-bits, by %s (%s) ***\n", PROGRAM_DESCRIPTION, ZSTD_VERSION_STRING, (int)(sizeof(void*)*8), AUTHOR, __DATE__
 
-
-#define KB *(1<<10)
-#define MB *(1<<20)
-
 #define NBLOOPS    6
 #define TIMELOOP_S 2
 
@@ -110,9 +107,8 @@ static size_t BMK_findMaxMem(U64 requiredMem)
 /*_*******************************************************
 *  Benchmark wrappers
 *********************************************************/
-typedef enum { bt_compressed, bt_raw, bt_rle, bt_end } blockType_t;
 typedef struct {
-    blockType_t blockType;
+    blockType_e blockType;
     U32 unusedBits;
     U32 origSize;
 } blockProperties_t;
@@ -177,12 +173,9 @@ static size_t local_ZBUFF_decompress(void* dst, size_t dstCapacity, void* buff2,
 static ZSTD_CCtx* g_zcc = NULL;
 size_t local_ZSTD_compressContinue(void* dst, size_t dstCapacity, void* buff2, const void* src, size_t srcSize)
 {
-    size_t compressedSize;
     (void)buff2;
     ZSTD_compressBegin(g_zcc, 1);
-    compressedSize = ZSTD_compressContinue(g_zcc, dst, dstCapacity, src, srcSize);
-    compressedSize += ZSTD_compressEnd(g_zcc, ((char*)dst)+compressedSize, dstCapacity-compressedSize);
-    return compressedSize;
+    return ZSTD_compressEnd(g_zcc, dst, dstCapacity, src, srcSize);
 }
 
 size_t local_ZSTD_decompressContinue(void* dst, size_t dstCapacity, void* buff2, const void* src, size_t srcSize)
@@ -214,8 +207,8 @@ size_t local_ZSTD_decompressContinue(void* dst, size_t dstCapacity, void* buff2,
 static size_t benchMem(const void* src, size_t srcSize, U32 benchNb)
 {
     BYTE*  dstBuff;
-    size_t dstBuffSize;
-    BYTE*  buff2;
+    size_t const dstBuffSize = ZSTD_compressBound(srcSize);
+    void*  buff2;
     const char* benchName;
     size_t (*benchFunction)(void* dst, size_t dstSize, void* verifBuff, const void* src, size_t srcSize);
     double bestTime = 100000000.;
@@ -252,9 +245,8 @@ static size_t benchMem(const void* src, size_t srcSize, U32 benchNb)
     }
 
     /* Allocation */
-    dstBuffSize = ZSTD_compressBound(srcSize);
     dstBuff = (BYTE*)malloc(dstBuffSize);
-    buff2 = (BYTE*)malloc(dstBuffSize);
+    buff2 = malloc(dstBuffSize);
     if ((!dstBuff) || (!buff2)) {
         DISPLAY("\nError: not enough memory!\n");
         free(dstBuff); free(buff2);
@@ -287,7 +279,7 @@ static size_t benchMem(const void* src, size_t srcSize, U32 benchNb)
                 DISPLAY("ZSTD_decodeLiteralsBlock : impossible to test on this sample (not compressible)\n");
                 goto _cleanOut;
             }
-            skippedSize = frameHeaderSize + 3 /* ZSTD_blockHeaderSize */;
+            skippedSize = frameHeaderSize + ZSTD_blockHeaderSize;
             memcpy(buff2, dstBuff+skippedSize, g_cSize-skippedSize);
             srcSize = srcSize > 128 KB ? 128 KB : srcSize;    /* speed relative to block */
             break;
@@ -309,9 +301,9 @@ static size_t benchMem(const void* src, size_t srcSize, U32 benchNb)
                 DISPLAY("ZSTD_decodeSeqHeaders : impossible to test on this sample (not compressible)\n");
                 goto _cleanOut;
             }
-            iend = ip + 3 /* ZSTD_blockHeaderSize */ + cBlockSize;   /* End of first block */
-            ip += 3 /* ZSTD_blockHeaderSize */;                     /* skip block header */
-            ip += ZSTD_decodeLiteralsBlock(g_zdc, ip, iend-ip);  /* skip literal segment */
+            iend = ip + ZSTD_blockHeaderSize + cBlockSize;   /* End of first block */
+            ip += ZSTD_blockHeaderSize;                      /* skip block header */
+            ip += ZSTD_decodeLiteralsBlock(g_zdc, ip, iend-ip);   /* skip literal segment */
             g_cSize = iend-ip;
             memcpy(buff2, ip, g_cSize);   /* copy rest of block (it starts by SeqHeader) */
             srcSize = srcSize > 128 KB ? 128 KB : srcSize;   /* speed relative to block */
diff --git a/programs/fuzzer.c b/programs/fuzzer.c
index d1dfe51..cb31dc4 100644
--- a/programs/fuzzer.c
+++ b/programs/fuzzer.c
@@ -35,18 +35,19 @@
 /*-************************************
 *  Includes
 **************************************/
-#include <stdlib.h>      /* free */
-#include <stdio.h>       /* fgets, sscanf */
-#include <sys/timeb.h>   /* timeb */
-#include <string.h>      /* strcmp */
-#include <time.h>        /* clock_t */
-#define ZSTD_STATIC_LINKING_ONLY   /* ZSTD_compressContinue */
-#include "zstd.h"        /* ZSTD_VERSION_STRING, ZSTD_getErrorCode */
-#include "zdict.h"       /* ZDICT_trainFromBuffer */
-#include "datagen.h"     /* RDG_genBuffer */
+#include <stdlib.h>       /* free */
+#include <stdio.h>        /* fgets, sscanf */
+#include <sys/timeb.h>    /* timeb */
+#include <string.h>       /* strcmp */
+#include <time.h>         /* clock_t */
+#define ZSTD_STATIC_LINKING_ONLY   /* ZSTD_compressContinue, ZSTD_compressBlock */
+#include "zstd.h"         /* ZSTD_VERSION_STRING */
+#include "error_public.h" /* ZSTD_getErrorCode */
+#include "zdict.h"        /* ZDICT_trainFromBuffer */
+#include "datagen.h"      /* RDG_genBuffer */
 #include "mem.h"
 #define XXH_STATIC_LINKING_ONLY
-#include "xxhash.h"      /* XXH64 */
+#include "xxhash.h"       /* XXH64 */
 
 
 /*-************************************
@@ -109,9 +110,9 @@ static unsigned FUZ_highbit32(U32 v32)
 }
 
 
-#define CHECKTEST(var, fn)  size_t const var = fn; if (ZSTD_isError(var)) goto _output_error
-#define CHECK(fn)  { CHECKTEST(err, fn); }
-#define CHECKPLUS(var, fn, more)  { CHECKTEST(var, fn); more; }
+#define CHECK_V(var, fn)  size_t const var = fn; if (ZSTD_isError(var)) goto _output_error
+#define CHECK(fn)  { CHECK_V(err, fn); }
+#define CHECKPLUS(var, fn, more)  { CHECK_V(var, fn); more; }
 static int basicUnitTests(U32 seed, double compressibility)
 {
     size_t const CNBuffSize = 5 MB;
@@ -137,9 +138,15 @@ static int basicUnitTests(U32 seed, double compressibility)
               cSize=r );
     DISPLAYLEVEL(4, "OK (%u bytes : %.2f%%)\n", (U32)cSize, (double)cSize/CNBuffSize*100);
 
+    DISPLAYLEVEL(4, "test%3i : decompressed size test : ", testNb++);
+    {   unsigned long long const rSize = ZSTD_getDecompressedSize(compressedBuffer, cSize);
+        if (rSize != CNBuffSize) goto _output_error;
+    }
+    DISPLAYLEVEL(4, "OK \n");
+
     DISPLAYLEVEL(4, "test%3i : decompress %u bytes : ", testNb++, (U32)CNBuffSize);
-    CHECKPLUS( r , ZSTD_decompress(decodedBuffer, CNBuffSize, compressedBuffer, cSize),
-               if (r != CNBuffSize) goto _output_error);
+    { size_t const r = ZSTD_decompress(decodedBuffer, CNBuffSize, compressedBuffer, cSize);
+      if (r != CNBuffSize) goto _output_error; }
     DISPLAYLEVEL(4, "OK \n");
 
     DISPLAYLEVEL(4, "test%3i : check decompressed result : ", testNb++);
@@ -179,11 +186,9 @@ static int basicUnitTests(U32 seed, double compressibility)
 
         DISPLAYLEVEL(4, "test%3i : compress with flat dictionary : ", testNb++);
         cSize = 0;
-        CHECKPLUS(r, ZSTD_compressContinue(ctxOrig, compressedBuffer, ZSTD_compressBound(CNBuffSize),
+        CHECKPLUS(r, ZSTD_compressEnd(ctxOrig, compressedBuffer, ZSTD_compressBound(CNBuffSize),
                                            (const char*)CNBuffer + dictSize, CNBuffSize - dictSize),
                   cSize += r);
-        CHECKPLUS(r, ZSTD_compressEnd(ctxOrig, (char*)compressedBuffer+cSize, ZSTD_compressBound(CNBuffSize)-cSize),
-                  cSize += r);
         DISPLAYLEVEL(4, "OK (%u bytes : %.2f%%)\n", (U32)cSize, (double)cSize/CNBuffSize*100);
 
         DISPLAYLEVEL(4, "test%3i : frame built with flat dictionary should be decompressible : ", testNb++);
@@ -197,12 +202,10 @@ static int basicUnitTests(U32 seed, double compressibility)
         DISPLAYLEVEL(4, "test%3i : compress with duplicated context : ", testNb++);
         {   size_t const cSizeOrig = cSize;
             cSize = 0;
-            CHECKPLUS(r, ZSTD_compressContinue(ctxDuplicated, compressedBuffer, ZSTD_compressBound(CNBuffSize),
+            CHECKPLUS(r, ZSTD_compressEnd(ctxDuplicated, compressedBuffer, ZSTD_compressBound(CNBuffSize),
                                                (const char*)CNBuffer + dictSize, CNBuffSize - dictSize),
                       cSize += r);
-            CHECKPLUS(r, ZSTD_compressEnd(ctxDuplicated, (char*)compressedBuffer+cSize, ZSTD_compressBound(CNBuffSize)-cSize),
-                      cSize += r);
-            if (cSize != cSizeOrig) goto _output_error;   /* should be identical ==> have same size */
+            if (cSize != cSizeOrig) goto _output_error;   /* should be identical ==> same size */
         }
         DISPLAYLEVEL(4, "OK (%u bytes : %.2f%%)\n", (U32)cSize, (double)cSize/CNBuffSize*100);
 
@@ -216,10 +219,8 @@ static int basicUnitTests(U32 seed, double compressibility)
 
         DISPLAYLEVEL(4, "test%3i : check content size on duplicated context : ", testNb++);
         {   size_t const testSize = CNBuffSize / 3;
-            {   ZSTD_compressionParameters const cPar = ZSTD_getCParams(2, testSize, dictSize);
-                ZSTD_frameParameters const fPar = { 1 , 0 , 0 };
-                ZSTD_parameters p;
-                p.cParams = cPar; p.fParams = fPar;
+            {   ZSTD_parameters p = ZSTD_getParams(2, testSize, dictSize);
+                p.fParams.contentSizeFlag = 1;
                 CHECK( ZSTD_compressBegin_advanced(ctxOrig, CNBuffer, dictSize, p, testSize-1) );
             }
             CHECK( ZSTD_copyCCtx(ctxDuplicated, ctxOrig) );
@@ -277,10 +278,8 @@ static int basicUnitTests(U32 seed, double compressibility)
         DISPLAYLEVEL(4, "OK \n");
 
         DISPLAYLEVEL(4, "test%3i : compress without dictID : ", testNb++);
-        {   ZSTD_frameParameters const fParams = { 0 /*contentSize*/, 0 /*checksum*/, 1 /*NoDictID*/ };
-            ZSTD_compressionParameters const cParams = ZSTD_getCParams(3, CNBuffSize, dictSize);
-            ZSTD_parameters p;
-            p.cParams = cParams; p.fParams = fParams;
+        {   ZSTD_parameters p = ZSTD_getParams(3, CNBuffSize, dictSize);
+            p.fParams.noDictIDFlag = 1;
             cSize = ZSTD_compress_advanced(cctx, compressedBuffer, ZSTD_compressBound(CNBuffSize),
                                            CNBuffer, CNBuffSize,
                                            dictBuffer, dictSize, p);
@@ -318,8 +317,9 @@ static int basicUnitTests(U32 seed, double compressibility)
     /* block API tests */
     {   ZSTD_CCtx* const cctx = ZSTD_createCCtx();
         ZSTD_DCtx* const dctx = ZSTD_createDCtx();
-        static const size_t blockSize = 100 KB;
-        static const size_t dictSize = 16 KB;
+        static const size_t dictSize = 65 KB;
+        static const size_t blockSize = 100 KB;   /* won't cause pb with small dict size */
+        size_t cSize2;
 
         /* basic block compression */
         DISPLAYLEVEL(4, "test%3i : Block compression test : ", testNb++);
@@ -330,7 +330,7 @@ static int basicUnitTests(U32 seed, double compressibility)
 
         DISPLAYLEVEL(4, "test%3i : Block decompression test : ", testNb++);
         CHECK( ZSTD_decompressBegin(dctx) );
-        { CHECKTEST(r, ZSTD_decompressBlock(dctx, decodedBuffer, CNBuffSize, compressedBuffer, cSize) );
+        { CHECK_V(r, ZSTD_decompressBlock(dctx, decodedBuffer, CNBuffSize, compressedBuffer, cSize) );
           if (r != blockSize) goto _output_error; }
         DISPLAYLEVEL(4, "OK \n");
 
@@ -339,11 +339,20 @@ static int basicUnitTests(U32 seed, double compressibility)
         CHECK( ZSTD_compressBegin_usingDict(cctx, CNBuffer, dictSize, 5) );
         cSize = ZSTD_compressBlock(cctx, compressedBuffer, ZSTD_compressBound(blockSize), (char*)CNBuffer+dictSize, blockSize);
         if (ZSTD_isError(cSize)) goto _output_error;
+        cSize2 = ZSTD_compressBlock(cctx, (char*)compressedBuffer+cSize, ZSTD_compressBound(blockSize), (char*)CNBuffer+dictSize+blockSize, blockSize);
+        if (ZSTD_isError(cSize2)) goto _output_error;
+        memcpy((char*)compressedBuffer+cSize, (char*)CNBuffer+dictSize+blockSize, blockSize);   /* fake non-compressed block */
+        cSize2 = ZSTD_compressBlock(cctx, (char*)compressedBuffer+cSize+blockSize, ZSTD_compressBound(blockSize),
+                                          (char*)CNBuffer+dictSize+2*blockSize, blockSize);
+        if (ZSTD_isError(cSize2)) goto _output_error;
         DISPLAYLEVEL(4, "OK \n");
 
         DISPLAYLEVEL(4, "test%3i : Dictionary Block decompression test : ", testNb++);
         CHECK( ZSTD_decompressBegin_usingDict(dctx, CNBuffer, dictSize) );
-        { CHECKTEST( r, ZSTD_decompressBlock(dctx, decodedBuffer, CNBuffSize, compressedBuffer, cSize) );
+        { CHECK_V( r, ZSTD_decompressBlock(dctx, decodedBuffer, CNBuffSize, compressedBuffer, cSize) );
+          if (r != blockSize) goto _output_error; }
+        ZSTD_insertBlock(dctx, (char*)decodedBuffer+blockSize, blockSize);   /* insert non-compressed block into dctx history */
+        { CHECK_V( r, ZSTD_decompressBlock(dctx, (char*)decodedBuffer+2*blockSize, CNBuffSize, (char*)compressedBuffer+cSize+blockSize, cSize2) );
           if (r != blockSize) goto _output_error; }
         DISPLAYLEVEL(4, "OK \n");
 
@@ -361,7 +370,7 @@ static int basicUnitTests(U32 seed, double compressibility)
         sampleSize += 96 KB;
         cSize = ZSTD_compress(compressedBuffer, ZSTD_compressBound(sampleSize), CNBuffer, sampleSize, 1);
         if (ZSTD_isError(cSize)) goto _output_error;
-        { CHECKTEST(regenSize, ZSTD_decompress(decodedBuffer, sampleSize, compressedBuffer, cSize));
+        { CHECK_V(regenSize, ZSTD_decompress(decodedBuffer, sampleSize, compressedBuffer, cSize));
           if (regenSize!=sampleSize) goto _output_error; }
         DISPLAYLEVEL(4, "OK \n");
     }
@@ -370,12 +379,12 @@ static int basicUnitTests(U32 seed, double compressibility)
     #define ZEROESLENGTH 100
     DISPLAYLEVEL(4, "test%3i : compress %u zeroes : ", testNb++, ZEROESLENGTH);
     memset(CNBuffer, 0, ZEROESLENGTH);
-    { CHECKTEST(r, ZSTD_compress(compressedBuffer, ZSTD_compressBound(ZEROESLENGTH), CNBuffer, ZEROESLENGTH, 1) );
+    { CHECK_V(r, ZSTD_compress(compressedBuffer, ZSTD_compressBound(ZEROESLENGTH), CNBuffer, ZEROESLENGTH, 1) );
       cSize = r; }
     DISPLAYLEVEL(4, "OK (%u bytes : %.2f%%)\n", (U32)cSize, (double)cSize/ZEROESLENGTH*100);
 
     DISPLAYLEVEL(4, "test%3i : decompress %u zeroes : ", testNb++, ZEROESLENGTH);
-    { CHECKTEST(r, ZSTD_decompress(decodedBuffer, ZEROESLENGTH, compressedBuffer, cSize) );
+    { CHECK_V(r, ZSTD_decompress(decodedBuffer, ZEROESLENGTH, compressedBuffer, cSize) );
       if (r != ZEROESLENGTH) goto _output_error; }
     DISPLAYLEVEL(4, "OK \n");
 
@@ -389,27 +398,29 @@ static int basicUnitTests(U32 seed, double compressibility)
         U32 rSeed = 1;
 
         /* create batch of 3-bytes sequences */
-        { int i; for (i=0; i < NB3BYTESSEQ; i++) {
-            _3BytesSeqs[i][0] = (BYTE)(FUZ_rand(&rSeed) & 255);
-            _3BytesSeqs[i][1] = (BYTE)(FUZ_rand(&rSeed) & 255);
-            _3BytesSeqs[i][2] = (BYTE)(FUZ_rand(&rSeed) & 255);
-        }}
+        {   int i;
+            for (i=0; i < NB3BYTESSEQ; i++) {
+                _3BytesSeqs[i][0] = (BYTE)(FUZ_rand(&rSeed) & 255);
+                _3BytesSeqs[i][1] = (BYTE)(FUZ_rand(&rSeed) & 255);
+                _3BytesSeqs[i][2] = (BYTE)(FUZ_rand(&rSeed) & 255);
+        }   }
 
         /* randomly fills CNBuffer with prepared 3-bytes sequences */
-        { int i; for (i=0; i < _3BYTESTESTLENGTH; i += 3) {   /* note : CNBuffer size > _3BYTESTESTLENGTH+3 */
-            U32 const id = FUZ_rand(&rSeed) & NB3BYTESSEQMASK;
-            ((BYTE*)CNBuffer)[i+0] = _3BytesSeqs[id][0];
-            ((BYTE*)CNBuffer)[i+1] = _3BytesSeqs[id][1];
-            ((BYTE*)CNBuffer)[i+2] = _3BytesSeqs[id][2];
-    }   }}
+        {   int i;
+            for (i=0; i < _3BYTESTESTLENGTH; i += 3) {   /* note : CNBuffer size > _3BYTESTESTLENGTH+3 */
+                U32 const id = FUZ_rand(&rSeed) & NB3BYTESSEQMASK;
+                ((BYTE*)CNBuffer)[i+0] = _3BytesSeqs[id][0];
+                ((BYTE*)CNBuffer)[i+1] = _3BytesSeqs[id][1];
+                ((BYTE*)CNBuffer)[i+2] = _3BytesSeqs[id][2];
+    }   }   }
     DISPLAYLEVEL(4, "test%3i : compress lots 3-bytes sequences : ", testNb++);
-    { CHECKTEST(r, ZSTD_compress(compressedBuffer, ZSTD_compressBound(_3BYTESTESTLENGTH),
+    { CHECK_V(r, ZSTD_compress(compressedBuffer, ZSTD_compressBound(_3BYTESTESTLENGTH),
                                  CNBuffer, _3BYTESTESTLENGTH, 19) );
       cSize = r; }
     DISPLAYLEVEL(4, "OK (%u bytes : %.2f%%)\n", (U32)cSize, (double)cSize/_3BYTESTESTLENGTH*100);
 
     DISPLAYLEVEL(4, "test%3i : decompress lots 3-bytes sequence : ", testNb++);
-    { CHECKTEST(r, ZSTD_decompress(decodedBuffer, _3BYTESTESTLENGTH, compressedBuffer, cSize) );
+    { CHECK_V(r, ZSTD_decompress(decodedBuffer, _3BYTESTESTLENGTH, compressedBuffer, cSize) );
       if (r != _3BYTESTESTLENGTH) goto _output_error; }
     DISPLAYLEVEL(4, "OK \n");
 
@@ -555,6 +566,11 @@ static int fuzzerTests(U32 seed, U32 nbTests, unsigned startTest, U32 const maxD
                   CHECK(endCheck != endMark, "ZSTD_compressCCtx : dst buffer overflow"); }
         }   }
 
+        /* Decompressed size test */
+        {   unsigned long long const rSize = ZSTD_getDecompressedSize(cBuffer, cSize);
+            CHECK(rSize != sampleSize, "decompressed size incorrect");
+        }
+
         /* frame header decompression test */
         {   ZSTD_frameParams dParams;
             size_t const check = ZSTD_getFrameParams(&dParams, cBuffer, cSize);
@@ -676,7 +692,7 @@ static int fuzzerTests(U32 seed, U32 nbTests, unsigned startTest, U32 const maxD
                 totalTestSize += segmentSize;
         }   }
 
-        {   size_t const flushResult = ZSTD_compressEnd(ctx, cBuffer+cSize, cBufferSize-cSize);
+        {   size_t const flushResult = ZSTD_compressEnd(ctx, cBuffer+cSize, cBufferSize-cSize, NULL, 0);
             CHECK (ZSTD_isError(flushResult), "multi-segments epilogue error : %s", ZSTD_getErrorName(flushResult));
             cSize += flushResult;
         }
@@ -691,7 +707,7 @@ static int fuzzerTests(U32 seed, U32 nbTests, unsigned startTest, U32 const maxD
         while (totalCSize < cSize) {
             size_t const inSize = ZSTD_nextSrcSizeToDecompress(dctx);
             size_t const genSize = ZSTD_decompressContinue(dctx, dstBuffer+totalGenSize, dstBufferSize-totalGenSize, cBuffer+totalCSize, inSize);
-            CHECK (ZSTD_isError(genSize), "streaming decompression error : %s", ZSTD_getErrorName(genSize));
+            CHECK (ZSTD_isError(genSize), "ZSTD_decompressContinue error : %s", ZSTD_getErrorName(genSize));
             totalGenSize += genSize;
             totalCSize += inSize;
         }
diff --git a/programs/legacy/fileio_legacy.c b/programs/legacy/fileio_legacy.c
index 7723933..c07b6e5 100644
--- a/programs/legacy/fileio_legacy.c
+++ b/programs/legacy/fileio_legacy.c
@@ -548,6 +548,81 @@ unsigned long long FIOv06_decompressFrame(dRessv06_t ress,
 }
 
 
+/*=====    v0.7.x    =====*/
+
+typedef struct {
+    void*  srcBuffer;
+    size_t srcBufferSize;
+    void*  dstBuffer;
+    size_t dstBufferSize;
+    const void*  dictBuffer;
+    size_t dictBufferSize;
+    ZBUFFv07_DCtx* dctx;
+} dRessv07_t;
+
+static dRessv07_t FIOv07_createDResources(void)
+{
+    dRessv07_t ress;
+
+    /* init */
+    ress.dctx = ZBUFFv07_createDCtx();
+    if (ress.dctx==NULL) EXM_THROW(60, "Can't create ZBUFF decompression context");
+    ress.dictBuffer = NULL; ress.dictBufferSize=0;
+
+    /* Allocate Memory */
+    ress.srcBufferSize = ZBUFFv07_recommendedDInSize();
+    ress.srcBuffer = malloc(ress.srcBufferSize);
+    ress.dstBufferSize = ZBUFFv07_recommendedDOutSize();
+    ress.dstBuffer = malloc(ress.dstBufferSize);
+    if (!ress.srcBuffer || !ress.dstBuffer) EXM_THROW(61, "Allocation error : not enough memory");
+
+    return ress;
+}
+
+static void FIOv07_freeDResources(dRessv07_t ress)
+{
+    size_t const errorCode = ZBUFFv07_freeDCtx(ress.dctx);
+    if (ZBUFFv07_isError(errorCode)) EXM_THROW(69, "Error : can't free ZBUFF context resource : %s", ZBUFFv07_getErrorName(errorCode));
+    free(ress.srcBuffer);
+    free(ress.dstBuffer);
+}
+
+
+unsigned long long FIOv07_decompressFrame(dRessv07_t ress,
+                                          FILE* foutput, FILE* finput)
+{
+    U64    frameSize = 0;
+    size_t readSize  = 4;
+
+    MEM_writeLE32(ress.srcBuffer, ZSTDv07_MAGICNUMBER);
+    ZBUFFv07_decompressInitDictionary(ress.dctx, ress.dictBuffer, ress.dictBufferSize);
+
+    while (1) {
+        /* Decode */
+        size_t inSize=readSize, decodedSize=ress.dstBufferSize;
+        size_t toRead = ZBUFFv07_decompressContinue(ress.dctx, ress.dstBuffer, &decodedSize, ress.srcBuffer, &inSize);
+        if (ZBUFFv07_isError(toRead)) EXM_THROW(36, "Decoding error : %s", ZBUFFv07_getErrorName(toRead));
+        readSize -= inSize;
+
+        /* Write block */
+        { size_t const sizeCheck = fwrite(ress.dstBuffer, 1, decodedSize, foutput);
+          if (sizeCheck != decodedSize) EXM_THROW(37, "Write error : unable to write data block to destination file"); }
+        frameSize += decodedSize;
+        DISPLAYUPDATE(2, "\rDecoded : %u MB...     ", (U32)(frameSize>>20) );
+
+        if (toRead == 0) break;
+        if (readSize) EXM_THROW(38, "Decoding error : should consume entire input");
+
+        /* Fill input buffer */
+        if (toRead > ress.srcBufferSize) EXM_THROW(34, "too large block");
+        readSize = fread(ress.srcBuffer, 1, toRead, finput);
+        if (readSize != toRead) EXM_THROW(35, "Read error");
+    }
+
+    return frameSize;
+}
+
+
 /*=====   General legacy dispatcher   =====*/
 
 unsigned long long FIO_decompressLegacyFrame(FILE* foutput, FILE* finput,
@@ -584,6 +659,14 @@ unsigned long long FIO_decompressLegacyFrame(FILE* foutput, FILE* finput,
                     FIOv06_freeDResources(r);
                     return s;
             }   }
+        case ZSTDv07_MAGICNUMBER :
+            {   dRessv07_t r = FIOv07_createDResources();
+                r.dictBuffer = dictBuffer;
+                r.dictBufferSize = dictSize;
+                {   unsigned long long const s = FIOv07_decompressFrame(r, foutput, finput);
+                    FIOv07_freeDResources(r);
+                    return s;
+            }   }
         default :
             return ERROR(prefix_unknown);
     }
diff --git a/programs/paramgrill.c b/programs/paramgrill.c
index 6cf4ccd..9348a40 100644
--- a/programs/paramgrill.c
+++ b/programs/paramgrill.c
@@ -22,33 +22,19 @@
     - zstd homepage : http://www.zstd.net/
 */
 
-/*-************************************
-*  Compiler Options
-**************************************/
-/* gettimeofday() are not supported by MSVC */
-#if defined(_MSC_VER) || defined(_WIN32)
-#  define BMK_LEGACY_TIMER 1
-#endif
-
 
 /*-************************************
 *  Dependencies
 **************************************/
-#include "util.h"         /* Compiler options, UTIL_GetFileSize */
-#include <stdlib.h>       /* malloc */
-#include <stdio.h>        /* fprintf, fopen, ftello64 */
-#include <string.h>       /* strcmp */
-#include <math.h>         /* log */
-
-/* Use ftime() if gettimeofday() is not available on your target */
-#if defined(BMK_LEGACY_TIMER)
-#  include <sys/timeb.h>  /* timeb, ftime */
-#else
-#  include <sys/time.h>   /* gettimeofday */
-#endif
+#include "util.h"      /* Compiler options, UTIL_GetFileSize */
+#include <stdlib.h>    /* malloc */
+#include <stdio.h>     /* fprintf, fopen, ftello64 */
+#include <string.h>    /* strcmp */
+#include <math.h>      /* log */
+#include <time.h>      /* clock_t */
 
 #include "mem.h"
-#define ZSTD_STATIC_LINKING_ONLY   /* ZSTD_parameters */
+#define ZSTD_STATIC_LINKING_ONLY   /* ZSTD_parameters, ZSTD_estimateCCtxSize */
 #include "zstd.h"
 #include "datagen.h"
 #include "xxhash.h"
@@ -67,7 +53,7 @@
 #define GB *(1ULL<<30)
 
 #define NBLOOPS    2
-#define TIMELOOP   2000
+#define TIMELOOP   (2 * CLOCKS_PER_SEC)
 
 #define NB_LEVELS_TRACKED 30
 
@@ -76,9 +62,9 @@ static const size_t maxMemory = (sizeof(size_t)==4)  ?  (2 GB - 64 MB) : (size_t
 #define COMPRESSIBILITY_DEFAULT 0.50
 static const size_t sampleSize = 10000000;
 
-static const int g_grillDuration = 50000000;   /* about 13 hours */
-static const int g_maxParamTime = 15000;   /* 15 sec */
-static const int g_maxVariationTime = 60000;   /* 60 sec */
+static const U32 g_grillDuration_s = 60000;   /* about 16 hours */
+static const clock_t g_maxParamTime = 15 * CLOCKS_PER_SEC;
+static const clock_t g_maxVariationTime = 60 * CLOCKS_PER_SEC;
 static const int g_maxNbVariations = 64;
 
 
@@ -111,49 +97,15 @@ void BMK_SetNbIterations(int nbLoops)
 *  Private functions
 *********************************************************/
 
-#if defined(BMK_LEGACY_TIMER)
-
-static int BMK_GetMilliStart(void)
-{
-  /* Based on Legacy ftime()
-  *  Rolls over every ~ 12.1 days (0x100000/24/60/60)
-  *  Use GetMilliSpan to correct for rollover */
-  struct timeb tb;
-  int nCount;
-  ftime( &tb );
-  nCount = (int) (tb.millitm + (tb.time & 0xfffff) * 1000);
-  return nCount;
-}
-
-#else
+static clock_t BMK_clockSpan(clock_t cStart) { return clock() - cStart; }  /* works even if overflow ; max span ~ 30 mn */
 
-static int BMK_GetMilliStart(void)
-{
-  /* Based on newer gettimeofday()
-  *  Use GetMilliSpan to correct for rollover */
-  struct timeval tv;
-  int nCount;
-  gettimeofday(&tv, NULL);
-  nCount = (int) (tv.tv_usec/1000 + (tv.tv_sec & 0xfffff) * 1000);
-  return nCount;
-}
-
-#endif
-
-
-static int BMK_GetMilliSpan( int nTimeStart )
-{
-  int nSpan = BMK_GetMilliStart() - nTimeStart;
-  if ( nSpan < 0 )
-    nSpan += 0x100000 * 1000;
-  return nSpan;
-}
+static U32 BMK_timeSpan(time_t tStart) { return (U32)difftime(time(NULL), tStart); }  /* accuracy in seconds only, span can be multiple years */
 
 
 static size_t BMK_findMaxMem(U64 requiredMem)
 {
-    size_t step = 64 MB;
-    BYTE* testmem=NULL;
+    size_t const step = 64 MB;
+    void* testmem = NULL;
 
     requiredMem = (((requiredMem >> 26) + 1) << 26);
     if (requiredMem > maxMemory) requiredMem = maxMemory;
@@ -161,7 +113,7 @@ static size_t BMK_findMaxMem(U64 requiredMem)
     requiredMem += 2*step;
     while (!testmem) {
         requiredMem -= step;
-        testmem = (BYTE*) malloc ((size_t)requiredMem);
+        testmem = malloc ((size_t)requiredMem);
     }
 
     free (testmem);
@@ -188,8 +140,8 @@ U32 FUZ_rand(U32* src)
 *********************************************************/
 typedef struct {
     size_t cSize;
-    U32 cSpeed;
-    U32 dSpeed;
+    double cSpeed;
+    double dSpeed;
 } BMK_result_t;
 
 typedef struct
@@ -265,35 +217,33 @@ static size_t BMK_benchParam(BMK_result_t* resultPtr,
     RDG_genBuffer(compressedBuffer, maxCompressedSize, 0.10, 0.10, 1);
 
     /* Bench */
-    {
-        U32 loopNb;
+    {   U32 loopNb;
         size_t cSize = 0;
         double fastestC = 100000000., fastestD = 100000000.;
         double ratio = 0.;
         U64 crcCheck = 0;
-        const int startTime =BMK_GetMilliStart();
+        clock_t const benchStart = clock();
 
         DISPLAY("\r%79s\r", "");
+        memset(&params, 0, sizeof(params));
         params.cParams = cParams;
-        params.fParams.contentSizeFlag = 0;
         for (loopNb = 1; loopNb <= g_nbIterations; loopNb++) {
             int nbLoops;
-            int milliTime;
             U32 blockNb;
-            const int totalTime = BMK_GetMilliSpan(startTime);
+            clock_t roundStart, roundClock;
 
-            /* early break (slow params) */
-            if (totalTime > g_maxParamTime) break;
+            { clock_t const benchTime = BMK_clockSpan(benchStart);
+              if (benchTime > g_maxParamTime) break; }
 
             /* Compression */
             DISPLAY("\r%1u-%s : %9u ->", loopNb, name, (U32)srcSize);
             memset(compressedBuffer, 0xE5, maxCompressedSize);
 
             nbLoops = 0;
-            milliTime = BMK_GetMilliStart();
-            while (BMK_GetMilliStart() == milliTime);
-            milliTime = BMK_GetMilliStart();
-            while (BMK_GetMilliSpan(milliTime) < TIMELOOP) {
+            roundStart = clock();
+            while (clock() == roundStart);
+            roundStart = clock();
+            while (BMK_clockSpan(roundStart) < TIMELOOP) {
                 for (blockNb=0; blockNb<nbBlocks; blockNb++)
                     blockTable[blockNb].cSize = ZSTD_compress_advanced(ctx,
                                                     blockTable[blockNb].cPtr,  blockTable[blockNb].cRoom,
@@ -302,40 +252,40 @@ static size_t BMK_benchParam(BMK_result_t* resultPtr,
                                                     params);
                 nbLoops++;
             }
-            milliTime = BMK_GetMilliSpan(milliTime);
+            roundClock = BMK_clockSpan(roundStart);
 
             cSize = 0;
             for (blockNb=0; blockNb<nbBlocks; blockNb++)
                 cSize += blockTable[blockNb].cSize;
-            if ((double)milliTime < fastestC*nbLoops) fastestC = (double)milliTime / nbLoops;
+            if ((double)roundClock < fastestC * CLOCKS_PER_SEC * nbLoops) fastestC = ((double)roundClock / CLOCKS_PER_SEC) / nbLoops;
             ratio = (double)srcSize / (double)cSize;
             DISPLAY("\r");
             DISPLAY("%1u-%s : %9u ->", loopNb, name, (U32)srcSize);
-            DISPLAY(" %9u (%4.3f),%7.1f MB/s", (U32)cSize, ratio, (double)srcSize / fastestC / 1000.);
+            DISPLAY(" %9u (%4.3f),%7.1f MB/s", (U32)cSize, ratio, (double)srcSize / fastestC / 1000000.);
             resultPtr->cSize = cSize;
-            resultPtr->cSpeed = (U32)((double)srcSize / fastestC);
+            resultPtr->cSpeed = (double)srcSize / fastestC;
 
 #if 1
             /* Decompression */
             memset(resultBuffer, 0xD6, srcSize);
 
             nbLoops = 0;
-            milliTime = BMK_GetMilliStart();
-            while (BMK_GetMilliStart() == milliTime);
-            milliTime = BMK_GetMilliStart();
-            for ( ; BMK_GetMilliSpan(milliTime) < TIMELOOP; nbLoops++) {
+            roundStart = clock();
+            while (clock() == roundStart);
+            roundStart = clock();
+            for ( ; BMK_clockSpan(roundStart) < TIMELOOP; nbLoops++) {
                 for (blockNb=0; blockNb<nbBlocks; blockNb++)
                     blockTable[blockNb].resSize = ZSTD_decompress(blockTable[blockNb].resPtr, blockTable[blockNb].srcSize,
                                                                   blockTable[blockNb].cPtr, blockTable[blockNb].cSize);
             }
-            milliTime = BMK_GetMilliSpan(milliTime);
+            roundClock = BMK_clockSpan(roundStart);
 
-            if ((double)milliTime < fastestD*nbLoops) fastestD = (double)milliTime / nbLoops;
+            if ((double)roundClock < fastestD * CLOCKS_PER_SEC * nbLoops) fastestD = ((double)roundClock / CLOCKS_PER_SEC) / nbLoops;
             DISPLAY("\r");
             DISPLAY("%1u-%s : %9u -> ", loopNb, name, (U32)srcSize);
-            DISPLAY("%9u (%4.3f),%7.1f MB/s, ", (U32)cSize, ratio, (double)srcSize / fastestC / 1000.);
-            DISPLAY("%7.1f MB/s", (double)srcSize / fastestD / 1000.);
-            resultPtr->dSpeed = (U32)((double)srcSize / fastestD);
+            DISPLAY("%9u (%4.3f),%7.1f MB/s, ", (U32)cSize, ratio, (double)srcSize / fastestC / 1000000.);
+            DISPLAY("%7.1f MB/s", (double)srcSize / fastestD / 1000000.);
+            resultPtr->dSpeed = (double)srcSize / fastestD;
 
             /* CRC Checking */
             crcCheck = XXH64(resultBuffer, srcSize, 0);
@@ -362,6 +312,7 @@ static size_t BMK_benchParam(BMK_result_t* resultPtr,
 
 
 const char* g_stratName[] = { "ZSTD_fast   ",
+                              "ZSTD_dfast  ",
                               "ZSTD_greedy ",
                               "ZSTD_lazy   ",
                               "ZSTD_lazy2  ",
@@ -376,11 +327,11 @@ static void BMK_printWinner(FILE* f, U32 cLevel, BMK_result_t result, ZSTD_compr
             params.targetLength, g_stratName[(U32)(params.strategy)]);
     fprintf(f,
             "/* level %2u */   /* R:%5.3f at %5.1f MB/s - %5.1f MB/s */\n",
-            cLevel, (double)srcSize / result.cSize, (double)result.cSpeed / 1000., (double)result.dSpeed / 1000.);
+            cLevel, (double)srcSize / result.cSize, result.cSpeed / 1000000., result.dSpeed / 1000000.);
 }
 
 
-static U32 g_cSpeedTarget[NB_LEVELS_TRACKED] = { 0 };   /* NB_LEVELS_TRACKED : checked at main() */
+static double g_cSpeedTarget[NB_LEVELS_TRACKED] = { 0. };   /* NB_LEVELS_TRACKED : checked at main() */
 
 typedef struct {
     BMK_result_t result;
@@ -389,7 +340,7 @@ typedef struct {
 
 static void BMK_printWinners2(FILE* f, const winnerInfo_t* winners, size_t srcSize)
 {
-    unsigned cLevel;
+    int cLevel;
 
     fprintf(f, "\n /* Proposed configurations : */ \n");
     fprintf(f, "    /* W,  C,  H,  S,  L,  T, strat */ \n");
@@ -407,15 +358,13 @@ static void BMK_printWinners(FILE* f, const winnerInfo_t* winners, size_t srcSiz
     BMK_printWinners2(stdout, winners, srcSize);
 }
 
-size_t ZSTD_sizeofCCtx(ZSTD_compressionParameters params);   /* hidden interface, declared here */
-
 static int BMK_seed(winnerInfo_t* winners, const ZSTD_compressionParameters params,
               const void* srcBuffer, size_t srcSize,
                     ZSTD_CCtx* ctx)
 {
     BMK_result_t testResult;
     int better = 0;
-    unsigned cLevel;
+    int cLevel;
 
     BMK_benchParam(&testResult, srcBuffer, srcSize, ctx, params);
 
@@ -442,17 +391,16 @@ static int BMK_seed(winnerInfo_t* winners, const ZSTD_compressionParameters para
             double W_DMemUsed_note = W_ratioNote * ( 40 + 9*cLevel) - log((double)W_DMemUsed);
             double O_DMemUsed_note = O_ratioNote * ( 40 + 9*cLevel) - log((double)O_DMemUsed);
 
-            size_t W_CMemUsed = (1 << params.windowLog) + ZSTD_sizeofCCtx(params);
-            size_t O_CMemUsed = (1 << winners[cLevel].params.windowLog) + ZSTD_sizeofCCtx(winners[cLevel].params);
+            size_t W_CMemUsed = (1 << params.windowLog) + ZSTD_estimateCCtxSize(params);
+            size_t O_CMemUsed = (1 << winners[cLevel].params.windowLog) + ZSTD_estimateCCtxSize(winners[cLevel].params);
             double W_CMemUsed_note = W_ratioNote * ( 50 + 13*cLevel) - log((double)W_CMemUsed);
             double O_CMemUsed_note = O_ratioNote * ( 50 + 13*cLevel) - log((double)O_CMemUsed);
 
-            double W_CSpeed_note = W_ratioNote * ( 30 + 10*cLevel) + log((double)testResult.cSpeed);
-            double O_CSpeed_note = O_ratioNote * ( 30 + 10*cLevel) + log((double)winners[cLevel].result.cSpeed);
-
-            double W_DSpeed_note = W_ratioNote * ( 20 + 2*cLevel) + log((double)testResult.dSpeed);
-            double O_DSpeed_note = O_ratioNote * ( 20 + 2*cLevel) + log((double)winners[cLevel].result.dSpeed);
+            double W_CSpeed_note = W_ratioNote * ( 30 + 10*cLevel) + log(testResult.cSpeed);
+            double O_CSpeed_note = O_ratioNote * ( 30 + 10*cLevel) + log(winners[cLevel].result.cSpeed);
 
+            double W_DSpeed_note = W_ratioNote * ( 20 + 2*cLevel) + log(testResult.dSpeed);
+            double O_DSpeed_note = O_ratioNote * ( 20 + 2*cLevel) + log(winners[cLevel].result.dSpeed);
 
             if (W_DMemUsed_note < O_DMemUsed_note) {
                 /* uses too much Decompression memory for too little benefit */
@@ -474,16 +422,16 @@ static int BMK_seed(winnerInfo_t* winners, const ZSTD_compressionParameters para
                 /* too large compression speed difference for the compression benefit */
                 if (W_ratio > O_ratio)
                 DISPLAY ("Compression Speed : %5.3f @ %4.1f MB/s  vs  %5.3f @ %4.1f MB/s   : not enough for level %i\n",
-                         W_ratio, (double)(testResult.cSpeed) / 1000.,
-                         O_ratio, (double)(winners[cLevel].result.cSpeed) / 1000.,   cLevel);
+                         W_ratio, testResult.cSpeed / 1000000,
+                         O_ratio, winners[cLevel].result.cSpeed / 1000000.,   cLevel);
                 continue;
             }
             if (W_DSpeed_note   < O_DSpeed_note  ) {
                 /* too large decompression speed difference for the compression benefit */
                 if (W_ratio > O_ratio)
                 DISPLAY ("Decompression Speed : %5.3f @ %4.1f MB/s  vs  %5.3f @ %4.1f MB/s   : not enough for level %i\n",
-                         W_ratio, (double)(testResult.dSpeed) / 1000.,
-                         O_ratio, (double)(winners[cLevel].result.dSpeed) / 1000.,   cLevel);
+                         W_ratio, testResult.dSpeed / 1000000.,
+                         O_ratio, winners[cLevel].result.dSpeed / 1000000.,   cLevel);
                 continue;
             }
 
@@ -507,6 +455,8 @@ static ZSTD_compressionParameters* sanitizeParams(ZSTD_compressionParameters par
     g_params = params;
     if (params.strategy == ZSTD_fast)
         g_params.chainLog = 0, g_params.searchLog = 0;
+    if (params.strategy == ZSTD_dfast)
+        g_params.searchLog = 0;
     if (params.strategy != ZSTD_btopt )
         g_params.targetLength = 0;
     return &g_params;
@@ -577,9 +527,9 @@ static void playAround(FILE* f, winnerInfo_t* winners,
                        ZSTD_CCtx* ctx)
 {
     int nbVariations = 0;
-    const int startTime = BMK_GetMilliStart();
+    clock_t const clockStart = clock();
 
-    while (BMK_GetMilliSpan(startTime) < g_maxVariationTime) {
+    while (BMK_clockSpan(clockStart) < g_maxVariationTime) {
         ZSTD_compressionParameters p = params;
 
         if (nbVariations++ > g_maxNbVariations) break;
@@ -637,15 +587,18 @@ static void BMK_selectRandomStart(
 
 static void BMK_benchMem(void* srcBuffer, size_t srcSize)
 {
-    ZSTD_CCtx* ctx = ZSTD_createCCtx();
+    ZSTD_CCtx* const ctx = ZSTD_createCCtx();
     ZSTD_compressionParameters params;
     winnerInfo_t winners[NB_LEVELS_TRACKED];
-    int i;
-    unsigned u;
-    const char* rfName = "grillResults.txt";
-    FILE* f;
+    const char* const rfName = "grillResults.txt";
+    FILE* const f = fopen(rfName, "w");
     const size_t blockSize = g_blockSize ? g_blockSize : srcSize;
 
+    /* init */
+    if (ctx==NULL) { DISPLAY("ZSTD_createCCtx() failed \n"); exit(1); }
+    memset(winners, 0, sizeof(winners));
+    if (f==NULL) { DISPLAY("error opening %s \n", rfName); exit(1); }
+
     if (g_singleRun) {
         BMK_result_t testResult;
         g_params = ZSTD_adjustCParams(g_params, srcSize, 0);
@@ -654,41 +607,36 @@ static void BMK_benchMem(void* srcBuffer, size_t srcSize)
         return;
     }
 
-    /* init */
-    memset(winners, 0, sizeof(winners));
-    f = fopen(rfName, "w");
-    if (f==NULL) { DISPLAY("error opening %s \n", rfName); exit(1); }
-
     if (g_target)
-        g_cSpeedTarget[1] = g_target * 1000;
+        g_cSpeedTarget[1] = g_target * 1000000;
     else {
         /* baseline config for level 1 */
         BMK_result_t testResult;
         params = ZSTD_getCParams(1, blockSize, 0);
         BMK_benchParam(&testResult, srcBuffer, srcSize, ctx, params);
-        g_cSpeedTarget[1] = (testResult.cSpeed * 31) >> 5;
+        g_cSpeedTarget[1] = (testResult.cSpeed * 31) / 32;
     }
 
     /* establish speed objectives (relative to level 1) */
-    for (u=2; u<=ZSTD_maxCLevel(); u++)
-        g_cSpeedTarget[u] = (g_cSpeedTarget[u-1] * 25) >> 5;
+    {   int i;
+        for (i=2; i<=ZSTD_maxCLevel(); i++)
+            g_cSpeedTarget[i] = (g_cSpeedTarget[i-1] * 25) / 32;
+    }
 
     /* populate initial solution */
-    {
-        const int maxSeeds = g_noSeed ? 1 : ZSTD_maxCLevel();
-        for (i=1; i<=maxSeeds; i++) {
+    {   const int maxSeeds = g_noSeed ? 1 : ZSTD_maxCLevel();
+        int i;
+        for (i=0; i<=maxSeeds; i++) {
             params = ZSTD_getCParams(i, blockSize, 0);
             BMK_seed(winners, params, srcBuffer, srcSize, ctx);
-        }
-    }
+    }   }
     BMK_printWinners(f, winners, srcSize);
 
     /* start tests */
-    {
-        const int milliStart = BMK_GetMilliStart();
+    {   const time_t grillStart = time(NULL);
         do {
             BMK_selectRandomStart(f, winners, srcBuffer, srcSize, ctx);
-        } while (BMK_GetMilliSpan(milliStart) < g_grillDuration);
+        } while (BMK_timeSpan(grillStart) < g_grillDuration_s);
     }
 
     /* end summary */
@@ -704,8 +652,8 @@ static void BMK_benchMem(void* srcBuffer, size_t srcSize)
 static int benchSample(void)
 {
     void* origBuff;
-    size_t benchedSize = sampleSize;
-    const char* name = "Sample 10MiB";
+    size_t const benchedSize = sampleSize;
+    const char* const name = "Sample 10MiB";
 
     /* Allocation */
     origBuff = malloc(benchedSize);
@@ -724,37 +672,31 @@ static int benchSample(void)
 }
 
 
-int benchFiles(char** fileNamesTable, int nbFiles)
+int benchFiles(const char** fileNamesTable, int nbFiles)
 {
     int fileIdx=0;
 
     /* Loop for each file */
     while (fileIdx<nbFiles) {
-        FILE* inFile;
-        char* inFileName;
-        U64   inFileSize;
+        const char* const inFileName = fileNamesTable[fileIdx++];
+        FILE* const inFile = fopen( inFileName, "rb" );
+        U64 const inFileSize = UTIL_getFileSize(inFileName);
         size_t benchedSize;
-        size_t readSize;
-        char* origBuff;
+        void* origBuff;
 
         /* Check file existence */
-        inFileName = fileNamesTable[fileIdx++];
-        inFile = fopen( inFileName, "rb" );
         if (inFile==NULL) {
             DISPLAY( "Pb opening %s\n", inFileName);
             return 11;
         }
 
-        /* Memory allocation & restrictions */
-        inFileSize = UTIL_getFileSize(inFileName);
+        /* Memory allocation */
         benchedSize = BMK_findMaxMem(inFileSize*3) / 3;
         if ((U64)benchedSize > inFileSize) benchedSize = (size_t)inFileSize;
         if (benchedSize < inFileSize)
             DISPLAY("Not enough memory for '%s' full size; testing %i MB only...\n", inFileName, (int)(benchedSize>>20));
-
-        /* Alloc */
-        origBuff = (char*) malloc((size_t)benchedSize);
-        if(!origBuff) {
+        origBuff = malloc(benchedSize);
+        if (origBuff==NULL) {
             DISPLAY("\nError: not enough memory!\n");
             fclose(inFile);
             return 12;
@@ -762,49 +704,44 @@ int benchFiles(char** fileNamesTable, int nbFiles)
 
         /* Fill input buffer */
         DISPLAY("Loading %s...       \r", inFileName);
-        readSize = fread(origBuff, 1, benchedSize, inFile);
-        fclose(inFile);
-
-        if(readSize != benchedSize) {
-            DISPLAY("\nError: problem reading file '%s' !!    \n", inFileName);
-            free(origBuff);
-            return 13;
-        }
+        {   size_t const readSize = fread(origBuff, 1, benchedSize, inFile);
+            fclose(inFile);
+            if(readSize != benchedSize) {
+                DISPLAY("\nError: problem reading file '%s' !!    \n", inFileName);
+                free(origBuff);
+                return 13;
+        }   }
 
         /* bench */
         DISPLAY("\r%79s\r", "");
         DISPLAY("using %s : \n", inFileName);
         BMK_benchMem(origBuff, benchedSize);
+
+        /* clean */
+        free(origBuff);
     }
 
     return 0;
 }
 
 
-int optimizeForSize(char* inFileName)
+int optimizeForSize(const char* inFileName, U32 targetSpeed)
 {
-    FILE* inFile;
-    U64   inFileSize;
-    size_t benchedSize;
-    size_t readSize;
-    char* origBuff;
-
-    /* Check file existence */
-    inFile = fopen( inFileName, "rb" );
-    if (inFile==NULL) {
-        DISPLAY( "Pb opening %s\n", inFileName);
-        return 11;
-    }
+    FILE* const inFile = fopen( inFileName, "rb" );
+    U64 const inFileSize = UTIL_getFileSize(inFileName);
+    size_t benchedSize = BMK_findMaxMem(inFileSize*3) / 3;
+    void* origBuff;
+
+    /* Init */
+    if (inFile==NULL) { DISPLAY( "Pb opening %s\n", inFileName); return 11; }
 
     /* Memory allocation & restrictions */
-    inFileSize = UTIL_getFileSize(inFileName);
-    benchedSize = (size_t) BMK_findMaxMem(inFileSize*3) / 3;
     if ((U64)benchedSize > inFileSize) benchedSize = (size_t)inFileSize;
     if (benchedSize < inFileSize)
         DISPLAY("Not enough memory for '%s' full size; testing %i MB only...\n", inFileName, (int)(benchedSize>>20));
 
     /* Alloc */
-    origBuff = (char*) malloc((size_t)benchedSize);
+    origBuff = malloc(benchedSize);
     if(!origBuff) {
         DISPLAY("\nError: not enough memory!\n");
         fclose(inFile);
@@ -813,39 +750,40 @@ int optimizeForSize(char* inFileName)
 
     /* Fill input buffer */
     DISPLAY("Loading %s...       \r", inFileName);
-    readSize = fread(origBuff, 1, benchedSize, inFile);
-    fclose(inFile);
-
-    if(readSize != benchedSize) {
-        DISPLAY("\nError: problem reading file '%s' !!    \n", inFileName);
-        free(origBuff);
-        return 13;
-    }
+    {   size_t const readSize = fread(origBuff, 1, benchedSize, inFile);
+        fclose(inFile);
+        if(readSize != benchedSize) {
+            DISPLAY("\nError: problem reading file '%s' !!    \n", inFileName);
+            free(origBuff);
+            return 13;
+    }   }
 
     /* bench */
     DISPLAY("\r%79s\r", "");
-    DISPLAY("optimizing for %s : \n", inFileName);
+    DISPLAY("optimizing for %s - limit speed %u MB/s \n", inFileName, targetSpeed);
+    targetSpeed *= 1000;
 
-    {
-        ZSTD_CCtx* ctx = ZSTD_createCCtx();
+    {   ZSTD_CCtx* const ctx = ZSTD_createCCtx();
         ZSTD_compressionParameters params;
         winnerInfo_t winner;
         BMK_result_t candidate;
         const size_t blockSize = g_blockSize ? g_blockSize : benchedSize;
-        int i;
 
         /* init */
+        if (ctx==NULL) { DISPLAY("\n ZSTD_createCCtx error \n"); free(origBuff); return 14;}
         memset(&winner, 0, sizeof(winner));
         winner.result.cSize = (size_t)(-1);
 
         /* find best solution from default params */
-        {
-            const int maxSeeds = g_noSeed ? 1 : ZSTD_maxCLevel();
+        {   const int maxSeeds = g_noSeed ? 1 : ZSTD_maxCLevel();
+            int i;
             for (i=1; i<=maxSeeds; i++) {
                 params = ZSTD_getCParams(i, blockSize, 0);
                 BMK_benchParam(&candidate, origBuff, benchedSize, ctx, params);
+                if (candidate.cSpeed < targetSpeed)
+                    break;
                 if ( (candidate.cSize < winner.result.cSize)
-                   ||((candidate.cSize == winner.result.cSize) && (candidate.cSpeed > winner.result.cSpeed)) )
+                   | ((candidate.cSize == winner.result.cSize) & (candidate.cSpeed > winner.result.cSpeed)) )
                 {
                     winner.params = params;
                     winner.result = candidate;
@@ -855,12 +793,11 @@ int optimizeForSize(char* inFileName)
         BMK_printWinner(stdout, 99, winner.result, winner.params, benchedSize);
 
         /* start tests */
-        {
-            const int milliStart = BMK_GetMilliStart();
+        {   time_t const grillStart = time(NULL);
             do {
                 params = winner.params;
                 paramVariation(&params);
-                if ((FUZ_rand(&g_rand) & 15) == 1) params = randomParams();
+                if ((FUZ_rand(&g_rand) & 15) == 3) params = randomParams();
 
                 /* exclude faster if already played set of params */
                 if (FUZ_rand(&g_rand) & ((1 << NB_TESTS_PLAYED(params))-1)) continue;
@@ -870,13 +807,15 @@ int optimizeForSize(char* inFileName)
                 BMK_benchParam(&candidate, origBuff, benchedSize, ctx, params);
 
                 /* improvement found => new winner */
-                if ( (candidate.cSize < winner.result.cSize)
-                   ||((candidate.cSize == winner.result.cSize) && (candidate.cSpeed > winner.result.cSpeed)) ) {
+                if ( (candidate.cSpeed > targetSpeed)
+                   & ( (candidate.cSize < winner.result.cSize)
+                     | ((candidate.cSize == winner.result.cSize) & (candidate.cSpeed > winner.result.cSpeed)) )  )
+                {
                     winner.params = params;
                     winner.result = candidate;
                     BMK_printWinner(stdout, 99, winner.result, winner.params, benchedSize);
                 }
-            } while (BMK_GetMilliSpan(milliStart) < g_grillDuration);
+            } while (BMK_timeSpan(grillStart) < g_grillDuration_s);
         }
 
         /* end summary */
@@ -887,11 +826,12 @@ int optimizeForSize(char* inFileName)
         ZSTD_freeCCtx(ctx);
     }
 
+    free(origBuff);
     return 0;
 }
 
 
-static int usage(char* exename)
+static int usage(const char* exename)
 {
     DISPLAY( "Usage :\n");
     DISPLAY( "      %s [arg] file\n", exename);
@@ -904,29 +844,32 @@ static int usage(char* exename)
 static int usage_advanced(void)
 {
     DISPLAY( "\nAdvanced options :\n");
-    DISPLAY( " -i#    : iteration loops [1-9](default : %i)\n", NBLOOPS);
-    DISPLAY( " -B#    : cut input into blocks of size # (default : single block)\n");
-    DISPLAY( " -P#    : generated sample compressibility (default : %.1f%%)\n", COMPRESSIBILITY_DEFAULT * 100);
-    DISPLAY( " -S     : Single run\n");
+    DISPLAY( " -T#    : set level 1 speed objective \n");
+    DISPLAY( " -B#    : cut input into blocks of size # (default : single block) \n");
+    DISPLAY( " -i#    : iteration loops [1-9](default : %i) \n", NBLOOPS);
+    DISPLAY( " -O#    : find Optimized parameters for # target speed (default : 0) \n");
+    DISPLAY( " -S     : Single run \n");
+    DISPLAY( " -P#    : generated sample compressibility (default : %.1f%%) \n", COMPRESSIBILITY_DEFAULT * 100);
     return 0;
 }
 
-static int badusage(char* exename)
+static int badusage(const char* exename)
 {
     DISPLAY("Wrong parameters\n");
     usage(exename);
     return 1;
 }
 
-int main(int argc, char** argv)
+int main(int argc, const char** argv)
 {
     int i,
         filenamesStart=0,
         result;
-    char* exename=argv[0];
-    char* input_filename=0;
+    const char* exename=argv[0];
+    const char* input_filename=0;
     U32 optimizer = 0;
     U32 main_pause = 0;
+    U32 targetSpeed = 0;
 
     /* checks */
     if (NB_LEVELS_TRACKED <= ZSTD_maxCLevel()) {
@@ -940,7 +883,7 @@ int main(int argc, char** argv)
     if (argc<1) { badusage(exename); return 1; }
 
     for(i=1; i<argc; i++) {
-        char* argument = argv[i];
+        const char* argument = argv[i];
 
         if(!argument) continue;   /* Protection if argument empty */
 
@@ -964,7 +907,7 @@ int main(int argc, char** argv)
                     /* Modify Nb Iterations */
                 case 'i':
                     argument++;
-                    if ((argument[0] >='0') && (argument[0] <='9'))
+                    if ((argument[0] >='0') & (argument[0] <='9'))
                         g_nbIterations = *argument++ - '0';
                     break;
 
@@ -972,7 +915,7 @@ int main(int argc, char** argv)
                 case 'P':
                     argument++;
                     {   U32 proba32 = 0;
-                        while ((argument[0]>= '0') && (argument[0]<= '9'))
+                        while ((argument[0]>= '0') & (argument[0]<= '9'))
                             proba32 = (proba32*10) + (*argument++ - '0');
                         g_compressibility = (double)proba32 / 100.;
                     }
@@ -981,6 +924,9 @@ int main(int argc, char** argv)
                 case 'O':
                     argument++;
                     optimizer=1;
+                    targetSpeed = 0;
+                    while ((*argument >= '0') & (*argument <= '9'))
+                        targetSpeed = (targetSpeed*10) + (*argument++ - '0');
                     break;
 
                     /* Run Single conf */
@@ -1058,7 +1004,7 @@ int main(int argc, char** argv)
                 case 'B':
                     g_blockSize = 0;
                     argument++;
-                    while ((*argument >='0') && (*argument <='9'))
+                    while ((*argument >='0') & (*argument <='9'))
                         g_blockSize = (g_blockSize*10) + (*argument++ - '0');
                     if (*argument=='K') g_blockSize<<=10, argument++;  /* allows using KB notation */
                     if (*argument=='M') g_blockSize<<=20, argument++;
@@ -1081,7 +1027,7 @@ int main(int argc, char** argv)
         result = benchSample();
     else {
         if (optimizer)
-            result = optimizeForSize(input_filename);
+            result = optimizeForSize(input_filename, targetSpeed);
         else
             result = benchFiles(argv+filenamesStart, argc-filenamesStart);
     }
diff --git a/programs/playTests.sh b/programs/playTests.sh
index 8afd9cb..1fc508f 100755
--- a/programs/playTests.sh
+++ b/programs/playTests.sh
@@ -16,7 +16,7 @@ roundTripTest() {
     rm -f tmp1 tmp2
     $ECHO "roundTripTest: ./datagen $1 $p | $ZSTD -v$c | $ZSTD -d"
     ./datagen $1 $p | $MD5SUM > tmp1
-    ./datagen $1 $p | $ZSTD -vq$c | $ZSTD -d  | $MD5SUM > tmp2
+    ./datagen $1 $p | $ZSTD -v$c | $ZSTD -d  | $MD5SUM > tmp2
     diff -q tmp1 tmp2
 }
 
@@ -96,6 +96,13 @@ cat hello.zstd world.zstd > helloworld.zstd
 $ZSTD -dc helloworld.zstd > result.tmp
 cat result.tmp
 sdiff helloworld.tmp result.tmp
+$ECHO "frame concatenation without checksum"
+$ZSTD -c hello.tmp > hello.zstd --no-check
+$ZSTD -c world.tmp > world.zstd --no-check
+cat hello.zstd world.zstd > helloworld.zstd
+$ZSTD -dc helloworld.zstd > result.tmp
+cat result.tmp
+sdiff helloworld.tmp result.tmp
 rm ./*.tmp ./*.zstd
 $ECHO "frame concatenation tests completed"
 
@@ -142,8 +149,8 @@ $ECHO "\n**** multiple files tests **** "
 ./datagen -s1        > tmp1 2> $INTOVOID
 ./datagen -s2 -g100K > tmp2 2> $INTOVOID
 ./datagen -s3 -g1M   > tmp3 2> $INTOVOID
-$ZSTD -f tmp*
 $ECHO "compress tmp* : "
+$ZSTD -f tmp*
 ls -ls tmp*
 rm tmp1 tmp2 tmp3
 $ECHO "decompress tmp* : "
@@ -204,8 +211,16 @@ $ZSTD -t tmp1.zst
 $ZSTD --test tmp1.zst
 $ECHO "test multiple files (*.zst) "
 $ZSTD -t *.zst
-$ECHO "test good and bad files (*) "
+$ECHO "test bad files (*) "
 $ZSTD -t * && die "bad files not detected !"
+$ZSTD -t tmp1 && die "bad file not detected !"
+cp tmp1 tmp2.zst
+$ZSTD -t tmp2.zst && die "bad file not detected !"
+./datagen -g0 > tmp3
+$ZSTD -t tmp3 && die "bad file not detected !"   # detects 0-sized files as bad
+$ECHO "test --rm and --test combined "
+$ZSTD -t --rm tmp1.zst
+ls -ls tmp1.zst  # check file is still present
 
 
 $ECHO "\n**** zstd round-trip tests **** "
diff --git a/programs/util.h b/programs/util.h
index 2b739dc..72a40ca 100644
--- a/programs/util.h
+++ b/programs/util.h
@@ -284,6 +284,7 @@ UTIL_STATIC int UTIL_prepareFileList(const char *dirName, char** bufStart, size_
         return 0;
     }
 
+    errno = 0;
     while ((entry = readdir(dir)) != NULL) {
         if (strcmp (entry->d_name, "..") == 0 ||
             strcmp (entry->d_name, ".") == 0) continue;
@@ -310,8 +311,14 @@ UTIL_STATIC int UTIL_prepareFileList(const char *dirName, char** bufStart, size_
             }
          //   printf ("%s/%s nbFiles=%d left=%d\n", dirName, entry->d_name, nbFiles, (int)(bufEnd - *bufStart));
         }
+        errno = 0; // clear errno after UTIL_isDirectory, UTIL_prepareFileList
     }
 
+    if (errno != 0) {
+        fprintf(stderr, "readdir(%s) error: %s\n", dirName, strerror(errno));
+        free(*bufStart);
+        *bufStart = NULL;
+    }
     closedir(dir);
     return nbFiles;
 }
diff --git a/programs/zbufftest.c b/programs/zbufftest.c
index 41dfa33..ce6beb2 100644
--- a/programs/zbufftest.c
+++ b/programs/zbufftest.c
@@ -381,13 +381,9 @@ static int fuzzerTests(U32 seed, U32 nbTests, unsigned startTest, double compres
             {   size_t const dictStart = FUZ_rand(&lseed) % (srcBufferSize - dictSize);
                 dict = srcBuffer + dictStart;
             }
-            {   ZSTD_compressionParameters const cPar = ZSTD_getCParams(cLevel, 0, dictSize);
-                U32 const checksum = FUZ_rand(&lseed) & 1;
-                U32 const noDictIDFlag = FUZ_rand(&lseed) & 1;
-                ZSTD_frameParameters const fPar = { 0, checksum, noDictIDFlag };
-                ZSTD_parameters params;
-                params.cParams = cPar;
-                params.fParams = fPar;
+            {   ZSTD_parameters params = ZSTD_getParams(cLevel, 0, dictSize);
+                params.fParams.checksumFlag = FUZ_rand(&lseed) & 1;
+                params.fParams.noDictIDFlag = FUZ_rand(&lseed) & 1;
                 {   size_t const initError = ZBUFF_compressInit_advanced(zc, dict, dictSize, params, 0);
                     CHECK (ZBUFF_isError(initError),"init error : %s", ZBUFF_getErrorName(initError));
         }   }   }
@@ -428,23 +424,22 @@ static int fuzzerTests(U32 seed, U32 nbTests, unsigned startTest, double compres
                 U32 const enoughDstSize = dstBuffSize >= remainingToFlush;
                 remainingToFlush = ZBUFF_compressEnd(zc, cBuffer+cSize, &dstBuffSize);
                 CHECK (ZBUFF_isError(remainingToFlush), "flush error : %s", ZBUFF_getErrorName(remainingToFlush));
-                //DISPLAY("flush %u bytes : still within context : %i \n", (U32)dstBuffSize, (int)remainingToFlush);
-                CHECK (enoughDstSize && remainingToFlush, "ZBUFF_compressEnd() not fully flushed, but enough space available");
+                CHECK (enoughDstSize && remainingToFlush, "ZBUFF_compressEnd() not fully flushed (%u remaining), but enough space available", (U32)remainingToFlush);
                 cSize += dstBuffSize;
         }   }
         crcOrig = XXH64_digest(&xxhState);
 
         /* multi - fragments decompression test */
         ZBUFF_decompressInitDictionary(zd, dict, dictSize);
-        for (totalCSize = 0, totalGenSize = 0 ; totalCSize < cSize ; ) {
+        errorCode = 1;
+        for (totalCSize = 0, totalGenSize = 0 ; errorCode ; ) {
             size_t readCSrcSize = FUZ_randomLength(&lseed, maxSampleLog);
             size_t const randomDstSize = FUZ_randomLength(&lseed, maxSampleLog);
             size_t dstBuffSize = MIN(dstBufferSize - totalGenSize, randomDstSize);
-            size_t const decompressError = ZBUFF_decompressContinue(zd, dstBuffer+totalGenSize, &dstBuffSize, cBuffer+totalCSize, &readCSrcSize);
-            CHECK (ZBUFF_isError(decompressError), "decompression error : %s", ZBUFF_getErrorName(decompressError));
+            errorCode = ZBUFF_decompressContinue(zd, dstBuffer+totalGenSize, &dstBuffSize, cBuffer+totalCSize, &readCSrcSize);
+            CHECK (ZBUFF_isError(errorCode), "decompression error : %s", ZBUFF_getErrorName(errorCode));
             totalGenSize += dstBuffSize;
             totalCSize += readCSrcSize;
-            errorCode = decompressError;   /* needed for != 0 last test */
         }
         CHECK (errorCode != 0, "frame not fully decoded");
         CHECK (totalGenSize != totalTestSize, "decompressed data : wrong size")
diff --git a/programs/zstd.1 b/programs/zstd.1
index d7760f7..d2dfc3c 100644
--- a/programs/zstd.1
+++ b/programs/zstd.1
@@ -33,17 +33,17 @@ It is based on the \fBLZ77\fR family, with further FSE & huff0 entropy stages.
 It also features a very fast decoder, with speed > 500 MB/s per core.
 
 \fBzstd\fR command line is generally similar to gzip, but features the following differences :
- - Original files are preserved
+ - Source files are preserved by default
+   It's possible to remove them automatically by using \fB--rm\fR command
  - By default, when compressing a single file, \fBzstd\fR displays progress notifications and result summary.
      Use \fB-q\fR to turn them off
 
 
-\fBzstd\fR supports the following options :
 
 .SH OPTIONS
 .TP
 .B \-#
- # compression level [1-22] (default:1)
+ # compression level [1-22] (default:3)
 .TP
 .BR \-d ", " --decompress
  decompression
@@ -57,6 +57,19 @@ It also features a very fast decoder, with speed > 500 MB/s per core.
 .BR \-f ", " --force
  overwrite output without prompting
 .TP
+.BR \-c ", " --stdout
+ force write to standard output, even if it is the console
+.TP
+.BR \--rm
+ remove source file(s) after successful compression or decompression
+.TP
+.BR \-k ", " --keep
+ keep source file(s) after successful compression or decompression.
+ This is the default behavior.
+.TP
+.BR \-r
+ operate recursively on directories
+.TP
 .BR \-h/\-H ", " --help
  display help/long help and exit
 .TP
@@ -67,16 +80,14 @@ It also features a very fast decoder, with speed > 500 MB/s per core.
  verbose mode
 .TP
 .BR \-q ", " --quiet
- suppress warnings and notifications; specify twice to suppress errors too
-.TP
-.BR \-c ", " --stdout
- force write to standard output, even if it is the console
+ suppress warnings, interactivity and notifications.
+ specify twice to suppress errors too.
 .TP
 .BR \-C ", " --check
  add integrity check computed from uncompressed data
 .TP
 .BR \-t ", " --test
- Test the integrity of compressed files.  This option is equivalent to \fB--decompress --stdout > /dev/null\fR.
+ Test the integrity of compressed files. This option is equivalent to \fB--decompress --stdout > /dev/null\fR.
  No files are created or removed.
 
 .SH DICTIONARY
@@ -121,9 +132,6 @@ Typical gains range from ~10% (at 64KB) to x5 better (at <1KB).
 .TP
 .B \-B#
  cut file into independent blocks of size # (default: no block)
-.TP
-.B \-r#
- test all compression levels from 1 to # (default: disabled)
 
 
 .SH BUGS
diff --git a/programs/zstdcli.c b/programs/zstdcli.c
index bf40dad..4668232 100644
--- a/programs/zstdcli.c
+++ b/programs/zstdcli.c
@@ -29,11 +29,20 @@
 
 
 /*-************************************
+*  Tuning parameters
+**************************************/
+#ifndef ZSTDCLI_CLEVEL_DEFAULT
+#  define ZSTDCLI_CLEVEL_DEFAULT 3
+#endif
+
+
+/*-************************************
 *  Includes
 **************************************/
 #include "util.h"     /* Compiler options, UTIL_HAS_CREATEFILELIST */
 #include <string.h>   /* strcmp, strlen */
 #include <ctype.h>    /* toupper */
+#include <errno.h>    /* errno */
 #include "fileio.h"
 #ifndef ZSTD_NOBENCH
 #  include "bench.h"  /* BMK_benchFiles, BMK_SetNbIterations */
@@ -45,7 +54,6 @@
 #include "zstd.h"     /* ZSTD_VERSION_STRING */
 
 
-
 /*-************************************
 *  OS-specific Includes
 **************************************/
@@ -53,12 +61,12 @@
 #  include <io.h>       /* _isatty */
 #  define IS_CONSOLE(stdStream) _isatty(_fileno(stdStream))
 #else
-#if defined(_POSIX_C_SOURCE) || defined(_XOPEN_SOURCE) || defined(_POSIX_SOURCE)
-#  include <unistd.h>   /* isatty */
-#  define IS_CONSOLE(stdStream) isatty(fileno(stdStream))
-#else
-#  define IS_CONSOLE(stdStream) 0
-#endif
+#  if defined(_POSIX_C_SOURCE) || defined(_XOPEN_SOURCE) || defined(_POSIX_SOURCE)
+#    include <unistd.h>   /* isatty */
+#    define IS_CONSOLE(stdStream) isatty(fileno(stdStream))
+#  else
+#    define IS_CONSOLE(stdStream) 0
+#  endif
 #endif
 
 
@@ -82,7 +90,7 @@
 
 static const char*    g_defaultDictName = "dictionary";
 static const unsigned g_defaultMaxDictSize = 110 KB;
-static const unsigned g_defaultDictCLevel = 5;
+static const int      g_defaultDictCLevel = 5;
 static const unsigned g_defaultSelectivityLevel = 9;
 
 
@@ -107,7 +115,7 @@ static int usage(const char* programName)
     DISPLAY( "          with no FILE, or when FILE is - , read standard input\n");
     DISPLAY( "Arguments :\n");
 #ifndef ZSTD_NOCOMPRESS
-    DISPLAY( " -#     : # compression level (1-%u, default:1) \n", ZSTD_maxCLevel());
+    DISPLAY( " -#     : # compression level (1-%u, default:%u) \n", ZSTD_maxCLevel(), ZSTDCLI_CLEVEL_DEFAULT);
 #endif
 #ifndef ZSTD_NODECOMPRESS
     DISPLAY( " -d     : decompression \n");
@@ -115,6 +123,8 @@ static int usage(const char* programName)
     DISPLAY( " -D file: use `file` as Dictionary \n");
     DISPLAY( " -o file: result stored into `file` (only if 1 input file) \n");
     DISPLAY( " -f     : overwrite output without prompting \n");
+    DISPLAY( "--rm    : remove source file(s) after successful de/compression \n");
+    DISPLAY( " -k     : preserve source file(s) (default) \n");
     DISPLAY( " -h/-H  : display help/long help and exit\n");
     return 0;
 }
@@ -132,7 +142,6 @@ static int usage_advanced(const char* programName)
 #ifdef UTIL_HAS_CREATEFILELIST
     DISPLAY( " -r     : operate recursively on directories\n");
 #endif
-    DISPLAY( "--rm    : remove source files after successful de/compression \n");
 #ifndef ZSTD_NOCOMPRESS
     DISPLAY( "--ultra : enable ultra modes (requires more memory to decompress)\n");
     DISPLAY( "--no-dictID : don't write dictID into header (dictionary compression)\n");
@@ -169,7 +178,6 @@ static int badusage(const char* programName)
     return 1;
 }
 
-
 static void waitEnter(void)
 {
     int unused;
@@ -181,7 +189,7 @@ static void waitEnter(void)
 /*! readU32FromChar() :
     @return : unsigned integer value reach from input in `char` format
     Will also modify `*stringPtr`, advancing it to position where it stopped reading.
-    Note : this function can overflow if result > MAX_UNIT */
+    Note : this function can overflow if result > MAX_UINT */
 static unsigned readU32FromChar(const char** stringPtr)
 {
     unsigned result = 0;
@@ -198,6 +206,7 @@ int main(int argCount, const char** argv)
     int argNb,
         bench=0,
         decode=0,
+        testmode=0,
         forceStdout=0,
         main_pause=0,
         nextEntryIsDictionary=0,
@@ -205,9 +214,10 @@ int main(int argCount, const char** argv)
         dictBuild=0,
         nextArgumentIsOutFileName=0,
         nextArgumentIsMaxDict=0,
-        nextArgumentIsDictID=0;
-    unsigned cLevel = 1;
-    unsigned cLevelLast = 1;
+        nextArgumentIsDictID=0,
+        nextArgumentIsFile=0;
+    int cLevel = ZSTDCLI_CLEVEL_DEFAULT;
+    int cLevelLast = 1;
     unsigned recursive = 0;
     const char** filenameTable = (const char**)malloc(argCount * sizeof(const char*));   /* argCount >= 1 */
     unsigned filenameIdx = 0;
@@ -217,7 +227,7 @@ int main(int argCount, const char** argv)
     char* dynNameSpace = NULL;
     unsigned maxDictSize = g_defaultMaxDictSize;
     unsigned dictID = 0;
-    unsigned dictCLevel = g_defaultDictCLevel;
+    int dictCLevel = g_defaultDictCLevel;
     unsigned dictSelect = g_defaultSelectivityLevel;
 #ifdef UTIL_HAS_CREATEFILELIST
     const char** fileNamesTable = NULL;
@@ -229,7 +239,7 @@ int main(int argCount, const char** argv)
     (void)recursive; (void)cLevelLast;    /* not used when ZSTD_NOBENCH set */
     (void)dictCLevel; (void)dictSelect; (void)dictID;  /* not used when ZSTD_NODICT set */
     (void)decode; (void)cLevel; /* not used when ZSTD_NOCOMPRESS set */
-    if (filenameTable==NULL) { DISPLAY("not enough memory\n"); exit(1); }
+    if (filenameTable==NULL) { DISPLAY("zstd: %s \n", strerror(errno)); exit(1); }
     filenameTable[0] = stdinmark;
     displayOut = stderr;
     /* Pick out program name from path. Don't rely on stdlib because of conflicting behavior */
@@ -247,142 +257,165 @@ int main(int argCount, const char** argv)
         const char* argument = argv[argNb];
         if(!argument) continue;   /* Protection if argument empty */
 
-        /* long commands (--long-word) */
-        if (!strcmp(argument, "--decompress")) { decode=1; continue; }
-        if (!strcmp(argument, "--force")) {  FIO_overwriteMode(); continue; }
-        if (!strcmp(argument, "--version")) { displayOut=stdout; DISPLAY(WELCOME_MESSAGE); CLEAN_RETURN(0); }
-        if (!strcmp(argument, "--help")) { displayOut=stdout; CLEAN_RETURN(usage_advanced(programName)); }
-        if (!strcmp(argument, "--verbose")) { displayLevel=4; continue; }
-        if (!strcmp(argument, "--quiet")) { displayLevel--; continue; }
-        if (!strcmp(argument, "--stdout")) { forceStdout=1; outFileName=stdoutmark; displayLevel=1; continue; }
-        if (!strcmp(argument, "--ultra")) { FIO_setMaxWLog(0); continue; }
-        if (!strcmp(argument, "--check")) { FIO_setChecksumFlag(2); continue; }
-        if (!strcmp(argument, "--no-check")) { FIO_setChecksumFlag(0); continue; }
-        if (!strcmp(argument, "--no-dictID")) { FIO_setDictIDFlag(0); continue; }
-        if (!strcmp(argument, "--sparse")) { FIO_setSparseWrite(2); continue; }
-        if (!strcmp(argument, "--no-sparse")) { FIO_setSparseWrite(0); continue; }
-        if (!strcmp(argument, "--test")) { decode=1; outFileName=nulmark; FIO_overwriteMode(); continue; }
-        if (!strcmp(argument, "--train")) { dictBuild=1; outFileName=g_defaultDictName; continue; }
-        if (!strcmp(argument, "--maxdict")) { nextArgumentIsMaxDict=1; continue; }
-        if (!strcmp(argument, "--dictID")) { nextArgumentIsDictID=1; continue; }
-        if (!strcmp(argument, "--keep")) { continue; }   /* does nothing, since preserving input is default; for gzip/xz compatibility */
-        if (!strcmp(argument, "--rm")) { FIO_setRemoveSrcFile(1); continue; }
-
-        /* '-' means stdin/stdout */
-        if (!strcmp(argument, "-")){
-            if (!filenameIdx) { filenameIdx=1, filenameTable[0]=stdinmark; outFileName=stdoutmark; continue; }
-        }
+        if (nextArgumentIsFile==0) {
+
+            /* long commands (--long-word) */
+            if (!strcmp(argument, "--")) { nextArgumentIsFile=1; continue; }
+            if (!strcmp(argument, "--decompress")) { decode=1; continue; }
+            if (!strcmp(argument, "--force")) {  FIO_overwriteMode(); continue; }
+            if (!strcmp(argument, "--version")) { displayOut=stdout; DISPLAY(WELCOME_MESSAGE); CLEAN_RETURN(0); }
+            if (!strcmp(argument, "--help")) { displayOut=stdout; CLEAN_RETURN(usage_advanced(programName)); }
+            if (!strcmp(argument, "--verbose")) { displayLevel++; continue; }
+            if (!strcmp(argument, "--quiet")) { displayLevel--; continue; }
+            if (!strcmp(argument, "--stdout")) { forceStdout=1; outFileName=stdoutmark; displayLevel-=(displayLevel==2); continue; }
+            if (!strcmp(argument, "--ultra")) { FIO_setMaxWLog(0); continue; }
+            if (!strcmp(argument, "--check")) { FIO_setChecksumFlag(2); continue; }
+            if (!strcmp(argument, "--no-check")) { FIO_setChecksumFlag(0); continue; }
+            if (!strcmp(argument, "--no-dictID")) { FIO_setDictIDFlag(0); continue; }
+            if (!strcmp(argument, "--sparse")) { FIO_setSparseWrite(2); continue; }
+            if (!strcmp(argument, "--no-sparse")) { FIO_setSparseWrite(0); continue; }
+            if (!strcmp(argument, "--test")) { testmode=1; decode=1; continue; }
+            if (!strcmp(argument, "--train")) { dictBuild=1; outFileName=g_defaultDictName; continue; }
+            if (!strcmp(argument, "--maxdict")) { nextArgumentIsMaxDict=1; continue; }
+            if (!strcmp(argument, "--dictID")) { nextArgumentIsDictID=1; continue; }
+            if (!strcmp(argument, "--keep")) { FIO_setRemoveSrcFile(0); continue; }
+            if (!strcmp(argument, "--rm")) { FIO_setRemoveSrcFile(1); continue; }
+
+            /* '-' means stdin/stdout */
+            if (!strcmp(argument, "-")){
+                if (!filenameIdx) {
+                    filenameIdx=1, filenameTable[0]=stdinmark;
+                    outFileName=stdoutmark;
+                    displayLevel-=(displayLevel==2);
+                    continue;
+            }   }
 
-        /* Decode commands (note : aggregated commands are allowed) */
-        if (argument[0]=='-') {
-            argument++;
+            /* Decode commands (note : aggregated commands are allowed) */
+            if (argument[0]=='-') {
+                argument++;
 
-            while (argument[0]!=0) {
-#ifndef ZSTD_NOCOMPRESS
-                /* compression Level */
-                if ((*argument>='0') && (*argument<='9')) {
-                    cLevel = readU32FromChar(&argument);
-                    dictCLevel = cLevel;
-                    if (dictCLevel > ZSTD_maxCLevel())
-                        CLEAN_RETURN(badusage(programName));
-                    continue;
-                }
-#endif
+                while (argument[0]!=0) {
+    #ifndef ZSTD_NOCOMPRESS
+                    /* compression Level */
+                    if ((*argument>='0') && (*argument<='9')) {
+                        cLevel = readU32FromChar(&argument);
+                        dictCLevel = cLevel;
+                        if (dictCLevel > ZSTD_maxCLevel())
+                            CLEAN_RETURN(badusage(programName));
+                        continue;
+                    }
+    #endif
 
-                switch(argument[0])
-                {
-                    /* Display help */
-                case 'V': displayOut=stdout; DISPLAY(WELCOME_MESSAGE); CLEAN_RETURN(0);   /* Version Only */
-                case 'H':
-                case 'h': displayOut=stdout; CLEAN_RETURN(usage_advanced(programName));
+                    switch(argument[0])
+                    {
+                        /* Display help */
+                    case 'V': displayOut=stdout; DISPLAY(WELCOME_MESSAGE); CLEAN_RETURN(0);   /* Version Only */
+                    case 'H':
+                    case 'h': displayOut=stdout; CLEAN_RETURN(usage_advanced(programName));
 
-                     /* Decoding */
-                case 'd': decode=1; argument++; break;
+                         /* Decoding */
+                    case 'd': decode=1; argument++; break;
 
-                    /* Force stdout, even if stdout==console */
-                case 'c': forceStdout=1; outFileName=stdoutmark; displayLevel=1; argument++; break;
+                        /* Force stdout, even if stdout==console */
+                    case 'c': forceStdout=1; outFileName=stdoutmark; displayLevel-=(displayLevel==2); argument++; break;
 
-                    /* Use file content as dictionary */
-                case 'D': nextEntryIsDictionary = 1; argument++; break;
+                        /* Use file content as dictionary */
+                    case 'D': nextEntryIsDictionary = 1; argument++; break;
 
-                    /* Overwrite */
-                case 'f': FIO_overwriteMode(); forceStdout=1; argument++; break;
+                        /* Overwrite */
+                    case 'f': FIO_overwriteMode(); forceStdout=1; argument++; break;
 
-                    /* Verbose mode */
-                case 'v': displayLevel=4; argument++; break;
+                        /* Verbose mode */
+                    case 'v': displayLevel++; argument++; break;
 
-                    /* Quiet mode */
-                case 'q': displayLevel--; argument++; break;
+                        /* Quiet mode */
+                    case 'q': displayLevel--; argument++; break;
 
-                    /* keep source file (default anyway, so useless; for gzip/xz compatibility) */
-                case 'k': argument++; break;
+                        /* keep source file (default); for gzip/xz compatibility */
+                    case 'k': FIO_setRemoveSrcFile(0); argument++; break;
 
-                    /* Checksum */
-                case 'C': argument++; FIO_setChecksumFlag(2); break;
+                        /* Checksum */
+                    case 'C': argument++; FIO_setChecksumFlag(2); break;
 
-                    /* test compressed file */
-                case 't': decode=1; outFileName=nulmark; argument++; break;
+                        /* test compressed file */
+                    case 't': testmode=1; decode=1; argument++; break;
 
-                    /* dictionary name */
-                case 'o': nextArgumentIsOutFileName=1; argument++; break;
+                        /* destination file name */
+                    case 'o': nextArgumentIsOutFileName=1; argument++; break;
 
-                    /* recursive */
-                case 'r': recursive=1; argument++; break;
+                        /* recursive */
+                    case 'r': recursive=1; argument++; break;
 
-#ifndef ZSTD_NOBENCH
-                    /* Benchmark */
-                case 'b': bench=1; argument++; break;
+    #ifndef ZSTD_NOBENCH
+                        /* Benchmark */
+                    case 'b': bench=1; argument++; break;
 
-                    /* range bench (benchmark only) */
-                case 'e':
-                        /* compression Level */
+                        /* range bench (benchmark only) */
+                    case 'e':
+                            /* compression Level */
+                            argument++;
+                            cLevelLast = readU32FromChar(&argument);
+                            break;
+
+                        /* Modify Nb Iterations (benchmark only) */
+                    case 'i':
                         argument++;
-                        cLevelLast = readU32FromChar(&argument);
+                        {   U32 const iters = readU32FromChar(&argument);
+                            BMK_setNotificationLevel(displayLevel);
+                            BMK_SetNbIterations(iters);
+                        }
                         break;
 
-                    /* Modify Nb Iterations (benchmark only) */
-                case 'i':
-                    argument++;
-                    {   U32 const iters = readU32FromChar(&argument);
-                        BMK_setNotificationLevel(displayLevel);
-                        BMK_SetNbIterations(iters);
-                    }
-                    break;
-
-                    /* cut input into blocks (benchmark only) */
-                case 'B':
-                    argument++;
-                    {   size_t bSize = readU32FromChar(&argument);
-                        if (toupper(*argument)=='K') bSize<<=10, argument++;  /* allows using KB notation */
-                        if (toupper(*argument)=='M') bSize<<=20, argument++;
-                        if (toupper(*argument)=='B') argument++;
-                        BMK_setNotificationLevel(displayLevel);
-                        BMK_SetBlockSize(bSize);
-                    }
-                    break;
-#endif   /* ZSTD_NOBENCH */
+                        /* cut input into blocks (benchmark only) */
+                    case 'B':
+                        argument++;
+                        {   size_t bSize = readU32FromChar(&argument);
+                            if (toupper(*argument)=='K') bSize<<=10, argument++;  /* allows using KB notation */
+                            if (toupper(*argument)=='M') bSize<<=20, argument++;
+                            if (toupper(*argument)=='B') argument++;
+                            BMK_setNotificationLevel(displayLevel);
+                            BMK_SetBlockSize(bSize);
+                        }
+                        break;
+    #endif   /* ZSTD_NOBENCH */
 
-                    /* Dictionary Selection level */
-                case 's':
-                    argument++;
-                    dictSelect = readU32FromChar(&argument);
-                    break;
+                        /* Dictionary Selection level */
+                    case 's':
+                        argument++;
+                        dictSelect = readU32FromChar(&argument);
+                        break;
 
-                    /* Pause at the end (-p) or set an additional param (-p#) (hidden option) */
-                case 'p': argument++;
-#ifndef ZSTD_NOBENCH
-                    if ((*argument>='0') && (*argument<='9')) {
-                        BMK_setAdditionalParam(readU32FromChar(&argument));
-                    } else
-#endif
-                        main_pause=1;
-                    break;
-                    /* unknown command */
-                default : CLEAN_RETURN(badusage(programName));
+                        /* Pause at the end (-p) or set an additional param (-p#) (hidden option) */
+                    case 'p': argument++;
+    #ifndef ZSTD_NOBENCH
+                        if ((*argument>='0') && (*argument<='9')) {
+                            BMK_setAdditionalParam(readU32FromChar(&argument));
+                        } else
+    #endif
+                            main_pause=1;
+                        break;
+                        /* unknown command */
+                    default : CLEAN_RETURN(badusage(programName));
+                    }
                 }
+                continue;
+            }   /* if (argument[0]=='-') */
+
+            if (nextArgumentIsMaxDict) {
+                nextArgumentIsMaxDict = 0;
+                maxDictSize = readU32FromChar(&argument);
+                if (toupper(*argument)=='K') maxDictSize <<= 10;
+                if (toupper(*argument)=='M') maxDictSize <<= 20;
+                continue;
+            }
+
+            if (nextArgumentIsDictID) {
+                nextArgumentIsDictID = 0;
+                dictID = readU32FromChar(&argument);
+                continue;
             }
-            continue;
-        }   /* if (argument[0]=='-') */
+
+        }   /* if (nextArgumentIsAFile==0) */
 
         if (nextEntryIsDictionary) {
             nextEntryIsDictionary = 0;
@@ -397,20 +430,6 @@ int main(int argCount, const char** argv)
             continue;
         }
 
-        if (nextArgumentIsMaxDict) {
-            nextArgumentIsMaxDict = 0;
-            maxDictSize = readU32FromChar(&argument);
-            if (toupper(*argument)=='K') maxDictSize <<= 10;
-            if (toupper(*argument)=='M') maxDictSize <<= 20;
-            continue;
-        }
-
-        if (nextArgumentIsDictID) {
-            nextArgumentIsDictID = 0;
-            dictID = readU32FromChar(&argument);
-            continue;
-        }
-
         /* add filename to list */
         filenameTable[filenameIdx++] = argument;
     }
@@ -423,7 +442,7 @@ int main(int argCount, const char** argv)
         fileNamesTable = UTIL_createFileList(filenameTable, filenameIdx, &fileNamesBuf, &fileNamesNb);
         if (fileNamesTable) {
             unsigned i;
-            for (i=0; i<fileNamesNb; i++) DISPLAYLEVEL(3, "%d %s\n", i, fileNamesTable[i]);
+            for (i=0; i<fileNamesNb; i++) DISPLAYLEVEL(4, "%d %s\n", i, fileNamesTable[i]);
             free((void*)filenameTable);
             filenameTable = fileNamesTable;
             filenameIdx = fileNamesNb;
@@ -444,6 +463,7 @@ int main(int argCount, const char** argv)
     if (dictBuild) {
 #ifndef ZSTD_NODICT
         ZDICT_params_t dictParams;
+        memset(&dictParams, 0, sizeof(dictParams));
         dictParams.compressionLevel = dictCLevel;
         dictParams.selectivityLevel = dictSelect;
         dictParams.notificationLevel = displayLevel;
@@ -455,7 +475,7 @@ int main(int argCount, const char** argv)
 
     /* No input filename ==> use stdin and stdout */
     filenameIdx += !filenameIdx;   /*< default input is stdin */
-    if (!strcmp(filenameTable[0], stdinmark) && !outFileName ) outFileName = stdoutmark;   /*< when input is stdin, default output is stdout */
+    if (!strcmp(filenameTable[0], stdinmark) && !outFileName) outFileName = stdoutmark;   /*< when input is stdin, default output is stdout */
 
     /* Check if input/output defined as console; trigger an error in this case */
     if (!strcmp(filenameTable[0], stdinmark) && IS_CONSOLE(stdin) ) CLEAN_RETURN(badusage(programName));
@@ -470,7 +490,7 @@ int main(int argCount, const char** argv)
 
     /* No warning message in pipe mode (stdin + stdout) or multiple mode */
     if (!strcmp(filenameTable[0], stdinmark) && outFileName && !strcmp(outFileName,stdoutmark) && (displayLevel==2)) displayLevel=1;
-    if ((filenameIdx>1) && (displayLevel==2)) displayLevel=1;
+    if ((filenameIdx>1) & (displayLevel==2)) displayLevel=1;
 
     /* IO Stream/File */
     FIO_setNotificationLevel(displayLevel);
@@ -484,6 +504,7 @@ int main(int argCount, const char** argv)
 #endif
     {  /* decompression */
 #ifndef ZSTD_NODECOMPRESS
+        if (testmode) { outFileName=nulmark; FIO_setRemoveSrcFile(0); } /* test mode */
         if (filenameIdx==1 && outFileName)
             operationResult = FIO_decompressFilename(outFileName, filenameTable[0], dictFileName);
         else
diff --git a/projects/README.md b/projects/README.md
index b6831ce..c2fa747 100644
--- a/projects/README.md
+++ b/projects/README.md
@@ -1,4 +1,4 @@
-projects for various integrated development environments (IDE) 
+projects for various integrated development environments (IDE)
 ================================
 
 #### Included projects
@@ -7,3 +7,4 @@ The following projects are included with the zstd distribution:
 - cmake - CMake project contributed by Artyom Dymchenko
 - VS2008 - Visual Studio 2008 project
 - VS2010 - Visual Studio 2010 project (which also works well with Visual Studio 2012, 2013, 2015)
+- build - command line scripts prepared for Visual Studio compilation without IDE
diff --git a/projects/VS2008/fullbench/fullbench.vcproj b/projects/VS2008/fullbench/fullbench.vcproj
index 50cbcc2..b669539 100644
--- a/projects/VS2008/fullbench/fullbench.vcproj
+++ b/projects/VS2008/fullbench/fullbench.vcproj
@@ -44,7 +44,7 @@
 			<Tool
 				Name="VCCLCompilerTool"
 				Optimization="0"
-				AdditionalIncludeDirectories="$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\programs\legacy"
+				AdditionalIncludeDirectories="$(SolutionDir)..\..\lib;$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\programs\legacy"
 				PreprocessorDefinitions="WIN32;_DEBUG;_CONSOLE"
 				MinimalRebuild="true"
 				BasicRuntimeChecks="3"
@@ -120,7 +120,7 @@
 				Optimization="2"
 				EnableIntrinsicFunctions="true"
 				OmitFramePointers="true"
-				AdditionalIncludeDirectories="$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\programs\legacy"
+				AdditionalIncludeDirectories="$(SolutionDir)..\..\lib;$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\programs\legacy"
 				PreprocessorDefinitions="WIN32;NDEBUG;_CONSOLE"
 				RuntimeLibrary="0"
 				EnableFunctionLevelLinking="true"
@@ -194,7 +194,7 @@
 			<Tool
 				Name="VCCLCompilerTool"
 				Optimization="0"
-				AdditionalIncludeDirectories="$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\programs\legacy"
+				AdditionalIncludeDirectories="$(SolutionDir)..\..\lib;$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\programs\legacy"
 				PreprocessorDefinitions="WIN32;_DEBUG;_CONSOLE"
 				MinimalRebuild="true"
 				BasicRuntimeChecks="3"
@@ -271,7 +271,7 @@
 				Optimization="2"
 				EnableIntrinsicFunctions="true"
 				OmitFramePointers="true"
-				AdditionalIncludeDirectories="$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\programs\legacy"
+				AdditionalIncludeDirectories="$(SolutionDir)..\..\lib;$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\programs\legacy"
 				PreprocessorDefinitions="WIN32;NDEBUG;_CONSOLE"
 				RuntimeLibrary="0"
 				EnableFunctionLevelLinking="true"
@@ -427,7 +427,7 @@
 				>
 			</File>
 			<File
-				RelativePath="..\..\..\lib\common\zstd.h"
+				RelativePath="..\..\..\lib\zstd.h"
 				>
 			</File>
 			<File
diff --git a/projects/VS2008/fuzzer/fuzzer.vcproj b/projects/VS2008/fuzzer/fuzzer.vcproj
index ab0bab2..b88ae6d 100644
--- a/projects/VS2008/fuzzer/fuzzer.vcproj
+++ b/projects/VS2008/fuzzer/fuzzer.vcproj
@@ -44,7 +44,7 @@
 			<Tool
 				Name="VCCLCompilerTool"
 				Optimization="0"
-				AdditionalIncludeDirectories="$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\dictBuilder;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\programs\legacy"
+				AdditionalIncludeDirectories="$(SolutionDir)..\..\lib;$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\dictBuilder;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\programs\legacy"
 				PreprocessorDefinitions="WIN32;_DEBUG;_CONSOLE"
 				MinimalRebuild="true"
 				BasicRuntimeChecks="3"
@@ -120,7 +120,7 @@
 				Optimization="2"
 				EnableIntrinsicFunctions="true"
 				OmitFramePointers="true"
-				AdditionalIncludeDirectories="$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\dictBuilder;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\programs\legacy"
+				AdditionalIncludeDirectories="$(SolutionDir)..\..\lib;$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\dictBuilder;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\programs\legacy"
 				PreprocessorDefinitions="WIN32;NDEBUG;_CONSOLE"
 				RuntimeLibrary="0"
 				EnableFunctionLevelLinking="true"
@@ -194,7 +194,7 @@
 			<Tool
 				Name="VCCLCompilerTool"
 				Optimization="0"
-				AdditionalIncludeDirectories="$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\dictBuilder;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\programs\legacy"
+				AdditionalIncludeDirectories="$(SolutionDir)..\..\lib;$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\dictBuilder;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\programs\legacy"
 				PreprocessorDefinitions="WIN32;_DEBUG;_CONSOLE"
 				MinimalRebuild="true"
 				BasicRuntimeChecks="3"
@@ -271,7 +271,7 @@
 				Optimization="2"
 				EnableIntrinsicFunctions="true"
 				OmitFramePointers="true"
-				AdditionalIncludeDirectories="$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\dictBuilder;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\programs\legacy"
+				AdditionalIncludeDirectories="$(SolutionDir)..\..\lib;$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\dictBuilder;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\programs\legacy"
 				PreprocessorDefinitions="WIN32;NDEBUG;_CONSOLE"
 				RuntimeLibrary="0"
 				EnableFunctionLevelLinking="true"
@@ -439,7 +439,7 @@
 				>
 			</File>
 			<File
-				RelativePath="..\..\..\lib\common\zstd.h"
+				RelativePath="..\..\..\lib\zstd.h"
 				>
 			</File>
 			<File
diff --git a/projects/VS2008/zstd/zstd.vcproj b/projects/VS2008/zstd/zstd.vcproj
index b9b0d1e..85a9f6b 100644
--- a/projects/VS2008/zstd/zstd.vcproj
+++ b/projects/VS2008/zstd/zstd.vcproj
@@ -44,7 +44,7 @@
 			<Tool
 				Name="VCCLCompilerTool"
 				Optimization="0"
-				AdditionalIncludeDirectories="$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\programs\legacy;$(SolutionDir)..\..\lib\dictBuilder"
+				AdditionalIncludeDirectories="$(SolutionDir)..\..\lib;$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\programs\legacy;$(SolutionDir)..\..\lib\dictBuilder"
 				PreprocessorDefinitions="WIN32;_DEBUG;_CONSOLE"
 				MinimalRebuild="true"
 				BasicRuntimeChecks="3"
@@ -121,7 +121,7 @@
 				Optimization="2"
 				EnableIntrinsicFunctions="true"
 				OmitFramePointers="true"
-				AdditionalIncludeDirectories="$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\programs\legacy;$(SolutionDir)..\..\lib\dictBuilder"
+				AdditionalIncludeDirectories="$(SolutionDir)..\..\lib;$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\programs\legacy;$(SolutionDir)..\..\lib\dictBuilder"
 				PreprocessorDefinitions="WIN32;NDEBUG;_CONSOLE"
 				RuntimeLibrary="0"
 				EnableFunctionLevelLinking="true"
@@ -196,7 +196,7 @@
 			<Tool
 				Name="VCCLCompilerTool"
 				Optimization="0"
-				AdditionalIncludeDirectories="$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\programs\legacy;$(SolutionDir)..\..\lib\dictBuilder"
+				AdditionalIncludeDirectories="$(SolutionDir)..\..\lib;$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\programs\legacy;$(SolutionDir)..\..\lib\dictBuilder"
 				PreprocessorDefinitions="WIN32;_DEBUG;_CONSOLE"
 				MinimalRebuild="true"
 				BasicRuntimeChecks="3"
@@ -274,7 +274,7 @@
 				Optimization="2"
 				EnableIntrinsicFunctions="true"
 				OmitFramePointers="true"
-				AdditionalIncludeDirectories="$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\programs\legacy;$(SolutionDir)..\..\lib\dictBuilder"
+				AdditionalIncludeDirectories="$(SolutionDir)..\..\lib;$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\programs\legacy;$(SolutionDir)..\..\lib\dictBuilder"
 				PreprocessorDefinitions="WIN32;NDEBUG;_CONSOLE"
 				RuntimeLibrary="0"
 				EnableFunctionLevelLinking="true"
@@ -429,6 +429,10 @@
 				>
 			</File>
 			<File
+				RelativePath="..\..\..\lib\legacy\zstd_v07.c"
+				>
+			</File>
+			<File
 				RelativePath="..\..\..\programs\zstdcli.c"
 				>
 			</File>
@@ -495,7 +499,7 @@
 				>
 			</File>
 			<File
-				RelativePath="..\..\..\lib\common\zstd.h"
+				RelativePath="..\..\..\lib\zstd.h"
 				>
 			</File>
 			<File
@@ -538,6 +542,10 @@
 				RelativePath="..\..\..\lib\legacy\zstd_v06.h"
 				>
 			</File>
+			<File
+				RelativePath="..\..\..\lib\legacy\zstd_v07.h"
+				>
+			</File>
 		</Filter>
 	</Files>
 	<Globals>
diff --git a/projects/VS2008/zstdlib/zstdlib.vcproj b/projects/VS2008/zstdlib/zstdlib.vcproj
index 2051da5..db596b4 100644
--- a/projects/VS2008/zstdlib/zstdlib.vcproj
+++ b/projects/VS2008/zstdlib/zstdlib.vcproj
@@ -44,7 +44,7 @@
 			<Tool
 				Name="VCCLCompilerTool"
 				Optimization="0"
-				AdditionalIncludeDirectories="$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\programs\legacy;$(SolutionDir)..\..\lib\dictBuilder"
+				AdditionalIncludeDirectories="$(SolutionDir)..\..\lib;$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\programs\legacy;$(SolutionDir)..\..\lib\dictBuilder"
 				PreprocessorDefinitions="ZSTD_DLL_EXPORT=1;ZSTD_HEAPMODE=0;ZSTD_LEGACY_SUPPORT=0;WIN32;_DEBUG;_CONSOLE"
 				MinimalRebuild="true"
 				BasicRuntimeChecks="3"
@@ -120,7 +120,7 @@
 				Optimization="2"
 				EnableIntrinsicFunctions="true"
 				OmitFramePointers="true"
-				AdditionalIncludeDirectories="$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\programs\legacy;$(SolutionDir)..\..\lib\dictBuilder"
+				AdditionalIncludeDirectories="$(SolutionDir)..\..\lib;$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\programs\legacy;$(SolutionDir)..\..\lib\dictBuilder"
 				PreprocessorDefinitions="ZSTD_DLL_EXPORT=1;ZSTD_HEAPMODE=0;ZSTD_LEGACY_SUPPORT=0;WIN32;NDEBUG;_CONSOLE"
 				RuntimeLibrary="0"
 				EnableFunctionLevelLinking="true"
@@ -194,7 +194,7 @@
 			<Tool
 				Name="VCCLCompilerTool"
 				Optimization="0"
-				AdditionalIncludeDirectories="$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\programs\legacy;$(SolutionDir)..\..\lib\dictBuilder"
+				AdditionalIncludeDirectories="$(SolutionDir)..\..\lib;$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\programs\legacy;$(SolutionDir)..\..\lib\dictBuilder"
 				PreprocessorDefinitions="ZSTD_DLL_EXPORT=1;ZSTD_HEAPMODE=0;ZSTD_LEGACY_SUPPORT=0;WIN32;_DEBUG;_CONSOLE"
 				MinimalRebuild="true"
 				BasicRuntimeChecks="3"
@@ -271,7 +271,7 @@
 				Optimization="2"
 				EnableIntrinsicFunctions="true"
 				OmitFramePointers="true"
-				AdditionalIncludeDirectories="$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\programs\legacy;$(SolutionDir)..\..\lib\dictBuilder"
+				AdditionalIncludeDirectories="$(SolutionDir)..\..\lib;$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\programs\legacy;$(SolutionDir)..\..\lib\dictBuilder"
 				PreprocessorDefinitions="ZSTD_DLL_EXPORT=1;ZSTD_HEAPMODE=0;ZSTD_LEGACY_SUPPORT=0;WIN32;NDEBUG;_CONSOLE"
 				RuntimeLibrary="0"
 				EnableFunctionLevelLinking="true"
@@ -443,7 +443,7 @@
 				>
 			</File>
 			<File
-				RelativePath="..\..\..\lib\common\zstd.h"
+				RelativePath="..\..\..\lib\zstd.h"
 				>
 			</File>
 			<File
diff --git a/projects/VS2010/datagen/datagen.vcxproj.filters b/projects/VS2010/datagen/datagen.vcxproj.filters
deleted file mode 100644
index 1ebbd6b..0000000
--- a/projects/VS2010/datagen/datagen.vcxproj.filters
+++ /dev/null
@@ -1,26 +0,0 @@
-<?xml version="1.0" encoding="utf-8"?>
-<Project ToolsVersion="4.0" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
-  <ItemGroup>
-    <Filter Include="Header Files">
-      <UniqueIdentifier>{93995380-89BD-4b04-88EB-625FBE52EBFB}</UniqueIdentifier>
-      <Extensions>h;hpp;hxx;hm;inl;inc;xsd</Extensions>
-    </Filter>
-    <Filter Include="Source Files">
-      <UniqueIdentifier>{4FC737F1-C7A5-4376-A066-2A32D752A2FF}</UniqueIdentifier>
-      <Extensions>cpp;c;cc;cxx;def;odl;idl;hpj;bat;asm;asmx</Extensions>
-    </Filter>
-  </ItemGroup>
-  <ItemGroup>
-    <ClCompile Include="..\..\..\programs\datagen.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\programs\datagencli.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-  </ItemGroup>
-  <ItemGroup>
-    <ClInclude Include="..\..\..\programs\datagen.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-  </ItemGroup>
-</Project>
\ No newline at end of file
diff --git a/projects/VS2010/fullbench/fullbench.vcxproj b/projects/VS2010/fullbench/fullbench.vcxproj
index 0cd32d8..159a58d 100644
--- a/projects/VS2010/fullbench/fullbench.vcxproj
+++ b/projects/VS2010/fullbench/fullbench.vcxproj
@@ -65,24 +65,24 @@
   <PropertyGroup Label="UserMacros" />
   <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|Win32'">
     <LinkIncremental>true</LinkIncremental>
-    <IncludePath>$(IncludePath);$(SolutionDir)..\..\programs\legacy;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\lib\common;$(UniversalCRT_IncludePath);</IncludePath>
+    <IncludePath>$(IncludePath);$(SolutionDir)..\..\lib;$(SolutionDir)..\..\programs\legacy;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\lib\common;$(UniversalCRT_IncludePath);</IncludePath>
     <RunCodeAnalysis>false</RunCodeAnalysis>
     <IntDir>$(Platform)\$(Configuration)\</IntDir>
   </PropertyGroup>
   <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|x64'">
     <LinkIncremental>true</LinkIncremental>
-    <IncludePath>$(IncludePath);$(SolutionDir)..\..\programs\legacy;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\lib\common;$(UniversalCRT_IncludePath);</IncludePath>
+    <IncludePath>$(IncludePath);$(SolutionDir)..\..\lib;$(SolutionDir)..\..\programs\legacy;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\lib\common;$(UniversalCRT_IncludePath);</IncludePath>
     <RunCodeAnalysis>false</RunCodeAnalysis>
   </PropertyGroup>
   <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|Win32'">
     <LinkIncremental>false</LinkIncremental>
-    <IncludePath>$(IncludePath);$(SolutionDir)..\..\programs\legacy;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\lib\common;$(UniversalCRT_IncludePath);</IncludePath>
+    <IncludePath>$(IncludePath);$(SolutionDir)..\..\lib;$(SolutionDir)..\..\programs\legacy;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\lib\common;$(UniversalCRT_IncludePath);</IncludePath>
     <RunCodeAnalysis>false</RunCodeAnalysis>
     <IntDir>$(Platform)\$(Configuration)\</IntDir>
   </PropertyGroup>
   <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|x64'">
     <LinkIncremental>false</LinkIncremental>
-    <IncludePath>$(IncludePath);$(SolutionDir)..\..\programs\legacy;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\lib\common;$(UniversalCRT_IncludePath);</IncludePath>
+    <IncludePath>$(IncludePath);$(SolutionDir)..\..\lib;$(SolutionDir)..\..\programs\legacy;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\lib\common;$(UniversalCRT_IncludePath);</IncludePath>
     <RunCodeAnalysis>false</RunCodeAnalysis>
   </PropertyGroup>
   <ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='Debug|Win32'">
@@ -175,7 +175,7 @@
     <ClInclude Include="..\..\..\lib\common\huf.h" />
     <ClInclude Include="..\..\..\lib\common\xxhash.h" />
     <ClInclude Include="..\..\..\lib\common\zbuff.h" />
-    <ClInclude Include="..\..\..\lib\common\zstd.h" />
+    <ClInclude Include="..\..\..\lib\zstd.h" />
     <ClInclude Include="..\..\..\lib\common\zstd_internal.h" />
     <ClInclude Include="..\..\..\lib\compress\zstd_opt.h" />
     <ClInclude Include="..\..\..\lib\legacy\zstd_legacy.h" />
@@ -185,4 +185,4 @@
   <Import Project="$(VCTargetsPath)\Microsoft.Cpp.targets" />
   <ImportGroup Label="ExtensionTargets">
   </ImportGroup>
-</Project>
\ No newline at end of file
+</Project>
diff --git a/projects/VS2010/fullbench/fullbench.vcxproj.filters b/projects/VS2010/fullbench/fullbench.vcxproj.filters
deleted file mode 100644
index a81b251..0000000
--- a/projects/VS2010/fullbench/fullbench.vcxproj.filters
+++ /dev/null
@@ -1,86 +0,0 @@
-<?xml version="1.0" encoding="utf-8"?>
-<Project ToolsVersion="4.0" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
-  <ItemGroup>
-    <Filter Include="Header Files">
-      <UniqueIdentifier>{93995380-89BD-4b04-88EB-625FBE52EBFB}</UniqueIdentifier>
-      <Extensions>h;hpp;hxx;hm;inl;inc;xsd</Extensions>
-    </Filter>
-    <Filter Include="Source Files">
-      <UniqueIdentifier>{4FC737F1-C7A5-4376-A066-2A32D752A2FF}</UniqueIdentifier>
-      <Extensions>cpp;c;cc;cxx;def;odl;idl;hpj;bat;asm;asmx</Extensions>
-    </Filter>
-  </ItemGroup>
-  <ItemGroup>
-    <ClCompile Include="..\..\..\lib\common\zstd_common.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\lib\compress\fse_compress.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\programs\fullbench.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\programs\datagen.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\lib\compress\huf_compress.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\lib\decompress\huf_decompress.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\lib\compress\zstd_compress.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\lib\decompress\zstd_decompress.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\lib\compress\zbuff_compress.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\lib\decompress\zbuff_decompress.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\lib\common\fse_decompress.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\lib\common\entropy_common.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\lib\common\xxhash.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-  </ItemGroup>
-  <ItemGroup>
-    <ClInclude Include="..\..\..\lib\common\fse.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-    <ClInclude Include="..\..\..\lib\common\zstd.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-    <ClInclude Include="..\..\..\programs\datagen.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-    <ClInclude Include="..\..\..\lib\common\huf.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-    <ClInclude Include="..\..\..\lib\legacy\zstd_legacy.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-    <ClInclude Include="..\..\..\lib\common\zbuff.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-    <ClInclude Include="..\..\..\lib\compress\zstd_opt.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-    <ClInclude Include="..\..\..\lib\common\zstd_internal.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-    <ClInclude Include="..\..\..\programs\util.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-    <ClInclude Include="..\..\..\lib\common\xxhash.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-  </ItemGroup>
-</Project>
\ No newline at end of file
diff --git a/projects/VS2010/fuzzer/fuzzer.vcxproj b/projects/VS2010/fuzzer/fuzzer.vcxproj
index 5605257..5c8d800 100644
--- a/projects/VS2010/fuzzer/fuzzer.vcxproj
+++ b/projects/VS2010/fuzzer/fuzzer.vcxproj
@@ -66,24 +66,24 @@
   <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|Win32'">
     <LinkIncremental>true</LinkIncremental>
     <RunCodeAnalysis>false</RunCodeAnalysis>
-    <IncludePath>$(IncludePath);$(SolutionDir)..\..\programs\legacy;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\dictBuilder;$(UniversalCRT_IncludePath);</IncludePath>
+    <IncludePath>$(IncludePath);$(SolutionDir)..\..\lib;$(SolutionDir)..\..\programs\legacy;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\dictBuilder;$(UniversalCRT_IncludePath);</IncludePath>
     <IntDir>$(Platform)\$(Configuration)\</IntDir>
   </PropertyGroup>
   <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|x64'">
     <LinkIncremental>true</LinkIncremental>
     <RunCodeAnalysis>false</RunCodeAnalysis>
-    <IncludePath>$(IncludePath);$(SolutionDir)..\..\programs\legacy;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\dictBuilder;$(UniversalCRT_IncludePath);</IncludePath>
+    <IncludePath>$(IncludePath);$(SolutionDir)..\..\lib;$(SolutionDir)..\..\programs\legacy;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\dictBuilder;$(UniversalCRT_IncludePath);</IncludePath>
   </PropertyGroup>
   <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|Win32'">
     <LinkIncremental>false</LinkIncremental>
     <RunCodeAnalysis>false</RunCodeAnalysis>
-    <IncludePath>$(IncludePath);$(SolutionDir)..\..\programs\legacy;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\dictBuilder;$(UniversalCRT_IncludePath);</IncludePath>
+    <IncludePath>$(IncludePath);$(SolutionDir)..\..\lib;$(SolutionDir)..\..\programs\legacy;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\dictBuilder;$(UniversalCRT_IncludePath);</IncludePath>
     <IntDir>$(Platform)\$(Configuration)\</IntDir>
   </PropertyGroup>
   <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|x64'">
     <LinkIncremental>false</LinkIncremental>
     <RunCodeAnalysis>false</RunCodeAnalysis>
-    <IncludePath>$(IncludePath);$(SolutionDir)..\..\programs\legacy;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\dictBuilder;$(UniversalCRT_IncludePath);</IncludePath>
+    <IncludePath>$(IncludePath);$(SolutionDir)..\..\lib;$(SolutionDir)..\..\programs\legacy;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\dictBuilder;$(UniversalCRT_IncludePath);</IncludePath>
   </PropertyGroup>
   <ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='Debug|Win32'">
     <ClCompile>
@@ -176,7 +176,7 @@
     <ClInclude Include="..\..\..\lib\common\xxhash.h" />
     <ClInclude Include="..\..\..\lib\common\zbuff.h" />
     <ClInclude Include="..\..\..\lib\common\zstd_internal.h" />
-    <ClInclude Include="..\..\..\lib\common\zstd.h" />
+    <ClInclude Include="..\..\..\lib\zstd.h" />
     <ClInclude Include="..\..\..\lib\compress\zstd_opt.h" />
     <ClInclude Include="..\..\..\lib\dictBuilder\divsufsort.h" />
     <ClInclude Include="..\..\..\lib\dictBuilder\zdict.h" />
@@ -187,4 +187,4 @@
   <Import Project="$(VCTargetsPath)\Microsoft.Cpp.targets" />
   <ImportGroup Label="ExtensionTargets">
   </ImportGroup>
-</Project>
\ No newline at end of file
+</Project>
diff --git a/projects/VS2010/fuzzer/fuzzer.vcxproj.filters b/projects/VS2010/fuzzer/fuzzer.vcxproj.filters
deleted file mode 100644
index 5161ea0..0000000
--- a/projects/VS2010/fuzzer/fuzzer.vcxproj.filters
+++ /dev/null
@@ -1,92 +0,0 @@
-<?xml version="1.0" encoding="utf-8"?>
-<Project ToolsVersion="4.0" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
-  <ItemGroup>
-    <Filter Include="Header Files">
-      <UniqueIdentifier>{93995380-89BD-4b04-88EB-625FBE52EBFB}</UniqueIdentifier>
-      <Extensions>h;hpp;hxx;hm;inl;inc;xsd</Extensions>
-    </Filter>
-    <Filter Include="Source Files">
-      <UniqueIdentifier>{4FC737F1-C7A5-4376-A066-2A32D752A2FF}</UniqueIdentifier>
-      <Extensions>cpp;c;cc;cxx;def;odl;idl;hpj;bat;asm;asmx</Extensions>
-    </Filter>
-  </ItemGroup>
-  <ItemGroup>
-    <ClCompile Include="..\..\..\programs\fuzzer.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\programs\datagen.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\lib\common\zstd_common.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\lib\compress\fse_compress.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\lib\compress\huf_compress.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\lib\decompress\huf_decompress.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\lib\compress\zstd_compress.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\lib\decompress\zstd_decompress.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\lib\common\fse_decompress.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\lib\common\entropy_common.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\lib\common\xxhash.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\lib\dictBuilder\divsufsort.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\lib\dictBuilder\zdict.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-  </ItemGroup>
-  <ItemGroup>
-    <ClInclude Include="..\..\..\programs\datagen.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-    <ClInclude Include="..\..\..\lib\legacy\zstd_legacy.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-    <ClInclude Include="..\..\..\lib\common\fse.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-    <ClInclude Include="..\..\..\lib\common\huf.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-    <ClInclude Include="..\..\..\lib\common\zbuff.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-    <ClInclude Include="..\..\..\lib\common\zstd_internal.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-    <ClInclude Include="..\..\..\lib\common\zstd.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-    <ClInclude Include="..\..\..\lib\compress\zstd_opt.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-    <ClInclude Include="..\..\..\programs\util.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-    <ClInclude Include="..\..\..\lib\common\xxhash.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-    <ClInclude Include="..\..\..\lib\dictBuilder\divsufsort.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-    <ClInclude Include="..\..\..\lib\dictBuilder\zdict.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-  </ItemGroup>
-</Project>
\ No newline at end of file
diff --git a/projects/VS2010/zstd/zstd.vcxproj b/projects/VS2010/zstd/zstd.vcxproj
index 3c1e80b..2dbfc34 100644
--- a/projects/VS2010/zstd/zstd.vcxproj
+++ b/projects/VS2010/zstd/zstd.vcxproj
@@ -1,4 +1,4 @@
-<?xml version="1.0" encoding="utf-8"?>
+<?xml version="1.0" encoding="utf-8"?>
 <Project DefaultTargets="Build" ToolsVersion="4.0" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
   <ItemGroup Label="ProjectConfigurations">
     <ProjectConfiguration Include="Debug|Win32">
@@ -38,6 +38,7 @@
     <ClCompile Include="..\..\..\lib\legacy\zstd_v04.c" />
     <ClCompile Include="..\..\..\lib\legacy\zstd_v05.c" />
     <ClCompile Include="..\..\..\lib\legacy\zstd_v06.c" />
+    <ClCompile Include="..\..\..\lib\legacy\zstd_v07.c" />
     <ClCompile Include="..\..\..\programs\bench.c" />
     <ClCompile Include="..\..\..\programs\datagen.c" />
     <ClCompile Include="..\..\..\programs\dibio.c" />
@@ -52,7 +53,7 @@
     <ClInclude Include="..\..\..\lib\common\fse.h" />
     <ClInclude Include="..\..\..\lib\common\huf.h" />
     <ClInclude Include="..\..\..\lib\common\zbuff.h" />
-    <ClInclude Include="..\..\..\lib\common\zstd.h" />
+    <ClInclude Include="..\..\..\lib\zstd.h" />
     <ClInclude Include="..\..\..\lib\common\zstd_internal.h" />
     <ClInclude Include="..\..\..\lib\compress\zstd_opt.h" />
     <ClInclude Include="..\..\..\lib\legacy\zstd_legacy.h" />
@@ -62,6 +63,7 @@
     <ClInclude Include="..\..\..\lib\legacy\zstd_v04.h" />
     <ClInclude Include="..\..\..\lib\legacy\zstd_v05.h" />
     <ClInclude Include="..\..\..\lib\legacy\zstd_v06.h" />
+    <ClInclude Include="..\..\..\lib\legacy\zstd_v07.h" />
     <ClInclude Include="..\..\..\programs\bench.h" />
     <ClInclude Include="..\..\..\programs\datagen.h" />
     <ClInclude Include="..\..\..\programs\dibio.h" />
@@ -116,27 +118,27 @@
   <PropertyGroup Label="UserMacros" />
   <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|Win32'">
     <LinkIncremental>true</LinkIncremental>
-    <IncludePath>$(IncludePath);$(SolutionDir)..\..\programs\legacy;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\dictBuilder;$(UniversalCRT_IncludePath);</IncludePath>
+    <IncludePath>$(IncludePath);$(SolutionDir)..\..\lib;$(SolutionDir)..\..\programs\legacy;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\dictBuilder;$(UniversalCRT_IncludePath);</IncludePath>
     <RunCodeAnalysis>false</RunCodeAnalysis>
     <LibraryPath>$(LibraryPath)</LibraryPath>
     <IntDir>$(Platform)\$(Configuration)\</IntDir>
   </PropertyGroup>
   <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|x64'">
     <LinkIncremental>true</LinkIncremental>
-    <IncludePath>$(IncludePath);$(SolutionDir)..\..\programs\legacy;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\dictBuilder;$(UniversalCRT_IncludePath);</IncludePath>
+    <IncludePath>$(IncludePath);$(SolutionDir)..\..\lib;$(SolutionDir)..\..\programs\legacy;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\dictBuilder;$(UniversalCRT_IncludePath);</IncludePath>
     <RunCodeAnalysis>false</RunCodeAnalysis>
     <LibraryPath>$(LibraryPath);</LibraryPath>
   </PropertyGroup>
   <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|Win32'">
     <LinkIncremental>false</LinkIncremental>
-    <IncludePath>$(IncludePath);$(SolutionDir)..\..\programs\legacy;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\dictBuilder;$(UniversalCRT_IncludePath);</IncludePath>
+    <IncludePath>$(IncludePath);$(SolutionDir)..\..\lib;$(SolutionDir)..\..\programs\legacy;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\dictBuilder;$(UniversalCRT_IncludePath);</IncludePath>
     <RunCodeAnalysis>false</RunCodeAnalysis>
     <LibraryPath>$(LibraryPath)</LibraryPath>
     <IntDir>$(Platform)\$(Configuration)\</IntDir>
   </PropertyGroup>
   <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|x64'">
     <LinkIncremental>false</LinkIncremental>
-    <IncludePath>$(IncludePath);$(SolutionDir)..\..\programs\legacy;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\dictBuilder;$(UniversalCRT_IncludePath);</IncludePath>
+    <IncludePath>$(IncludePath);$(SolutionDir)..\..\lib;$(SolutionDir)..\..\programs\legacy;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\dictBuilder;$(UniversalCRT_IncludePath);</IncludePath>
     <RunCodeAnalysis>false</RunCodeAnalysis>
     <LibraryPath>$(LibraryPath);</LibraryPath>
   </PropertyGroup>
@@ -217,4 +219,4 @@
   <Import Project="$(VCTargetsPath)\Microsoft.Cpp.targets" />
   <ImportGroup Label="ExtensionTargets">
   </ImportGroup>
-</Project>
\ No newline at end of file
+</Project>
diff --git a/projects/VS2010/zstd/zstd.vcxproj.filters b/projects/VS2010/zstd/zstd.vcxproj.filters
deleted file mode 100644
index 0e1e927..0000000
--- a/projects/VS2010/zstd/zstd.vcxproj.filters
+++ /dev/null
@@ -1,158 +0,0 @@
-<?xml version="1.0" encoding="utf-8"?>
-<Project ToolsVersion="4.0" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
-  <ItemGroup>
-    <Filter Include="Header Files">
-      <UniqueIdentifier>{93995380-89BD-4b04-88EB-625FBE52EBFB}</UniqueIdentifier>
-      <Extensions>h;hpp;hxx;hm;inl;inc;xsd</Extensions>
-    </Filter>
-    <Filter Include="Source Files">
-      <UniqueIdentifier>{4FC737F1-C7A5-4376-A066-2A32D752A2FF}</UniqueIdentifier>
-      <Extensions>cpp;c;cc;cxx;def;odl;idl;hpj;bat;asm;asmx</Extensions>
-    </Filter>
-  </ItemGroup>
-  <ItemGroup>
-    <ClCompile Include="..\..\..\programs\bench.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\programs\fileio.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\programs\zstdcli.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\programs\dibio.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\programs\datagen.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\programs\legacy\fileio_legacy.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\lib\legacy\zstd_v01.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\lib\legacy\zstd_v02.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\lib\legacy\zstd_v03.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\lib\legacy\zstd_v04.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\lib\legacy\zstd_v05.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\lib\legacy\zstd_v06.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\lib\common\zstd_common.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\lib\compress\fse_compress.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\lib\compress\huf_compress.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\lib\compress\zbuff_compress.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\lib\compress\zstd_compress.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\lib\decompress\huf_decompress.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\lib\decompress\zbuff_decompress.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\lib\decompress\zstd_decompress.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\lib\dictBuilder\divsufsort.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\lib\dictBuilder\zdict.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\lib\common\fse_decompress.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\lib\common\entropy_common.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\lib\common\xxhash.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-  </ItemGroup>
-  <ItemGroup>
-    <ClInclude Include="..\..\..\programs\bench.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-    <ClInclude Include="..\..\..\programs\fileio.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-    <ClInclude Include="..\..\..\programs\datagen.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-    <ClInclude Include="..\..\..\programs\legacy\fileio_legacy.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-    <ClInclude Include="..\..\..\lib\legacy\zstd_legacy.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-    <ClInclude Include="..\..\..\lib\legacy\zstd_v01.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-    <ClInclude Include="..\..\..\lib\legacy\zstd_v02.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-    <ClInclude Include="..\..\..\lib\legacy\zstd_v03.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-    <ClInclude Include="..\..\..\lib\legacy\zstd_v04.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-    <ClInclude Include="..\..\..\lib\legacy\zstd_v05.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-    <ClInclude Include="..\..\..\lib\legacy\zstd_v06.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-    <ClInclude Include="..\..\..\programs\dibio.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-    <ClInclude Include="..\..\..\lib\dictBuilder\zdict.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-    <ClInclude Include="..\..\..\lib\dictBuilder\divsufsort.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-    <ClInclude Include="..\..\..\lib\common\fse.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-    <ClInclude Include="..\..\..\lib\common\huf.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-    <ClInclude Include="..\..\..\lib\common\zbuff.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-    <ClInclude Include="..\..\..\lib\common\zstd.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-    <ClInclude Include="..\..\..\lib\common\zstd_internal.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-    <ClInclude Include="..\..\..\lib\compress\zstd_opt.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-    <ClInclude Include="..\..\..\programs\util.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-    <ClInclude Include="..\..\..\lib\common\xxhash.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-  </ItemGroup>
-</Project>
\ No newline at end of file
diff --git a/projects/VS2010/zstdlib/zstdlib.vcxproj b/projects/VS2010/zstdlib/zstdlib.vcxproj
index 70f8064..8a5bc8b 100644
--- a/projects/VS2010/zstdlib/zstdlib.vcxproj
+++ b/projects/VS2010/zstdlib/zstdlib.vcxproj
@@ -40,7 +40,7 @@
     <ClInclude Include="..\..\..\lib\common\huf.h" />
     <ClInclude Include="..\..\..\lib\common\xxhash.h" />
     <ClInclude Include="..\..\..\lib\common\zbuff.h" />
-    <ClInclude Include="..\..\..\lib\common\zstd.h" />
+    <ClInclude Include="..\..\..\lib\zstd.h" />
     <ClInclude Include="..\..\..\lib\common\zstd_internal.h" />
     <ClInclude Include="..\..\..\lib\compress\zstd_opt.h" />
     <ClInclude Include="..\..\..\programs\util.h" />
@@ -97,28 +97,28 @@
     <LinkIncremental>true</LinkIncremental>
     <TargetName>zstdlib_x86</TargetName>
     <IntDir>$(Platform)\$(Configuration)\</IntDir>
-    <IncludePath>$(IncludePath);$(SolutionDir)..\..\programs\legacy;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\dictBuilder;$(UniversalCRT_IncludePath);</IncludePath>
+    <IncludePath>$(IncludePath);$(SolutionDir)..\..\lib;$(SolutionDir)..\..\programs\legacy;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\dictBuilder;$(UniversalCRT_IncludePath);</IncludePath>
     <RunCodeAnalysis>false</RunCodeAnalysis>
   </PropertyGroup>
   <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|x64'">
     <LinkIncremental>true</LinkIncremental>
     <TargetName>zstdlib_x64</TargetName>
     <IntDir>$(Platform)\$(Configuration)\</IntDir>
-    <IncludePath>$(IncludePath);$(SolutionDir)..\..\programs\legacy;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\dictBuilder;$(UniversalCRT_IncludePath);</IncludePath>
+    <IncludePath>$(IncludePath);$(SolutionDir)..\..\lib;$(SolutionDir)..\..\programs\legacy;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\dictBuilder;$(UniversalCRT_IncludePath);</IncludePath>
     <RunCodeAnalysis>false</RunCodeAnalysis>
   </PropertyGroup>
   <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|Win32'">
     <LinkIncremental>false</LinkIncremental>
     <TargetName>zstdlib_x86</TargetName>
     <IntDir>$(Platform)\$(Configuration)\</IntDir>
-    <IncludePath>$(IncludePath);$(SolutionDir)..\..\programs\legacy;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\dictBuilder;$(UniversalCRT_IncludePath);</IncludePath>
+    <IncludePath>$(IncludePath);$(SolutionDir)..\..\lib;$(SolutionDir)..\..\programs\legacy;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\dictBuilder;$(UniversalCRT_IncludePath);</IncludePath>
     <RunCodeAnalysis>false</RunCodeAnalysis>
   </PropertyGroup>
   <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|x64'">
     <LinkIncremental>false</LinkIncremental>
     <TargetName>zstdlib_x64</TargetName>
     <IntDir>$(Platform)\$(Configuration)\</IntDir>
-    <IncludePath>$(IncludePath);$(SolutionDir)..\..\programs\legacy;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\dictBuilder;$(UniversalCRT_IncludePath);</IncludePath>
+    <IncludePath>$(IncludePath);$(SolutionDir)..\..\lib;$(SolutionDir)..\..\programs\legacy;$(SolutionDir)..\..\lib\legacy;$(SolutionDir)..\..\lib\common;$(SolutionDir)..\..\lib\dictBuilder;$(UniversalCRT_IncludePath);</IncludePath>
     <RunCodeAnalysis>false</RunCodeAnalysis>
   </PropertyGroup>
   <ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='Debug|Win32'">
@@ -208,4 +208,4 @@
   <Import Project="$(VCTargetsPath)\Microsoft.Cpp.targets" />
   <ImportGroup Label="ExtensionTargets">
   </ImportGroup>
-</Project>
\ No newline at end of file
+</Project>
diff --git a/projects/VS2010/zstdlib/zstdlib.vcxproj.filters b/projects/VS2010/zstdlib/zstdlib.vcxproj.filters
deleted file mode 100644
index 439e3ce..0000000
--- a/projects/VS2010/zstdlib/zstdlib.vcxproj.filters
+++ /dev/null
@@ -1,95 +0,0 @@
-<?xml version="1.0" encoding="utf-8"?>
-<Project ToolsVersion="4.0" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
-  <ItemGroup>
-    <Filter Include="Source Files">
-      <UniqueIdentifier>{4FC737F1-C7A5-4376-A066-2A32D752A2FF}</UniqueIdentifier>
-      <Extensions>cpp;c;cc;cxx;def;odl;idl;hpj;bat;asm;asmx</Extensions>
-    </Filter>
-    <Filter Include="Header Files">
-      <UniqueIdentifier>{93995380-89BD-4b04-88EB-625FBE52EBFB}</UniqueIdentifier>
-      <Extensions>h;hh;hpp;hxx;hm;inl;inc;xsd</Extensions>
-    </Filter>
-  </ItemGroup>
-  <ItemGroup>
-    <ClCompile Include="..\..\..\lib\common\zstd_common.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\lib\compress\fse_compress.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\lib\compress\huf_compress.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\lib\compress\zbuff_compress.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\lib\compress\zstd_compress.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\lib\decompress\huf_decompress.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\lib\decompress\zbuff_decompress.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\lib\decompress\zstd_decompress.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\lib\dictBuilder\divsufsort.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\lib\dictBuilder\zdict.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\lib\common\fse_decompress.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\lib\common\entropy_common.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-    <ClCompile Include="..\..\..\lib\common\xxhash.c">
-      <Filter>Source Files</Filter>
-    </ClCompile>
-  </ItemGroup>
-  <ItemGroup>
-    <ClInclude Include="..\..\..\lib\common\bitstream.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-    <ClInclude Include="..\..\..\lib\common\error_private.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-    <ClInclude Include="..\..\..\lib\common\error_public.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-    <ClInclude Include="..\..\..\lib\common\mem.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-    <ClInclude Include="..\..\..\lib\common\fse.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-    <ClInclude Include="..\..\..\lib\common\huf.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-    <ClInclude Include="..\..\..\lib\common\zbuff.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-    <ClInclude Include="..\..\..\lib\common\zstd.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-    <ClInclude Include="..\..\..\lib\common\zstd_internal.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-    <ClInclude Include="..\..\..\lib\compress\zstd_opt.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-    <ClInclude Include="..\..\..\programs\util.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-    <ClInclude Include="..\..\..\lib\common\xxhash.h">
-      <Filter>Header Files</Filter>
-    </ClInclude>
-  </ItemGroup>
-  <ItemGroup>
-    <ResourceCompile Include="zstdlib.rc" />
-  </ItemGroup>
-</Project>
\ No newline at end of file
diff --git a/projects/cmake/lib/CMakeLists.txt b/projects/cmake/lib/CMakeLists.txt
index 35553b9..36e8afa 100644
--- a/projects/cmake/lib/CMakeLists.txt
+++ b/projects/cmake/lib/CMakeLists.txt
@@ -47,10 +47,10 @@ SET(ROOT_DIR ../../..)
 
 # Define library directory, where sources and header files are located
 SET(LIBRARY_DIR ${ROOT_DIR}/lib)
-INCLUDE_DIRECTORIES(${LIBRARY_DIR}/common)
+INCLUDE_DIRECTORIES(${LIBRARY_DIR} ${LIBRARY_DIR}/common)
 
 # Read file content
-FILE(READ ${LIBRARY_DIR}/common/zstd.h HEADER_CONTENT)
+FILE(READ ${LIBRARY_DIR}/zstd.h HEADER_CONTENT)
 
 # Parse version
 GetLibraryVersion("${HEADER_CONTENT}" LIBVER_MAJOR LIBVER_MINOR LIBVER_RELEASE)
@@ -80,7 +80,7 @@ SET(Headers
         ${LIBRARY_DIR}/common/mem.h
         ${LIBRARY_DIR}/common/zbuff.h
         ${LIBRARY_DIR}/common/zstd_internal.h
-        ${LIBRARY_DIR}/common/zstd.h
+        ${LIBRARY_DIR}/zstd.h
         ${LIBRARY_DIR}/dictBuilder/zdict.h)
 
 IF (ZSTD_LEGACY_SUPPORT)
@@ -93,7 +93,8 @@ IF (ZSTD_LEGACY_SUPPORT)
             ${LIBRARY_LEGACY_DIR}/zstd_v03.c
             ${LIBRARY_LEGACY_DIR}/zstd_v04.c
             ${LIBRARY_LEGACY_DIR}/zstd_v05.c
-            ${LIBRARY_LEGACY_DIR}/zstd_v06.c)
+            ${LIBRARY_LEGACY_DIR}/zstd_v06.c
+            ${LIBRARY_LEGACY_DIR}/zstd_v07.c)
 
     SET(Headers ${Headers}
             ${LIBRARY_LEGACY_DIR}/zstd_legacy.h
@@ -102,7 +103,8 @@ IF (ZSTD_LEGACY_SUPPORT)
             ${LIBRARY_LEGACY_DIR}/zstd_v03.h
             ${LIBRARY_LEGACY_DIR}/zstd_v04.h
             ${LIBRARY_LEGACY_DIR}/zstd_v05.h
-            ${LIBRARY_LEGACY_DIR}/zstd_v06.h)
+            ${LIBRARY_LEGACY_DIR}/zstd_v06.h
+            ${LIBRARY_LEGACY_DIR}/zstd_v07.h)
 ENDIF (ZSTD_LEGACY_SUPPORT)
 
 IF (MSVC)
@@ -162,7 +164,7 @@ IF (UNIX)
     SET(INSTALL_INCLUDE_DIR ${PREFIX}/include)
 
     # install target
-    INSTALL(FILES ${LIBRARY_DIR}/common/zstd.h ${LIBRARY_DIR}/common/zbuff.h ${LIBRARY_DIR}/dictBuilder/zdict.h DESTINATION ${INSTALL_INCLUDE_DIR})
+    INSTALL(FILES ${LIBRARY_DIR}/zstd.h ${LIBRARY_DIR}/common/zbuff.h ${LIBRARY_DIR}/dictBuilder/zdict.h DESTINATION ${INSTALL_INCLUDE_DIR})
     INSTALL(TARGETS libzstd_static DESTINATION ${INSTALL_LIBRARY_DIR})
     INSTALL(TARGETS libzstd_shared LIBRARY DESTINATION ${INSTALL_LIBRARY_DIR})
 
diff --git a/projects/cmake/programs/CMakeLists.txt b/projects/cmake/programs/CMakeLists.txt
index c8fe5d2..fddfc7d 100644
--- a/projects/cmake/programs/CMakeLists.txt
+++ b/projects/cmake/programs/CMakeLists.txt
@@ -40,7 +40,7 @@ SET(ROOT_DIR ../../..)
 # Define programs directory, where sources and header files are located
 SET(LIBRARY_DIR ${ROOT_DIR}/lib)
 SET(PROGRAMS_DIR ${ROOT_DIR}/programs)
-INCLUDE_DIRECTORIES(${PROGRAMS_DIR} ${LIBRARY_DIR}/common ${LIBRARY_DIR}/dictBuilder)
+INCLUDE_DIRECTORIES(${PROGRAMS_DIR} ${LIBRARY_DIR} ${LIBRARY_DIR}/common ${LIBRARY_DIR}/dictBuilder)
 
 IF (ZSTD_LEGACY_SUPPORT)
     SET(PROGRAMS_LEGACY_DIR ${PROGRAMS_DIR}/legacy)
diff --git a/tests/.gitignore b/tests/.gitignore
index 4d14ba0..bda081a 100644
--- a/tests/.gitignore
+++ b/tests/.gitignore
@@ -2,3 +2,7 @@
 zstdtest
 speedTest
 versionsTest
+
+# Local script
+startSpeedTest
+speedTest.pid
diff --git a/tests/test-zstd-speed.py b/tests/test-zstd-speed.py
index 522227a..c517097 100755
--- a/tests/test-zstd-speed.py
+++ b/tests/test-zstd-speed.py
@@ -3,27 +3,29 @@
 import argparse
 import os
 import string
+import subprocess
 import time
 import traceback
-import subprocess
-import signal
- 
+
 
 default_repo_url = 'https://github.com/Cyan4973/zstd.git'
 working_dir_name = 'speedTest'
-working_path = os.getcwd() + '/' + working_dir_name     # /path/to/zstd/tests/speedTest 
-clone_path = working_path + '/' + 'zstd'                # /path/to/zstd/tests/speedTest/zstd 
+working_path = os.getcwd() + '/' + working_dir_name     # /path/to/zstd/tests/speedTest
+clone_path = working_path + '/' + 'zstd'                # /path/to/zstd/tests/speedTest/zstd
 email_header = '[ZSTD_speedTest]'
 pid = str(os.getpid())
+verbose = False
 
 
 def log(text):
     print(time.strftime("%Y/%m/%d %H:%M:%S") + ' - ' + text)
 
 
-def execute(command, print_output=False, print_error=True, param_shell=True):
-    log("> " + command)
-    popen = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, shell=param_shell, cwd=execute.cwd)
+def execute(command, print_command=True, print_output=False, print_error=True, param_shell=True):
+    if print_command:
+        log("> " + command)
+    popen = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.STDOUT,
+                             shell=param_shell, cwd=execute.cwd)
     stdout = popen.communicate()[0]
     stdout_lines = stdout.splitlines()
     if print_output:
@@ -38,8 +40,8 @@ execute.cwd = None
 
 def does_command_exist(command):
     try:
-        execute(command, False, False);
-    except Exception as e:
+        execute(command, verbose, False, False)
+    except Exception:
         return False
     return True
 
@@ -50,33 +52,38 @@ def send_email(emails, topic, text, have_mutt, have_mail):
         myfile.writelines(text)
         myfile.close()
         if have_mutt:
-            execute('mutt -s "' + topic + '" ' + emails + ' < ' + logFileName)
+            execute('mutt -s "' + topic + '" ' + emails + ' < ' + logFileName, verbose)
         elif have_mail:
-            execute('mail -s "' + topic + '" ' + emails + ' < ' + logFileName)
+            execute('mail -s "' + topic + '" ' + emails + ' < ' + logFileName, verbose)
         else:
             log("e-mail cannot be sent (mail or mutt not found)")
 
 
-def send_email_with_attachments(branch, commit, last_commit, emails, text, results_files, logFileName, lower_limit, have_mutt, have_mail):
+def send_email_with_attachments(branch, commit, last_commit, args, text, results_files,
+                                logFileName, have_mutt, have_mail):
     with open(logFileName, "w") as myfile:
         myfile.writelines(text)
         myfile.close()
-        email_topic = '%s:%s Warning for %s:%s last_commit=%s speed<%s' % (email_header, pid, branch, commit, last_commit, lower_limit)
+        email_topic = '%s:%s Warning for %s:%s last_commit=%s speed<%s ratio<%s' \
+                      % (email_header, pid, branch, commit, last_commit,
+                         args.lowerLimit, args.ratioLimit)
         if have_mutt:
-            execute('mutt -s "' + email_topic + '" ' + emails + ' -a ' + results_files + ' < ' + logFileName)
+            execute('mutt -s "' + email_topic + '" ' + args.emails + ' -a ' + results_files
+                    + ' < ' + logFileName)
         elif have_mail:
-            execute('mail -s "' + email_topic + '" ' + emails + ' < ' + logFileName)
+            execute('mail -s "' + email_topic + '" ' + args.emails + ' < ' + logFileName)
         else:
             log("e-mail cannot be sent (mail or mutt not found)")
 
 
 def git_get_branches():
-    execute('git fetch -p')
-    output = execute('git branch -rl')
-    for line in output:
-        if "HEAD" in line: 
-            output.remove(line)  # remove "origin/HEAD -> origin/dev"
-    return map(lambda l: l.strip(), output)
+    execute('git fetch -p', verbose)
+    branches = execute('git branch -rl', verbose)
+    output = []
+    for line in branches:
+        if ("HEAD" not in line) and ("coverity_scan" not in line) and ("gh-pages" not in line):
+            output.append(line.strip())
+    return output
 
 
 def git_get_changes(branch, commit, last_commit):
@@ -90,32 +97,38 @@ def git_get_changes(branch, commit, last_commit):
 
 def get_last_results(resultsFileName):
     if not os.path.isfile(resultsFileName):
-        return None, None, None
+        return None, None, None, None
     commit = None
+    csize = []
     cspeed = []
     dspeed = []
-    with open(resultsFileName,'r') as f:
+    with open(resultsFileName, 'r') as f:
         for line in f:
             words = line.split()
             if len(words) == 2:   # branch + commit
-                commit = words[1];
+                commit = words[1]
+                csize = []
                 cspeed = []
                 dspeed = []
             if (len(words) == 8):  # results
+                csize.append(int(words[1]))
                 cspeed.append(float(words[3]))
                 dspeed.append(float(words[5]))
-    return commit, cspeed, dspeed
+    return commit, csize, cspeed, dspeed
 
 
-def benchmark_and_compare(branch, commit, resultsFileName, lastCLevel, testFilePath, fileName, last_cspeed, last_dspeed, lower_limit, maxLoadAvg, message):
+def benchmark_and_compare(branch, commit, last_commit, args, executableName, resultsFileName,
+                          testFilePath, fileName, last_csize, last_cspeed, last_dspeed):
     sleepTime = 30
-    while os.getloadavg()[0] > maxLoadAvg:
-        log("WARNING: bench loadavg=%.2f is higher than %s, sleeping for %s seconds" % (os.getloadavg()[0], maxLoadAvg, sleepTime))
+    while os.getloadavg()[0] > args.maxLoadAvg:
+        log("WARNING: bench loadavg=%.2f is higher than %s, sleeping for %s seconds"
+            % (os.getloadavg()[0], args.maxLoadAvg, sleepTime))
         time.sleep(sleepTime)
     start_load = str(os.getloadavg())
-    result = execute('programs/zstd -qi5b1e%s %s' % (lastCLevel, testFilePath), print_output=True)
+    result = execute('programs/%s -qi5b1e%s %s' % (executableName, args.lastCLevel, testFilePath),
+                     print_output=True)
     end_load = str(os.getloadavg())
-    linesExpected = lastCLevel + 2;
+    linesExpected = args.lastCLevel + 1
     if len(result) != linesExpected:
         raise RuntimeError("ERROR: number of result lines=%d is different that expected %d\n%s" % (len(result), linesExpected, '\n'.join(result)))
     with open(resultsFileName, "a") as myfile:
@@ -125,16 +138,18 @@ def benchmark_and_compare(branch, commit, resultsFileName, lastCLevel, testFileP
         if (last_cspeed == None):
             log("WARNING: No data for comparison for branch=%s file=%s " % (branch, fileName))
             return ""
-        commit, cspeed, dspeed = get_last_results(resultsFileName)
+        commit, csize, cspeed, dspeed = get_last_results(resultsFileName)
         text = ""
         for i in range(0, min(len(cspeed), len(last_cspeed))):
-            print("%s:%s -%d cspeed=%6.2f clast=%6.2f cdiff=%1.4f dspeed=%6.2f dlast=%6.2f ddiff=%1.4f %s" % (branch, commit, i+1, cspeed[i], last_cspeed[i], cspeed[i]/last_cspeed[i], dspeed[i], last_dspeed[i], dspeed[i]/last_dspeed[i], fileName))
-            if (cspeed[i]/last_cspeed[i] < lower_limit):
-                text += "WARNING: -%d cspeed=%.2f clast=%.2f cdiff=%.4f %s\n" % (i+1, cspeed[i], last_cspeed[i], cspeed[i]/last_cspeed[i], fileName)
-            if (dspeed[i]/last_dspeed[i] < lower_limit):
-                text += "WARNING: -%d dspeed=%.2f dlast=%.2f ddiff=%.4f %s\n" % (i+1, dspeed[i], last_dspeed[i], dspeed[i]/last_dspeed[i], fileName)
+            print("%s:%s -%d cSpeed=%6.2f cLast=%6.2f cDiff=%1.4f dSpeed=%6.2f dLast=%6.2f dDiff=%1.4f ratioDiff=%1.4f %s" % (branch, commit, i+1, cspeed[i], last_cspeed[i], cspeed[i]/last_cspeed[i], dspeed[i], last_dspeed[i], dspeed[i]/last_dspeed[i], float(last_csize[i])/csize[i], fileName))
+            if (cspeed[i]/last_cspeed[i] < args.lowerLimit):
+                text += "WARNING: %s -%d cSpeed=%.2f cLast=%.2f cDiff=%.4f %s\n" % (executableName, i+1, cspeed[i], last_cspeed[i], cspeed[i]/last_cspeed[i], fileName)
+            if (dspeed[i]/last_dspeed[i] < args.lowerLimit):
+                text += "WARNING: %s -%d dSpeed=%.2f dLast=%.2f dDiff=%.4f %s\n" % (executableName, i+1, dspeed[i], last_dspeed[i], dspeed[i]/last_dspeed[i], fileName)
+            if (float(last_csize[i])/csize[i] < args.ratioLimit):
+                text += "WARNING: %s -%d cSize=%d last_cSize=%d diff=%.4f %s\n" % (executableName, i+1, csize[i], last_csize[i], float(last_csize[i])/csize[i], fileName)
         if text:
-            text = message + ("\nmaxLoadAvg=%s  load average at start=%s end=%s\n" % (maxLoadAvg, start_load, end_load)) + text
+            text = args.message + ("\nmaxLoadAvg=%s  load average at start=%s end=%s  last_commit=%s\n" % (args.maxLoadAvg, start_load, end_load, last_commit)) + text
         return text
 
 
@@ -147,28 +162,38 @@ def update_config_file(branch, commit):
     return last_commit
 
 
+def double_check(branch, commit, args, executableName, resultsFileName, filePath, fileName):
+    last_commit, csize, cspeed, dspeed = get_last_results(resultsFileName)
+    if not args.dry_run:
+        text = benchmark_and_compare(branch, commit, last_commit, args, executableName, resultsFileName, filePath, fileName, csize, cspeed, dspeed)
+        if text:
+            log("WARNING: redoing tests for branch %s: commit %s" % (branch, commit))
+            text = benchmark_and_compare(branch, commit, last_commit, args, executableName, resultsFileName, filePath, fileName, csize, cspeed, dspeed)
+    return text
+
+
 def test_commit(branch, commit, last_commit, args, testFilePaths, have_mutt, have_mail):
     local_branch = string.split(branch, '/')[1]
     version = local_branch.rpartition('-')[2] + '_' + commit
     if not args.dry_run:
-        execute('make clean zstdprogram MOREFLAGS="-DZSTD_GIT_COMMIT=%s"' % version)
+        execute('make -C programs clean zstd MOREFLAGS="-DZSTD_GIT_COMMIT=%s" && make -B -C programs zstd32 MOREFLAGS="-DZSTD_GIT_COMMIT=%s"' % (version, version))
     logFileName = working_path + "/log_" + branch.replace("/", "_") + ".txt"
     text_to_send = []
     results_files = ""
     for filePath in testFilePaths:
         fileName = filePath.rpartition('/')[2]
         resultsFileName = working_path + "/results_" + branch.replace("/", "_") + "_" + fileName.replace(".", "_") + ".txt"
-        last_commit, cspeed, dspeed = get_last_results(resultsFileName)
-        if not args.dry_run:
-            text = benchmark_and_compare(branch, commit, resultsFileName, args.lastCLevel, filePath, fileName, cspeed, dspeed, args.lowerLimit, args.maxLoadAvg, args.message)
-            if text:
-                log("WARNING: redoing tests for branch %s: commit %s" % (branch, commit))
-                text = benchmark_and_compare(branch, commit, resultsFileName, args.lastCLevel, filePath, fileName, cspeed, dspeed, args.lowerLimit, args.maxLoadAvg, args.message)
-                if text:
-                    text_to_send.append(text)
-                    results_files += resultsFileName + " "
+        text = double_check(branch, commit, args, 'zstd', resultsFileName, filePath, fileName)
+        if text:
+            text_to_send.append(text)
+            results_files += resultsFileName + " "
+        resultsFileName = working_path + "/results32_" + branch.replace("/", "_") + "_" + fileName.replace(".", "_") + ".txt"
+        text = double_check(branch, commit, args, 'zstd32', resultsFileName, filePath, fileName)
+        if text:
+            text_to_send.append(text)
+            results_files += resultsFileName + " "
     if text_to_send:
-        send_email_with_attachments(branch, commit, last_commit, args.emails, text_to_send, results_files, logFileName, args.lowerLimit, have_mutt, have_mail)
+        send_email_with_attachments(branch, commit, last_commit, args, text_to_send, results_files, logFileName, have_mutt, have_mail)
 
 
 if __name__ == '__main__':
@@ -178,11 +203,14 @@ if __name__ == '__main__':
     parser.add_argument('--message', help='attach an additional message to e-mail', default="")
     parser.add_argument('--repoURL', help='changes default repository URL', default=default_repo_url)
     parser.add_argument('--lowerLimit', type=float, help='send email if speed is lower than given limit', default=0.98)
+    parser.add_argument('--ratioLimit', type=float, help='send email if ratio is lower than given limit', default=0.999)
     parser.add_argument('--maxLoadAvg', type=float, help='maximum load average to start testing', default=0.75)
     parser.add_argument('--lastCLevel', type=int, help='last compression level for testing', default=5)
     parser.add_argument('--sleepTime', type=int, help='frequency of repository checking in seconds', default=300)
     parser.add_argument('--dry-run', dest='dry_run', action='store_true', help='not build', default=False)
+    parser.add_argument('--verbose', action='store_true', help='more verbose logs', default=False)
     args = parser.parse_args()
+    verbose = args.verbose
 
     # check if test files are accessible
     testFileNames = args.testFileNames.split()
@@ -196,24 +224,27 @@ if __name__ == '__main__':
             exit(1)
 
     # check availability of e-mail senders
-    have_mutt = does_command_exist("mutt -h");
-    have_mail = does_command_exist("mail -V");
+    have_mutt = does_command_exist("mutt -h")
+    have_mail = does_command_exist("mail -V")
     if not have_mutt and not have_mail:
         log("ERROR: e-mail senders 'mail' or 'mutt' not found")
         exit(1)
 
-    print("PARAMETERS:\nrepoURL=%s" % args.repoURL)
-    print("working_path=%s" % working_path)
-    print("clone_path=%s" % clone_path)
-    print("testFilePath(%s)=%s" % (len(testFilePaths), testFilePaths))
-    print("message=%s" % args.message)
-    print("emails=%s" % args.emails)
-    print("maxLoadAvg=%s" % args.maxLoadAvg)
-    print("lowerLimit=%s" % args.lowerLimit)
-    print("lastCLevel=%s" % args.lastCLevel)
-    print("sleepTime=%s" % args.sleepTime)
-    print("dry_run=%s" % args.dry_run)
-    print("have_mutt=%s have_mail=%s" % (have_mutt, have_mail))
+    if verbose:
+        print("PARAMETERS:\nrepoURL=%s" % args.repoURL)
+        print("working_path=%s" % working_path)
+        print("clone_path=%s" % clone_path)
+        print("testFilePath(%s)=%s" % (len(testFilePaths), testFilePaths))
+        print("message=%s" % args.message)
+        print("emails=%s" % args.emails)
+        print("maxLoadAvg=%s" % args.maxLoadAvg)
+        print("lowerLimit=%s" % args.lowerLimit)
+        print("ratioLimit=%s" % args.ratioLimit)
+        print("lastCLevel=%s" % args.lastCLevel)
+        print("sleepTime=%s" % args.sleepTime)
+        print("dry_run=%s" % args.dry_run)
+        print("verbose=%s" % args.verbose)
+        print("have_mutt=%s have_mail=%s" % (have_mutt, have_mail))
 
     # clone ZSTD repo if needed
     if not os.path.isdir(working_path):
@@ -241,7 +272,7 @@ if __name__ == '__main__':
             if (loadavg <= args.maxLoadAvg):
                 branches = git_get_branches()
                 for branch in branches:
-                    commit = execute('git show -s --format=%h ' + branch)[0]
+                    commit = execute('git show -s --format=%h ' + branch, verbose)[0]
                     last_commit = update_config_file(branch, commit)
                     if commit == last_commit:
                         log("skipping branch %s: head %s already processed" % (branch, commit))
@@ -252,13 +283,15 @@ if __name__ == '__main__':
                         test_commit(branch, commit, last_commit, args, testFilePaths, have_mutt, have_mail)
             else:
                 log("WARNING: main loadavg=%.2f is higher than %s" % (loadavg, args.maxLoadAvg))
-            log("sleep for %s seconds" % args.sleepTime)
+            if verbose:
+                log("sleep for %s seconds" % args.sleepTime)
             time.sleep(args.sleepTime)
         except Exception as e:
             stack = traceback.format_exc()
             email_topic = '%s:%s ERROR in %s:%s' % (email_header, pid, branch, commit)
             send_email(args.emails, email_topic, stack, have_mutt, have_mail)
             print(stack)
+            time.sleep(args.sleepTime)
         except KeyboardInterrupt:
             os.unlink(pidfile)
             send_email(args.emails, email_header + ':%s test-zstd-speed.py has been stopped' % pid, args.message, have_mutt, have_mail)
diff --git a/zlibWrapper/Makefile b/zlibWrapper/Makefile
index 21d56c5..9ad1c01 100644
--- a/zlibWrapper/Makefile
+++ b/zlibWrapper/Makefile
@@ -17,8 +17,8 @@ endif
 
 ZLIBWRAPPER_PATH = .
 EXAMPLE_PATH = examples
-CC = gcc
-CFLAGS = $(LOC) -I../lib/common -I$(ZLIBDIR) -I$(ZLIBWRAPPER_PATH) -O3 -std=gnu90
+CC ?= gcc
+CFLAGS = $(LOC) -I../lib -I../lib/common -I$(ZLIBDIR) -I$(ZLIBWRAPPER_PATH) -O3 -std=gnu90
 CFLAGS += -Wall -Wextra -Wcast-qual -Wcast-align -Wshadow -Wswitch-enum -Wdeclaration-after-statement -Wstrict-prototypes -Wundef
 LDFLAGS = $(LOC)
 RM = rm -f
diff --git a/zstd.rb b/zstd.rb
new file mode 100644
index 0000000..9992383
--- /dev/null
+++ b/zstd.rb
@@ -0,0 +1,18 @@
+class Zstd < Formula
+  desc "Zstandard - Fast real-time compression algorithm"
+  homepage "http://www.zstd.net/"
+  url "https://github.com/Cyan4973/zstd/archive/v0.7.4.tar.gz"
+  sha256 "35ab3a5084d0194e9ff08e702edb6f507eab1bfb8c09c913639241cec852e2b7"
+
+  def install
+    system "make", "install", "PREFIX=#{prefix}"
+  end
+
+  test do
+    (testpath/"input.txt").write("Hello, world." * 10)
+    system "#{bin}/zstd", "input.txt", "-o", "compressed.zst"
+    system "#{bin}/zstd", "--test", "compressed.zst"
+    system "#{bin}/zstd", "-d", "compressed.zst", "-o", "decompressed.txt"
+    system "cmp", "input.txt", "decompressed.txt"
+  end
+end
diff --git a/zstd_compression_format.md b/zstd_compression_format.md
new file mode 100644
index 0000000..da5c94a
--- /dev/null
+++ b/zstd_compression_format.md
@@ -0,0 +1,1149 @@
+Zstandard Compression Format
+============================
+
+### Notices
+
+Copyright (c) 2016 Yann Collet
+
+Permission is granted to copy and distribute this document
+for any purpose and without charge,
+including translations into other languages
+and incorporation into compilations,
+provided that the copyright notice and this notice are preserved,
+and that any substantive changes or deletions from the original
+are clearly marked.
+Distribution of this document is unlimited.
+
+### Version
+
+0.2.0 (22/07/16)
+
+
+Introduction
+------------
+
+The purpose of this document is to define a lossless compressed data format,
+that is independent of CPU type, operating system,
+file system and character set, suitable for
+file compression, pipe and streaming compression,
+using the [Zstandard algorithm](http://www.zstandard.org).
+
+The data can be produced or consumed,
+even for an arbitrarily long sequentially presented input data stream,
+using only an a priori bounded amount of intermediate storage,
+and hence can be used in data communications.
+The format uses the Zstandard compression method,
+and optional [xxHash-64 checksum method](http://www.xxhash.org),
+for detection of data corruption.
+
+The data format defined by this specification
+does not attempt to allow random access to compressed data.
+
+This specification is intended for use by implementers of software
+to compress data into Zstandard format and/or decompress data from Zstandard format.
+The text of the specification assumes a basic background in programming
+at the level of bits and other primitive data representations.
+
+Unless otherwise indicated below,
+a compliant compressor must produce data sets
+that conform to the specifications presented here.
+It doesn’t need to support all options though.
+
+A compliant decompressor must be able to decompress
+at least one working set of parameters
+that conforms to the specifications presented here.
+It may also ignore informative fields, such as checksum.
+Whenever it does not support a parameter defined in the compressed stream,
+it must produce a non-ambiguous error code and associated error message
+explaining which parameter is unsupported.
+
+
+Overall conventions
+-----------
+In this document square brackets i.e. `[` and `]` are used to indicate optional fields or parameters.
+
+
+Definitions
+-----------
+A content compressed by Zstandard is transformed into a Zstandard __frame__.
+Multiple frames can be appended into a single file or stream.
+A frame is totally independent, has a defined beginning and end,
+and a set of parameters which tells the decoder how to decompress it.
+
+A frame encapsulates one or multiple __blocks__.
+Each block can be compressed or not,
+and has a guaranteed maximum content size, which depends on frame parameters.
+Unlike frames, each block depends on previous blocks for proper decoding.
+However, each block can be decompressed without waiting for its successor,
+allowing streaming operations.
+
+
+Frame Concatenation
+-------------------
+
+In some circumstances, it may be required to append multiple frames,
+for example in order to add new data to an existing compressed file
+without re-framing it.
+
+In such case, each frame brings its own set of descriptor flags.
+Each frame is considered independent.
+The only relation between frames is their sequential order.
+
+The ability to decode multiple concatenated frames
+within a single stream or file is left outside of this specification.
+As an example, the reference `zstd` command line utility is able
+to decode all concatenated frames in their sequential order,
+delivering the final decompressed result as if it was a single content.
+
+
+General Structure of Zstandard Frame format
+-------------------------------------------
+The structure of a single Zstandard frame is following:
+
+| `Magic_Number` | `Frame_Header` |`Data_Block`| [More data blocks] | [`Content_Checksum`] |
+|:--------------:|:--------------:|:----------:| ------------------ |:--------------------:|
+| 4 bytes        |  2-14 bytes    | n bytes    |                    |   0-4 bytes          |
+
+__`Magic_Number`__
+
+4 Bytes, Little-endian format.
+Value : 0xFD2FB527
+
+__`Frame_Header`__
+
+2 to 14 Bytes, detailed in [next part](#the-structure-of-frame_header).
+
+__`Data_Block`__
+
+Detailed in [next chapter](#the-structure-of-data_block).
+That’s where compressed data is stored.
+
+__`Content_Checksum`__
+
+An optional 32-bit checksum, only present if `Content_Checksum_flag` is set.
+The content checksum is the result
+of [xxh64() hash function](https://www.xxHash.com)
+digesting the original (decoded) data as input, and a seed of zero.
+The low 4 bytes of the checksum are stored in little endian format.
+
+
+The structure of `Frame_Header`
+-------------------------------
+The `Frame_Header` has a variable size, which uses a minimum of 2 bytes,
+and up to 14 bytes depending on optional parameters.
+The structure of `Frame_Header` is following:
+
+| `Frame_Header_Descriptor` | [`Window_Descriptor`] | [`Dictionary_ID`] | [`Frame_Content_Size`] |
+| ------------------------- | --------------------- | ----------------- | ---------------------- |
+| 1 byte                    | 0-1 byte              | 0-4 bytes         | 0-8 bytes              |
+
+### `Frame_Header_Descriptor`
+
+The first header's byte is called the `Frame_Header_Descriptor`.
+It tells which other fields are present.
+Decoding this byte is enough to tell the size of `Frame_Header`.
+
+| Bit number | Field name                |
+| ---------- | ----------                |
+| 7-6        | `Frame_Content_Size_flag` |
+| 5          | `Single_Segment_flag`     |
+| 4          | `Unused_bit`              |
+| 3          | `Reserved_bit`            |
+| 2          | `Content_Checksum_flag`   |
+| 1-0        | `Dictionary_ID_flag`      |
+
+In this table, bit 7 is highest bit, while bit 0 is lowest.
+
+__`Frame_Content_Size_flag`__
+
+This is a 2-bits flag (`= Frame_Header_Descriptor >> 6`),
+specifying if decompressed data size is provided within the header.
+The `Flag_Value` can be converted into `Field_Size`,
+which is the number of bytes used by `Frame_Content_Size`
+according to the following table:
+
+|`Flag_Value`|  0  |  1  |  2  |  3  |
+| ---------- | --- | --- | --- | --- |
+|`Field_Size`| 0-1 |  2  |  4  |  8  |
+
+When `Flag_Value` is `0`, `Field_Size` depends on `Single_Segment_flag` :
+if `Single_Segment_flag` is set, `Field_Size` is 1.
+Otherwise, `Field_Size` is 0 (content size not provided).
+
+__`Single_Segment_flag`__
+
+If this flag is set,
+data must be regenerated within a single continuous memory segment.
+
+In this case, `Frame_Content_Size` is necessarily present,
+but `Window_Descriptor` byte is skipped.
+As a consequence, the decoder must allocate a memory segment
+of size equal or bigger than `Frame_Content_Size`.
+
+In order to preserve the decoder from unreasonable memory requirement,
+a decoder can reject a compressed frame
+which requests a memory size beyond decoder's authorized range.
+
+For broader compatibility, decoders are recommended to support
+memory sizes of at least 8 MB.
+This is just a recommendation,
+each decoder is free to support higher or lower limits,
+depending on local limitations.
+
+__`Unused_bit`__
+
+The value of this bit should be set to zero.
+A decoder compliant with this specification version shall not interpret it.
+It might be used in a future version,
+to signal a property which is not mandatory to properly decode the frame.
+
+__`Reserved_bit`__
+
+This bit is reserved for some future feature.
+Its value _must be zero_.
+A decoder compliant with this specification version must ensure it is not set.
+This bit may be used in a future revision,
+to signal a feature that must be interpreted to decode the frame correctly.
+
+__`Content_Checksum_flag`__
+
+If this flag is set, a 32-bits `Content_Checksum` will be present at frame's end.
+See `Content_Checksum` paragraph.
+
+__`Dictionary_ID_flag`__
+
+This is a 2-bits flag (`= FHD & 3`),
+telling if a dictionary ID is provided within the header.
+It also specifies the size of this field.
+
+|  Value   |  0  |  1  |  2  |  3  |
+| -------- | --- | --- | --- | --- |
+|Field size|  0  |  1  |  2  |  4  |
+
+### `Window_Descriptor`
+
+Provides guarantees on maximum back-reference distance
+that will be used within compressed data.
+This information is important for decoders to allocate enough memory.
+
+The `Window_Descriptor` byte is optional. It is absent when `Single_Segment_flag` is set.
+In this case, the maximum back-reference distance is the content size itself,
+which can be any value from 1 to 2^64-1 bytes (16 EB).
+
+| Bit numbers |    7-3   |    0-2   |
+| ----------- | -------- | -------- |
+| Field name  | Exponent | Mantissa |
+
+Maximum distance is given by the following formulae :
+```
+windowLog = 10 + Exponent;
+windowBase = 1 << windowLog;
+windowAdd = (windowBase / 8) * Mantissa;
+windowSize = windowBase + windowAdd;
+```
+The minimum window size is 1 KB.
+The maximum size is `15*(1<<38)` bytes, which is 1.875 TB.
+
+To properly decode compressed data,
+a decoder will need to allocate a buffer of at least `windowSize` bytes.
+
+In order to preserve decoder from unreasonable memory requirements,
+a decoder can refuse a compressed frame
+which requests a memory size beyond decoder's authorized range.
+
+For improved interoperability,
+decoders are recommended to be compatible with window sizes of 8 MB,
+and encoders are recommended to not request more than 8 MB.
+It's merely a recommendation though,
+decoders are free to support larger or lower limits,
+depending on local limitations.
+
+### `Dictionary_ID`
+
+This is a variable size field, which contains
+the ID of the dictionary required to properly decode the frame.
+Note that this field is optional. When it's not present,
+it's up to the caller to make sure it uses the correct dictionary.
+
+Field size depends on `Dictionary_ID_flag`.
+1 byte can represent an ID 0-255.
+2 bytes can represent an ID 0-65535.
+4 bytes can represent an ID 0-4294967295.
+
+It's allowed to represent a small ID (for example `13`)
+with a large 4-bytes dictionary ID, losing some compacity in the process.
+
+_Reserved ranges :_
+If the frame is going to be distributed in a private environment,
+any dictionary ID can be used.
+However, for public distribution of compressed frames using a dictionary,
+the following ranges are reserved for future use and should not be used :
+- low range : 1 - 32767
+- high range : >= (2^31)
+
+
+### `Frame_Content_Size`
+
+This is the original (uncompressed) size. This information is optional.
+The `Field_Size` is provided according to value of `Frame_Content_Size_flag`.
+The `Field_Size` can be equal to 0 (not present), 1, 2, 4 or 8 bytes.
+Format is Little-endian.
+
+| `Field_Size` |    Range   |
+| ------------ | ---------- |
+|      1       |   0 - 255  |
+|      2       | 256 - 65791|
+|      4       | 0 - 2^32-1 |
+|      8       | 0 - 2^64-1 |
+
+When `Field_Size` is 1, 4 or 8 bytes, the value is read directly.
+When `Field_Size` is 2, _the offset of 256 is added_.
+It's allowed to represent a small size (for example `18`) using any compatible variant.
+
+
+The structure of `Data_Block`
+-----------------------------
+The structure of `Data_Block` is following:
+
+| `Last_Block` | `Block_Type` | `Block_Size` | `Block_Content` |
+|:------------:|:------------:|:------------:|:---------------:|
+|   1 bit      |  2 bits      |  21 bits     |  n bytes        |
+
+The block header uses 3-bytes.
+
+__`Last_Block`__
+
+The lowest bit signals if this block is the last one.
+Frame ends right after this block.
+It may be followed by an optional `Content_Checksum` .
+
+__`Block_Type` and `Block_Size`__
+
+The next 2 bits represent the `Block_Type`,
+while the remaining 21 bits represent the `Block_Size`.
+Format is __little-endian__.
+
+There are 4 block types :
+
+|    Value     |      0      |     1       |  2                 |    3      |
+| ------------ | ----------- | ----------- | ------------------ | --------- |
+| `Block_Type` | `Raw_Block` | `RLE_Block` | `Compressed_Block` | `Reserved`|
+
+- `Raw_Block` - this is an uncompressed block.
+  `Block_Size` is the number of bytes to read and copy.
+- `RLE_Block` - this is a single byte, repeated N times.
+  In which case, `Block_Size` is the size to regenerate,
+  while the "compressed" block is just 1 byte (the byte to repeat).
+- `Compressed_Block` - this is a [Zstandard compressed block](#the-format-of-compressed_block),
+  detailed in another section of this specification.
+  `Block_Size` is the compressed size.
+  Decompressed size is unknown,
+  but its maximum possible value is guaranteed (see below)
+- `Reserved` - this is not a block.
+  This value cannot be used with current version of this specification.
+
+Block sizes must respect a few rules :
+- In compressed mode, compressed size if always strictly `< decompressed size`.
+- Block decompressed size is always <= maximum back-reference distance .
+- Block decompressed size is always <= 128 KB
+
+
+__`Block_Content`__
+
+The `Block_Content` is where the actual data to decode stands.
+It might be compressed or not, depending on previous field indications.
+A data block is not necessarily "full" :
+since an arbitrary “flush” may happen anytime,
+block decompressed content can be any size,
+up to `Block_Maximum_Decompressed_Size`, which is the smallest of :
+- Maximum back-reference distance
+- 128 KB
+
+
+Skippable Frames
+----------------
+
+| `Magic_Number` | `Frame_Size` | `User_Data` |
+|:--------------:|:------------:|:-----------:|
+|   4 bytes      |  4 bytes     |   n bytes   |
+
+Skippable frames allow the insertion of user-defined data
+into a flow of concatenated frames.
+Its design is pretty straightforward,
+with the sole objective to allow the decoder to quickly skip
+over user-defined data and continue decoding.
+
+Skippable frames defined in this specification are compatible with [LZ4] ones.
+
+[LZ4]:http://www.lz4.org
+
+__`Magic_Number`__
+
+4 Bytes, Little-endian format.
+Value : 0x184D2A5X, which means any value from 0x184D2A50 to 0x184D2A5F.
+All 16 values are valid to identify a skippable frame.
+
+__`Frame_Size`__
+
+This is the size, in bytes, of the following `User_Data`
+(without including the magic number nor the size field itself).
+This field is represented using 4 Bytes, Little-endian format, unsigned 32-bits.
+This means `User_Data` can’t be bigger than (2^32-1) bytes.
+
+__`User_Data`__
+
+The `User_Data` can be anything. Data will just be skipped by the decoder.
+
+
+The format of `Compressed_Block`
+--------------------------------
+The size of `Compressed_Block` must be provided using `Block_Size` field from `Data_Block`.
+The `Compressed_Block` has a guaranteed maximum regenerated size,
+in order to properly allocate destination buffer.
+See [`Data_Block`](#the-structure-of-data_block) for more details.
+
+A compressed block consists of 2 sections :
+- [Literals section](#literals-section)
+- [Sequences section](#sequences-section)
+
+### Prerequisites
+To decode a compressed block, the following elements are necessary :
+- Previous decoded blocks, up to a distance of `windowSize`,
+  or all previous blocks when `Single_Segment_flag` is set.
+- List of "recent offsets" from previous compressed block.
+- Decoding tables of previous compressed block for each symbol type
+  (literals, litLength, matchLength, offset).
+
+
+### Literals section
+
+During sequence phase, literals will be entangled with match copy operations.
+All literals are regrouped in the first part of the block.
+They can be decoded first, and then copied during sequence operations,
+or they can be decoded on the flow, as needed by sequence commands.
+
+| Literals section header | [Huffman Tree Description] | Stream1 | [Stream2] | [Stream3] | [Stream4] |
+| ----------------------- | -------------------------- | ------- | --------- | --------- | --------- |
+
+Literals can be stored uncompressed or compressed using Huffman prefix codes.
+When compressed, an optional tree description can be present,
+followed by 1 or 4 streams.
+
+
+#### Literals section header
+
+Header is in charge of describing how literals are packed.
+It's a byte-aligned variable-size bitfield, ranging from 1 to 5 bytes,
+using little-endian convention.
+
+| Literals Block Type | sizes format | regenerated size | [compressed size] |
+| ------------------- | ------------ | ---------------- | ----------------- |
+|   2 bits            |  1 - 2 bits  |    5 - 20 bits   |    0 - 18 bits    |
+
+In this representation, bits on the left are smallest bits.
+
+__Literals Block Type__ :
+
+This field uses 2 lowest bits of first byte, describing 4 different block types :
+
+|       Value         |  0  |  1  |      2     |      3      |
+| ------------------- | --- | --- | ---------- | ----------- |
+| Literals Block Type | Raw | RLE | Compressed | RepeatStats |    
+
+- Raw literals block - Literals are stored uncompressed.
+- RLE literals block - Literals consist of a single byte value repeated N times.
+- Compressed literals block - This is a standard huffman-compressed block,
+        starting with a huffman tree description.
+        See details below.
+- Repeat Stats literals block - This is a huffman-compressed block,
+        using huffman tree _from previous huffman-compressed literals block_.
+        Huffman tree description will be skipped.
+
+__Sizes format__ :
+
+Sizes format are divided into 2 families :
+
+- For compressed block, it requires to decode both the compressed size
+  and the decompressed size. It will also decode the number of streams.
+- For Raw or RLE blocks, it's enough to decode the size to regenerate.
+
+For values spanning several bytes, convention is Little-endian.
+
+__Sizes format for Raw and RLE literals block__ :
+
+- Value : x0 : Regenerated size uses 5 bits (0-31).
+               Total literal header size is 1 byte.
+               `size = h[0]>>3;`
+- Value : 01 : Regenerated size uses 12 bits (0-4095).
+               Total literal header size is 2 bytes.
+               `size = (h[0]>>4) + (h[1]<<4);`
+- Value : 11 : Regenerated size uses 20 bits (0-1048575).
+               Total literal header size is 3 bytes.
+               `size = (h[0]>>4) + (h[1]<<4) + (h[2]<<12);`
+
+Note : it's allowed to represent a short value (ex : `13`)
+using a long format, accepting the reduced compacity.
+
+__Sizes format for Compressed literals block and Repeat Stats literals block__ :
+
+- Value : 00 : _Single stream_.
+               Compressed and regenerated sizes use 10 bits (0-1023).
+               Total literal header size is 3 bytes.
+- Value : 01 : 4 streams.
+               Compressed and regenerated sizes use 10 bits (0-1023).
+               Total literal header size is 3 bytes.
+- Value : 10 : 4 streams.
+               Compressed and regenerated sizes use 14 bits (0-16383).
+               Total literal header size is 4 bytes.
+- Value : 11 : 4 streams.
+               Compressed and regenerated sizes use 18 bits (0-262143).
+               Total literal header size is 5 bytes.
+
+Compressed and regenerated size fields follow little-endian convention.
+
+#### Huffman Tree description
+
+This section is only present when literals block type is `Compressed` (`0`).
+
+Prefix coding represents symbols from an a priori known alphabet
+by bit sequences (codewords), one codeword for each symbol,
+in a manner such that different symbols may be represented
+by bit sequences of different lengths,
+but a parser can always parse an encoded string
+unambiguously symbol-by-symbol.
+
+Given an alphabet with known symbol frequencies,
+the Huffman algorithm allows the construction of an optimal prefix code
+using the fewest bits of any possible prefix codes for that alphabet.
+
+Prefix code must not exceed a maximum code length.
+More bits improve accuracy but cost more header size,
+and require more memory or more complex decoding operations.
+This specification limits maximum code length to 11 bits.
+
+
+##### Representation
+
+All literal values from zero (included) to last present one (excluded)
+are represented by `weight` values, from 0 to `maxBits`.
+Transformation from `weight` to `nbBits` follows this formulae :
+`nbBits = weight ? maxBits + 1 - weight : 0;` .
+The last symbol's weight is deduced from previously decoded ones,
+by completing to the nearest power of 2.
+This power of 2 gives `maxBits`, the depth of the current tree.
+
+__Example__ :
+Let's presume the following huffman tree must be described :
+
+| literal |  0  |  1  |  2  |  3  |  4  |  5  |
+| ------- | --- | --- | --- | --- | --- | --- |
+| nbBits  |  1  |  2  |  3  |  0  |  4  |  4  |
+
+The tree depth is 4, since its smallest element uses 4 bits.
+Value `5` will not be listed, nor will values above `5`.
+Values from `0` to `4` will be listed using `weight` instead of `nbBits`.
+Weight formula is : `weight = nbBits ? maxBits + 1 - nbBits : 0;`
+It gives the following serie of weights :
+
+| weights |  4  |  3  |  2  |  0  |  1  |
+| ------- | --- | --- | --- | --- | --- |
+| literal |  0  |  1  |  2  |  3  |  4  |
+
+The decoder will do the inverse operation :
+having collected weights of literals from `0` to `4`,
+it knows the last literal, `5`, is present with a non-zero weight.
+The weight of `5` can be deducted by joining to the nearest power of 2.
+Sum of 2^(weight-1) (excluding 0) is :
+`8 + 4 + 2 + 0 + 1 = 15`
+Nearest power of 2 is 16.
+Therefore, `maxBits = 4` and `weight[5] = 1`.
+
+##### Huffman Tree header
+
+This is a single byte value (0-255),
+which tells how to decode the list of weights.
+
+- if headerByte >= 128 : this is a direct representation,
+  where each weight is written directly as a 4 bits field (0-15).
+  The full representation occupies `((nbSymbols+1)/2)` bytes,
+  meaning it uses a last full byte even if nbSymbols is odd.
+  `nbSymbols = headerByte - 127;`.
+  Note that maximum nbSymbols is 255-127 = 128.
+  A larger serie must necessarily use FSE compression.
+
+- if headerByte < 128 :
+  the serie of weights is compressed by FSE.
+  The length of the FSE-compressed serie is `headerByte` (0-127).
+
+##### FSE (Finite State Entropy) compression of huffman weights
+
+The serie of weights is compressed using FSE compression.
+It's a single bitstream with 2 interleaved states,
+sharing a single distribution table.
+
+To decode an FSE bitstream, it is necessary to know its compressed size.
+Compressed size is provided by `headerByte`.
+It's also necessary to know its _maximum possible_ decompressed size,
+which is `255`, since literal values span from `0` to `255`,
+and last symbol value is not represented.
+
+An FSE bitstream starts by a header, describing probabilities distribution.
+It will create a Decoding Table.
+Table must be pre-allocated, which requires to support a maximum accuracy.
+For a list of huffman weights, maximum accuracy is 7 bits.
+
+FSE header is [described in relevant chapter](#fse-distribution-table--condensed-format),
+and so is [FSE bitstream](#bitstream).
+The main difference is that Huffman header compression uses 2 states,
+which share the same FSE distribution table.
+Bitstream contains only FSE symbols (no interleaved "raw bitfields").
+The number of symbols to decode is discovered
+by tracking bitStream overflow condition.
+When both states have overflowed the bitstream, end is reached.
+
+
+##### Conversion from weights to huffman prefix codes
+
+All present symbols shall now have a `weight` value.
+It is possible to transform weights into nbBits, using this formula :
+`nbBits = nbBits ? maxBits + 1 - weight : 0;` .
+
+Symbols are sorted by weight. Within same weight, symbols keep natural order.
+Symbols with a weight of zero are removed.
+Then, starting from lowest weight, prefix codes are distributed in order.
+
+__Example__ :
+Let's presume the following list of weights has been decoded :
+
+| Literal |  0  |  1  |  2  |  3  |  4  |  5  |
+| ------- | --- | --- | --- | --- | --- | --- |
+|  weight |  4  |  3  |  2  |  0  |  1  |  1  |
+
+Sorted by weight and then natural order,
+it gives the following distribution :
+
+| Literal      |  3  |  4  |  5  |  2  |  1  |   0  |
+| ------------ | --- | --- | --- | --- | --- | ---- |
+| weight       |  0  |  1  |  1  |  2  |  3  |   4  |
+| nb bits      |  0  |  4  |  4  |  3  |  2  |   1  |
+| prefix codes | N/A | 0000| 0001| 001 | 01  |   1  |
+
+
+#### Literals bitstreams
+
+##### Bitstreams sizes
+
+As seen in a previous paragraph,
+there are 2 flavors of huffman-compressed literals :
+single stream, and 4-streams.
+
+4-streams is useful for CPU with multiple execution units and OoO operations.
+Since each stream can be decoded independently,
+it's possible to decode them up to 4x faster than a single stream,
+presuming the CPU has enough parallelism available.
+
+For single stream, header provides both the compressed and regenerated size.
+For 4-streams though,
+header only provides compressed and regenerated size of all 4 streams combined.
+In order to properly decode the 4 streams,
+it's necessary to know the compressed and regenerated size of each stream.
+
+Regenerated size of each stream can be calculated by `(totalSize+3)/4`,
+except for last one, which can be up to 3 bytes smaller, to reach `totalSize`.
+
+Compressed size is provided explicitly : in the 4-streams variant,
+bitstreams are preceded by 3 unsigned Little-Endian 16-bits values.
+Each value represents the compressed size of one stream, in order.
+The last stream size is deducted from total compressed size
+and from previously decoded stream sizes :
+`stream4CSize = totalCSize - 6 - stream1CSize - stream2CSize - stream3CSize;`
+
+##### Bitstreams read and decode
+
+Each bitstream must be read _backward_,
+that is starting from the end down to the beginning.
+Therefore it's necessary to know the size of each bitstream.
+
+It's also necessary to know exactly which _bit_ is the latest.
+This is detected by a final bit flag :
+the highest bit of latest byte is a final-bit-flag.
+Consequently, a last byte of `0` is not possible.
+And the final-bit-flag itself is not part of the useful bitstream.
+Hence, the last byte contains between 0 and 7 useful bits.
+
+Starting from the end,
+it's possible to read the bitstream in a little-endian fashion,
+keeping track of already used bits.
+
+Reading the last `maxBits` bits,
+it's then possible to compare extracted value to decoding table,
+determining the symbol to decode and number of bits to discard.
+
+The process continues up to reading the required number of symbols per stream.
+If a bitstream is not entirely and exactly consumed,
+hence reaching exactly its beginning position with _all_ bits consumed,
+the decoding process is considered faulty.
+
+
+### Sequences section
+
+A compressed block is a succession of _sequences_ .
+A sequence is a literal copy command, followed by a match copy command.
+A literal copy command specifies a length.
+It is the number of bytes to be copied (or extracted) from the literal section.
+A match copy command specifies an offset and a length.
+The offset gives the position to copy from,
+which can be within a previous block.
+
+There are 3 symbol types, `literalLength`, `matchLength` and `offset`,
+which are encoded together, interleaved in a single _bitstream_.
+
+Each symbol is a _code_ in its own context,
+which specifies a baseline and a number of bits to add.
+_Codes_ are FSE compressed,
+and interleaved with raw additional bits in the same bitstream.
+
+The Sequences section starts by a header,
+followed by optional Probability tables for each symbol type,
+followed by the bitstream.
+
+| Header | [LitLengthTable] | [OffsetTable] | [MatchLengthTable] | bitStream |
+| ------ | ---------------- | ------------- | ------------------ | --------- |
+
+To decode the Sequence section, it's required to know its size.
+This size is deducted from `blockSize - literalSectionSize`.
+
+
+#### Sequences section header
+
+Consists in 2 items :
+- Nb of Sequences
+- Flags providing Symbol compression types
+
+__Nb of Sequences__
+
+This is a variable size field, `nbSeqs`, using between 1 and 3 bytes.
+Let's call its first byte `byte0`.
+- `if (byte0 == 0)` : there are no sequences.
+            The sequence section stops there.
+            Regenerated content is defined entirely by literals section.
+- `if (byte0 < 128)` : `nbSeqs = byte0;` . Uses 1 byte.
+- `if (byte0 < 255)` : `nbSeqs = ((byte0-128) << 8) + byte1;` . Uses 2 bytes.
+- `if (byte0 == 255)`: `nbSeqs = byte1 + (byte2<<8) + 0x7F00;` . Uses 3 bytes.
+
+__Symbol encoding modes__
+
+This is a single byte, defining the compression mode of each symbol type.
+
+|  BitNb  |   7-6  |   5-4  |   3-2  |    1-0   |
+| ------- | ------ | ------ | ------ | -------- |
+|FieldName| LLType | OFType | MLType | Reserved |
+
+The last field, `Reserved`, must be all-zeroes.
+
+`LLType`, `OFType` and `MLType` define the compression mode of
+Literal Lengths, Offsets and Match Lengths respectively.
+
+They follow the same enumeration :
+
+|       Value      |    0   |  1  |      2     |    3   |
+| ---------------- | ------ | --- | ---------- | ------ |
+| Compression Mode | predef | RLE | Compressed | Repeat |
+
+- "predef" : uses a pre-defined distribution table.
+- "RLE" : it's a single code, repeated `nbSeqs` times.
+- "Repeat" : re-use distribution table from previous compressed block.
+- "Compressed" : standard FSE compression.
+          A distribution table will be present.
+          It will be described in [next part](#distribution-tables).
+
+#### Symbols decoding
+
+##### Literal Lengths codes
+
+Literal lengths codes are values ranging from `0` to `35` included.
+They define lengths from 0 to 131071 bytes.
+
+|  Code  | 0-15 |
+| ------ | ---- |
+| length | Code |
+| nbBits |   0  |
+
+
+|   Code   |  16  |  17  |  18  |  19  |  20  |  21  |  22  |  23  |
+| -------- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- |
+| Baseline |  16  |  18  |  20  |  22  |  24  |  28  |  32  |  40  |
+| nb Bits  |   1  |   1  |   1  |   1  |   2  |   2  |   3  |   3  |
+
+|   Code   |  24  |  25  |  26  |  27  |  28  |  29  |  30  |  31  |
+| -------- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- |
+| Baseline |  48  |  64  |  128 |  256 |  512 | 1024 | 2048 | 4096 |
+| nb Bits  |   4  |   6  |   7  |   8  |   9  |  10  |  11  |  12  |
+
+|   Code   |  32  |  33  |  34  |  35  |
+| -------- | ---- | ---- | ---- | ---- |
+| Baseline | 8192 |16384 |32768 |65536 |
+| nb Bits  |  13  |  14  |  15  |  16  |
+
+__Default distribution__
+
+When "compression mode" is "predef"",
+a pre-defined distribution is used for FSE compression.
+
+Below is its definition. It uses an accuracy of 6 bits (64 states).
+```
+short literalLengths_defaultDistribution[36] =
+        { 4, 3, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1,
+          2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 2, 1, 1, 1, 1, 1,
+         -1,-1,-1,-1 };
+```
+
+##### Match Lengths codes
+
+Match lengths codes are values ranging from `0` to `52` included.
+They define lengths from 3 to 131074 bytes.
+
+|  Code  |   0-31   |
+| ------ | -------- |
+| value  | Code + 3 |
+| nbBits |     0    |
+
+|   Code   |  32  |  33  |  34  |  35  |  36  |  37  |  38  |  39  |
+| -------- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- |
+| Baseline |  35  |  37  |  39  |  41  |  43  |  47  |  51  |  59  |
+| nb Bits  |   1  |   1  |   1  |   1  |   2  |   2  |   3  |   3  |
+
+|   Code   |  40  |  41  |  42  |  43  |  44  |  45  |  46  |  47  |
+| -------- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- |
+| Baseline |  67  |  83  |  99  |  131 |  258 |  514 | 1026 | 2050 |
+| nb Bits  |   4  |   4  |   5  |   7  |   8  |   9  |  10  |  11  |
+
+|   Code   |  48  |  49  |  50  |  51  |  52  |
+| -------- | ---- | ---- | ---- | ---- | ---- |
+| Baseline | 4098 | 8194 |16486 |32770 |65538 |
+| nb Bits  |  12  |  13  |  14  |  15  |  16  |
+
+__Default distribution__
+
+When "compression mode" is defined as "predef",
+a pre-defined distribution is used for FSE compression.
+
+Here is its definition. It uses an accuracy of 6 bits (64 states).
+```
+short matchLengths_defaultDistribution[53] =
+        { 1, 4, 3, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1,
+          1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
+          1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,-1,-1,
+         -1,-1,-1,-1,-1 };
+```
+
+##### Offset codes
+
+Offset codes are values ranging from `0` to `N`,
+with `N` being limited by maximum backreference distance.
+
+A decoder is free to limit its maximum `N` supported.
+Recommendation is to support at least up to `22`.
+For information, at the time of this writing.
+the reference decoder supports a maximum `N` value of `28` in 64-bits mode.
+
+An offset code is also the nb of additional bits to read,
+and can be translated into an `OFValue` using the following formulae :
+
+```
+OFValue = (1 << offsetCode) + readNBits(offsetCode);
+if (OFValue > 3) offset = OFValue - 3;
+```
+
+OFValue from 1 to 3 are special : they define "repeat codes",
+which means one of the previous offsets will be repeated.
+They are sorted in recency order, with 1 meaning the most recent one.
+See [Repeat offsets](#repeat-offsets) paragraph.
+
+__Default distribution__
+
+When "compression mode" is defined as "predef",
+a pre-defined distribution is used for FSE compression.
+
+Here is its definition. It uses an accuracy of 5 bits (32 states),
+and supports a maximum `N` of 28, allowing offset values up to 536,870,908 .
+
+If any sequence in the compressed block requires an offset larger than this,
+it's not possible to use the default distribution to represent it.
+
+```
+short offsetCodes_defaultDistribution[53] =
+        { 1, 1, 1, 1, 1, 1, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1,
+          1, 1, 1, 1, 1, 1, 1, 1,-1,-1,-1,-1,-1 };
+```
+
+#### Distribution tables
+
+Following the header, up to 3 distribution tables can be described.
+When present, they are in this order :
+- Literal lengthes
+- Offsets
+- Match Lengthes
+
+The content to decode depends on their respective encoding mode :
+- Predef : no content. Use pre-defined distribution table.
+- RLE : 1 byte. This is the only code to use across the whole compressed block.
+- FSE : A distribution table is present.
+- Repeat mode : no content. Re-use distribution from previous compressed block.
+
+##### FSE distribution table : condensed format
+
+An FSE distribution table describes the probabilities of all symbols
+from `0` to the last present one (included)
+on a normalized scale of `1 << AccuracyLog` .
+
+It's a bitstream which is read forward, in little-endian fashion.
+It's not necessary to know its exact size,
+since it will be discovered and reported by the decoding process.
+
+The bitstream starts by reporting on which scale it operates.
+`AccuracyLog = low4bits + 5;`
+Note that maximum `AccuracyLog` for literal and match lengthes is `9`,
+and for offsets it is `8`. Higher values are considered errors.
+
+Then follow each symbol value, from `0` to last present one.
+The nb of bits used by each field is variable.
+It depends on :
+
+- Remaining probabilities + 1 :
+  __example__ :
+  Presuming an AccuracyLog of 8,
+  and presuming 100 probabilities points have already been distributed,
+  the decoder may read any value from `0` to `255 - 100 + 1 == 156` (included).
+  Therefore, it must read `log2sup(156) == 8` bits.
+
+- Value decoded : small values use 1 less bit :
+  __example__ :
+  Presuming values from 0 to 156 (included) are possible,
+  255-156 = 99 values are remaining in an 8-bits field.
+  They are used this way :
+  first 99 values (hence from 0 to 98) use only 7 bits,
+  values from 99 to 156 use 8 bits.
+  This is achieved through this scheme :
+
+  | Value read | Value decoded | nb Bits used |
+  | ---------- | ------------- | ------------ |
+  |   0 -  98  |   0 -  98     |  7           |
+  |  99 - 127  |  99 - 127     |  8           |
+  | 128 - 226  |   0 -  98     |  7           |
+  | 227 - 255  | 128 - 156     |  8           |
+
+Symbols probabilities are read one by one, in order.
+
+Probability is obtained from Value decoded by following formulae :
+`Proba = value - 1;`
+
+It means value `0` becomes negative probability `-1`.
+`-1` is a special probability, which means `less than 1`.
+Its effect on distribution table is described in [next paragraph].
+For the purpose of calculating cumulated distribution, it counts as one.
+
+[next paragraph]:#fse-decoding--from-normalized-distribution-to-decoding-tables
+
+When a symbol has a probability of `zero`,
+it is followed by a 2-bits repeat flag.
+This repeat flag tells how many probabilities of zeroes follow the current one.
+It provides a number ranging from 0 to 3.
+If it is a 3, another 2-bits repeat flag follows, and so on.
+
+When last symbol reaches cumulated total of `1 << AccuracyLog`,
+decoding is complete.
+If the last symbol makes cumulated total go above `1 << AccuracyLog`,
+distribution is considered corrupted.
+
+Then the decoder can tell how many bytes were used in this process,
+and how many symbols are present.
+The bitstream consumes a round number of bytes.
+Any remaining bit within the last byte is just unused.
+
+##### FSE decoding : from normalized distribution to decoding tables
+
+The distribution of normalized probabilities is enough
+to create a unique decoding table.
+
+It follows the following build rule :
+
+The table has a size of `tableSize = 1 << AccuracyLog;`.
+Each cell describes the symbol decoded,
+and instructions to get the next state.
+
+Symbols are scanned in their natural order for `less than 1` probabilities.
+Symbols with this probability are being attributed a single cell,
+starting from the end of the table.
+These symbols define a full state reset, reading `AccuracyLog` bits.
+
+All remaining symbols are sorted in their natural order.
+Starting from symbol `0` and table position `0`,
+each symbol gets attributed as many cells as its probability.
+Cell allocation is spreaded, not linear :
+each successor position follow this rule :
+
+```
+position += (tableSize>>1) + (tableSize>>3) + 3;
+position &= tableSize-1;
+```
+
+A position is skipped if already occupied,
+typically by a "less than 1" probability symbol.
+
+The result is a list of state values.
+Each state will decode the current symbol.
+
+To get the Number of bits and baseline required for next state,
+it's first necessary to sort all states in their natural order.
+The lower states will need 1 more bit than higher ones.
+
+__Example__ :
+Presuming a symbol has a probability of 5.
+It receives 5 state values. States are sorted in natural order.
+
+Next power of 2 is 8.
+Space of probabilities is divided into 8 equal parts.
+Presuming the AccuracyLog is 7, it defines 128 states.
+Divided by 8, each share is 16 large.
+
+In order to reach 8, 8-5=3 lowest states will count "double",
+taking shares twice larger,
+requiring one more bit in the process.
+
+Numbering starts from higher states using less bits.
+
+| state order |   0   |   1   |    2   |   3  |   4   |
+| ----------- | ----- | ----- | ------ | ---- | ----- |
+| width       |  32   |  32   |   32   |  16  |  16   |
+| nb Bits     |   5   |   5   |    5   |   4  |   4   |
+| range nb    |   2   |   4   |    6   |   0  |   1   |
+| baseline    |  32   |  64   |   96   |   0  |  16   |
+| range       | 32-63 | 64-95 | 96-127 | 0-15 | 16-31 |
+
+Next state is determined from current state
+by reading the required number of bits, and adding the specified baseline.
+
+
+#### Bitstream
+
+All sequences are stored in a single bitstream, read _backward_.
+It is therefore necessary to know the bitstream size,
+which is deducted from compressed block size.
+
+The last useful bit of the stream is followed by an end-bit-flag.
+Highest bit of last byte is this flag.
+It does not belong to the useful part of the bitstream.
+Therefore, last byte has 0-7 useful bits.
+Note that it also means that last byte cannot be `0`.
+
+##### Starting states
+
+The bitstream starts with initial state values,
+each using the required number of bits in their respective _accuracy_,
+decoded previously from their normalized distribution.
+
+It starts by `Literal Length State`,
+followed by `Offset State`,
+and finally `Match Length State`.
+
+Reminder : always keep in mind that all values are read _backward_.
+
+##### Decoding a sequence
+
+A state gives a code.
+A code provides a baseline and number of bits to add.
+See [Symbol Decoding] section for details on each symbol.
+
+Decoding starts by reading the nb of bits required to decode offset.
+It then does the same for match length,
+and then for literal length.
+
+Offset / matchLength / litLength define a sequence.
+It starts by inserting the number of literals defined by `litLength`,
+then continue by copying `matchLength` bytes from `currentPos - offset`.
+
+The next operation is to update states.
+Using rules pre-calculated in the decoding tables,
+`Literal Length State` is updated,
+followed by `Match Length State`,
+and then `Offset State`.
+
+This operation will be repeated `NbSeqs` times.
+At the end, the bitstream shall be entirely consumed,
+otherwise bitstream is considered corrupted.
+
+[Symbol Decoding]:#symbols-decoding
+
+##### Repeat offsets
+
+As seen in [Offset Codes], the first 3 values define a repeated offset.
+They are sorted in recency order, with 1 meaning "most recent one".
+
+There is an exception though, when current sequence's literal length is `0`.
+In which case, repcodes are "pushed by one",
+so 1 becomes 2, 2 becomes 3,
+and 3 becomes "offset_1 - 1_byte".
+
+On first block, offset history is populated by the following values : 1, 4 and 8 (in order).
+
+Then each block receives its start value from previous compressed block.
+Note that non-compressed blocks are skipped,
+they do not contribute to offset history.
+
+[Offset Codes]: #offset-codes
+
+###### Offset updates rules
+
+New offset take the lead in offset history,
+up to its previous place if it was already present.
+
+It means that when repeat offset 1 (most recent) is used, history is unmodified.
+When repeat offset 2 is used, it's swapped with offset 1.
+
+
+Dictionary format
+-----------------
+
+`zstd` is compatible with "pure content" dictionaries, free of any format restriction.
+But dictionaries created by `zstd --train` follow a format, described here.
+
+__Pre-requisites__ : a dictionary has a known length,
+                     defined either by a buffer limit, or a file size.
+
+| Header | DictID | Stats | Content |
+| ------ | ------ | ----- | ------- |
+
+__Header__ : 4 bytes ID, value 0xEC30A437, Little-Endian format
+
+__Dict_ID__ : 4 bytes, stored in Little-Endian format.
+              DictID can be any value, except 0 (which means no DictID).
+              It's used by decoders to check if they use the correct dictionary.
+              _Reserved ranges :_
+              If the frame is going to be distributed in a private environment,
+              any dictionary ID can be used.
+              However, for public distribution of compressed frames,
+              some ranges are reserved for future use :
+
+              - low range : 1 - 32767 : reserved
+              - high range : >= (2^31) : reserved
+
+__Stats__ : Entropy tables, following the same format as a [compressed blocks].
+            They are stored in following order :
+            Huffman tables for literals, FSE table for offset,
+            FSE table for matchLenth, and FSE table for litLength.
+            It's finally followed by 3 offset values, populating recent offsets,
+            stored in order, 4-bytes little-endian each, for a total of 12 bytes.
+
+__Content__ : Where the actual dictionary content is.
+              Content size depends on Dictionary size.
+
+[compressed blocks]: #the-format-of-compressed_block
+
+
+Version changes
+---------------
+- 0.2.0 : numerous format adjustments for zstd v0.8
+- 0.1.2 : limit huffman tree depth to 11 bits
+- 0.1.1 : reserved dictID ranges
+- 0.1.0 : initial release

-- 
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/debian-med/libzstd.git



More information about the debian-med-commit mailing list