[Git][java-team/compress-lzf][upstream] 2 commits: New upstream version 1.0.3
Emmanuel Bourg
gitlab at salsa.debian.org
Mon Jan 28 21:15:35 GMT 2019
Emmanuel Bourg pushed to branch upstream at Debian Java Maintainers / compress-lzf
Commits:
35c53663 by Emmanuel Bourg at 2019-01-28T18:38:40Z
New upstream version 1.0.3
- - - - -
8e9a254a by Emmanuel Bourg at 2019-01-28T21:06:37Z
New upstream version 1.0.4
- - - - -
26 changed files:
- README.md
- VERSION.txt
- pom.xml
- src/main/java/com/ning/compress/Uncompressor.java
- src/main/java/com/ning/compress/gzip/GZIPUncompressor.java
- src/main/java/com/ning/compress/gzip/OptimizedGZIPInputStream.java
- src/main/java/com/ning/compress/lzf/ChunkEncoder.java
- src/main/java/com/ning/compress/lzf/LZFCompressingInputStream.java
- src/main/java/com/ning/compress/lzf/LZFEncoder.java
- src/main/java/com/ning/compress/lzf/LZFInputStream.java
- src/main/java/com/ning/compress/lzf/LZFOutputStream.java
- src/main/java/com/ning/compress/lzf/LZFUncompressor.java
- src/main/java/com/ning/compress/lzf/impl/UnsafeChunkEncoder.java
- src/main/java/com/ning/compress/lzf/impl/UnsafeChunkEncoderBE.java
- src/main/java/com/ning/compress/lzf/impl/UnsafeChunkEncoderLE.java
- src/main/java/com/ning/compress/lzf/impl/UnsafeChunkEncoders.java
- src/main/java/com/ning/compress/lzf/impl/VanillaChunkEncoder.java
- src/main/java/com/ning/compress/lzf/parallel/PLZFOutputStream.java
- src/main/java/com/ning/compress/lzf/util/ChunkEncoderFactory.java
- src/main/java/com/ning/compress/lzf/util/LZFFileInputStream.java
- src/main/java/com/ning/compress/lzf/util/LZFFileOutputStream.java
- src/test/java/com/ning/compress/gzip/TestGzipStreams.java
- src/test/java/com/ning/compress/gzip/TestGzipUncompressor.java
- src/test/java/com/ning/compress/lzf/TestLZFEncoder.java → src/test/java/com/ning/compress/lzf/LZFEncoderTest.java
- src/test/java/com/ning/compress/lzf/TestLZFRoundTrip.java
- src/test/java/com/ning/compress/lzf/TestLZFUncompressor.java
Changes:
=====================================
README.md
=====================================
@@ -1,15 +1,22 @@
-# Ning-Compress
+# LZF Compressor
## Overview
-Ning-compress is a Java library for encoding and decoding data in LZF format, written by Tatu Saloranta (tatu.saloranta at iki.fi)
+LZF-compress is a Java library for encoding and decoding data in LZF format,
+written by Tatu Saloranta (tatu.saloranta at iki.fi)
-Data format and algorithm based on original [LZF library](http://freshmeat.net/projects/liblzf) by Marc A Lehmann. See [LZF Format](https://github.com/ning/compress/wiki/LZFFormat) for full description.
+Data format and algorithm based on original [LZF library](http://freshmeat.net/projects/liblzf) by Marc A Lehmann.
+See [LZF Format Specification](https://github.com/ning/compress/wiki/LZFFormat) for full description.
-Format differs slightly from some other adaptations, such as one used by [H2 database project](http://www.h2database.com) (by Thomas Mueller); although internal block compression structure is the same, block identifiers differ.
-This package uses the original LZF identifiers to be 100% compatible with existing command-line lzf tool(s).
+Format differs slightly from some other adaptations, such as the one used
+by [H2 database project](http://www.h2database.com) (by Thomas Mueller);
+although internal block compression structure is the same, block identifiers differ.
+This package uses the original LZF identifiers to be 100% compatible with existing command-line `lzf` tool(s).
-LZF alfgorithm itself is optimized for speed, with somewhat more modest compression: compared to Deflate (algorithm gzip uses) LZF can be 5-6 times as fast to compress, and twice as fast to decompress.
+LZF alfgorithm itself is optimized for speed, with somewhat more modest compression.
+Compared to the standard `Deflate` (algorithm gzip uses) LZF can be 5-6 times as fast to compress,
+and twice as fast to decompress. Compression rate is lower since no Huffman-encoding is used
+after lempel-ziv substring elimintation.
## Usage
@@ -17,28 +24,52 @@ See [Wiki](https://github.com/ning/compress/wiki) for more details; here's a "TL
Both compression and decompression can be done either by streaming approach:
- InputStream in = new LZFInputStream(new FileInputStream("data.lzf"));
- OutputStream out = new LZFOutputStream(new FileOutputStream("results.lzf"));
- InputStream compIn = new LZFCompressingInputStream(new FileInputStream("stuff.txt"));
+```java
+InputStream in = new LZFInputStream(new FileInputStream("data.lzf"));
+OutputStream out = new LZFOutputStream(new FileOutputStream("results.lzf"));
+InputStream compIn = new LZFCompressingInputStream(new FileInputStream("stuff.txt"));
+```
or by block operation:
- byte[] compressed = LZFEncoder.encode(uncompressedData);
- byte[] uncompressed = LZFDecoder.decode(compressedData);
+```java
+byte[] compressed = LZFEncoder.encode(uncompressedData);
+byte[] uncompressed = LZFDecoder.decode(compressedData);
+```
and you can even use the LZF jar as a command-line tool (it has manifest that points to 'com.ning.compress.lzf.LZF' as the class having main() method to call), like so:
- java -jar compress-lzf-1.0.0.jar
+ java -jar compress-lzf-1.0.3.jar
(which will display necessary usage arguments for `-c`(ompressing) or `-d`(ecompressing) files.
+### Parallel processing
+
+Since the compression is more CPU-heavy than decompression, it could benefit from concurrent operation.
+This works well with LZF because of its block-oriented nature, so that although there is need for
+sequential processing within block (of up to 64kB), encoding of separate blocks can be done completely
+independently: there are no dependencies to earlier blocks.
+
+The main abstraction to use is `PLZFOutputStream` which a `FilterOutputStream` and implements
+`java.nio.channels.WritableByteChannel` as well. It use is like that of any `OutputStream`:
+
+```java
+PLZFOutputStream output = new PLZFOutputStream(new FileOutputStream("stuff.lzf"));
+// then write contents:
+output.write(buffer);
+// ...
+output.close();
+
+```
+
## Interoperability
Besides Java support, LZF codecs / bindings exist for non-JVM languages as well:
* C: [liblzf](http://oldhome.schmorp.de/marc/liblzf.html) (the original LZF package!)
+* C#: [C# LZF](https://csharplzfcompression.codeplex.com/)
* Go: [Golly](https://github.com/tav/golly)
-* Javascript(!): [http://freecode.com/projects/lzf](freecode LZF) (or via [SourceForge](http://sourceforge.net/projects/lzf/))
+* Javascript(!): [freecode LZF](http://freecode.com/projects/lzf) (or via [SourceForge](http://sourceforge.net/projects/lzf/))
* Perl: [Compress::LZF](http://search.cpan.org/dist/Compress-LZF/LZF.pm)
* Python: [Python-LZF](https://github.com/teepark/python-lzf)
* Ruby: [glebtv/lzf](https://github.com/glebtv/lzf), [LZF/Ruby](https://rubyforge.org/projects/lzfruby/)
@@ -50,3 +81,20 @@ Check out [jvm-compress-benchmark](https://github.com/ning/jvm-compressor-benchm
## More
[Project Wiki](https://github.com/ning/compress/wiki).
+
+## Alternative High-Speed Lempel-Ziv Compressors
+
+LZF belongs to a family of compression codecs called "simple Lempel-Ziv" codecs.
+Since LZ compression is also the first part of `deflate` compression (which is used,
+along with simple framing, for `gzip`), it can be viewed as "first-part of gzip"
+(second part being Huffman-encoding of compressed content).
+
+There are many other codecs in this category, most notable (and competitive being)
+
+* [Snappy](http://en.wikipedia.org/wiki/Snappy_%28software%29)
+* [LZ4](http://en.wikipedia.org/wiki/LZ4_%28compression_algorithm%29)
+
+all of which have very similar compression ratios (due to same underlying algorithm,
+differences coming from slight encoding variations, and efficiency differences in
+back-reference matching), and similar performance profiles regarding ratio of
+compression vs uncompression speeds.
=====================================
VERSION.txt
=====================================
@@ -1,3 +1,22 @@
+1.0.4 (12-Mar-2017)
+
+#43: estimateMaxWorkspaceSize() is too small
+ (reported by Roman L, leventow at github)
+
+1.0.3 (15-Aug-2014)
+
+#37: Incorrect de-serialization on Big Endian systems, due to incorrect usage of #numberOfTrailingZeroes
+ (pointed out by Gireesh P, gireeshpunathil at github)
+
+1.0.2 (09-Aug-2014)
+
+#38: Overload of factory methods and constructors in Encoders and Streams
+ to allow specifying custom `BufferRecycler` instance
+ (contributed by `serverperformance at github`)
+#39: VanillaChunkEncoder.tryCompress() not using 'inPos' as it should, potentially
+ causing corruption in rare cases
+ (contributed by Ryan E, rjerns at github)
+
1.0.1 (08-Apr-2014)
#35: Fix a problem with closing of `DeflaterOutputStream` (for gzip output)
=====================================
pom.xml
=====================================
@@ -1,29 +1,30 @@
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
+ <!-- 13-Mar-2017, tatu: use FasterXML oss-parent over sonatype's, more
+ likely to get settings that work for releases
+ -->
<parent>
- <groupId>org.sonatype.oss</groupId>
- <artifactId>oss-parent</artifactId>
- <version>7</version>
+ <groupId>com.fasterxml</groupId>
+ <artifactId>oss-parent</artifactId>
+ <version>24</version>
</parent>
<groupId>com.ning</groupId>
<artifactId>compress-lzf</artifactId>
<name>Compress-LZF</name>
- <version>1.0.1</version>
+ <version>1.0.4</version>
<packaging>bundle</packaging>
<description>
Compression codec for LZF encoding for particularly encoding/decoding, with reasonable compression.
Compressor is basic Lempel-Ziv codec, without Huffman (deflate/gzip) or statistical post-encoding.
See "http://oldhome.schmorp.de/marc/liblzf.html" for more on original LZF package.
- </description>
- <prerequisites>
- <maven>2.2.1</maven>
- </prerequisites>
+ </description>
<url>http://github.com/ning/compress</url>
<scm>
<connection>scm:git:git at github.com:ning/compress.git</connection>
<developerConnection>scm:git:git at github.com:ning/compress.git</developerConnection>
<url>http://github.com/ning/compress</url>
+ <tag>compress-lzf-1.0.4</tag>
</scm>
<issueManagement>
<url>http://github.com/ning/compress/issues</url>
@@ -58,11 +59,11 @@ See "http://oldhome.schmorp.de/marc/liblzf.html" for more on original LZF packag
</licenses>
<dependencies>
<dependency>
- <groupId>org.testng</groupId>
- <artifactId>testng</artifactId>
- <version>6.5.2</version>
- <type>jar</type>
- <scope>test</scope>
+ <groupId>org.testng</groupId>
+ <artifactId>testng</artifactId>
+ <version>6.8.21</version>
+ <type>jar</type>
+ <scope>test</scope>
</dependency>
</dependencies>
<build>
@@ -71,7 +72,7 @@ See "http://oldhome.schmorp.de/marc/liblzf.html" for more on original LZF packag
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
- <version>2.3.2</version>
+ <version>3.1</version>
<!-- 1.6 since 0.9.7 -->
<configuration>
<source>1.6</source>
@@ -82,7 +83,7 @@ See "http://oldhome.schmorp.de/marc/liblzf.html" for more on original LZF packag
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-source-plugin</artifactId>
- <version>2.1.2</version>
+ <version>2.1.2</version>
<executions>
<execution>
<id>attach-sources</id>
@@ -95,13 +96,13 @@ See "http://oldhome.schmorp.de/marc/liblzf.html" for more on original LZF packag
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-javadoc-plugin</artifactId>
- <version>2.6.1</version>
+ <version>${version.plugin.javadoc}</version>
<configuration>
<source>1.6</source>
<target>1.6</target>
<encoding>UTF-8</encoding>
<links>
- <link>http://docs.oracle.com/javase/6/docs/api/</link>
+ <link>http://docs.oracle.com/javase/7/docs/api/</link>
</links>
</configuration>
<executions>
@@ -117,16 +118,15 @@ See "http://oldhome.schmorp.de/marc/liblzf.html" for more on original LZF packag
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-release-plugin</artifactId>
- <version>2.1</version>
<configuration>
<mavenExecutorId>forked-path</mavenExecutorId>
</configuration>
</plugin>
<!-- Plus, let's make jars OSGi bundles as well -->
- <plugin>
+ <plugin>
<groupId>org.apache.felix</groupId>
<artifactId>maven-bundle-plugin</artifactId>
- <version>2.3.7</version>
+ <version>2.5.3</version>
<extensions>true</extensions>
<configuration>
<instructions><!-- note: artifact id, name, version and description use defaults (which are fine) -->
@@ -148,7 +148,62 @@ com.ning.compress.lzf.util
<Main-Class>com.ning.compress.lzf.LZF</Main-Class>
</instructions>
</configuration>
- </plugin>
+ </plugin>
+ <!-- EVEN BETTER; make executable! -->
+
+<!-- 08-Sep-2014, tatu: except, doesn't quite work yet. Sigh.
+ <plugin>
+ <groupId>org.apache.maven.plugins</groupId>
+ <artifactId>maven-shade-plugin</artifactId>
+ <version>2.2</version>
+ <executions>
+ <execution>
+ <phase>package</phase>
+ <goals>
+ <goal>shade</goal>
+ </goals>
+ <configuration>
+ <transformers>
+ <transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer" />
+ <transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
+ <mainClass>com.ning.compress.lzf.LZF</mainClass>
+ </transformer>
+ </transformers>
+ <createDependencyReducedPom>false</createDependencyReducedPom>
+ <filters>
+ <filter>
+ <artifact>*:*</artifact>
+ <excludes>
+ <exclude>META-INF/*.SF</exclude>
+ <exclude>META-INF/*.DSA</exclude>
+ <exclude>META-INF/*.RSA</exclude>
+ </excludes>
+ </filter>
+ </filters>
+ </configuration>
+ </execution>
+ </executions>
+ </plugin>
+
+ <plugin>
+ <groupId>org.skife.maven</groupId>
+ <artifactId>really-executable-jar-maven-plugin</artifactId>
+ <version>1.2.0</version>
+ <configuration>
+ <programFile>lzf</programFile>
+ <flags>-Xmx200m</flags>
+ </configuration>
+
+ <executions>
+ <execution>
+ <phase>package</phase>
+ <goals>
+ <goal>really-executable-jar</goal>
+ </goals>
+ </execution>
+ </executions>
+ </plugin>
+-->
</plugins>
</build>
@@ -179,33 +234,5 @@ com.ning.compress.lzf.util
</plugins>
</build>
</profile>
- <profile>
- <id>offline-testing</id>
- <build>
- <plugins>
- <plugin>
- <groupId>org.apache.maven.plugins</groupId>
- <artifactId>maven-surefire-plugin</artifactId>
- <configuration>
- <groups>standalone</groups>
- </configuration>
- </plugin>
- </plugins>
- </build>
- </profile>
- <profile>
- <id>online-testing</id>
- <build>
- <plugins>
- <plugin>
- <groupId>org.apache.maven.plugins</groupId>
- <artifactId>maven-surefire-plugin</artifactId>
- <configuration>
- <groups>standalone, online</groups>
- </configuration>
- </plugin>
- </plugins>
- </build>
- </profile>
</profiles>
</project>
=====================================
src/main/java/com/ning/compress/Uncompressor.java
=====================================
@@ -4,7 +4,7 @@ import java.io.IOException;
/**
* Abstract class that defines "push" style API for various uncompressors
- * (aka decompressors or decoders). Implements are alternatives to stream
+ * (aka decompressors or decoders). Implementations are alternatives to stream
* based uncompressors (such as {@link com.ning.compress.lzf.LZFInputStream})
* in cases where "push" operation is important and/or blocking is not allowed;
* for example, when handling asynchronous HTTP responses.
=====================================
src/main/java/com/ning/compress/gzip/GZIPUncompressor.java
=====================================
@@ -171,17 +171,22 @@ public class GZIPUncompressor extends Uncompressor
public GZIPUncompressor(DataHandler h)
{
- this(h, DEFAULT_CHUNK_SIZE);
+ this(h, DEFAULT_CHUNK_SIZE, BufferRecycler.instance(), GZIPRecycler.instance());
}
public GZIPUncompressor(DataHandler h, int inputChunkLength)
+ {
+ this(h, inputChunkLength, BufferRecycler.instance(), GZIPRecycler.instance());
+ }
+
+ public GZIPUncompressor(DataHandler h, int inputChunkLength, BufferRecycler bufferRecycler, GZIPRecycler gzipRecycler)
{
_inputChunkLength = inputChunkLength;
_handler = h;
- _recycler = BufferRecycler.instance();
- _decodeBuffer = _recycler.allocDecodeBuffer(DECODE_BUFFER_SIZE);
- _gzipRecycler = GZIPRecycler.instance();
- _inflater = _gzipRecycler.allocInflater();
+ _recycler = bufferRecycler;
+ _decodeBuffer = bufferRecycler.allocDecodeBuffer(DECODE_BUFFER_SIZE);
+ _gzipRecycler = gzipRecycler;
+ _inflater = gzipRecycler.allocInflater();
_crc = new CRC32();
}
=====================================
src/main/java/com/ning/compress/gzip/OptimizedGZIPInputStream.java
=====================================
@@ -77,15 +77,20 @@ public class OptimizedGZIPInputStream
*/
public OptimizedGZIPInputStream(InputStream in) throws IOException
+ {
+ this(in, BufferRecycler.instance(), GZIPRecycler.instance());
+ }
+
+ public OptimizedGZIPInputStream(InputStream in, BufferRecycler bufferRecycler, GZIPRecycler gzipRecycler) throws IOException
{
super();
- _bufferRecycler = BufferRecycler.instance();
- _gzipRecycler = GZIPRecycler.instance();
+ _bufferRecycler = bufferRecycler;
+ _gzipRecycler = gzipRecycler;
_rawInput = in;
- _buffer = _bufferRecycler.allocInputBuffer(INPUT_BUFFER_SIZE);
+ _buffer = bufferRecycler.allocInputBuffer(INPUT_BUFFER_SIZE);
_bufferPtr = _bufferEnd = 0;
- _inflater = _gzipRecycler.allocInflater();
+ _inflater = gzipRecycler.allocInflater();
_crc = new CRC32();
// And then need to process header...
=====================================
src/main/java/com/ning/compress/lzf/ChunkEncoder.java
=====================================
@@ -75,34 +75,56 @@ public abstract class ChunkEncoder
protected byte[] _headerBuffer;
/**
+ * Uses a ThreadLocal soft-referenced BufferRecycler instance.
+ *
* @param totalLength Total encoded length; used for calculating size
* of hash table to use
*/
protected ChunkEncoder(int totalLength)
+ {
+ this(totalLength, BufferRecycler.instance());
+ }
+
+ /**
+ * @param totalLength Total encoded length; used for calculating size
+ * of hash table to use
+ * @param bufferRecycler Buffer recycler instance, for usages where the
+ * caller manages the recycler instances
+ */
+ protected ChunkEncoder(int totalLength, BufferRecycler bufferRecycler)
{
// Need room for at most a single full chunk
int largestChunkLen = Math.min(totalLength, LZFChunk.MAX_CHUNK_LEN);
int suggestedHashLen = calcHashLen(largestChunkLen);
- _recycler = BufferRecycler.instance();
- _hashTable = _recycler.allocEncodingHash(suggestedHashLen);
+ _recycler = bufferRecycler;
+ _hashTable = bufferRecycler.allocEncodingHash(suggestedHashLen);
_hashModulo = _hashTable.length - 1;
// Ok, then, what's the worst case output buffer length?
// length indicator for each 32 literals, so:
// 21-Feb-2013, tatu: Plus we want to prepend chunk header in place:
int bufferLen = largestChunkLen + ((largestChunkLen + 31) >> 5) + LZFChunk.MAX_HEADER_LEN;
- _encodeBuffer = _recycler.allocEncodingBuffer(bufferLen);
+ _encodeBuffer = bufferRecycler.allocEncodingBuffer(bufferLen);
}
-
+
/**
* Alternate constructor used when we want to avoid allocation encoding
* buffer, in cases where caller wants full control over allocations.
*/
protected ChunkEncoder(int totalLength, boolean bogus)
+ {
+ this(totalLength, BufferRecycler.instance(), bogus);
+ }
+
+ /**
+ * Alternate constructor used when we want to avoid allocation encoding
+ * buffer, in cases where caller wants full control over allocations.
+ */
+ protected ChunkEncoder(int totalLength, BufferRecycler bufferRecycler, boolean bogus)
{
int largestChunkLen = Math.max(totalLength, LZFChunk.MAX_CHUNK_LEN);
int suggestedHashLen = calcHashLen(largestChunkLen);
- _recycler = BufferRecycler.instance();
- _hashTable = _recycler.allocEncodingHash(suggestedHashLen);
+ _recycler = bufferRecycler;
+ _hashTable = bufferRecycler.allocEncodingHash(suggestedHashLen);
_hashModulo = _hashTable.length - 1;
_encodeBuffer = null;
}
@@ -297,6 +319,10 @@ public abstract class ChunkEncoder
return false;
}
+ public BufferRecycler getBufferRecycler() {
+ return _recycler;
+ }
+
/*
///////////////////////////////////////////////////////////////////////
// Abstract methods for sub-classes
=====================================
src/main/java/com/ning/compress/lzf/LZFCompressingInputStream.java
=====================================
@@ -76,16 +76,24 @@ public class LZFCompressingInputStream extends InputStream
public LZFCompressingInputStream(InputStream in)
{
- this(null, in);
+ this(null, in, BufferRecycler.instance());
}
public LZFCompressingInputStream(final ChunkEncoder encoder, InputStream in)
+ {
+ this(encoder, in, null);
+ }
+
+ public LZFCompressingInputStream(final ChunkEncoder encoder, InputStream in, BufferRecycler bufferRecycler)
{
// may be passed by caller, or could be null
_encoder = encoder;
_inputStream = in;
- _recycler = BufferRecycler.instance();
- _inputBuffer = _recycler.allocInputBuffer(LZFChunk.MAX_CHUNK_LEN);
+ if (bufferRecycler==null) {
+ bufferRecycler = (encoder!=null) ? _encoder._recycler : BufferRecycler.instance();
+ }
+ _recycler = bufferRecycler;
+ _inputBuffer = bufferRecycler.allocInputBuffer(LZFChunk.MAX_CHUNK_LEN);
// let's not yet allocate encoding buffer; don't know optimal size
}
@@ -259,7 +267,7 @@ public class LZFCompressingInputStream extends InputStream
if (_encoder == null) {
// need 7 byte header, plus regular max buffer size:
int bufferLen = chunkLength + ((chunkLength + 31) >> 5) + 7;
- _encoder = ChunkEncoderFactory.optimalNonAllocatingInstance(bufferLen);
+ _encoder = ChunkEncoderFactory.optimalNonAllocatingInstance(bufferLen, _recycler);
}
if (_encodedBytes == null) {
int bufferLen = chunkLength + ((chunkLength + 31) >> 5) + 7;
=====================================
src/main/java/com/ning/compress/lzf/LZFEncoder.java
=====================================
@@ -11,6 +11,7 @@
package com.ning.compress.lzf;
+import com.ning.compress.BufferRecycler;
import com.ning.compress.lzf.util.ChunkEncoderFactory;
/**
@@ -22,15 +23,23 @@ import com.ning.compress.lzf.util.ChunkEncoderFactory;
*/
public class LZFEncoder
{
- /* Approximate maximum size for a full chunk, in case where it does not compress
- * at all. Such chunks are converted to uncompressed chunks, but during compression
- * process this amount of space is still needed.
+ /* Approximate maximum size for a full chunk DURING PROCESSING, in case where it does
+ * not compress at all. Such chunks are converted to uncompressed chunks,
+ * but during compression process this amount of space is still needed.
+ *<p>
+ * NOTE: eventual maximum size is different, see below
*/
- public final static int MAX_CHUNK_RESULT_SIZE = LZFChunk.MAX_HEADER_LEN + LZFChunk.MAX_CHUNK_LEN + (LZFChunk.MAX_CHUNK_LEN * 32 / 31);
+ public final static int MAX_CHUNK_RESULT_SIZE = LZFChunk.MAX_HEADER_LEN + LZFChunk.MAX_CHUNK_LEN + ((LZFChunk.MAX_CHUNK_LEN + 30) / 31);
+
+ // since 1.0.4 (better name that MAX_CHUNK_RESULT_SIZE, same value)
+ private final static int MAX_CHUNK_WORKSPACE_SIZE = LZFChunk.MAX_HEADER_LEN + LZFChunk.MAX_CHUNK_LEN + ((LZFChunk.MAX_CHUNK_LEN + 30) / 31);
+
+ // since 1.0.4
+ private final static int FULL_UNCOMP_ENCODED_CHUNK = LZFChunk.MAX_HEADER_LEN + LZFChunk.MAX_CHUNK_LEN;
// Static methods only, no point in instantiating
private LZFEncoder() { }
-
+
/*
///////////////////////////////////////////////////////////////////////
// Helper methods
@@ -49,20 +58,27 @@ public class LZFEncoder
*/
public static int estimateMaxWorkspaceSize(int inputSize)
{
- // single chunk; give a rough estimate with +5% (1 + 1/32 + 1/64)
+ // single chunk; give a rough estimate with +4.6% (1 + 1/32 + 1/64)
+ // 12-Mar-2017, tatu: as per [compress-lzf#43], rounding down would mess this
+ // up for small sizes; but effect should go away after sizes of 64 and more,
+ // before which we may need up to 2 markers
if (inputSize <= LZFChunk.MAX_CHUNK_LEN) {
- return LZFChunk.MAX_HEADER_LEN + inputSize + (inputSize >> 5) + (inputSize >> 6);
+ return LZFChunk.MAX_HEADER_LEN + 2 + inputSize + (inputSize >> 5) + (inputSize >> 6);
}
// one more special case, 2 chunks
inputSize -= LZFChunk.MAX_CHUNK_LEN;
if (inputSize <= LZFChunk.MAX_CHUNK_LEN) { // uncompressed chunk actually has 5 byte header but
- return MAX_CHUNK_RESULT_SIZE + inputSize + LZFChunk.MAX_HEADER_LEN;
+ return MAX_CHUNK_WORKSPACE_SIZE + (LZFChunk.MAX_HEADER_LEN + inputSize);
}
- // check number of chunks we should be creating (assuming use of full chunks)
- int chunkCount = 1 + ((inputSize + (LZFChunk.MAX_CHUNK_LEN-1)) / LZFChunk.MAX_CHUNK_LEN);
- return MAX_CHUNK_RESULT_SIZE + chunkCount * (LZFChunk.MAX_CHUNK_LEN + LZFChunk.MAX_HEADER_LEN);
+ // check number of full chunks we should be creating:
+ int chunkCount = inputSize / LZFChunk.MAX_CHUNK_LEN;
+ inputSize -= chunkCount * LZFChunk.MAX_CHUNK_LEN; // will now be remainders
+ // So: first chunk has type marker, rest not, but for simplicity assume as if they all
+ // could. But take into account that last chunk is smaller
+ return MAX_CHUNK_WORKSPACE_SIZE + (chunkCount * FULL_UNCOMP_ENCODED_CHUNK)
+ + (LZFChunk.MAX_HEADER_LEN + inputSize);
}
-
+
/*
///////////////////////////////////////////////////////////////////////
// Encoding methods, blocks
@@ -121,6 +137,36 @@ public class LZFEncoder
return result;
}
+ /**
+ * Method for compressing given input data using LZF encoding and
+ * block structure (compatible with lzf command line utility).
+ * Result consists of a sequence of chunks.
+ *<p>
+ * Note that {@link ChunkEncoder} instance used is one produced by
+ * {@link ChunkEncoderFactory#optimalInstance}, which typically
+ * is "unsafe" instance if one can be used on current JVM.
+ */
+ public static byte[] encode(byte[] data, int offset, int length, BufferRecycler bufferRecycler)
+ {
+ ChunkEncoder enc = ChunkEncoderFactory.optimalInstance(length, bufferRecycler);
+ byte[] result = encode(enc, data, offset, length);
+ enc.close(); // important for buffer reuse!
+ return result;
+ }
+
+ /**
+ * Method that will use "safe" {@link ChunkEncoder}, as produced by
+ * {@link ChunkEncoderFactory#safeInstance}, for encoding. Safe here
+ * means that it does not use any non-compliant features beyond core JDK.
+ */
+ public static byte[] safeEncode(byte[] data, int offset, int length, BufferRecycler bufferRecycler)
+ {
+ ChunkEncoder enc = ChunkEncoderFactory.safeInstance(length, bufferRecycler);
+ byte[] result = encode(enc, data, offset, length);
+ enc.close();
+ return result;
+ }
+
/**
* Compression method that uses specified {@link ChunkEncoder} for actual
* encoding.
@@ -207,6 +253,36 @@ public class LZFEncoder
/**
* Alternate version that accepts pre-allocated output buffer.
+ *<p>
+ * Note that {@link ChunkEncoder} instance used is one produced by
+ * {@link ChunkEncoderFactory#optimalNonAllocatingInstance}, which typically
+ * is "unsafe" instance if one can be used on current JVM.
+ */
+ public static int appendEncoded(byte[] input, int inputPtr, int inputLength,
+ byte[] outputBuffer, int outputPtr, BufferRecycler bufferRecycler) {
+ ChunkEncoder enc = ChunkEncoderFactory.optimalNonAllocatingInstance(inputLength, bufferRecycler);
+ int len = appendEncoded(enc, input, inputPtr, inputLength, outputBuffer, outputPtr);
+ enc.close();
+ return len;
+ }
+
+ /**
+ * Alternate version that accepts pre-allocated output buffer.
+ *<p>
+ * Method that will use "safe" {@link ChunkEncoder}, as produced by
+ * {@link ChunkEncoderFactory#safeInstance}, for encoding. Safe here
+ * means that it does not use any non-compliant features beyond core JDK.
+ */
+ public static int safeAppendEncoded(byte[] input, int inputPtr, int inputLength,
+ byte[] outputBuffer, int outputPtr, BufferRecycler bufferRecycler) {
+ ChunkEncoder enc = ChunkEncoderFactory.safeNonAllocatingInstance(inputLength, bufferRecycler);
+ int len = appendEncoded(enc, input, inputPtr, inputLength, outputBuffer, outputPtr);
+ enc.close();
+ return len;
+ }
+
+ /**
+ * Alternate version that accepts pre-allocated output buffer.
*/
public static int appendEncoded(ChunkEncoder enc, byte[] input, int inputPtr, int inputLength,
byte[] outputBuffer, int outputPtr)
=====================================
src/main/java/com/ning/compress/lzf/LZFInputStream.java
=====================================
@@ -83,7 +83,7 @@ public class LZFInputStream extends InputStream
public LZFInputStream(final ChunkDecoder decoder, final InputStream in)
throws IOException
{
- this(decoder, in, false);
+ this(decoder, in, BufferRecycler.instance(), false);
}
/**
@@ -94,21 +94,45 @@ public class LZFInputStream extends InputStream
*/
public LZFInputStream(final InputStream in, boolean fullReads) throws IOException
{
- this(ChunkDecoderFactory.optimalInstance(), in, fullReads);
+ this(ChunkDecoderFactory.optimalInstance(), in, BufferRecycler.instance(), fullReads);
}
public LZFInputStream(final ChunkDecoder decoder, final InputStream in, boolean fullReads)
throws IOException
+ {
+ this(decoder, in, BufferRecycler.instance(), fullReads);
+ }
+
+ public LZFInputStream(final InputStream inputStream, final BufferRecycler bufferRecycler) throws IOException
+ {
+ this(inputStream, bufferRecycler, false);
+ }
+
+ /**
+ * @param in Underlying input stream to use
+ * @param fullReads Whether {@link #read(byte[])} should try to read exactly
+ * as many bytes as requested (true); or just however many happen to be
+ * available (false)
+ * @param bufferRecycler Buffer recycler instance, for usages where the
+ * caller manages the recycler instances
+ */
+ public LZFInputStream(final InputStream in, final BufferRecycler bufferRecycler, boolean fullReads) throws IOException
+ {
+ this(ChunkDecoderFactory.optimalInstance(), in, bufferRecycler, fullReads);
+ }
+
+ public LZFInputStream(final ChunkDecoder decoder, final InputStream in, final BufferRecycler bufferRecycler, boolean fullReads)
+ throws IOException
{
super();
_decoder = decoder;
- _recycler = BufferRecycler.instance();
+ _recycler = bufferRecycler;
_inputStream = in;
_inputStreamClosed = false;
_cfgFullReads = fullReads;
- _inputBuffer = _recycler.allocInputBuffer(LZFChunk.MAX_CHUNK_LEN);
- _decodedBytes = _recycler.allocDecodeBuffer(LZFChunk.MAX_CHUNK_LEN);
+ _inputBuffer = bufferRecycler.allocInputBuffer(LZFChunk.MAX_CHUNK_LEN);
+ _decodedBytes = bufferRecycler.allocDecodeBuffer(LZFChunk.MAX_CHUNK_LEN);
}
/**
=====================================
src/main/java/com/ning/compress/lzf/LZFOutputStream.java
=====================================
@@ -28,7 +28,7 @@ import com.ning.compress.lzf.util.ChunkEncoderFactory;
*/
public class LZFOutputStream extends FilterOutputStream implements WritableByteChannel
{
- private static final int OUTPUT_BUFFER_SIZE = LZFChunk.MAX_CHUNK_LEN;
+ private static final int DEFAULT_OUTPUT_BUFFER_SIZE = LZFChunk.MAX_CHUNK_LEN;
private final ChunkEncoder _encoder;
private final BufferRecycler _recycler;
@@ -58,15 +58,34 @@ public class LZFOutputStream extends FilterOutputStream implements WritableByteC
public LZFOutputStream(final OutputStream outputStream)
{
- this(ChunkEncoderFactory.optimalInstance(OUTPUT_BUFFER_SIZE), outputStream);
+ this(ChunkEncoderFactory.optimalInstance(DEFAULT_OUTPUT_BUFFER_SIZE), outputStream);
}
public LZFOutputStream(final ChunkEncoder encoder, final OutputStream outputStream)
+ {
+ this(encoder, outputStream, DEFAULT_OUTPUT_BUFFER_SIZE, encoder._recycler);
+ }
+
+ public LZFOutputStream(final OutputStream outputStream, final BufferRecycler bufferRecycler)
+ {
+ this(ChunkEncoderFactory.optimalInstance(bufferRecycler), outputStream, bufferRecycler);
+ }
+
+ public LZFOutputStream(final ChunkEncoder encoder, final OutputStream outputStream, final BufferRecycler bufferRecycler)
+ {
+ this(encoder, outputStream, DEFAULT_OUTPUT_BUFFER_SIZE, bufferRecycler);
+ }
+
+ public LZFOutputStream(final ChunkEncoder encoder, final OutputStream outputStream,
+ final int bufferSize, BufferRecycler bufferRecycler)
{
super(outputStream);
_encoder = encoder;
- _recycler = BufferRecycler.instance();
- _outputBuffer = _recycler.allocOutputBuffer(OUTPUT_BUFFER_SIZE);
+ if (bufferRecycler==null) {
+ bufferRecycler = _encoder._recycler;
+ }
+ _recycler = bufferRecycler;
+ _outputBuffer = bufferRecycler.allocOutputBuffer(bufferSize);
_outputStreamClosed = false;
}
=====================================
src/main/java/com/ning/compress/lzf/LZFUncompressor.java
=====================================
@@ -109,14 +109,23 @@ public class LZFUncompressor extends Uncompressor
*/
public LZFUncompressor(DataHandler handler) {
- this(handler, ChunkDecoderFactory.optimalInstance());
+ this(handler, ChunkDecoderFactory.optimalInstance(), BufferRecycler.instance());
+ }
+
+ public LZFUncompressor(DataHandler handler, BufferRecycler bufferRecycler) {
+ this(handler, ChunkDecoderFactory.optimalInstance(), bufferRecycler);
}
public LZFUncompressor(DataHandler handler, ChunkDecoder dec)
+ {
+ this(handler, dec, BufferRecycler.instance());
+ }
+
+ public LZFUncompressor(DataHandler handler, ChunkDecoder dec, BufferRecycler bufferRecycler)
{
_handler = handler;
_decoder = dec;
- _recycler = BufferRecycler.instance();
+ _recycler = bufferRecycler;
}
/*
=====================================
src/main/java/com/ning/compress/lzf/impl/UnsafeChunkEncoder.java
=====================================
@@ -1,5 +1,6 @@
package com.ning.compress.lzf.impl;
+import com.ning.compress.BufferRecycler;
import java.lang.reflect.Field;
import sun.misc.Unsafe;
@@ -44,6 +45,14 @@ public abstract class UnsafeChunkEncoder
super(totalLength, bogus);
}
+ public UnsafeChunkEncoder(int totalLength, BufferRecycler bufferRecycler) {
+ super(totalLength, bufferRecycler);
+ }
+
+ public UnsafeChunkEncoder(int totalLength, BufferRecycler bufferRecycler, boolean bogus) {
+ super(totalLength, bufferRecycler, bogus);
+ }
+
/*
///////////////////////////////////////////////////////////////////////
// Shared helper methods
=====================================
src/main/java/com/ning/compress/lzf/impl/UnsafeChunkEncoderBE.java
=====================================
@@ -1,5 +1,6 @@
package com.ning.compress.lzf.impl;
+import com.ning.compress.BufferRecycler;
import com.ning.compress.lzf.LZFChunk;
/**
@@ -16,6 +17,14 @@ public final class UnsafeChunkEncoderBE
public UnsafeChunkEncoderBE(int totalLength, boolean bogus) {
super(totalLength, bogus);
}
+ public UnsafeChunkEncoderBE(int totalLength, BufferRecycler bufferRecycler) {
+ super(totalLength, bufferRecycler);
+ }
+
+ public UnsafeChunkEncoderBE(int totalLength, BufferRecycler bufferRecycler, boolean bogus) {
+ super(totalLength, bufferRecycler, bogus);
+ }
+
@Override
protected int tryCompress(byte[] in, int inPos, int inEnd, byte[] out, int outPos)
@@ -120,7 +129,7 @@ public final class UnsafeChunkEncoderBE
long l1 = unsafe.getLong(in, BYTE_ARRAY_OFFSET + ptr1);
long l2 = unsafe.getLong(in, BYTE_ARRAY_OFFSET + ptr2);
if (l1 != l2) {
- return ptr1 - base + (Long.numberOfLeadingZeros(l1 ^ l2) >> 3);
+ return ptr1 - base + _leadingBytes(l1, l2);
}
ptr1 += 8;
ptr2 += 8;
@@ -133,7 +142,15 @@ public final class UnsafeChunkEncoderBE
return ptr1 - base; // i.e.
}
+ /* With Big-Endian, in-memory layout is "natural", so what we consider
+ * leading is also leading for in-register.
+ */
+
private final static int _leadingBytes(int i1, int i2) {
- return (Long.numberOfLeadingZeros(i1 ^ i2) >> 3);
+ return Integer.numberOfLeadingZeros(i1 ^ i2) >> 3;
+ }
+
+ private final static int _leadingBytes(long l1, long l2) {
+ return Long.numberOfLeadingZeros(l1 ^ l2) >> 3;
}
}
=====================================
src/main/java/com/ning/compress/lzf/impl/UnsafeChunkEncoderLE.java
=====================================
@@ -1,5 +1,6 @@
package com.ning.compress.lzf.impl;
+import com.ning.compress.BufferRecycler;
import com.ning.compress.lzf.LZFChunk;
/**
@@ -17,7 +18,15 @@ public class UnsafeChunkEncoderLE
super(totalLength, bogus);
}
- @Override
+ public UnsafeChunkEncoderLE(int totalLength, BufferRecycler bufferRecycler) {
+ super(totalLength, bufferRecycler);
+ }
+
+ public UnsafeChunkEncoderLE(int totalLength, BufferRecycler bufferRecycler, boolean bogus) {
+ super(totalLength, bufferRecycler, bogus);
+ }
+
+ @Override
protected int tryCompress(byte[] in, int inPos, int inEnd, byte[] out, int outPos)
{
final int[] hashTable = _hashTable;
@@ -122,7 +131,7 @@ public class UnsafeChunkEncoderLE
long l1 = unsafe.getLong(in, BYTE_ARRAY_OFFSET + ptr1);
long l2 = unsafe.getLong(in, BYTE_ARRAY_OFFSET + ptr2);
if (l1 != l2) {
- return ptr1 - base + (Long.numberOfTrailingZeros(l1 ^ l2) >> 3);
+ return ptr1 - base + _leadingBytes(l1, l2);
}
ptr1 += 8;
ptr2 += 8;
@@ -135,7 +144,16 @@ public class UnsafeChunkEncoderLE
return ptr1 - base; // i.e.
}
+ /* With Little-Endian, in-memory layout is reverse of what we expect for
+ * in-register, so we either have to reverse bytes, or, simpler,
+ * calculate trailing zeroes instead.
+ */
+
private final static int _leadingBytes(int i1, int i2) {
- return (Long.numberOfTrailingZeros(i1 ^ i2) >> 3);
+ return Integer.numberOfTrailingZeros(i1 ^ i2) >> 3;
+ }
+
+ private final static int _leadingBytes(long l1, long l2) {
+ return Long.numberOfTrailingZeros(l1 ^ l2) >> 3;
}
}
=====================================
src/main/java/com/ning/compress/lzf/impl/UnsafeChunkEncoders.java
=====================================
@@ -11,6 +11,7 @@
package com.ning.compress.lzf.impl;
+import com.ning.compress.BufferRecycler;
import java.nio.ByteOrder;
@@ -39,4 +40,18 @@ public final class UnsafeChunkEncoders
}
return new UnsafeChunkEncoderBE(totalLength, false);
}
+
+ public static UnsafeChunkEncoder createEncoder(int totalLength, BufferRecycler bufferRecycler) {
+ if (LITTLE_ENDIAN) {
+ return new UnsafeChunkEncoderLE(totalLength, bufferRecycler);
+ }
+ return new UnsafeChunkEncoderBE(totalLength, bufferRecycler);
+ }
+
+ public static UnsafeChunkEncoder createNonAllocatingEncoder(int totalLength, BufferRecycler bufferRecycler) {
+ if (LITTLE_ENDIAN) {
+ return new UnsafeChunkEncoderLE(totalLength, bufferRecycler, false);
+ }
+ return new UnsafeChunkEncoderBE(totalLength, bufferRecycler, false);
+ }
}
=====================================
src/main/java/com/ning/compress/lzf/impl/VanillaChunkEncoder.java
=====================================
@@ -1,5 +1,6 @@
package com.ning.compress.lzf.impl;
+import com.ning.compress.BufferRecycler;
import com.ning.compress.lzf.ChunkEncoder;
import com.ning.compress.lzf.LZFChunk;
@@ -22,10 +23,31 @@ public class VanillaChunkEncoder
super(totalLength, bogus);
}
+ /**
+ * @param totalLength Total encoded length; used for calculating size
+ * of hash table to use
+ * @param bufferRecycler The BufferRecycler instance
+ */
+ public VanillaChunkEncoder(int totalLength, BufferRecycler bufferRecycler) {
+ super(totalLength, bufferRecycler);
+ }
+
+ /**
+ * Alternate constructor used when we want to avoid allocation encoding
+ * buffer, in cases where caller wants full control over allocations.
+ */
+ protected VanillaChunkEncoder(int totalLength, BufferRecycler bufferRecycler, boolean bogus) {
+ super(totalLength, bufferRecycler, bogus);
+ }
+
public static VanillaChunkEncoder nonAllocatingEncoder(int totalLength) {
return new VanillaChunkEncoder(totalLength, true);
}
+ public static VanillaChunkEncoder nonAllocatingEncoder(int totalLength, BufferRecycler bufferRecycler) {
+ return new VanillaChunkEncoder(totalLength, bufferRecycler, true);
+ }
+
/*
///////////////////////////////////////////////////////////////////////
// Abstract method implementations
@@ -44,7 +66,7 @@ public class VanillaChunkEncoder
{
final int[] hashTable = _hashTable;
++outPos; // To leave one byte for literal-length indicator
- int seen = first(in, 0); // past 4 bytes we have seen... (last one is LSB)
+ int seen = first(in, inPos); // past 4 bytes we have seen... (last one is LSB)
int literals = 0;
inEnd -= TAIL_LENGTH;
final int firstPos = inPos; // so that we won't have back references across block boundary
=====================================
src/main/java/com/ning/compress/lzf/parallel/PLZFOutputStream.java
=====================================
@@ -41,7 +41,7 @@ import com.ning.compress.lzf.LZFChunk;
*/
public class PLZFOutputStream extends FilterOutputStream implements WritableByteChannel
{
- private static final int OUTPUT_BUFFER_SIZE = LZFChunk.MAX_CHUNK_LEN;
+ private static final int DEFAULT_OUTPUT_BUFFER_SIZE = LZFChunk.MAX_CHUNK_LEN;
protected byte[] _outputBuffer;
protected int _position = 0;
@@ -65,16 +65,20 @@ public class PLZFOutputStream extends FilterOutputStream implements WritableByte
*/
public PLZFOutputStream(final OutputStream outputStream) {
- this(outputStream, getNThreads());
+ this(outputStream, DEFAULT_OUTPUT_BUFFER_SIZE, getNThreads());
}
protected PLZFOutputStream(final OutputStream outputStream, int nThreads) {
+ this(outputStream, DEFAULT_OUTPUT_BUFFER_SIZE, nThreads);
+ }
+
+ protected PLZFOutputStream(final OutputStream outputStream, final int bufferSize, int nThreads) {
super(outputStream);
_outputStreamClosed = false;
compressExecutor = new ThreadPoolExecutor(nThreads, nThreads, 60L, TimeUnit.SECONDS, new LinkedBlockingQueue<Runnable>()); // unbounded
((ThreadPoolExecutor)compressExecutor).allowCoreThreadTimeOut(true);
writeExecutor = Executors.newSingleThreadExecutor(); // unbounded
- blockManager = new BlockManager(nThreads * 2, OUTPUT_BUFFER_SIZE); // this is where the bounds will be enforced!
+ blockManager = new BlockManager(nThreads * 2, bufferSize); // this is where the bounds will be enforced!
_outputBuffer = blockManager.getBlockFromPool();
}
=====================================
src/main/java/com/ning/compress/lzf/util/ChunkEncoderFactory.java
=====================================
@@ -1,5 +1,6 @@
package com.ning.compress.lzf.util;
+import com.ning.compress.BufferRecycler;
import com.ning.compress.lzf.ChunkEncoder;
import com.ning.compress.lzf.LZFChunk;
import com.ning.compress.lzf.impl.UnsafeChunkEncoders;
@@ -35,6 +36,8 @@ public class ChunkEncoderFactory
* non-standard platforms it may be necessary to either directly load
* instances, or use {@link #safeInstance}.
*
+ * <p/>Uses a ThreadLocal soft-referenced BufferRecycler instance.
+ *
* @param totalLength Expected total length of content to compress; only matters
* for content that is smaller than maximum chunk size (64k), to optimize
* encoding hash tables
@@ -50,6 +53,8 @@ public class ChunkEncoderFactory
/**
* Factory method for constructing encoder that is always passed buffer
* externally, so that it will not (nor need) allocate encoding buffer.
+ * <p>
+ * Uses a ThreadLocal soft-referenced BufferRecycler instance.
*/
public static ChunkEncoder optimalNonAllocatingInstance(int totalLength) {
try {
@@ -68,9 +73,12 @@ public class ChunkEncoderFactory
public static ChunkEncoder safeInstance() {
return safeInstance(LZFChunk.MAX_CHUNK_LEN);
}
+
/**
* Method that can be used to ensure that a "safe" compressor instance is loaded.
* Safe here means that it should work on any and all Java platforms.
+ * <p>
+ * Uses a ThreadLocal soft-referenced BufferRecycler instance.
*
* @param totalLength Expected total length of content to compress; only matters
* for content that is smaller than maximum chunk size (64k), to optimize
@@ -83,8 +91,81 @@ public class ChunkEncoderFactory
/**
* Factory method for constructing encoder that is always passed buffer
* externally, so that it will not (nor need) allocate encoding buffer.
+ *<p>Uses a ThreadLocal soft-referenced BufferRecycler instance.
*/
public static ChunkEncoder safeNonAllocatingInstance(int totalLength) {
return VanillaChunkEncoder.nonAllocatingEncoder(totalLength);
}
+
+ /**
+ * Convenience method, equivalent to:
+ *<code>
+ * return optimalInstance(LZFChunk.MAX_CHUNK_LEN, bufferRecycler);
+ *</code>
+ */
+ public static ChunkEncoder optimalInstance(BufferRecycler bufferRecycler) {
+ return optimalInstance(LZFChunk.MAX_CHUNK_LEN, bufferRecycler);
+ }
+
+ /**
+ * Method to use for getting compressor instance that uses the most optimal
+ * available methods for underlying data access. It should be safe to call
+ * this method as implementations are dynamically loaded; however, on some
+ * non-standard platforms it may be necessary to either directly load
+ * instances, or use {@link #safeInstance}.
+ *
+ * @param totalLength Expected total length of content to compress; only matters
+ * for content that is smaller than maximum chunk size (64k), to optimize
+ * encoding hash tables
+ * @param bufferRecycler The BufferRecycler instance
+ */
+ public static ChunkEncoder optimalInstance(int totalLength, BufferRecycler bufferRecycler) {
+ try {
+ return UnsafeChunkEncoders.createEncoder(totalLength, bufferRecycler);
+ } catch (Exception e) {
+ return safeInstance(totalLength, bufferRecycler);
+ }
+ }
+
+ /**
+ * Factory method for constructing encoder that is always passed buffer
+ * externally, so that it will not (nor need) allocate encoding buffer.
+ */
+ public static ChunkEncoder optimalNonAllocatingInstance(int totalLength, BufferRecycler bufferRecycler) {
+ try {
+ return UnsafeChunkEncoders.createNonAllocatingEncoder(totalLength, bufferRecycler);
+ } catch (Exception e) {
+ return safeNonAllocatingInstance(totalLength, bufferRecycler);
+ }
+ }
+
+ /**
+ * Convenience method, equivalent to:
+ *<code>
+ * return safeInstance(LZFChunk.MAX_CHUNK_LEN, bufferRecycler);
+ *</code>
+ */
+ public static ChunkEncoder safeInstance(BufferRecycler bufferRecycler) {
+ return safeInstance(LZFChunk.MAX_CHUNK_LEN, bufferRecycler);
+ }
+ /**
+ * Method that can be used to ensure that a "safe" compressor instance is loaded.
+ * Safe here means that it should work on any and all Java platforms.
+ *
+ * @param totalLength Expected total length of content to compress; only matters
+ * for content that is smaller than maximum chunk size (64k), to optimize
+ * encoding hash tables
+ * @param bufferRecycler The BufferRecycler instance
+ */
+ public static ChunkEncoder safeInstance(int totalLength, BufferRecycler bufferRecycler) {
+ return new VanillaChunkEncoder(totalLength, bufferRecycler);
+ }
+
+ /**
+ * Factory method for constructing encoder that is always passed buffer
+ * externally, so that it will not (nor need) allocate encoding buffer.
+ */
+ public static ChunkEncoder safeNonAllocatingInstance(int totalLength, BufferRecycler bufferRecycler) {
+ return VanillaChunkEncoder.nonAllocatingEncoder(totalLength, bufferRecycler);
+ }
}
=====================================
src/main/java/com/ning/compress/lzf/util/LZFFileInputStream.java
=====================================
@@ -77,47 +77,62 @@ public class LZFFileInputStream
*/
public LZFFileInputStream(File file) throws FileNotFoundException {
- this(file, ChunkDecoderFactory.optimalInstance());
+ this(file, ChunkDecoderFactory.optimalInstance(), BufferRecycler.instance());
}
public LZFFileInputStream(FileDescriptor fdObj) {
- this(fdObj, ChunkDecoderFactory.optimalInstance());
+ this(fdObj, ChunkDecoderFactory.optimalInstance(), BufferRecycler.instance());
}
public LZFFileInputStream(String name) throws FileNotFoundException {
- this(name, ChunkDecoderFactory.optimalInstance());
+ this(name, ChunkDecoderFactory.optimalInstance(), BufferRecycler.instance());
}
public LZFFileInputStream(File file, ChunkDecoder decompressor) throws FileNotFoundException
+ {
+ this(file, decompressor, BufferRecycler.instance());
+ }
+
+ public LZFFileInputStream(FileDescriptor fdObj, ChunkDecoder decompressor)
+ {
+ this(fdObj, decompressor, BufferRecycler.instance());
+ }
+
+ public LZFFileInputStream(String name, ChunkDecoder decompressor) throws FileNotFoundException
+ {
+ this(name, decompressor, BufferRecycler.instance());
+ }
+
+ public LZFFileInputStream(File file, ChunkDecoder decompressor, BufferRecycler bufferRecycler) throws FileNotFoundException
{
super(file);
_decompressor = decompressor;
- _recycler = BufferRecycler.instance();
+ _recycler = bufferRecycler;
_inputStreamClosed = false;
- _inputBuffer = _recycler.allocInputBuffer(LZFChunk.MAX_CHUNK_LEN);
- _decodedBytes = _recycler.allocDecodeBuffer(LZFChunk.MAX_CHUNK_LEN);
+ _inputBuffer = bufferRecycler.allocInputBuffer(LZFChunk.MAX_CHUNK_LEN);
+ _decodedBytes = bufferRecycler.allocDecodeBuffer(LZFChunk.MAX_CHUNK_LEN);
_wrapper = new Wrapper();
}
- public LZFFileInputStream(FileDescriptor fdObj, ChunkDecoder decompressor)
+ public LZFFileInputStream(FileDescriptor fdObj, ChunkDecoder decompressor, BufferRecycler bufferRecycler)
{
super(fdObj);
_decompressor = decompressor;
- _recycler = BufferRecycler.instance();
+ _recycler = bufferRecycler;
_inputStreamClosed = false;
- _inputBuffer = _recycler.allocInputBuffer(LZFChunk.MAX_CHUNK_LEN);
- _decodedBytes = _recycler.allocDecodeBuffer(LZFChunk.MAX_CHUNK_LEN);
+ _inputBuffer = bufferRecycler.allocInputBuffer(LZFChunk.MAX_CHUNK_LEN);
+ _decodedBytes = bufferRecycler.allocDecodeBuffer(LZFChunk.MAX_CHUNK_LEN);
_wrapper = new Wrapper();
}
- public LZFFileInputStream(String name, ChunkDecoder decompressor) throws FileNotFoundException
+ public LZFFileInputStream(String name, ChunkDecoder decompressor, BufferRecycler bufferRecycler) throws FileNotFoundException
{
super(name);
_decompressor = decompressor;
- _recycler = BufferRecycler.instance();
+ _recycler = bufferRecycler;
_inputStreamClosed = false;
- _inputBuffer = _recycler.allocInputBuffer(LZFChunk.MAX_CHUNK_LEN);
- _decodedBytes = _recycler.allocDecodeBuffer(LZFChunk.MAX_CHUNK_LEN);
+ _inputBuffer = bufferRecycler.allocInputBuffer(LZFChunk.MAX_CHUNK_LEN);
+ _decodedBytes = bufferRecycler.allocDecodeBuffer(LZFChunk.MAX_CHUNK_LEN);
_wrapper = new Wrapper();
}
=====================================
src/main/java/com/ning/compress/lzf/util/LZFFileOutputStream.java
=====================================
@@ -86,42 +86,65 @@ public class LZFFileOutputStream extends FileOutputStream implements WritableByt
}
public LZFFileOutputStream(ChunkEncoder encoder, File file) throws FileNotFoundException {
+ this(encoder, file, encoder.getBufferRecycler());
+ }
+
+ public LZFFileOutputStream(ChunkEncoder encoder, File file, boolean append) throws FileNotFoundException {
+ this(encoder, file, append, encoder.getBufferRecycler());
+ }
+
+ public LZFFileOutputStream(ChunkEncoder encoder, FileDescriptor fdObj) {
+ this(encoder, fdObj, encoder.getBufferRecycler());
+ }
+
+ public LZFFileOutputStream(ChunkEncoder encoder, String name) throws FileNotFoundException {
+ this(encoder, name, encoder.getBufferRecycler());
+ }
+
+ public LZFFileOutputStream(ChunkEncoder encoder, String name, boolean append) throws FileNotFoundException {
+ this(encoder, name, append, encoder.getBufferRecycler());
+ }
+
+ public LZFFileOutputStream(ChunkEncoder encoder, File file, BufferRecycler bufferRecycler) throws FileNotFoundException {
super(file);
_encoder = encoder;
- _recycler = BufferRecycler.instance();
- _outputBuffer = _recycler.allocOutputBuffer(OUTPUT_BUFFER_SIZE);
+ if (bufferRecycler==null) {
+ bufferRecycler = encoder.getBufferRecycler();
+ }
+ _recycler = bufferRecycler;
+ _outputBuffer = bufferRecycler.allocOutputBuffer(OUTPUT_BUFFER_SIZE);
_wrapper = new Wrapper();
}
- public LZFFileOutputStream(ChunkEncoder encoder, File file, boolean append) throws FileNotFoundException {
+ public LZFFileOutputStream(ChunkEncoder encoder, File file, boolean append, BufferRecycler bufferRecycler) throws FileNotFoundException {
super(file, append);
_encoder = encoder;
- _recycler = BufferRecycler.instance();
- _outputBuffer = _recycler.allocOutputBuffer(OUTPUT_BUFFER_SIZE);
+ _recycler = bufferRecycler;
+ _outputBuffer = bufferRecycler.allocOutputBuffer(OUTPUT_BUFFER_SIZE);
_wrapper = new Wrapper();
}
- public LZFFileOutputStream(ChunkEncoder encoder, FileDescriptor fdObj) {
+ public LZFFileOutputStream(ChunkEncoder encoder, FileDescriptor fdObj, BufferRecycler bufferRecycler) {
super(fdObj);
_encoder = encoder;
- _recycler = BufferRecycler.instance();
- _outputBuffer = _recycler.allocOutputBuffer(OUTPUT_BUFFER_SIZE);
+ _recycler = bufferRecycler;
+ _outputBuffer = bufferRecycler.allocOutputBuffer(OUTPUT_BUFFER_SIZE);
_wrapper = new Wrapper();
}
- public LZFFileOutputStream(ChunkEncoder encoder, String name) throws FileNotFoundException {
+ public LZFFileOutputStream(ChunkEncoder encoder, String name, BufferRecycler bufferRecycler) throws FileNotFoundException {
super(name);
_encoder = encoder;
- _recycler = BufferRecycler.instance();
- _outputBuffer = _recycler.allocOutputBuffer(OUTPUT_BUFFER_SIZE);
+ _recycler = bufferRecycler;
+ _outputBuffer = bufferRecycler.allocOutputBuffer(OUTPUT_BUFFER_SIZE);
_wrapper = new Wrapper();
}
- public LZFFileOutputStream(ChunkEncoder encoder, String name, boolean append) throws FileNotFoundException {
+ public LZFFileOutputStream(ChunkEncoder encoder, String name, boolean append, BufferRecycler bufferRecycler) throws FileNotFoundException {
super(name, append);
_encoder = encoder;
- _recycler = BufferRecycler.instance();
- _outputBuffer = _recycler.allocOutputBuffer(OUTPUT_BUFFER_SIZE);
+ _recycler = bufferRecycler;
+ _outputBuffer = bufferRecycler.allocOutputBuffer(OUTPUT_BUFFER_SIZE);
_wrapper = new Wrapper();
}
=====================================
src/test/java/com/ning/compress/gzip/TestGzipStreams.java
=====================================
@@ -3,7 +3,7 @@ package com.ning.compress.gzip;
import java.io.*;
import java.util.zip.*;
-import org.junit.Assert;
+import org.testng.Assert;
import org.testng.annotations.Test;
import com.ning.compress.BaseForTests;
@@ -19,7 +19,7 @@ public class TestGzipStreams extends BaseForTests
throw new RuntimeException(e);
}
}
-
+
@Test
public void testReusableInputStreams() throws IOException
{
@@ -33,7 +33,8 @@ public class TestGzipStreams extends BaseForTests
byte[] raw = bytes.toByteArray();
OptimizedGZIPInputStream re = new OptimizedGZIPInputStream(new ByteArrayInputStream(raw));
byte[] b = _readAll(re);
- Assert.assertArrayEquals(INPUT_BYTES, b);
+ Assert.assertEquals(INPUT_BYTES, b);
+ re.close();
}
@Test
@@ -47,7 +48,7 @@ public class TestGzipStreams extends BaseForTests
byte[] raw = bytes.toByteArray();
byte[] b = _readAll(new GZIPInputStream(new ByteArrayInputStream(raw)));
- Assert.assertArrayEquals(INPUT_BYTES, b);
+ Assert.assertEquals(INPUT_BYTES, b);
}
private byte[] _readAll(InputStream in) throws IOException
=====================================
src/test/java/com/ning/compress/gzip/TestGzipUncompressor.java
=====================================
@@ -3,7 +3,7 @@ package com.ning.compress.gzip;
import java.io.*;
import java.util.Random;
-import org.junit.Assert;
+import org.testng.Assert;
import org.testng.annotations.Test;
import com.ning.compress.BaseForTests;
@@ -26,7 +26,7 @@ public class TestGzipUncompressor extends BaseForTests
uncomp.complete();
byte[] result = co.getBytes();
- Assert.assertArrayEquals(fluff, result);
+ Assert.assertEquals(fluff, result);
}
@Test
@@ -41,7 +41,7 @@ public class TestGzipUncompressor extends BaseForTests
uncomp.feedCompressedData(comp, 0, comp.length);
uncomp.complete();
byte[] result = co.getBytes();
- Assert.assertArrayEquals(fluff, result);
+ Assert.assertEquals(fluff, result);
}
@Test
@@ -62,7 +62,7 @@ public class TestGzipUncompressor extends BaseForTests
uncomp.complete();
byte[] result = co.getBytes();
- Assert.assertArrayEquals(fluff, result);
+ Assert.assertEquals(fluff, result);
}
@Test
@@ -78,7 +78,7 @@ public class TestGzipUncompressor extends BaseForTests
uncomp.complete();
byte[] result = co.getBytes();
- Assert.assertArrayEquals(fluff, result);
+ Assert.assertEquals(fluff, result);
}
@Test
@@ -92,7 +92,7 @@ public class TestGzipUncompressor extends BaseForTests
out.close();
byte[] result = co.getBytes();
- Assert.assertArrayEquals(fluff, result);
+ Assert.assertEquals(fluff, result);
}
/*
=====================================
src/test/java/com/ning/compress/lzf/TestLZFEncoder.java → src/test/java/com/ning/compress/lzf/LZFEncoderTest.java
=====================================
@@ -9,18 +9,39 @@ import org.testng.annotations.Test;
import com.ning.compress.BaseForTests;
import com.ning.compress.lzf.util.ChunkEncoderFactory;
-public class TestLZFEncoder extends BaseForTests
+public class LZFEncoderTest extends BaseForTests
{
@Test
- public void testSizeEstimate()
+ public void testBigSizeEstimate()
{
- int max = LZFEncoder.estimateMaxWorkspaceSize(10000);
- // somewhere between 103 and 105%
- if (max < 10300 || max > 10500) {
- Assert.fail("Expected ratio to be 1010 <= x <= 1050, was: "+max);
+ for (int amt : new int[] {
+ 100, 250, 600,
+ 10000, 50000, 65000, 120000, 130000,
+ 3 * 0x10000 + 4,
+ 15 * 0x10000 + 4,
+ 1000 * 0x10000 + 4,
+ }) {
+ int estimate = LZFEncoder.estimateMaxWorkspaceSize(amt);
+ int chunks = ((amt + 0xFFFE) / 0xFFFF);
+ int expMin = 2 + amt + (chunks * 5); // 5-byte header for uncompressed; however, not enough workspace
+ int expMax = ((int) (0.05 * 0xFFFF)) + amt + (chunks * 7);
+ if (estimate < expMin || estimate > expMax) {
+ Assert.fail("Expected ratio for "+amt+" to be "+expMin+" <= x <= "+expMax+", was: "+estimate);
+ }
+//System.err.printf("%d < %d < %d\n", expMin, estimate, expMax);
}
}
+ // as per [compress-lzf#43]
+ @Test
+ public void testSmallSizeEstimate()
+ {
+ // and here we ensure that specific uncompressable case won't fail
+ byte[] in = new byte[] {0, 0, 0, 0, 1, 0, 0, 0, 2, 0, 0, 0, 3, 0, 0, 0, 4, 0, 0, 0};
+ int outSize = LZFEncoder.estimateMaxWorkspaceSize(in.length);
+ LZFEncoder.appendEncoded(in, 0, in.length, new byte[outSize], 0);
+ }
+
@Test
public void testCompressableChunksSingle() throws Exception
{
=====================================
src/test/java/com/ning/compress/lzf/TestLZFRoundTrip.java
=====================================
@@ -1,12 +1,16 @@
package com.ning.compress.lzf;
import java.io.*;
+import java.util.Arrays;
import org.testng.Assert;
import org.testng.annotations.Test;
+import com.ning.compress.BaseForTests;
+import com.ning.compress.lzf.LZFChunk;
import com.ning.compress.lzf.impl.UnsafeChunkDecoder;
import com.ning.compress.lzf.impl.VanillaChunkDecoder;
+import com.ning.compress.lzf.util.ChunkEncoderFactory;
public class TestLZFRoundTrip
{
@@ -86,6 +90,43 @@ public class TestLZFRoundTrip
in.close();
compressedIn.close();
}
+
+ @Test
+ public void testHashCollision() throws IOException
+ {
+ // this test generates a hash collision: [0,1,153,64] hashes the same as [1,153,64,64]
+ // and then leverages the bug s/inPos/0/ to corrupt the array
+ // the first array is used to insert a reference from this hash to offset 6
+ // and then the hash table is reused and still thinks that there is such a hash at position 6
+ // and at position 7, it finds a sequence with the same hash
+ // so it inserts a buggy reference
+ final byte[] b1 = new byte[] {0,1,2,3,4,(byte)153,64,64,64,9,9,9,9,9,9,9,9,9,9};
+ final byte[] b2 = new byte[] {1,(byte)153,0,0,0,0,(byte)153,64,64,64,0,0,0,0,0,0,0,0,0,0,0,0,0,0};
+ final int off = 6;
+
+ ChunkEncoder encoder = ChunkEncoderFactory.safeInstance();
+ ChunkDecoder decoder = new VanillaChunkDecoder();
+ _testCollision(encoder, decoder, b1, 0, b1.length);
+ _testCollision(encoder, decoder, b2, off, b2.length - off);
+
+ encoder = ChunkEncoderFactory.optimalInstance();
+ decoder = new UnsafeChunkDecoder();
+ _testCollision(encoder, decoder, b1, 0, b1.length);
+ _testCollision(encoder, decoder, b2, off, b2.length - off);
+ }
+
+ private void _testCollision(ChunkEncoder encoder, ChunkDecoder decoder, byte[] bytes, int offset, int length) throws IOException
+ {
+ ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
+ byte[] expected = new byte[length];
+ byte[] buffer = new byte[LZFChunk.MAX_CHUNK_LEN];
+ byte[] output = new byte[length];
+ System.arraycopy(bytes, offset, expected, 0, length);
+ encoder.encodeAndWriteChunk(bytes, offset, length, outputStream);
+ InputStream inputStream = new ByteArrayInputStream(outputStream.toByteArray());
+ Assert.assertEquals(decoder.decodeChunk(inputStream, buffer, output), length);
+ Assert.assertEquals(expected, output);
+ }
/*
///////////////////////////////////////////////////////////////////////
=====================================
src/test/java/com/ning/compress/lzf/TestLZFUncompressor.java
=====================================
@@ -3,7 +3,7 @@ package com.ning.compress.lzf;
import java.io.*;
import java.util.Random;
-import org.junit.Assert;
+import org.testng.Assert;
import org.testng.annotations.Test;
import com.ning.compress.BaseForTests;
@@ -26,7 +26,7 @@ public class TestLZFUncompressor extends BaseForTests
uncomp.complete();
byte[] result = co.getBytes();
- Assert.assertArrayEquals(fluff, result);
+ Assert.assertEquals(fluff, result);
}
@Test
@@ -41,7 +41,7 @@ public class TestLZFUncompressor extends BaseForTests
uncomp.feedCompressedData(comp, 0, comp.length);
uncomp.complete();
byte[] result = co.getBytes();
- Assert.assertArrayEquals(fluff, result);
+ Assert.assertEquals(fluff, result);
}
@Test
@@ -62,7 +62,7 @@ public class TestLZFUncompressor extends BaseForTests
uncomp.complete();
byte[] result = co.getBytes();
- Assert.assertArrayEquals(fluff, result);
+ Assert.assertEquals(fluff, result);
}
@Test
@@ -78,7 +78,7 @@ public class TestLZFUncompressor extends BaseForTests
uncomp.complete();
byte[] result = co.getBytes();
- Assert.assertArrayEquals(fluff, result);
+ Assert.assertEquals(fluff, result);
}
@@ -93,7 +93,7 @@ public class TestLZFUncompressor extends BaseForTests
out.close();
byte[] result = co.getBytes();
- Assert.assertArrayEquals(fluff, result);
+ Assert.assertEquals(fluff, result);
}
private final static class Collector implements DataHandler
View it on GitLab: https://salsa.debian.org/java-team/compress-lzf/compare/71cfbdcdcd4a268def6863df9afe0f8312133016...8e9a254a5368e69a4e43f637159398d7a25294b1
--
View it on GitLab: https://salsa.debian.org/java-team/compress-lzf/compare/71cfbdcdcd4a268def6863df9afe0f8312133016...8e9a254a5368e69a4e43f637159398d7a25294b1
You're receiving this email because of your account on salsa.debian.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/pkg-java-commits/attachments/20190128/8106dd4f/attachment.html>
More information about the pkg-java-commits
mailing list