Squashed 'Sources/LZ4/' content from commit 641b453d9d

git-subtree-dir: Sources/LZ4
git-subtree-split: 641b453d9db536ee020851bfcb1dc39f61006f0a
This commit is contained in:
Sergey Abramchuk
2020-02-24 14:40:17 +03:00
commit c0cd028912
120 changed files with 27654 additions and 0 deletions

Binary file not shown.

After

Width:  |  Height:  |  Size: 80 KiB

148
doc/lz4_Block_format.md Normal file
View File

@@ -0,0 +1,148 @@
LZ4 Block Format Description
============================
Last revised: 2018-04-25.
Author : Yann Collet
This specification is intended for developers
willing to produce LZ4-compatible compressed data blocks
using any programming language.
LZ4 is an LZ77-type compressor with a fixed, byte-oriented encoding.
There is no entropy encoder back-end nor framing layer.
The latter is assumed to be handled by other parts of the system (see [LZ4 Frame format]).
This design is assumed to favor simplicity and speed.
It helps later on for optimizations, compactness, and features.
This document describes only the block format,
not how the compressor nor decompressor actually work.
The correctness of the decompressor should not depend
on implementation details of the compressor, and vice versa.
[LZ4 Frame format]: lz4_Frame_format.md
Compressed block format
-----------------------
An LZ4 compressed block is composed of sequences.
A sequence is a suite of literals (not-compressed bytes),
followed by a match copy.
Each sequence starts with a `token`.
The `token` is a one byte value, separated into two 4-bits fields.
Therefore each field ranges from 0 to 15.
The first field uses the 4 high-bits of the token.
It provides the length of literals to follow.
If the field value is 0, then there is no literal.
If it is 15, then we need to add some more bytes to indicate the full length.
Each additional byte then represent a value from 0 to 255,
which is added to the previous value to produce a total length.
When the byte value is 255, another byte is output.
There can be any number of bytes following `token`. There is no "size limit".
(Side note : this is why a not-compressible input block is expanded by 0.4%).
Example 1 : A literal length of 48 will be represented as :
- 15 : value for the 4-bits High field
- 33 : (=48-15) remaining length to reach 48
Example 2 : A literal length of 280 will be represented as :
- 15 : value for the 4-bits High field
- 255 : following byte is maxed, since 280-15 >= 255
- 10 : (=280 - 15 - 255) ) remaining length to reach 280
Example 3 : A literal length of 15 will be represented as :
- 15 : value for the 4-bits High field
- 0 : (=15-15) yes, the zero must be output
Following `token` and optional length bytes, are the literals themselves.
They are exactly as numerous as previously decoded (length of literals).
It's possible that there are zero literal.
Following the literals is the match copy operation.
It starts by the `offset`.
This is a 2 bytes value, in little endian format
(the 1st byte is the "low" byte, the 2nd one is the "high" byte).
The `offset` represents the position of the match to be copied from.
1 means "current position - 1 byte".
The maximum `offset` value is 65535, 65536 cannot be coded.
Note that 0 is an invalid value, not used.
Then we need to extract the `matchlength`.
For this, we use the second token field, the low 4-bits.
Value, obviously, ranges from 0 to 15.
However here, 0 means that the copy operation will be minimal.
The minimum length of a match, called `minmatch`, is 4.
As a consequence, a 0 value means 4 bytes, and a value of 15 means 19+ bytes.
Similar to literal length, on reaching the highest possible value (15),
we output additional bytes, one at a time, with values ranging from 0 to 255.
They are added to total to provide the final match length.
A 255 value means there is another byte to read and add.
There is no limit to the number of optional bytes that can be output this way.
(This points towards a maximum achievable compression ratio of about 250).
Decoding the `matchlength` reaches the end of current sequence.
Next byte will be the start of another sequence.
But before moving to next sequence,
it's time to use the decoded match position and length.
The decoder copies `matchlength` bytes from match position to current position.
In some cases, `matchlength` is larger than `offset`.
Therefore, `match_pos + matchlength > current_pos`,
which means that later bytes to copy are not yet decoded.
This is called an "overlap match", and must be handled with special care.
A common case is an offset of 1,
meaning the last byte is repeated `matchlength` times.
Parsing restrictions
-----------------------
There are specific parsing rules to respect in order to remain compatible
with assumptions made by the decoder :
1. The last 5 bytes are always literals. In other words, the last five bytes
from the uncompressed input (or all bytes, if the input has less than five
bytes) must be encoded as literals on behalf of the last sequence.
The last sequence is incomplete, and stops right after the literals.
2. The last match must start at least 12 bytes before end of block.
The last match is part of the penultimate sequence,
since the last sequence stops right after literals.
Note that, as a consequence, blocks < 13 bytes cannot be compressed.
These rules are in place to ensure that the decoder
can speculatively execute copy instructions
without ever reading nor writing beyond provided I/O buffers.
1. To copy literals from a non-last sequence, an 8-byte copy instruction
can always be safely issued (without reading past the input),
because literals are followed by a 2-byte offset,
and last sequence is at least 1+5 bytes long.
2. Similarly, a match operation can speculatively copy up to 12 bytes
while remaining within output buffer boundaries.
Empty inputs can be represented with a zero byte,
interpreted as a token without literals and without a match.
Additional notes
-----------------------
There is no assumption nor limits to the way the compressor
searches and selects matches within the source data block.
It could be a fast scan, a multi-probe, a full search using BST,
standard hash chains or MMC, well whatever.
Advanced parsing strategies can also be implemented, such as lazy match,
or full optimal parsing.
All these trade-off offer distinctive speed/memory/compression advantages.
Whatever the method used by the compressor, its result will be decodable
by any LZ4 decoder if it follows the format specification described above.

419
doc/lz4_Frame_format.md Normal file
View File

@@ -0,0 +1,419 @@
LZ4 Frame Format Description
============================
### Notices
Copyright (c) 2013-2015 Yann Collet
Permission is granted to copy and distribute this document
for any purpose and without charge,
including translations into other languages
and incorporation into compilations,
provided that the copyright notice and this notice are preserved,
and that any substantive changes or deletions from the original
are clearly marked.
Distribution of this document is unlimited.
### Version
1.6.1 (30/01/2018)
Introduction
------------
The purpose of this document is to define a lossless compressed data format,
that is independent of CPU type, operating system,
file system and character set, suitable for
File compression, Pipe and streaming compression
using the [LZ4 algorithm](http://www.lz4.org).
The data can be produced or consumed,
even for an arbitrarily long sequentially presented input data stream,
using only an a priori bounded amount of intermediate storage,
and hence can be used in data communications.
The format uses the LZ4 compression method,
and optional [xxHash-32 checksum method](https://github.com/Cyan4973/xxHash),
for detection of data corruption.
The data format defined by this specification
does not attempt to allow random access to compressed data.
This specification is intended for use by implementers of software
to compress data into LZ4 format and/or decompress data from LZ4 format.
The text of the specification assumes a basic background in programming
at the level of bits and other primitive data representations.
Unless otherwise indicated below,
a compliant compressor must produce data sets
that conform to the specifications presented here.
It doesnt need to support all options though.
A compliant decompressor must be able to decompress
at least one working set of parameters
that conforms to the specifications presented here.
It may also ignore checksums.
Whenever it does not support a specific parameter within the compressed stream,
it must produce a non-ambiguous error code
and associated error message explaining which parameter is unsupported.
General Structure of LZ4 Frame format
-------------------------------------
| MagicNb | F. Descriptor | Block | (...) | EndMark | C. Checksum |
|:-------:|:-------------:| ----- | ----- | ------- | ----------- |
| 4 bytes | 3-15 bytes | | | 4 bytes | 0-4 bytes |
__Magic Number__
4 Bytes, Little endian format.
Value : 0x184D2204
__Frame Descriptor__
3 to 15 Bytes, to be detailed in its own paragraph,
as it is the most important part of the spec.
The combined __Magic Number__ and __Frame Descriptor__ fields are sometimes
called ___LZ4 Frame Header___. Its size varies between 7 and 19 bytes.
__Data Blocks__
To be detailed in its own paragraph.
Thats where compressed data is stored.
__EndMark__
The flow of blocks ends when the last data block has a size of “0”.
The size is expressed as a 32-bits value.
__Content Checksum__
Content Checksum verify that the full content has been decoded correctly.
The content checksum is the result
of [xxh32() hash function](https://github.com/Cyan4973/xxHash)
digesting the original (decoded) data as input, and a seed of zero.
Content checksum is only present when its associated flag
is set in the frame descriptor.
Content Checksum validates the result,
that all blocks were fully transmitted in the correct order and without error,
and also that the encoding/decoding process itself generated no distortion.
Its usage is recommended.
The combined __EndMark__ and __Content Checksum__ fields might sometimes be
referred to as ___LZ4 Frame Footer___. Its size varies between 4 and 8 bytes.
__Frame Concatenation__
In some circumstances, it may be preferable to append multiple frames,
for example in order to add new data to an existing compressed file
without re-framing it.
In such case, each frame has its own set of descriptor flags.
Each frame is considered independent.
The only relation between frames is their sequential order.
The ability to decode multiple concatenated frames
within a single stream or file
is left outside of this specification.
As an example, the reference lz4 command line utility behavior is
to decode all concatenated frames in their sequential order.
Frame Descriptor
----------------
| FLG | BD | (Content Size) | (Dictionary ID) | HC |
| ------- | ------- |:--------------:|:---------------:| ------- |
| 1 byte | 1 byte | 0 - 8 bytes | 0 - 4 bytes | 1 byte |
The descriptor uses a minimum of 3 bytes,
and up to 15 bytes depending on optional parameters.
__FLG byte__
| BitNb | 7-6 | 5 | 4 | 3 | 2 | 1 | 0 |
| ------- |-------|-------|----------|------|----------|----------|------|
|FieldName|Version|B.Indep|B.Checksum|C.Size|C.Checksum|*Reserved*|DictID|
__BD byte__
| BitNb | 7 | 6-5-4 | 3-2-1-0 |
| ------- | -------- | ------------- | -------- |
|FieldName|*Reserved*| Block MaxSize |*Reserved*|
In the tables, bit 7 is highest bit, while bit 0 is lowest.
__Version Number__
2-bits field, must be set to `01`.
Any other value cannot be decoded by this version of the specification.
Other version numbers will use different flag layouts.
__Block Independence flag__
If this flag is set to “1”, blocks are independent.
If this flag is set to “0”, each block depends on previous ones
(up to LZ4 window size, which is 64 KB).
In such case, its necessary to decode all blocks in sequence.
Block dependency improves compression ratio, especially for small blocks.
On the other hand, it makes random access or multi-threaded decoding impossible.
__Block checksum flag__
If this flag is set, each data block will be followed by a 4-bytes checksum,
calculated by using the xxHash-32 algorithm on the raw (compressed) data block.
The intention is to detect data corruption (storage or transmission errors)
immediately, before decoding.
Block checksum usage is optional.
__Content Size flag__
If this flag is set, the uncompressed size of data included within the frame
will be present as an 8 bytes unsigned little endian value, after the flags.
Content Size usage is optional.
__Content checksum flag__
If this flag is set, a 32-bits content checksum will be appended
after the EndMark.
__Dictionary ID flag__
If this flag is set, a 4-bytes Dict-ID field will be present,
after the descriptor flags and the Content Size.
__Block Maximum Size__
This information is useful to help the decoder allocate memory.
Size here refers to the original (uncompressed) data size.
Block Maximum Size is one value among the following table :
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
| --- | --- | --- | --- | ----- | ------ | ---- | ---- |
| N/A | N/A | N/A | N/A | 64 KB | 256 KB | 1 MB | 4 MB |
The decoder may refuse to allocate block sizes above any system-specific size.
Unused values may be used in a future revision of the spec.
A decoder conformant with the current version of the spec
is only able to decode block sizes defined in this spec.
__Reserved bits__
Value of reserved bits **must** be 0 (zero).
Reserved bit might be used in a future version of the specification,
typically enabling new optional features.
When this happens, a decoder respecting the current specification version
shall not be able to decode such a frame.
__Content Size__
This is the original (uncompressed) size.
This information is optional, and only present if the associated flag is set.
Content size is provided using unsigned 8 Bytes, for a maximum of 16 HexaBytes.
Format is Little endian.
This value is informational, typically for display or memory allocation.
It can be skipped by a decoder, or used to validate content correctness.
__Dictionary ID__
Dict-ID is only present if the associated flag is set.
It's an unsigned 32-bits value, stored using little-endian convention.
A dictionary is useful to compress short input sequences.
The compressor can take advantage of the dictionary context
to encode the input in a more compact manner.
It works as a kind of “known prefix” which is used by
both the compressor and the decompressor to “warm-up” reference tables.
The decompressor can use Dict-ID identifier to determine
which dictionary must be used to correctly decode data.
The compressor and the decompressor must use exactly the same dictionary.
It's presumed that the 32-bits dictID uniquely identifies a dictionary.
Within a single frame, a single dictionary can be defined.
When the frame descriptor defines independent blocks,
each block will be initialized with the same dictionary.
If the frame descriptor defines linked blocks,
the dictionary will only be used once, at the beginning of the frame.
__Header Checksum__
One-byte checksum of combined descriptor fields, including optional ones.
The value is the second byte of `xxh32()` : ` (xxh32()>>8) & 0xFF `
using zero as a seed, and the full Frame Descriptor as an input
(including optional fields when they are present).
A wrong checksum indicates an error in the descriptor.
Header checksum is informational and can be skipped.
Data Blocks
-----------
| Block Size | data | (Block Checksum) |
|:----------:| ------ |:----------------:|
| 4 bytes | | 0 - 4 bytes |
__Block Size__
This field uses 4-bytes, format is little-endian.
The highest bit is “1” if data in the block is uncompressed.
The highest bit is “0” if data in the block is compressed by LZ4.
All other bits give the size, in bytes, of the following data block
(the size does not include the block checksum if present).
Block Size shall never be larger than Block Maximum Size.
Such a thing could happen for incompressible source data.
In such case, such a data block shall be passed in uncompressed format.
__Data__
Where the actual data to decode stands.
It might be compressed or not, depending on previous field indications.
Uncompressed size of Data can be any size, up to “block maximum size”.
Note that data block is not necessarily full :
an arbitrary “flush” may happen anytime. Any block can be “partially filled”.
__Block checksum__
Only present if the associated flag is set.
This is a 4-bytes checksum value, in little endian format,
calculated by using the xxHash-32 algorithm on the raw (undecoded) data block,
and a seed of zero.
The intention is to detect data corruption (storage or transmission errors)
before decoding.
Block checksum is cumulative with Content checksum.
Skippable Frames
----------------
| Magic Number | Frame Size | User Data |
|:------------:|:----------:| --------- |
| 4 bytes | 4 bytes | |
Skippable frames allow the integration of user-defined data
into a flow of concatenated frames.
Its design is pretty straightforward,
with the sole objective to allow the decoder to quickly skip
over user-defined data and continue decoding.
For the purpose of facilitating identification,
it is discouraged to start a flow of concatenated frames with a skippable frame.
If there is a need to start such a flow with some user data
encapsulated into a skippable frame,
its recommended to start with a zero-byte LZ4 frame
followed by a skippable frame.
This will make it easier for file type identifiers.
__Magic Number__
4 Bytes, Little endian format.
Value : 0x184D2A5X, which means any value from 0x184D2A50 to 0x184D2A5F.
All 16 values are valid to identify a skippable frame.
__Frame Size__
This is the size, in bytes, of the following User Data
(without including the magic number nor the size field itself).
4 Bytes, Little endian format, unsigned 32-bits.
This means User Data cant be bigger than (2^32-1) Bytes.
__User Data__
User Data can be anything. Data will just be skipped by the decoder.
Legacy frame
------------
The Legacy frame format was defined into the initial versions of “LZ4Demo”.
Newer compressors should not use this format anymore, as it is too restrictive.
Main characteristics of the legacy format :
- Fixed block size : 8 MB.
- All blocks must be completely filled, except the last one.
- All blocks are always compressed, even when compression is detrimental.
- The last block is detected either because
it is followed by the “EOF” (End of File) mark,
or because it is followed by a known Frame Magic Number.
- No checksum
- Convention is Little endian
| MagicNb | B.CSize | CData | B.CSize | CData | (...) | EndMark |
| ------- | ------- | ----- | ------- | ----- | ------- | ------- |
| 4 bytes | 4 bytes | CSize | 4 bytes | CSize | x times | EOF |
__Magic Number__
4 Bytes, Little endian format.
Value : 0x184C2102
__Block Compressed Size__
This is the size, in bytes, of the following compressed data block.
4 Bytes, Little endian format.
__Data__
Where the actual compressed data stands.
Data is always compressed, even when compression is detrimental.
__EndMark__
End of legacy frame is implicit only.
It must be followed by a standard EOF (End Of File) signal,
wether it is a file or a stream.
Alternatively, if the frame is followed by a valid Frame Magic Number,
it is considered completed.
This policy makes it possible to concatenate legacy frames.
Any other value will be interpreted as a block size,
and trigger an error if it does not fit within acceptable range.
Version changes
---------------
1.6.1 : introduced terms "LZ4 Frame Header" and "LZ4 Frame Footer"
1.6.0 : restored Dictionary ID field in Frame header
1.5.1 : changed document format to MarkDown
1.5 : removed Dictionary ID from specification
1.4.1 : changed wording from “stream” to “frame”
1.4 : added skippable streams, re-added stream checksum
1.3 : modified header checksum
1.2 : reduced choice of “block size”, to postpone decision on “dynamic size of BlockSize Field”.
1.1 : optional fields are now part of the descriptor
1.0 : changed “block size” specification, adding a compressed/uncompressed flag
0.9 : reduced scale of “block maximum size” table
0.8 : removed : high compression flag
0.7 : removed : stream checksum
0.6 : settled : stream size uses 8 bytes, endian convention is little endian
0.5: added copyright notice
0.4 : changed format to Google Doc compatible OpenDocument

458
doc/lz4_manual.html Normal file
View File

@@ -0,0 +1,458 @@
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
<title>1.8.3 Manual</title>
</head>
<body>
<h1>1.8.3 Manual</h1>
<hr>
<a name="Contents"></a><h2>Contents</h2>
<ol>
<li><a href="#Chapter1">Introduction</a></li>
<li><a href="#Chapter2">Version</a></li>
<li><a href="#Chapter3">Tuning parameter</a></li>
<li><a href="#Chapter4">Simple Functions</a></li>
<li><a href="#Chapter5">Advanced Functions</a></li>
<li><a href="#Chapter6">Streaming Compression Functions</a></li>
<li><a href="#Chapter7">Streaming Decompression Functions</a></li>
<li><a href="#Chapter8">Unstable declarations</a></li>
<li><a href="#Chapter9">Private definitions</a></li>
<li><a href="#Chapter10">Obsolete Functions</a></li>
</ol>
<hr>
<a name="Chapter1"></a><h2>Introduction</h2><pre>
LZ4 is lossless compression algorithm, providing compression speed at 400 MB/s per core,
scalable with multi-cores CPU. It features an extremely fast decoder, with speed in
multiple GB/s per core, typically reaching RAM speed limits on multi-core systems.
The LZ4 compression library provides in-memory compression and decompression functions.
Compression can be done in:
- a single step (described as Simple Functions)
- a single step, reusing a context (described in Advanced Functions)
- unbounded multiple steps (described as Streaming compression)
lz4.h provides block compression functions. It gives full buffer control to user.
Decompressing an lz4-compressed block also requires metadata (such as compressed size).
Each application is free to encode such metadata in whichever way it wants.
An additional format, called LZ4 frame specification (doc/lz4_Frame_format.md),
take care of encoding standard metadata alongside LZ4-compressed blocks.
If your application requires interoperability, it's recommended to use it.
A library is provided to take care of it, see lz4frame.h.
<BR></pre>
<a name="Chapter2"></a><h2>Version</h2><pre></pre>
<pre><b>int LZ4_versionNumber (void); </b>/**< library version number; useful to check dll version */<b>
</b></pre><BR>
<pre><b>const char* LZ4_versionString (void); </b>/**< library version string; unseful to check dll version */<b>
</b></pre><BR>
<a name="Chapter3"></a><h2>Tuning parameter</h2><pre></pre>
<pre><b>#ifndef LZ4_MEMORY_USAGE
# define LZ4_MEMORY_USAGE 14
#endif
</b><p> Memory usage formula : N->2^N Bytes (examples : 10 -> 1KB; 12 -> 4KB ; 16 -> 64KB; 20 -> 1MB; etc.)
Increasing memory usage improves compression ratio
Reduced memory usage may improve speed, thanks to cache effect
Default value is 14, for 16KB, which nicely fits into Intel x86 L1 cache
</p></pre><BR>
<a name="Chapter4"></a><h2>Simple Functions</h2><pre></pre>
<pre><b>int LZ4_compress_default(const char* src, char* dst, int srcSize, int dstCapacity);
</b><p> Compresses 'srcSize' bytes from buffer 'src'
into already allocated 'dst' buffer of size 'dstCapacity'.
Compression is guaranteed to succeed if 'dstCapacity' >= LZ4_compressBound(srcSize).
It also runs faster, so it's a recommended setting.
If the function cannot compress 'src' into a more limited 'dst' budget,
compression stops *immediately*, and the function result is zero.
Note : as a consequence, 'dst' content is not valid.
Note 2 : This function is protected against buffer overflow scenarios (never writes outside 'dst' buffer, nor read outside 'source' buffer).
srcSize : max supported value is LZ4_MAX_INPUT_SIZE.
dstCapacity : size of buffer 'dst' (which must be already allocated)
return : the number of bytes written into buffer 'dst' (necessarily <= dstCapacity)
or 0 if compression fails
</p></pre><BR>
<pre><b>int LZ4_decompress_safe (const char* src, char* dst, int compressedSize, int dstCapacity);
</b><p> compressedSize : is the exact complete size of the compressed block.
dstCapacity : is the size of destination buffer, which must be already allocated.
return : the number of bytes decompressed into destination buffer (necessarily <= dstCapacity)
If destination buffer is not large enough, decoding will stop and output an error code (negative value).
If the source stream is detected malformed, the function will stop decoding and return a negative result.
This function is protected against malicious data packets.
</p></pre><BR>
<a name="Chapter5"></a><h2>Advanced Functions</h2><pre></pre>
<pre><b>int LZ4_compressBound(int inputSize);
</b><p> Provides the maximum size that LZ4 compression may output in a "worst case" scenario (input data not compressible)
This function is primarily useful for memory allocation purposes (destination buffer size).
Macro LZ4_COMPRESSBOUND() is also provided for compilation-time evaluation (stack memory allocation for example).
Note that LZ4_compress_default() compresses faster when dstCapacity is >= LZ4_compressBound(srcSize)
inputSize : max supported value is LZ4_MAX_INPUT_SIZE
return : maximum output size in a "worst case" scenario
or 0, if input size is incorrect (too large or negative)
</p></pre><BR>
<pre><b>int LZ4_compress_fast (const char* src, char* dst, int srcSize, int dstCapacity, int acceleration);
</b><p> Same as LZ4_compress_default(), but allows selection of "acceleration" factor.
The larger the acceleration value, the faster the algorithm, but also the lesser the compression.
It's a trade-off. It can be fine tuned, with each successive value providing roughly +~3% to speed.
An acceleration value of "1" is the same as regular LZ4_compress_default()
Values <= 0 will be replaced by ACCELERATION_DEFAULT (currently == 1, see lz4.c).
</p></pre><BR>
<pre><b>int LZ4_sizeofState(void);
int LZ4_compress_fast_extState (void* state, const char* src, char* dst, int srcSize, int dstCapacity, int acceleration);
</b><p> Same compression function, just using an externally allocated memory space to store compression state.
Use LZ4_sizeofState() to know how much memory must be allocated,
and allocate it on 8-bytes boundaries (using malloc() typically).
Then, provide this buffer as 'void* state' to compression function.
</p></pre><BR>
<pre><b>int LZ4_compress_destSize (const char* src, char* dst, int* srcSizePtr, int targetDstSize);
</b><p> Reverse the logic : compresses as much data as possible from 'src' buffer
into already allocated buffer 'dst', of size >= 'targetDestSize'.
This function either compresses the entire 'src' content into 'dst' if it's large enough,
or fill 'dst' buffer completely with as much data as possible from 'src'.
note: acceleration parameter is fixed to "default".
*srcSizePtr : will be modified to indicate how many bytes where read from 'src' to fill 'dst'.
New value is necessarily <= input value.
@return : Nb bytes written into 'dst' (necessarily <= targetDestSize)
or 0 if compression fails.
</p></pre><BR>
<pre><b>int LZ4_decompress_fast (const char* src, char* dst, int originalSize);
</b><p> This function used to be a bit faster than LZ4_decompress_safe(),
though situation has changed in recent versions,
and now `LZ4_decompress_safe()` can be as fast and sometimes faster than `LZ4_decompress_fast()`.
Moreover, LZ4_decompress_fast() is not protected vs malformed input, as it doesn't perform full validation of compressed data.
As a consequence, this function is no longer recommended, and may be deprecated in future versions.
It's only remaining specificity is that it can decompress data without knowing its compressed size.
originalSize : is the uncompressed size to regenerate.
`dst` must be already allocated, its size must be >= 'originalSize' bytes.
@return : number of bytes read from source buffer (== compressed size).
If the source stream is detected malformed, the function stops decoding and returns a negative result.
note : This function requires uncompressed originalSize to be known in advance.
The function never writes past the output buffer.
However, since it doesn't know its 'src' size, it may read past the intended input.
Also, because match offsets are not validated during decoding,
reads from 'src' may underflow.
Use this function in trusted environment **only**.
</p></pre><BR>
<pre><b>int LZ4_decompress_safe_partial (const char* src, char* dst, int srcSize, int targetOutputSize, int dstCapacity);
</b><p> Decompress an LZ4 compressed block, of size 'srcSize' at position 'src',
into destination buffer 'dst' of size 'dstCapacity'.
Up to 'targetOutputSize' bytes will be decoded.
The function stops decoding on reaching this objective,
which can boost performance when only the beginning of a block is required.
@return : the number of bytes decoded in `dst` (necessarily <= dstCapacity)
If source stream is detected malformed, function returns a negative result.
Note : @return can be < targetOutputSize, if compressed block contains less data.
Note 2 : this function features 2 parameters, targetOutputSize and dstCapacity,
and expects targetOutputSize <= dstCapacity.
It effectively stops decoding on reaching targetOutputSize,
so dstCapacity is kind of redundant.
This is because in a previous version of this function,
decoding operation would not "break" a sequence in the middle.
As a consequence, there was no guarantee that decoding would stop at exactly targetOutputSize,
it could write more bytes, though only up to dstCapacity.
Some "margin" used to be required for this operation to work properly.
This is no longer necessary.
The function nonetheless keeps its signature, in an effort to not break API.
</p></pre><BR>
<a name="Chapter6"></a><h2>Streaming Compression Functions</h2><pre></pre>
<pre><b>LZ4_stream_t* LZ4_createStream(void);
int LZ4_freeStream (LZ4_stream_t* streamPtr);
</b><p> LZ4_createStream() will allocate and initialize an `LZ4_stream_t` structure.
LZ4_freeStream() releases its memory.
</p></pre><BR>
<pre><b>void LZ4_resetStream (LZ4_stream_t* streamPtr);
</b><p> An LZ4_stream_t structure can be allocated once and re-used multiple times.
Use this function to start compressing a new stream.
</p></pre><BR>
<pre><b>int LZ4_loadDict (LZ4_stream_t* streamPtr, const char* dictionary, int dictSize);
</b><p> Use this function to load a static dictionary into LZ4_stream_t.
Any previous data will be forgotten, only 'dictionary' will remain in memory.
Loading a size of 0 is allowed, and is the same as reset.
@return : dictionary size, in bytes (necessarily <= 64 KB)
</p></pre><BR>
<pre><b>int LZ4_compress_fast_continue (LZ4_stream_t* streamPtr, const char* src, char* dst, int srcSize, int dstCapacity, int acceleration);
</b><p> Compress 'src' content using data from previously compressed blocks, for better compression ratio.
'dst' buffer must be already allocated.
If dstCapacity >= LZ4_compressBound(srcSize), compression is guaranteed to succeed, and runs faster.
@return : size of compressed block
or 0 if there is an error (typically, cannot fit into 'dst').
Note 1 : Each invocation to LZ4_compress_fast_continue() generates a new block.
Each block has precise boundaries.
It's not possible to append blocks together and expect a single invocation of LZ4_decompress_*() to decompress them together.
Each block must be decompressed separately, calling LZ4_decompress_*() with associated metadata.
Note 2 : The previous 64KB of source data is __assumed__ to remain present, unmodified, at same address in memory!
Note 3 : When input is structured as a double-buffer, each buffer can have any size, including < 64 KB.
Make sure that buffers are separated, by at least one byte.
This construction ensures that each block only depends on previous block.
Note 4 : If input buffer is a ring-buffer, it can have any size, including < 64 KB.
Note 5 : After an error, the stream status is invalid, it can only be reset or freed.
</p></pre><BR>
<pre><b>int LZ4_saveDict (LZ4_stream_t* streamPtr, char* safeBuffer, int maxDictSize);
</b><p> If last 64KB data cannot be guaranteed to remain available at its current memory location,
save it into a safer place (char* safeBuffer).
This is schematically equivalent to a memcpy() followed by LZ4_loadDict(),
but is much faster, because LZ4_saveDict() doesn't need to rebuild tables.
@return : saved dictionary size in bytes (necessarily <= maxDictSize), or 0 if error.
</p></pre><BR>
<a name="Chapter7"></a><h2>Streaming Decompression Functions</h2><pre> Bufferless synchronous API
<BR></pre>
<pre><b>LZ4_streamDecode_t* LZ4_createStreamDecode(void);
int LZ4_freeStreamDecode (LZ4_streamDecode_t* LZ4_stream);
</b><p> creation / destruction of streaming decompression tracking context.
A tracking context can be re-used multiple times.
</p></pre><BR>
<pre><b>int LZ4_setStreamDecode (LZ4_streamDecode_t* LZ4_streamDecode, const char* dictionary, int dictSize);
</b><p> An LZ4_streamDecode_t context can be allocated once and re-used multiple times.
Use this function to start decompression of a new stream of blocks.
A dictionary can optionally be set. Use NULL or size 0 for a reset order.
Dictionary is presumed stable : it must remain accessible and unmodified during next decompression.
@return : 1 if OK, 0 if error
</p></pre><BR>
<pre><b>int LZ4_decoderRingBufferSize(int maxBlockSize);
#define LZ4_DECODER_RING_BUFFER_SIZE(mbs) (65536 + 14 + (mbs)) </b>/* for static allocation; mbs presumed valid */<b>
</b><p> Note : in a ring buffer scenario (optional),
blocks are presumed decompressed next to each other
up to the moment there is not enough remaining space for next block (remainingSize < maxBlockSize),
at which stage it resumes from beginning of ring buffer.
When setting such a ring buffer for streaming decompression,
provides the minimum size of this ring buffer
to be compatible with any source respecting maxBlockSize condition.
@return : minimum ring buffer size,
or 0 if there is an error (invalid maxBlockSize).
</p></pre><BR>
<pre><b>int LZ4_decompress_safe_continue (LZ4_streamDecode_t* LZ4_streamDecode, const char* src, char* dst, int srcSize, int dstCapacity);
int LZ4_decompress_fast_continue (LZ4_streamDecode_t* LZ4_streamDecode, const char* src, char* dst, int originalSize);
</b><p> These decoding functions allow decompression of consecutive blocks in "streaming" mode.
A block is an unsplittable entity, it must be presented entirely to a decompression function.
Decompression functions only accepts one block at a time.
The last 64KB of previously decoded data *must* remain available and unmodified at the memory position where they were decoded.
If less than 64KB of data has been decoded, all the data must be present.
Special : if decompression side sets a ring buffer, it must respect one of the following conditions :
- Decompression buffer size is _at least_ LZ4_decoderRingBufferSize(maxBlockSize).
maxBlockSize is the maximum size of any single block. It can have any value > 16 bytes.
In which case, encoding and decoding buffers do not need to be synchronized.
Actually, data can be produced by any source compliant with LZ4 format specification, and respecting maxBlockSize.
- Synchronized mode :
Decompression buffer size is _exactly_ the same as compression buffer size,
and follows exactly same update rule (block boundaries at same positions),
and decoding function is provided with exact decompressed size of each block (exception for last block of the stream),
_then_ decoding & encoding ring buffer can have any size, including small ones ( < 64 KB).
- Decompression buffer is larger than encoding buffer, by a minimum of maxBlockSize more bytes.
In which case, encoding and decoding buffers do not need to be synchronized,
and encoding ring buffer can have any size, including small ones ( < 64 KB).
Whenever these conditions are not possible,
save the last 64KB of decoded data into a safe buffer where it can't be modified during decompression,
then indicate where this data is saved using LZ4_setStreamDecode(), before decompressing next block.
</p></pre><BR>
<pre><b>int LZ4_decompress_safe_usingDict (const char* src, char* dst, int srcSize, int dstCapcity, const char* dictStart, int dictSize);
int LZ4_decompress_fast_usingDict (const char* src, char* dst, int originalSize, const char* dictStart, int dictSize);
</b><p> These decoding functions work the same as
a combination of LZ4_setStreamDecode() followed by LZ4_decompress_*_continue()
They are stand-alone, and don't need an LZ4_streamDecode_t structure.
Dictionary is presumed stable : it must remain accessible and unmodified during next decompression.
</p></pre><BR>
<a name="Chapter8"></a><h2>Unstable declarations</h2><pre>
Declarations in this section should be considered unstable.
Use at your own peril, etc., etc.
They may be removed in the future.
Their signatures may change.
<BR></pre>
<pre><b>void LZ4_resetStream_fast (LZ4_stream_t* streamPtr);
</b><p> Use this, like LZ4_resetStream(), to prepare a context for a new chain of
calls to a streaming API (e.g., LZ4_compress_fast_continue()).
Note:
Using this in advance of a non- streaming-compression function is redundant,
and potentially bad for performance, since they all perform their own custom
reset internally.
Differences from LZ4_resetStream():
When an LZ4_stream_t is known to be in a internally coherent state,
it can often be prepared for a new compression with almost no work, only
sometimes falling back to the full, expensive reset that is always required
when the stream is in an indeterminate state (i.e., the reset performed by
LZ4_resetStream()).
LZ4_streams are guaranteed to be in a valid state when:
- returned from LZ4_createStream()
- reset by LZ4_resetStream()
- memset(stream, 0, sizeof(LZ4_stream_t)), though this is discouraged
- the stream was in a valid state and was reset by LZ4_resetStream_fast()
- the stream was in a valid state and was then used in any compression call
that returned success
- the stream was in an indeterminate state and was used in a compression
call that fully reset the state (e.g., LZ4_compress_fast_extState()) and
that returned success
When a stream isn't known to be in a valid state, it is not safe to pass to
any fastReset or streaming function. It must first be cleansed by the full
LZ4_resetStream().
</p></pre><BR>
<pre><b>int LZ4_compress_fast_extState_fastReset (void* state, const char* src, char* dst, int srcSize, int dstCapacity, int acceleration);
</b><p> A variant of LZ4_compress_fast_extState().
Using this variant avoids an expensive initialization step. It is only safe
to call if the state buffer is known to be correctly initialized already
(see above comment on LZ4_resetStream_fast() for a definition of "correctly
initialized"). From a high level, the difference is that this function
initializes the provided state with a call to something like
LZ4_resetStream_fast() while LZ4_compress_fast_extState() starts with a
call to LZ4_resetStream().
</p></pre><BR>
<pre><b>void LZ4_attach_dictionary(LZ4_stream_t *working_stream, const LZ4_stream_t *dictionary_stream);
</b><p> This is an experimental API that allows for the efficient use of a
static dictionary many times.
Rather than re-loading the dictionary buffer into a working context before
each compression, or copying a pre-loaded dictionary's LZ4_stream_t into a
working LZ4_stream_t, this function introduces a no-copy setup mechanism,
in which the working stream references the dictionary stream in-place.
Several assumptions are made about the state of the dictionary stream.
Currently, only streams which have been prepared by LZ4_loadDict() should
be expected to work.
Alternatively, the provided dictionary stream pointer may be NULL, in which
case any existing dictionary stream is unset.
If a dictionary is provided, it replaces any pre-existing stream history.
The dictionary contents are the only history that can be referenced and
logically immediately precede the data compressed in the first subsequent
compression call.
The dictionary will only remain attached to the working stream through the
first compression call, at the end of which it is cleared. The dictionary
stream (and source buffer) must remain in-place / accessible / unchanged
through the completion of the first compression call on the stream.
</p></pre><BR>
<a name="Chapter9"></a><h2>Private definitions</h2><pre>
Do not use these definitions.
They are exposed to allow static allocation of `LZ4_stream_t` and `LZ4_streamDecode_t`.
Using these definitions will expose code to API and/or ABI break in future versions of the library.
<BR></pre>
<pre><b>typedef struct {
const uint8_t* externalDict;
size_t extDictSize;
const uint8_t* prefixEnd;
size_t prefixSize;
} LZ4_streamDecode_t_internal;
</b></pre><BR>
<pre><b>typedef struct {
const unsigned char* externalDict;
size_t extDictSize;
const unsigned char* prefixEnd;
size_t prefixSize;
} LZ4_streamDecode_t_internal;
</b></pre><BR>
<pre><b>#define LZ4_STREAMSIZE_U64 ((1 << (LZ4_MEMORY_USAGE-3)) + 4)
#define LZ4_STREAMSIZE (LZ4_STREAMSIZE_U64 * sizeof(unsigned long long))
union LZ4_stream_u {
unsigned long long table[LZ4_STREAMSIZE_U64];
LZ4_stream_t_internal internal_donotuse;
} ; </b>/* previously typedef'd to LZ4_stream_t */<b>
</b><p> information structure to track an LZ4 stream.
init this structure before first use.
note : only use in association with static linking !
this definition is not API/ABI safe,
it may change in a future version !
</p></pre><BR>
<pre><b>#define LZ4_STREAMDECODESIZE_U64 4
#define LZ4_STREAMDECODESIZE (LZ4_STREAMDECODESIZE_U64 * sizeof(unsigned long long))
union LZ4_streamDecode_u {
unsigned long long table[LZ4_STREAMDECODESIZE_U64];
LZ4_streamDecode_t_internal internal_donotuse;
} ; </b>/* previously typedef'd to LZ4_streamDecode_t */<b>
</b><p> information structure to track an LZ4 stream during decompression.
init this structure using LZ4_setStreamDecode (or memset()) before first use
note : only use in association with static linking !
this definition is not API/ABI safe,
and may change in a future version !
</p></pre><BR>
<a name="Chapter10"></a><h2>Obsolete Functions</h2><pre></pre>
<pre><b>#ifdef LZ4_DISABLE_DEPRECATE_WARNINGS
# define LZ4_DEPRECATED(message) </b>/* disable deprecation warnings */<b>
#else
# define LZ4_GCC_VERSION (__GNUC__ * 100 + __GNUC_MINOR__)
# if defined (__cplusplus) && (__cplusplus >= 201402) </b>/* C++14 or greater */<b>
# define LZ4_DEPRECATED(message) [[deprecated(message)]]
# elif (LZ4_GCC_VERSION >= 405) || defined(__clang__)
# define LZ4_DEPRECATED(message) __attribute__((deprecated(message)))
# elif (LZ4_GCC_VERSION >= 301)
# define LZ4_DEPRECATED(message) __attribute__((deprecated))
# elif defined(_MSC_VER)
# define LZ4_DEPRECATED(message) __declspec(deprecated(message))
# else
# pragma message("WARNING: You need to implement LZ4_DEPRECATED for this compiler")
# define LZ4_DEPRECATED(message)
# endif
#endif </b>/* LZ4_DISABLE_DEPRECATE_WARNINGS */<b>
</b><p> Should deprecation warnings be a problem,
it is generally possible to disable them,
typically with -Wno-deprecated-declarations for gcc
or _CRT_SECURE_NO_WARNINGS in Visual.
Otherwise, it's also possible to define LZ4_DISABLE_DEPRECATE_WARNINGS
</p></pre><BR>
</html>
</body>

352
doc/lz4frame_manual.html Normal file
View File

@@ -0,0 +1,352 @@
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
<title>1.8.3 Manual</title>
</head>
<body>
<h1>1.8.3 Manual</h1>
<hr>
<a name="Contents"></a><h2>Contents</h2>
<ol>
<li><a href="#Chapter1">Introduction</a></li>
<li><a href="#Chapter2">Compiler specifics</a></li>
<li><a href="#Chapter3">Error management</a></li>
<li><a href="#Chapter4">Frame compression types</a></li>
<li><a href="#Chapter5">Simple compression function</a></li>
<li><a href="#Chapter6">Advanced compression functions</a></li>
<li><a href="#Chapter7">Resource Management</a></li>
<li><a href="#Chapter8">Compression</a></li>
<li><a href="#Chapter9">Decompression functions</a></li>
<li><a href="#Chapter10">Streaming decompression functions</a></li>
<li><a href="#Chapter11">Bulk processing dictionary API</a></li>
</ol>
<hr>
<a name="Chapter1"></a><h2>Introduction</h2><pre>
lz4frame.h implements LZ4 frame specification (doc/lz4_Frame_format.md).
lz4frame.h provides frame compression functions that take care
of encoding standard metadata alongside LZ4-compressed blocks.
<BR></pre>
<a name="Chapter2"></a><h2>Compiler specifics</h2><pre></pre>
<a name="Chapter3"></a><h2>Error management</h2><pre></pre>
<pre><b>unsigned LZ4F_isError(LZ4F_errorCode_t code); </b>/**< tells when a function result is an error code */<b>
</b></pre><BR>
<pre><b>const char* LZ4F_getErrorName(LZ4F_errorCode_t code); </b>/**< return error code string; for debugging */<b>
</b></pre><BR>
<a name="Chapter4"></a><h2>Frame compression types</h2><pre></pre>
<pre><b>typedef enum {
LZ4F_default=0,
LZ4F_max64KB=4,
LZ4F_max256KB=5,
LZ4F_max1MB=6,
LZ4F_max4MB=7
LZ4F_OBSOLETE_ENUM(max64KB)
LZ4F_OBSOLETE_ENUM(max256KB)
LZ4F_OBSOLETE_ENUM(max1MB)
LZ4F_OBSOLETE_ENUM(max4MB)
} LZ4F_blockSizeID_t;
</b></pre><BR>
<pre><b>typedef enum {
LZ4F_blockLinked=0,
LZ4F_blockIndependent
LZ4F_OBSOLETE_ENUM(blockLinked)
LZ4F_OBSOLETE_ENUM(blockIndependent)
} LZ4F_blockMode_t;
</b></pre><BR>
<pre><b>typedef enum {
LZ4F_noContentChecksum=0,
LZ4F_contentChecksumEnabled
LZ4F_OBSOLETE_ENUM(noContentChecksum)
LZ4F_OBSOLETE_ENUM(contentChecksumEnabled)
} LZ4F_contentChecksum_t;
</b></pre><BR>
<pre><b>typedef enum {
LZ4F_noBlockChecksum=0,
LZ4F_blockChecksumEnabled
} LZ4F_blockChecksum_t;
</b></pre><BR>
<pre><b>typedef enum {
LZ4F_frame=0,
LZ4F_skippableFrame
LZ4F_OBSOLETE_ENUM(skippableFrame)
} LZ4F_frameType_t;
</b></pre><BR>
<pre><b>typedef struct {
LZ4F_blockSizeID_t blockSizeID; </b>/* max64KB, max256KB, max1MB, max4MB; 0 == default */<b>
LZ4F_blockMode_t blockMode; </b>/* LZ4F_blockLinked, LZ4F_blockIndependent; 0 == default */<b>
LZ4F_contentChecksum_t contentChecksumFlag; </b>/* 1: frame terminated with 32-bit checksum of decompressed data; 0: disabled (default) */<b>
LZ4F_frameType_t frameType; </b>/* read-only field : LZ4F_frame or LZ4F_skippableFrame */<b>
unsigned long long contentSize; </b>/* Size of uncompressed content ; 0 == unknown */<b>
unsigned dictID; </b>/* Dictionary ID, sent by compressor to help decoder select correct dictionary; 0 == no dictID provided */<b>
LZ4F_blockChecksum_t blockChecksumFlag; </b>/* 1: each block followed by a checksum of block's compressed data; 0: disabled (default) */<b>
} LZ4F_frameInfo_t;
</b><p> makes it possible to set or read frame parameters.
It's not required to set all fields, as long as the structure was initially memset() to zero.
For all fields, 0 sets it to default value
</p></pre><BR>
<pre><b>typedef struct {
LZ4F_frameInfo_t frameInfo;
int compressionLevel; </b>/* 0: default (fast mode); values > LZ4HC_CLEVEL_MAX count as LZ4HC_CLEVEL_MAX; values < 0 trigger "fast acceleration" */<b>
unsigned autoFlush; </b>/* 1: always flush, to reduce usage of internal buffers */<b>
unsigned favorDecSpeed; </b>/* 1: parser favors decompression speed vs compression ratio. Only works for high compression modes (>= LZ4LZ4HC_CLEVEL_OPT_MIN) */ /* >= v1.8.2 */<b>
unsigned reserved[3]; </b>/* must be zero for forward compatibility */<b>
} LZ4F_preferences_t;
</b><p> makes it possible to supply detailed compression parameters to the stream interface.
Structure is presumed initially memset() to zero, representing default settings.
All reserved fields must be set to zero.
</p></pre><BR>
<a name="Chapter5"></a><h2>Simple compression function</h2><pre></pre>
<pre><b>size_t LZ4F_compressFrameBound(size_t srcSize, const LZ4F_preferences_t* preferencesPtr);
</b><p> Returns the maximum possible compressed size with LZ4F_compressFrame() given srcSize and preferences.
`preferencesPtr` is optional. It can be replaced by NULL, in which case, the function will assume default preferences.
Note : this result is only usable with LZ4F_compressFrame().
It may also be used with LZ4F_compressUpdate() _if no flush() operation_ is performed.
</p></pre><BR>
<pre><b>size_t LZ4F_compressFrame(void* dstBuffer, size_t dstCapacity,
const void* srcBuffer, size_t srcSize,
const LZ4F_preferences_t* preferencesPtr);
</b><p> Compress an entire srcBuffer into a valid LZ4 frame.
dstCapacity MUST be >= LZ4F_compressFrameBound(srcSize, preferencesPtr).
The LZ4F_preferences_t structure is optional : you can provide NULL as argument. All preferences will be set to default.
@return : number of bytes written into dstBuffer.
or an error code if it fails (can be tested using LZ4F_isError())
</p></pre><BR>
<a name="Chapter6"></a><h2>Advanced compression functions</h2><pre></pre>
<pre><b>typedef struct {
unsigned stableSrc; </b>/* 1 == src content will remain present on future calls to LZ4F_compress(); skip copying src content within tmp buffer */<b>
unsigned reserved[3];
} LZ4F_compressOptions_t;
</b></pre><BR>
<a name="Chapter7"></a><h2>Resource Management</h2><pre></pre>
<pre><b>LZ4F_errorCode_t LZ4F_createCompressionContext(LZ4F_cctx** cctxPtr, unsigned version);
LZ4F_errorCode_t LZ4F_freeCompressionContext(LZ4F_cctx* cctx);
</b><p> The first thing to do is to create a compressionContext object, which will be used in all compression operations.
This is achieved using LZ4F_createCompressionContext(), which takes as argument a version.
The version provided MUST be LZ4F_VERSION. It is intended to track potential version mismatch, notably when using DLL.
The function will provide a pointer to a fully allocated LZ4F_cctx object.
If @return != zero, there was an error during context creation.
Object can release its memory using LZ4F_freeCompressionContext();
</p></pre><BR>
<a name="Chapter8"></a><h2>Compression</h2><pre></pre>
<pre><b>size_t LZ4F_compressBegin(LZ4F_cctx* cctx,
void* dstBuffer, size_t dstCapacity,
const LZ4F_preferences_t* prefsPtr);
</b><p> will write the frame header into dstBuffer.
dstCapacity must be >= LZ4F_HEADER_SIZE_MAX bytes.
`prefsPtr` is optional : you can provide NULL as argument, all preferences will then be set to default.
@return : number of bytes written into dstBuffer for the header
or an error code (which can be tested using LZ4F_isError())
</p></pre><BR>
<pre><b>size_t LZ4F_compressBound(size_t srcSize, const LZ4F_preferences_t* prefsPtr);
</b><p> Provides minimum dstCapacity required to guarantee compression success
given a srcSize and preferences, covering worst case scenario.
prefsPtr is optional : when NULL is provided, preferences will be set to cover worst case scenario.
Estimation is valid for either LZ4F_compressUpdate(), LZ4F_flush() or LZ4F_compressEnd(),
Estimation includes the possibility that internal buffer might already be filled by up to (blockSize-1) bytes.
It also includes frame footer (ending + checksum), which would have to be generated by LZ4F_compressEnd().
Estimation doesn't include frame header, as it was already generated by LZ4F_compressBegin().
Result is always the same for a srcSize and prefsPtr, so it can be trusted to size reusable buffers.
When srcSize==0, LZ4F_compressBound() provides an upper bound for LZ4F_flush() and LZ4F_compressEnd() operations.
</p></pre><BR>
<pre><b>size_t LZ4F_compressUpdate(LZ4F_cctx* cctx,
void* dstBuffer, size_t dstCapacity,
const void* srcBuffer, size_t srcSize,
const LZ4F_compressOptions_t* cOptPtr);
</b><p> LZ4F_compressUpdate() can be called repetitively to compress as much data as necessary.
Important rule: dstCapacity MUST be large enough to ensure operation success even in worst case situations.
This value is provided by LZ4F_compressBound().
If this condition is not respected, LZ4F_compress() will fail (result is an errorCode).
LZ4F_compressUpdate() doesn't guarantee error recovery.
When an error occurs, compression context must be freed or resized.
`cOptPtr` is optional : NULL can be provided, in which case all options are set to default.
@return : number of bytes written into `dstBuffer` (it can be zero, meaning input data was just buffered).
or an error code if it fails (which can be tested using LZ4F_isError())
</p></pre><BR>
<pre><b>size_t LZ4F_flush(LZ4F_cctx* cctx,
void* dstBuffer, size_t dstCapacity,
const LZ4F_compressOptions_t* cOptPtr);
</b><p> When data must be generated and sent immediately, without waiting for a block to be completely filled,
it's possible to call LZ4_flush(). It will immediately compress any data buffered within cctx.
`dstCapacity` must be large enough to ensure the operation will be successful.
`cOptPtr` is optional : it's possible to provide NULL, all options will be set to default.
@return : nb of bytes written into dstBuffer (can be zero, when there is no data stored within cctx)
or an error code if it fails (which can be tested using LZ4F_isError())
</p></pre><BR>
<pre><b>size_t LZ4F_compressEnd(LZ4F_cctx* cctx,
void* dstBuffer, size_t dstCapacity,
const LZ4F_compressOptions_t* cOptPtr);
</b><p> To properly finish an LZ4 frame, invoke LZ4F_compressEnd().
It will flush whatever data remained within `cctx` (like LZ4_flush())
and properly finalize the frame, with an endMark and a checksum.
`cOptPtr` is optional : NULL can be provided, in which case all options will be set to default.
@return : nb of bytes written into dstBuffer, necessarily >= 4 (endMark),
or an error code if it fails (which can be tested using LZ4F_isError())
A successful call to LZ4F_compressEnd() makes `cctx` available again for another compression task.
</p></pre><BR>
<a name="Chapter9"></a><h2>Decompression functions</h2><pre></pre>
<pre><b>typedef struct {
unsigned stableDst; </b>/* pledges that last 64KB decompressed data will remain available unmodified. This optimization skips storage operations in tmp buffers. */<b>
unsigned reserved[3]; </b>/* must be set to zero for forward compatibility */<b>
} LZ4F_decompressOptions_t;
</b></pre><BR>
<pre><b>LZ4F_errorCode_t LZ4F_createDecompressionContext(LZ4F_dctx** dctxPtr, unsigned version);
LZ4F_errorCode_t LZ4F_freeDecompressionContext(LZ4F_dctx* dctx);
</b><p> Create an LZ4F_dctx object, to track all decompression operations.
The version provided MUST be LZ4F_VERSION.
The function provides a pointer to an allocated and initialized LZ4F_dctx object.
The result is an errorCode, which can be tested using LZ4F_isError().
dctx memory can be released using LZ4F_freeDecompressionContext();
Result of LZ4F_freeDecompressionContext() indicates current state of decompressionContext when being released.
That is, it should be == 0 if decompression has been completed fully and correctly.
</p></pre><BR>
<a name="Chapter10"></a><h2>Streaming decompression functions</h2><pre></pre>
<pre><b>size_t LZ4F_getFrameInfo(LZ4F_dctx* dctx,
LZ4F_frameInfo_t* frameInfoPtr,
const void* srcBuffer, size_t* srcSizePtr);
</b><p> This function extracts frame parameters (max blockSize, dictID, etc.).
Its usage is optional.
Extracted information is typically useful for allocation and dictionary.
This function works in 2 situations :
- At the beginning of a new frame, in which case
it will decode information from `srcBuffer`, starting the decoding process.
Input size must be large enough to successfully decode the entire frame header.
Frame header size is variable, but is guaranteed to be <= LZ4F_HEADER_SIZE_MAX bytes.
It's allowed to provide more input data than this minimum.
- After decoding has been started.
In which case, no input is read, frame parameters are extracted from dctx.
- If decoding has barely started, but not yet extracted information from header,
LZ4F_getFrameInfo() will fail.
The number of bytes consumed from srcBuffer will be updated within *srcSizePtr (necessarily <= original value).
Decompression must resume from (srcBuffer + *srcSizePtr).
@return : an hint about how many srcSize bytes LZ4F_decompress() expects for next call,
or an error code which can be tested using LZ4F_isError().
note 1 : in case of error, dctx is not modified. Decoding operation can resume from beginning safely.
note 2 : frame parameters are *copied into* an already allocated LZ4F_frameInfo_t structure.
</p></pre><BR>
<pre><b>size_t LZ4F_decompress(LZ4F_dctx* dctx,
void* dstBuffer, size_t* dstSizePtr,
const void* srcBuffer, size_t* srcSizePtr,
const LZ4F_decompressOptions_t* dOptPtr);
</b><p> Call this function repetitively to regenerate compressed data from `srcBuffer`.
The function will read up to *srcSizePtr bytes from srcBuffer,
and decompress data into dstBuffer, of capacity *dstSizePtr.
The nb of bytes consumed from srcBuffer will be written into *srcSizePtr (necessarily <= original value).
The nb of bytes decompressed into dstBuffer will be written into *dstSizePtr (necessarily <= original value).
The function does not necessarily read all input bytes, so always check value in *srcSizePtr.
Unconsumed source data must be presented again in subsequent invocations.
`dstBuffer` can freely change between each consecutive function invocation.
`dstBuffer` content will be overwritten.
@return : an hint of how many `srcSize` bytes LZ4F_decompress() expects for next call.
Schematically, it's the size of the current (or remaining) compressed block + header of next block.
Respecting the hint provides some small speed benefit, because it skips intermediate buffers.
This is just a hint though, it's always possible to provide any srcSize.
When a frame is fully decoded, @return will be 0 (no more data expected).
When provided with more bytes than necessary to decode a frame,
LZ4F_decompress() will stop reading exactly at end of current frame, and @return 0.
If decompression failed, @return is an error code, which can be tested using LZ4F_isError().
After a decompression error, the `dctx` context is not resumable.
Use LZ4F_resetDecompressionContext() to return to clean state.
After a frame is fully decoded, dctx can be used again to decompress another frame.
</p></pre><BR>
<pre><b>void LZ4F_resetDecompressionContext(LZ4F_dctx* dctx); </b>/* always successful */<b>
</b><p> In case of an error, the context is left in "undefined" state.
In which case, it's necessary to reset it, before re-using it.
This method can also be used to abruptly stop any unfinished decompression,
and start a new one using same context resources.
</p></pre><BR>
<pre><b>typedef enum { LZ4F_LIST_ERRORS(LZ4F_GENERATE_ENUM) } LZ4F_errorCodes;
</b></pre><BR>
<a name="Chapter11"></a><h2>Bulk processing dictionary API</h2><pre></pre>
<pre><b>LZ4FLIB_STATIC_API LZ4F_CDict* LZ4F_createCDict(const void* dictBuffer, size_t dictSize);
LZ4FLIB_STATIC_API void LZ4F_freeCDict(LZ4F_CDict* CDict);
</b><p> When compressing multiple messages / blocks with the same dictionary, it's recommended to load it just once.
LZ4_createCDict() will create a digested dictionary, ready to start future compression operations without startup delay.
LZ4_CDict can be created once and shared by multiple threads concurrently, since its usage is read-only.
`dictBuffer` can be released after LZ4_CDict creation, since its content is copied within CDict
</p></pre><BR>
<pre><b>LZ4FLIB_STATIC_API size_t LZ4F_compressFrame_usingCDict(
LZ4F_cctx* cctx,
void* dst, size_t dstCapacity,
const void* src, size_t srcSize,
const LZ4F_CDict* cdict,
const LZ4F_preferences_t* preferencesPtr);
</b><p> Compress an entire srcBuffer into a valid LZ4 frame using a digested Dictionary.
cctx must point to a context created by LZ4F_createCompressionContext().
If cdict==NULL, compress without a dictionary.
dstBuffer MUST be >= LZ4F_compressFrameBound(srcSize, preferencesPtr).
If this condition is not respected, function will fail (@return an errorCode).
The LZ4F_preferences_t structure is optional : you may provide NULL as argument,
but it's not recommended, as it's the only way to provide dictID in the frame header.
@return : number of bytes written into dstBuffer.
or an error code if it fails (can be tested using LZ4F_isError())
</p></pre><BR>
<pre><b>LZ4FLIB_STATIC_API size_t LZ4F_compressBegin_usingCDict(
LZ4F_cctx* cctx,
void* dstBuffer, size_t dstCapacity,
const LZ4F_CDict* cdict,
const LZ4F_preferences_t* prefsPtr);
</b><p> Inits streaming dictionary compression, and writes the frame header into dstBuffer.
dstCapacity must be >= LZ4F_HEADER_SIZE_MAX bytes.
`prefsPtr` is optional : you may provide NULL as argument,
however, it's the only way to provide dictID in the frame header.
@return : number of bytes written into dstBuffer for the header,
or an error code (which can be tested using LZ4F_isError())
</p></pre><BR>
<pre><b>LZ4FLIB_STATIC_API size_t LZ4F_decompress_usingDict(
LZ4F_dctx* dctxPtr,
void* dstBuffer, size_t* dstSizePtr,
const void* srcBuffer, size_t* srcSizePtr,
const void* dict, size_t dictSize,
const LZ4F_decompressOptions_t* decompressOptionsPtr);
</b><p> Same as LZ4F_decompress(), using a predefined dictionary.
Dictionary is used "in place", without any preprocessing.
It must remain accessible throughout the entire frame decoding.
</p></pre><BR>
</html>
</body>