lz4.stream sub-package

Warning

This module is unmaintained.

This sub-package is considered experimental. It was submitted by a community member who is not able to continue to maintain the module.

This module is not built as part of the distributed wheels. If you wish to build and use this module you will need to download and build from source with the environment variable PYLZ4_EXPERIMENTAL set to TRUE.

The module needs some re-write, and the tests need extensive work, for this to become production ready. If you are interested in working on this, please reach out to the package maintainers.

This sub-package provides the capability to compress and decompress data using the stream specification, especially the stream specification based on a double buffer.

Because the LZ4 stream format does not define a container format, the Python bindings will by default insert the compressed data size as an integer at the start of the compressed payload. However, it is possible to set the bit depth of this compressed data size.

So far, only the double-buffer based approach is implemented.

Example usage

To use the lz4 stream format bindings is straightforward:

>>> from lz4.stream import LZ4StreamCompressor, LZ4StreamDecompressor
>>> import os
>>> block_size_length = 2 # LZ4 compressed block size stored on 2 bytes
>>> page_size = 8192 # LZ4 context double buffer page size
>>> origin_stream = 10 * 1024 * os.urandom(1024) # 10MiB
>>> # LZ4 stream compression of origin_stream into compressed_stream:
>>> compressed_stream = bytearray()
>>> with LZ4StreamCompressor("double_buffer", page_size, store_comp_size=block_size_length) as proc:
...     offset = 0
...     while offset < len(origin_stream):
...         chunk = origin_stream[offset:offset + page_size]
...         block = proc.compress(chunk)
...         compressed_stream.extend(block)
...         offset += page_size
>>> # LZ4 stream decompression of compressed_stream into decompressed_stream:
>>> decompressed_stream = bytearray()
>>> with LZ4StreamDecompressor("double_buffer", page_size, store_comp_size=block_size_length) as proc:
...     offset = 0
...     while offset < len(compressed_stream):
...         block = proc.get_block(compressed_stream[offset:])
...         chunk = proc.decompress(block)
...         decompressed_stream.extend(chunk)
...         offset += block_size_length + len(block)
>>> decompressed_stream == origin_stream
True

Out-of-band block size record example

>>> from lz4.stream import LZ4StreamCompressor, LZ4StreamDecompressor
>>> import os
>>> page_size = 8192 # LZ4 context double buffer page size
>>> out_of_band_block_sizes = [] # Store the block sizes
>>> origin_stream = 10 * 1024 * os.urandom(1024) # 10MiB
>>> # LZ4 stream compression of origin_stream into compressed_stream:
>>> compressed_stream = bytearray()
>>> with LZ4StreamCompressor("double_buffer", page_size, store_comp_size=0) as proc:
...     offset = 0
...     while offset < len(origin_stream):
...         chunk = origin_stream[offset:offset + page_size]
...         block = proc.compress(chunk)
...         out_of_band_block_sizes.append(len(block))
...         compressed_stream.extend(block)
...         offset += page_size
>>> # LZ4 stream decompression of compressed_stream into decompressed_stream:
>>> decompressed_stream = bytearray()
>>> with LZ4StreamDecompressor("double_buffer", page_size, store_comp_size=0) as proc:
...     offset = 0
...     for block_len in out_of_band_block_sizes:
...         # Sanity check:
...         if offset >= len(compressed_stream):
...             raise LZ4StreamError("Truncated stream")
...         block = compressed_stream[offset:offset + block_len]
...         chunk = proc.decompress(block)
...         decompressed_stream.extend(chunk)
...         offset += block_len
>>> decompressed_stream == origin_stream
True

Contents

A Python wrapper for the LZ4 stream protocol.

class lz4.stream.LZ4StreamCompressor(strategy, buffer_size, mode='default', acceleration=True, compression_level=9, return_bytearray=False, store_comp_size=4, dictionary='')

LZ4 stream compressing context.

__enter__()

Enter the LZ4 stream context.

__exit__(exc_type, exc, exc_tb)

Exit the LZ4 stream context.

compress(chunk)

Stream compress given chunk of data.

Compress the given chunk, using the given LZ4 stream context, returning the compressed data as a bytearray or as a bytes object.

Parameters:

chunk (str, bytes or buffer-compatible object) – Data to compress

Returns:

Compressed data.

Return type:

bytes or bytearray

Raises:
  • Exceptions occuring during compression.

  • OverflowError – raised if the source is too large for being compressed in the given context.

  • LZ4StreamError – raised if the call to the LZ4 library fails.

class lz4.stream.LZ4StreamDecompressor(strategy, buffer_size, return_bytearray=False, store_comp_size=4, dictionary='')

LZ4 stream decompression context.

__enter__()

Enter the LZ4 stream context.

__exit__(exc_type, exc, exc_tb)

Exit the LZ4 stream context.

decompress(chunk)

Decompress streamed compressed data.

Decompress the given chunk, using the given LZ4 stream context, Raises an exception if any error occurs.

Parameters:

chunk (str, bytes or buffer-compatible object) – Data to decompress

Returns:

Decompressed data.

Return type:

bytes or bytearray

Raises:
  • Exceptions occuring during decompression.

  • ValueError – raised if the source is inconsistent with a finite LZ4 stream block chain.

  • MemoryError – raised if the work output buffer cannot be allocated.

  • OverflowError – raised if the source is too large for being decompressed in the given context.

  • LZ4StreamError – raised if the call to the LZ4 library fails. This can be caused by decompressed_size being too small, or invalid data.

get_block(stream)

Return the first LZ4 compressed block from stream.

Parameters:

stream (str, bytes or buffer-compatible object) – LZ4 compressed stream.

Returns:

LZ4 compressed data block.

Return type:

bytes or bytearray

Raises:
  • Exceptions occuring while getting the first block from stream.

  • BufferError – raised if the function cannot return a complete LZ4 compressed block from the stream (i.e. the stream does not hold a complete block).

  • MemoryError – raised if the output buffer cannot be allocated.

  • OverflowError – raised if the source is too large for being handled by the given context.

  • LZ4StreamError – raised if used while in an out-of-band block size record configuration.