Usage
hdf5plugin
allows using additional HDF5 compression filters with h5py for reading and writing compressed datasets.
Read compressed datasets
In order to read compressed dataset with h5py, use:
import hdf5plugin
It registers hdf5plugin
supported compression filters with the HDF5 library used by h5py.
Hence, HDF5 compressed datasets can be read as any other dataset (see h5py documentation).
Write compressed datasets
As for reading compressed datasets, import hdf5plugin
is required to enable the supported compression filters.
To create a compressed dataset use h5py.Group.create_dataset and set the compression
and compression_opts
arguments.
hdf5plugin
provides helpers to prepare those compression options: Bitshuffle, Blosc, BZip2, FciDecomp, LZ4, SZ, Zfp, Zstd.
Sample code:
import numpy
import h5py
import hdf5plugin
# Compression
f = h5py.File('test.h5', 'w')
f.create_dataset('data', data=numpy.arange(100), **hdf5plugin.LZ4())
f.close()
# Decompression
f = h5py.File('test.h5', 'r')
data = f['data'][()]
f.close()
Relevant h5py documentation: Filter pipeline and Chunked Storage.
Bitshuffle
Blosc
BZip2
FciDecomp
LZ4
SZ
Zfp
Zstd
Get information about hdf5plugin
Constants:
- hdf5plugin.PLUGIN_PATH
Directory where the provided HDF5 filter plugins are stored.
Functions:
Manage registered filters
When imported, hdf5plugin initialises and registers the filters it embeds if there is no already registered filters for the corresponding filter IDs.
h5py gives access to HDF5 functions handling registered filters in h5py.h5z. This module allows checking the filter availability and registering/unregistering filters.
hdf5plugin provides an extra register function to register the filters it provides, e.g., to override an already loaded filters. Registering with this function is required to perform additional initialisation and enable writing compressed data with the given filter.
Use HDF5 filters in other applications
Non h5py or non-Python users can also benefit from the supplied HDF5 compression filters for reading compressed datasets by setting the HDF5_PLUGIN_PATH
environment variable the value of hdf5plugin.PLUGIN_PATH
, which can be retrieved from the command line with:
python -c "import hdf5plugin; print(hdf5plugin.PLUGIN_PATH)"
For instance:
export HDF5_PLUGIN_PATH=$(python -c "import hdf5plugin; print(hdf5plugin.PLUGIN_PATH)")
should allow MatLab or IDL users to read data compressed using the supported plugins.
Setting the HDF5_PLUGIN_PATH
environment variable allows already existing programs or Python code to read compressed data without any modification.