Skip to content

Commit fecf7dd

Browse files
authored
Merge pull request #44 from pycompression/release_0.5.0
Release 0.5.0
2 parents 8c99519 + e06c7c8 commit fecf7dd

16 files changed

+375
-78
lines changed

.github/workflows/ci.yml

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,25 @@ jobs:
4444
run: tox -e docs
4545
env:
4646
PYTHON_ISAL_LINK_DYNAMIC: True
47+
mypy:
48+
needs: lint
49+
runs-on: ubuntu-20.04
50+
steps:
51+
- uses: actions/[email protected]
52+
with:
53+
submodules: recursive
54+
- name: Set up Python 3.6
55+
uses: actions/[email protected]
56+
with:
57+
python-version: 3.6
58+
- name: Install isal
59+
run: sudo apt-get install libisal-dev
60+
- name: Install tox and upgrade setuptools and pip
61+
run: pip install --upgrade tox setuptools pip
62+
- name: Mypy checks
63+
run: tox -e mypy
64+
env:
65+
PYTHON_ISAL_LINK_DYNAMIC: True
4766
twine_check:
4867
needs: lint
4968
runs-on: ${{ matrix.os }}

CHANGELOG.rst

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,19 @@ Changelog
77
.. This document is user facing. Please word the changes in such a way
88
.. that users understand how the changes affect the new version.
99
10+
version 0.5.0
11+
-----------------
12+
+ Fix a bug where negative integers were not allowed for the ``adler32`` and
13+
``crc32`` functions in ``isal_zlib``.
14+
+ Provided stubs (type-hint files) for ``isal_zlib`` and ``_isal`` modules.
15+
Package is now tested with mypy to ensure correct type information.
16+
+ The command-line interface now reads in blocks of 32K instead of 8K. This
17+
improves performance by about 6% when compressing and 11% when decompressing.
18+
A hidden ``-b`` flag was added to adjust the buffer size for benchmarks.
19+
+ A ``-c`` or ``--stdout`` flag was added to the CLI interface of isal.igzip.
20+
This allows it to behave more like the ``gzip`` or ``pigz`` command line
21+
interfaces.
22+
1023
version 0.4.0
1124
-----------------
1225
+ Move wheel building to cibuildwheel on github actions CI. Wheels are now

README.rst

Lines changed: 27 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,29 @@ a variety of functions to provide zlib/gzip-compatible compression.
4343
``igzip`` module which are usable as drop-in replacements for the ``zlib``
4444
and ``gzip`` modules from the stdlib (with some minor exceptions, see below).
4545

46+
Usage
47+
-----
48+
49+
Python-isal has faster versions of the stdlib's ``zlib`` and ``gzip`` module
50+
these are called ``isal_zlib`` and ``igzip`` respectively.
51+
52+
They can be imported as follows
53+
54+
.. code-block:: python
55+
56+
from isal import isal_zlib
57+
from isal import igzip
58+
59+
``isal_zlib`` and ``igzip`` are meant to be used as drop in replacements so
60+
their api and functions are the same as the stdlib's modules. Except where
61+
isa-l does not support the same calls as zlib (See differences below).
62+
63+
A full API documentation can be found on `our readthedocs page
64+
<https://python-isal.readthedocs.io>`_.
65+
66+
``python -m isal.igzip`` implements a simple gzip-like command line
67+
application (just like ``python -m gzip``).
68+
4669
Installation
4770
------------
4871
Installation with pip
@@ -99,39 +122,13 @@ python-isal is available on conda-forge and can be installed with
99122
This will automatically install the isa-l library dependency as well, since
100123
it is available on conda-forge.
101124

102-
103-
Usage
104-
-----
105-
106-
Python-isal has faster versions of the stdlib's ``zlib`` and ``gzip`` module
107-
these are called ``isal_zlib`` and ``igzip`` respectively.
108-
109-
They can be imported as follows
110-
111-
.. code-block:: python
112-
113-
from isal import isal_zlib
114-
from isal import igzip
115-
116-
``isal_zlib`` and ``igzip`` were meant to be used as drop in replacements so
117-
their api and functions are the same as the stdlib's modules. Except where
118-
isa-l does not support the same calls as zlib (See differences below).
119-
120-
A full API documentation can be found on `our readthedocs page
121-
<https://python-isal.readthedocs.io>`_.
122-
123-
``python -m isal.igzip`` implements a simple gzip-like command line
124-
application (just like ``python -m gzip``).
125-
126125
Differences with zlib and gzip modules
127126
--------------------------------------
128127

129128
+ Compression level 0 in ``zlib`` and ``gzip`` means **no compression**, while
130129
in ``isal_zlib`` and ``igzip`` this is the **lowest compression level**.
131130
This is a design choice that was inherited from the ISA-L library.
132131
+ Compression levels range from 0 to 3, not 1 to 9.
133-
+ ``isal_zlib.crc32`` and ``isal_zlib.adler32`` do not support negative
134-
numbers for the value parameter.
135132
+ ``zlib.Z_DEFAULT_STRATEGY``, ``zlib.Z_RLE`` etc. are exposed as
136133
``isal_zlib.Z_DEFAULT_STRATEGY``, ``isal_zlib.Z_RLE`` etc. for compatibility
137134
reasons. However, ``isal_zlib`` only supports a default strategy and will
@@ -140,13 +137,14 @@ Differences with zlib and gzip modules
140137
``isal_zlib`` supports memory levels smallest, small, medium, large and
141138
largest. These have been mapped to levels 1, 2-3, 4-6, 7-8 and 9. So
142139
``isal_zlib`` can be used with zlib compatible memory levels.
143-
+ ``isal_zlib`` only supports ``FLUSH``, ``SYNC_FLUSH`` and ``FULL_FLUSH``
144-
``FINISH`` is aliased to ``FULL_FLUSH`` (and works correctly as such).
145140
+ ``isal_zlib`` has a ``compressobj`` and ``decompressobj`` implementation.
146141
However, the unused_data and unconsumed_tail for the Decompress object, only
147142
work properly when using gzip compatible compression. (25 <= wbits <= 31).
148143
+ The flush implementation for the Compress object behavious differently from
149-
the zlib equivalent.
144+
the zlib equivalent. The flush implementation is sufficient for
145+
the ``igzip`` module to work 100% in compliance with the ``gzip`` tests from
146+
CPython. It does not however work for all the ``zlib`` compliance tests
147+
(see above). This is an area that still needs work.
150148

151149
Contributing
152150
------------

codecov.yml

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
comment: off
2+
coverage:
3+
status:
4+
project:
5+
default:
6+
target: 90 # let's try to hit high standards
7+
patch:
8+
default:
9+
target: 90 # Tests should be written for new features

setup.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -114,7 +114,7 @@ def build_isa_l():
114114

115115
setup(
116116
name="isal",
117-
version="0.4.0",
117+
version="0.5.0",
118118
description="Faster zlib and gzip compatible compression and "
119119
"decompression by providing python bindings for the isa-l "
120120
"library.",
@@ -128,7 +128,7 @@ def build_isa_l():
128128
zip_safe=False,
129129
packages=find_packages('src'),
130130
package_dir={'': 'src'},
131-
package_data={'isal': ['*.pxd', '*.pyx',
131+
package_data={'isal': ['*.pxd', '*.pyx', '*.pyi', 'py.typed',
132132
# Include isa-l LICENSE and other relevant files
133133
# with the binary distribution.
134134
'isa-l/LICENSE', 'isa-l/README.md',

src/isal/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,4 +29,4 @@
2929
"__version__"
3030
]
3131

32-
__version__ = "0.4.0"
32+
__version__ = "0.5.0"

src/isal/_isal.pyi

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
# Copyright (c) 2020 Leiden University Medical Center
2+
#
3+
# Permission is hereby granted, free of charge, to any person obtaining a copy
4+
# of this software and associated documentation files (the "Software"), to deal
5+
# in the Software without restriction, including without limitation the rights
6+
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
7+
# copies of the Software, and to permit persons to whom the Software is
8+
# furnished to do so, subject to the following conditions:
9+
#
10+
# The above copyright notice and this permission notice shall be included in
11+
# all copies or substantial portions of the Software.
12+
#
13+
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
14+
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
15+
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
16+
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
17+
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
18+
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
19+
# SOFTWARE.
20+
21+
ISAL_MAJOR_VERSION: int
22+
ISAL_MINOR_VERSION: int
23+
ISAL_PATCH_VERSION: int
24+
ISAL_VERSION: str

src/isal/igzip.py

Lines changed: 37 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -27,8 +27,6 @@
2727
import os
2828
import sys
2929

30-
import _compression
31-
3230
from . import isal_zlib
3331

3432
__all__ = ["IGzipFile", "open", "compress", "decompress", "BadGzipFile"]
@@ -37,10 +35,8 @@
3735
_COMPRESS_LEVEL_TRADEOFF = isal_zlib.ISAL_DEFAULT_COMPRESSION
3836
_COMPRESS_LEVEL_BEST = isal_zlib.ISAL_BEST_COMPRESSION
3937

40-
BUFFER_SIZE = _compression.BUFFER_SIZE
41-
4238
try:
43-
BadGzipFile = gzip.BadGzipFile
39+
BadGzipFile = gzip.BadGzipFile # type: ignore
4440
except AttributeError: # Versions lower than 3.8 do not have BadGzipFile
4541
BadGzipFile = OSError
4642

@@ -80,7 +76,8 @@ def open(filename, mode="rb", compresslevel=_COMPRESS_LEVEL_TRADEOFF,
8076
raise ValueError("Argument 'newline' not supported in binary mode")
8177

8278
gz_mode = mode.replace("t", "")
83-
if isinstance(filename, (str, bytes, os.PathLike)):
79+
# __fspath__ method is os.PathLike
80+
if isinstance(filename, (str, bytes)) or hasattr(filename, "__fspath__"):
8481
binary_file = IGzipFile(filename, gz_mode, compresslevel)
8582
elif hasattr(filename, "read") or hasattr(filename, "write"):
8683
binary_file = IGzipFile(None, gz_mode, compresslevel, filename)
@@ -276,33 +273,46 @@ def main():
276273
"-d", "--decompress", action="store_false",
277274
dest="compress",
278275
help="Decompress the file instead of compressing.")
276+
parser.add_argument("-c", "--stdout", action="store_true",
277+
help="write on standard output")
278+
# -b flag not taken by either gzip or igzip. Hidden attribute. Above 32K
279+
# diminishing returns hit. _compression.BUFFER_SIZE = 8k. But 32K is about
280+
# ~6% faster.
281+
parser.add_argument("-b", "--buffer-size",
282+
default=32 * 1024, type=int,
283+
help=argparse.SUPPRESS)
279284
args = parser.parse_args()
280285

281286
compresslevel = args.compresslevel or _COMPRESS_LEVEL_TRADEOFF
282287

283-
if args.file is None:
284-
if args.compress:
285-
in_file = sys.stdin.buffer
286-
out_file = IGzipFile(mode="wb", compresslevel=compresslevel,
287-
fileobj=sys.stdout.buffer)
288-
else:
289-
in_file = IGzipFile(mode="rb", fileobj=sys.stdin.buffer)
290-
out_file = sys.stdout.buffer
291-
else:
292-
if args.compress:
293-
in_file = io.open(args.file, mode="rb")
294-
out_file = open(args.file + ".gz", mode="wb",
295-
compresslevel=compresslevel)
296-
else:
297-
base, extension = os.path.splitext(args.file)
298-
if extension != ".gz":
299-
print(f"filename doesn't end in .gz: {args.file}")
300-
return
301-
in_file = open(args.file, "rb")
302-
out_file = io.open(base, "wb")
288+
# Determine input file
289+
if args.compress and args.file is None:
290+
in_file = sys.stdin.buffer
291+
elif args.compress and args.file is not None:
292+
in_file = io.open(args.file, mode="rb")
293+
elif not args.compress and args.file is None:
294+
in_file = IGzipFile(mode="rb", fileobj=sys.stdin.buffer)
295+
elif not args.compress and args.file is not None:
296+
base, extension = os.path.splitext(args.file)
297+
if extension != ".gz":
298+
raise ValueError(f"filename doesn't end in .gz: {args.file}. ")
299+
in_file = open(args.file, "rb")
300+
301+
# Determine output file
302+
if args.compress and (args.file is None or args.stdout):
303+
out_file = IGzipFile(mode="wb", compresslevel=compresslevel,
304+
fileobj=sys.stdout.buffer)
305+
elif args.compress and args.file is not None:
306+
out_file = open(args.file + ".gz", mode="wb",
307+
compresslevel=compresslevel)
308+
elif not args.compress and (args.file is None or args.stdout):
309+
out_file = sys.stdout.buffer
310+
elif not args.compress and args.file is not None:
311+
out_file = io.open(base, "wb")
312+
303313
try:
304314
while True:
305-
block = in_file.read(BUFFER_SIZE)
315+
block = in_file.read(args.buffer_size)
306316
if block == b"":
307317
break
308318
out_file.write(block)

src/isal/isal_zlib.pyi

Lines changed: 75 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,75 @@
1+
# Copyright (c) 2020 Leiden University Medical Center
2+
#
3+
# Permission is hereby granted, free of charge, to any person obtaining a copy
4+
# of this software and associated documentation files (the "Software"), to deal
5+
# in the Software without restriction, including without limitation the rights
6+
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
7+
# copies of the Software, and to permit persons to whom the Software is
8+
# furnished to do so, subject to the following conditions:
9+
#
10+
# The above copyright notice and this permission notice shall be included in
11+
# all copies or substantial portions of the Software.
12+
#
13+
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
14+
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
15+
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
16+
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
17+
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
18+
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
19+
# SOFTWARE.
20+
21+
ISAL_BEST_SPEED: int
22+
ISAL_BEST_COMPRESSION: int
23+
ISAL_DEFAULT_COMPRESSION: int
24+
Z_BEST_SPEED: int
25+
Z_BEST_COMPRESSION: int
26+
Z_DEFAULT_COMPRESSION: int
27+
28+
DEF_BUF_SIZE: int
29+
DEF_MEM_LEVEL: int
30+
MAX_WBITS: int
31+
ISAL_DEFAULT_HIST_BITS: int
32+
33+
DEFLATED: int
34+
35+
Z_DEFAULT_STRATEGY: int
36+
Z_RLE: int
37+
Z_HUFFMAN_ONLY: int
38+
Z_FILTERED: int
39+
Z_FIXED: int
40+
41+
ISAL_NO_FLUSH: int
42+
ISAL_SYNC_FLUSH: int
43+
ISAL_FULL_FLUSH: int
44+
45+
Z_NO_FLUSH: int
46+
Z_SYNC_FLUSH: int
47+
Z_FINISH: int
48+
49+
class IsalError(OSError): ...
50+
51+
error: IsalError
52+
53+
def adler32(data, value: int = ...) -> int: ...
54+
def crc32(data, value: int = ...) -> int: ...
55+
56+
def compress(data, level: int = ..., wbits: int = ...) -> bytes: ...
57+
def decompress(data, wbits: int = ..., bufsize: int = ...) -> bytes: ...
58+
59+
class Compress:
60+
def compress(self, data) -> bytes: ...
61+
def flush(self, mode: int = ...) -> bytes: ...
62+
63+
class Decompress:
64+
unused_data: bytes
65+
unconsumed_tail: bytes
66+
eof: bool
67+
crc: int
68+
69+
def decompress(self, data, max_length: int = ...) -> bytes: ...
70+
def flush(self, length: int = ...) -> bytes: ...
71+
72+
def compressobj(level: int = ..., method: int = ..., wbits: int = ...,
73+
memLevel: int = ..., strategy: int = ..., zdict = ...
74+
) -> Compress: ...
75+
def decompressobj(wbits: int = ..., zdict = ...) -> Decompress: ...

0 commit comments

Comments
 (0)