Skip to content

Commit

Permalink
Update contrib/libs/zstd to 1.5.7
Browse files Browse the repository at this point in the history
Release highlights are:

* The compression speed for small data blocks has been notably (~10%) improved at fast compression levels.
* The `--patch-from` functionality of the zstd CLI ... had its speed reduced to uncomfortable levels. v1.5.7 largely mitigates the speed impact of high compression levels 18+
* The compression ratio has been enhanced slightly for large data across all compression levels,
commit_hash:c66283dd802e1fd38484f7179848006c58f2baa4
  • Loading branch information
robot-piglet committed Feb 21, 2025
1 parent 03d24c0 commit 0f416bc
Show file tree
Hide file tree
Showing 79 changed files with 3,183 additions and 1,914 deletions.
30 changes: 28 additions & 2 deletions contrib/libs/zstd/CHANGELOG
Original file line number Diff line number Diff line change
@@ -1,9 +1,35 @@
V1.5.7 (Feb 2025)
fix: compression bug in 32-bit mode associated with long-lasting sessions
api: new method `ZSTD_compressSequencesAndLiterals()` (#4217, #4232)
api: `ZSTD_getFrameHeader()` works on skippable frames (#4228)
perf: substantial compression speed improvements (up to +30%) on small data, by @TocarIP (#4144) and @cyan4973 (#4165)
perf: improved compression speed (~+5%) for dictionary compression at low levels (#4170)
perf: much faster speed for `--patch-from` at high compression levels (#4276)
perf: higher `--patch-from` compression ratios, notably at high levels (#4288)
perf: better speed for binaries on Windows (@pps83) and when compiled with Visual Studio (@MessyHack)
perf: slight compression ratio improvement thanks to better block boundaries (#4136, #4176, #4178)
perf: slight compression ratio improvement for `dfast`, aka levels 3 and 4 (#4171)
perf: runtime bmi2 detection enabled on x86 32-bit mode (#4251)
cli: multi-threading as default CLI setting, by @daniellerozenblit
cli: new `--max` command (#4290)
build: improve `msbuild` version autodetection, support VS2022, by @ManuelBlanc
build: fix `meson` build by @artem and @Victor-C-Zhang, and on Windows by @bgilbert
build: compatibility with Apple Framework, by @Treata11
build: improve icc/icx compatibility, by @josepho0918 and @luau-project
build: improve compatibility with Android NDK, by Adenilson Cavalcanti
portability: linux kernel branch, with improved support for Sequence producers (@embg, @gcabiddu, @cyan4973)
portability: improved qnx compatibility, suggested by @rainbowball
portability: improved install script for FreeBSD, by @sunpoet
portability: fixed test suite compatibility with gnu hurd, by @diegonc
doc: clarify specification, by @elasota
misc: improved tests/decodecorpus validation tool (#4102), by antmicro

V1.5.6 (Mar 2024)
api: Promote `ZSTD_c_targetCBlockSize` to Stable API by @felixhandte
api: new `ZSTD_d_maxBlockSize` experimental parameter, to reduce streaming decompression memory, by @terrelln
perf: improve performance of param `ZSTD_c_targetCBlockSize`, by @Cyan4973
perf: improved compression of arrays of integers at high compression, by @Cyan4973
lib: reduce binary size with selective built-time exclusion, by @felixhandte
lib: reduce binary size with selective build-time exclusion, by @felixhandte
lib: improved huffman speed on small data and linux kernel, by @terrelln
lib: accept dictionaries with partial literal tables, by @terrelln
lib: fix CCtx size estimation with external sequence producer, by @embg
Expand Down Expand Up @@ -489,7 +515,7 @@ misc: added /contrib/docker script by @gyscos

v1.3.3 (Dec 21, 2017)
perf: faster zstd_opt strategy (levels 16-19)
fix : bug #944 : multithreading with shared ditionary and large data, reported by @gsliepen
fix : bug #944 : multithreading with shared dictionary and large data, reported by @gsliepen
cli : fix : content size written in header by default
cli : fix : improved LZ4 format support, by @felixhandte
cli : new : hidden command `-S`, to benchmark multiple files while generating one result per file
Expand Down
2 changes: 1 addition & 1 deletion contrib/libs/zstd/CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ Our contribution process works in three main stages:
* Note: run local tests to ensure that your changes didn't break existing functionality
* Quick check
```
make shortest
make check
```
* Longer check
```
Expand Down
38 changes: 24 additions & 14 deletions contrib/libs/zstd/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,10 +29,10 @@ a list of known ports and bindings is provided on [Zstandard homepage](https://f
## Benchmarks

For reference, several fast compression algorithms were tested and compared
on a desktop running Ubuntu 20.04 (`Linux 5.11.0-41-generic`),
with a Core i7-9700K CPU @ 4.9GHz,
on a desktop featuring a Core i7-9700K CPU @ 4.9GHz
and running Ubuntu 20.04 (`Linux ubu20 5.15.0-101-generic`),
using [lzbench], an open-source in-memory benchmark by @inikep
compiled with [gcc] 9.3.0,
compiled with [gcc] 9.4.0,
on the [Silesia compression corpus].

[lzbench]: https://github.com/inikep/lzbench
Expand All @@ -41,24 +41,23 @@ on the [Silesia compression corpus].

| Compressor name | Ratio | Compression| Decompress.|
| --------------- | ------| -----------| ---------- |
| **zstd 1.5.1 -1** | 2.887 | 530 MB/s | 1700 MB/s |
| **zstd 1.5.6 -1** | 2.887 | 510 MB/s | 1580 MB/s |
| [zlib] 1.2.11 -1 | 2.743 | 95 MB/s | 400 MB/s |
| brotli 1.0.9 -0 | 2.702 | 395 MB/s | 450 MB/s |
| **zstd 1.5.1 --fast=1** | 2.437 | 600 MB/s | 2150 MB/s |
| **zstd 1.5.1 --fast=3** | 2.239 | 670 MB/s | 2250 MB/s |
| quicklz 1.5.0 -1 | 2.238 | 540 MB/s | 760 MB/s |
| **zstd 1.5.1 --fast=4** | 2.148 | 710 MB/s | 2300 MB/s |
| lzo1x 2.10 -1 | 2.106 | 660 MB/s | 845 MB/s |
| [lz4] 1.9.3 | 2.101 | 740 MB/s | 4500 MB/s |
| lzf 3.6 -1 | 2.077 | 410 MB/s | 830 MB/s |
| snappy 1.1.9 | 2.073 | 550 MB/s | 1750 MB/s |
| brotli 1.0.9 -0 | 2.702 | 395 MB/s | 430 MB/s |
| **zstd 1.5.6 --fast=1** | 2.437 | 545 MB/s | 1890 MB/s |
| **zstd 1.5.6 --fast=3** | 2.239 | 650 MB/s | 2000 MB/s |
| quicklz 1.5.0 -1 | 2.238 | 525 MB/s | 750 MB/s |
| lzo1x 2.10 -1 | 2.106 | 650 MB/s | 825 MB/s |
| [lz4] 1.9.4 | 2.101 | 700 MB/s | 4000 MB/s |
| lzf 3.6 -1 | 2.077 | 420 MB/s | 830 MB/s |
| snappy 1.1.9 | 2.073 | 530 MB/s | 1660 MB/s |

[zlib]: https://www.zlib.net/
[lz4]: https://lz4.github.io/lz4/

The negative compression levels, specified with `--fast=#`,
offer faster compression and decompression speed
at the cost of compression ratio (compared to level 1).
at the cost of compression ratio.

Zstd can also offer stronger compression ratios at the cost of compression speed.
Speed vs Compression trade-off is configurable by small increments.
Expand Down Expand Up @@ -185,6 +184,17 @@ You can build and install zstd [vcpkg](https://github.com/Microsoft/vcpkg/) depe
The zstd port in vcpkg is kept up to date by Microsoft team members and community contributors.
If the version is out of date, please [create an issue or pull request](https://github.com/Microsoft/vcpkg) on the vcpkg repository.

### Conan

You can install pre-built binaries for zstd or build it from source using [Conan](https://conan.io/). Use the following command:

```bash
conan install --requires="zstd/[*]" --build=missing
```

The zstd Conan recipe is kept up to date by Conan maintainers and community contributors.
If the version is out of date, please [create an issue or pull request](https://github.com/conan-io/conan-center-index) on the ConanCenterIndex repository.

### Visual Studio (Windows)

Going into `build` directory, you will find additional possibilities:
Expand Down
11 changes: 11 additions & 0 deletions contrib/libs/zstd/lib/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,12 +27,16 @@ Enabling multithreading requires 2 conditions :

For convenience, we provide a build target to generate multi and single threaded libraries:
- Force enable multithreading on both dynamic and static libraries by appending `-mt` to the target, e.g. `make lib-mt`.
Note that the `.pc` generated on calling `make lib-mt` will already include the require Libs and Cflags.
- Force disable multithreading on both dynamic and static libraries by appending `-nomt` to the target, e.g. `make lib-nomt`.
- By default, as mentioned before, dynamic library is multithreaded, and static library is single-threaded, e.g. `make lib`.

When linking a POSIX program with a multithreaded version of `libzstd`,
note that it's necessary to invoke the `-pthread` flag during link stage.

The `.pc` generated from `make install` or `make install-pc` always assume a single-threaded static library
is compiled. To correctly generate a `.pc` for the multi-threaded static library, set `MT=1` as ENV variable.

Multithreading capabilities are exposed
via the [advanced API defined in `lib/zstd.h`](https://github.com/facebook/zstd/blob/v1.4.3/lib/zstd.h#L351).

Expand Down Expand Up @@ -145,6 +149,13 @@ The file structure is designed to make this selection manually achievable for an
will expose the deprecated `ZSTDMT` API exposed by `zstdmt_compress.h` in
the shared library, which is now hidden by default.

- The build macro `STATIC_BMI2` can be set to 1 to force usage of `bmi2` instructions.
It is generally not necessary to set this build macro,
because `STATIC_BMI2` will be automatically set to 1
on detecting the presence of the corresponding instruction set in the compilation target.
It's nonetheless available as an optional manual toggle for better control,
and can also be used to forcefully disable `bmi2` instructions by setting it to 0.

- The build macro `DYNAMIC_BMI2` can be set to 1 or 0 in order to generate binaries
which can detect at runtime the presence of BMI2 instructions, and use them only if present.
These instructions contribute to better performance, notably on the decoder side.
Expand Down
179 changes: 92 additions & 87 deletions contrib/libs/zstd/lib/common/bits.h
Original file line number Diff line number Diff line change
Expand Up @@ -28,27 +28,29 @@ MEM_STATIC unsigned ZSTD_countTrailingZeros32_fallback(U32 val)
MEM_STATIC unsigned ZSTD_countTrailingZeros32(U32 val)
{
assert(val != 0);
# if defined(_MSC_VER)
# if STATIC_BMI2 == 1
return (unsigned)_tzcnt_u32(val);
# else
if (val != 0) {
unsigned long r;
_BitScanForward(&r, val);
return (unsigned)r;
} else {
/* Should not reach this code path */
__assume(0);
}
# endif
# elif defined(__GNUC__) && (__GNUC__ >= 4)
return (unsigned)__builtin_ctz(val);
# else
return ZSTD_countTrailingZeros32_fallback(val);
# endif
#if defined(_MSC_VER)
# if STATIC_BMI2
return (unsigned)_tzcnt_u32(val);
# else
if (val != 0) {
unsigned long r;
_BitScanForward(&r, val);
return (unsigned)r;
} else {
__assume(0); /* Should not reach this code path */
}
# endif
#elif defined(__GNUC__) && (__GNUC__ >= 4)
return (unsigned)__builtin_ctz(val);
#elif defined(__ICCARM__)
return (unsigned)__builtin_ctz(val);
#else
return ZSTD_countTrailingZeros32_fallback(val);
#endif
}

MEM_STATIC unsigned ZSTD_countLeadingZeros32_fallback(U32 val) {
MEM_STATIC unsigned ZSTD_countLeadingZeros32_fallback(U32 val)
{
assert(val != 0);
{
static const U32 DeBruijnClz[32] = {0, 9, 1, 10, 13, 21, 2, 29,
Expand All @@ -67,86 +69,89 @@ MEM_STATIC unsigned ZSTD_countLeadingZeros32_fallback(U32 val) {
MEM_STATIC unsigned ZSTD_countLeadingZeros32(U32 val)
{
assert(val != 0);
# if defined(_MSC_VER)
# if STATIC_BMI2 == 1
return (unsigned)_lzcnt_u32(val);
# else
if (val != 0) {
unsigned long r;
_BitScanReverse(&r, val);
return (unsigned)(31 - r);
} else {
/* Should not reach this code path */
__assume(0);
}
# endif
# elif defined(__GNUC__) && (__GNUC__ >= 4)
return (unsigned)__builtin_clz(val);
# else
return ZSTD_countLeadingZeros32_fallback(val);
# endif
#if defined(_MSC_VER)
# if STATIC_BMI2
return (unsigned)_lzcnt_u32(val);
# else
if (val != 0) {
unsigned long r;
_BitScanReverse(&r, val);
return (unsigned)(31 - r);
} else {
__assume(0); /* Should not reach this code path */
}
# endif
#elif defined(__GNUC__) && (__GNUC__ >= 4)
return (unsigned)__builtin_clz(val);
#elif defined(__ICCARM__)
return (unsigned)__builtin_clz(val);
#else
return ZSTD_countLeadingZeros32_fallback(val);
#endif
}

MEM_STATIC unsigned ZSTD_countTrailingZeros64(U64 val)
{
assert(val != 0);
# if defined(_MSC_VER) && defined(_WIN64)
# if STATIC_BMI2 == 1
return (unsigned)_tzcnt_u64(val);
# else
if (val != 0) {
unsigned long r;
_BitScanForward64(&r, val);
return (unsigned)r;
} else {
/* Should not reach this code path */
__assume(0);
}
# endif
# elif defined(__GNUC__) && (__GNUC__ >= 4) && defined(__LP64__)
return (unsigned)__builtin_ctzll(val);
# else
{
U32 mostSignificantWord = (U32)(val >> 32);
U32 leastSignificantWord = (U32)val;
if (leastSignificantWord == 0) {
return 32 + ZSTD_countTrailingZeros32(mostSignificantWord);
} else {
return ZSTD_countTrailingZeros32(leastSignificantWord);
}
#if defined(_MSC_VER) && defined(_WIN64)
# if STATIC_BMI2
return (unsigned)_tzcnt_u64(val);
# else
if (val != 0) {
unsigned long r;
_BitScanForward64(&r, val);
return (unsigned)r;
} else {
__assume(0); /* Should not reach this code path */
}
# endif
#elif defined(__GNUC__) && (__GNUC__ >= 4) && defined(__LP64__)
return (unsigned)__builtin_ctzll(val);
#elif defined(__ICCARM__)
return (unsigned)__builtin_ctzll(val);
#else
{
U32 mostSignificantWord = (U32)(val >> 32);
U32 leastSignificantWord = (U32)val;
if (leastSignificantWord == 0) {
return 32 + ZSTD_countTrailingZeros32(mostSignificantWord);
} else {
return ZSTD_countTrailingZeros32(leastSignificantWord);
}
# endif
}
#endif
}

MEM_STATIC unsigned ZSTD_countLeadingZeros64(U64 val)
{
assert(val != 0);
# if defined(_MSC_VER) && defined(_WIN64)
# if STATIC_BMI2 == 1
return (unsigned)_lzcnt_u64(val);
# else
if (val != 0) {
unsigned long r;
_BitScanReverse64(&r, val);
return (unsigned)(63 - r);
} else {
/* Should not reach this code path */
__assume(0);
}
# endif
# elif defined(__GNUC__) && (__GNUC__ >= 4)
return (unsigned)(__builtin_clzll(val));
# else
{
U32 mostSignificantWord = (U32)(val >> 32);
U32 leastSignificantWord = (U32)val;
if (mostSignificantWord == 0) {
return 32 + ZSTD_countLeadingZeros32(leastSignificantWord);
} else {
return ZSTD_countLeadingZeros32(mostSignificantWord);
}
#if defined(_MSC_VER) && defined(_WIN64)
# if STATIC_BMI2
return (unsigned)_lzcnt_u64(val);
# else
if (val != 0) {
unsigned long r;
_BitScanReverse64(&r, val);
return (unsigned)(63 - r);
} else {
__assume(0); /* Should not reach this code path */
}
# endif
#elif defined(__GNUC__) && (__GNUC__ >= 4)
return (unsigned)(__builtin_clzll(val));
#elif defined(__ICCARM__)
return (unsigned)(__builtin_clzll(val));
#else
{
U32 mostSignificantWord = (U32)(val >> 32);
U32 leastSignificantWord = (U32)val;
if (mostSignificantWord == 0) {
return 32 + ZSTD_countLeadingZeros32(leastSignificantWord);
} else {
return ZSTD_countLeadingZeros32(mostSignificantWord);
}
# endif
}
#endif
}

MEM_STATIC unsigned ZSTD_NbCommonBytes(size_t val)
Expand Down
Loading

0 comments on commit 0f416bc

Please sign in to comment.