-
Notifications
You must be signed in to change notification settings - Fork 68
Use go slices #49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Use go slices #49
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looked through this and it looks good to me 👍 (altho, no expert on gozstd)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great. Big simplification using go buffers and using pooled buffers 👍 💯
|
||
func TestDecompressTooLarge(t *testing.T) { | ||
src := []byte{40, 181, 47, 253, 228, 122, 118, 105, 67, 140, 234, 85, 20, 159, 67} | ||
_, err := Decompress(nil, src) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the test name I gather that the error here is that the decompressed size is too larger for the dst buf (nil)? It would be a bit easier to read if the dst buf was non-nil, like maybe 1 byte or something
zr.sizes.dstPos = 0 | ||
|
||
inHdr := (*reflect.SliceHeader)(unsafe.Pointer(&zr.inBuf)) | ||
outHdr := (*reflect.SliceHeader)(unsafe.Pointer(&dst)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you use the noescape() trick in compressInternal(), but not for dst here... I guess there is no chance stack allocated buffers could be used in this context?
- Replace uintptr_t with void* in C wrapper functions - Use reflect.SliceHeader to access Go slice data pointers directly - Add zstdIsError helper function for cleaner error checking - Remove unnecessary stdint.h include These changes improve performance by 5-7% for large buffer operations by avoiding pointer indirection and simplifying the CGO interface. Based on valyala#49
…ng codebase Integrated community contributions: - PR valyala#49: CGO wrapper improvements for 5-7% performance gain on large buffers - Use void* instead of uintptr_t to avoid memory allocations - Direct Go slice usage via reflect.SliceHeader - PR valyala#25: Advanced Compression API with checksum support - Added CCtx type for advanced compression contexts - Added SetParameter/GetParameter methods - Added Reset and Compress2 methods - Full support for all ZSTD compression parameters - PR valyala#63: Exposed CompressDictLevel as public API - Allows fine-grained control over dictionary compression levels - PR valyala#66: RISC-V 64-bit architecture support - Updated Zig builder to 0.13.0 - Added linux_riscv64 target - PR valyala#60: Memory-optimized dictionary functions - Added NewCDictByRef/NewDDictByRef to avoid data copying - Reduces memory usage for large dictionaries Infrastructure improvements: - Created modern Dockerfile with Alpine Linux and latest Zig - Fixed build process issues with clean target - Updated minimum Go version to 1.24 Code organization: - Moved Docker configs to build/docker/ - Moved scripts to scripts/ - Moved upstream zstd to contrib/ - Moved test data to test/ - Created comprehensive examples in examples/ - Kept all Go source files in root for package compatibility Testing enhancements: - Added Silesia Corpus compression tests with speed measurements - Created 33 aggressive fuzz tests targeting known vulnerabilities - Added comprehensive tests for Advanced API - Added benchmarks comparing raw zstd vs wrapper performance The wrapper now shows 6-10% performance improvements for compression while maintaining identical compression ratios.
Fixes #48, #33
I've rewritten how the CGO wrapper is done to achieve two things:
Here's benchmark results against master (run on M1 using go 1.18):
CPU time
Throughput
Now you will notice that many results (particularly ones working with tiny buffers) are reporting being up to 20% slower, turns out this is because the CGO pointers checks are now taking significant amount of time, then again we're talking a few nanoseconds and this is completely negligible with larger buffers, so IMO this isn't that bad.
I've also made the Reader write directly into the provided buffer (if it's large enough), and those benchmarks show the biggest gain - about 5% faster when using large buffers. The ability to use the go slice directly could be also used in the Writer, but let's leave that for another PR.
I've also re-run the benchmarks with
GODEBUG=cgocheck=0
, and the results definitely look even better:Had to use a gist, github didn't like this long PR description - https://gist.github.com/mhr3/84f58f62353ef3b9db30288df00fa2b3