-
Notifications
You must be signed in to change notification settings - Fork 1.9k
zfs-2.4.0-rc4 patch set #17930
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
behlendorf
wants to merge
37
commits into
openzfs:zfs-2.4-release
Choose a base branch
from
behlendorf:zfs-2.4.0-rc4-wip
base: zfs-2.4-release
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
zfs-2.4.0-rc4 patch set #17930
+1,386
−1,059
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
zfs-tests.sh executes test-runner.py to do the actual test work. Any exit code < 4 is interpreted as success, with the actual value describing the outcome of the tests inside. If a Python program crashes in some way (eg an uncaught exception), the process exit code is 1. Taken together, this means that test-runner.py can crash during setup, but return a "success" error code to zfs-tests.sh, which will report and exit 0. This in turn causes the CI runner to believe the test run completed successfully. This commit addresses this by making zfs-tests.sh interpret an exit code of 255 as a failure in the runner itself. Then, in test-runner.py, the "fail()" function defaults to a 255 return, and the main function gets wrapped in a generic exception handler, which prints it and calls fail(). All together, this should mean that any unexpected failure in the test runner itself will be propagated out of zfs-tests.sh for CI or any other calling program to deal with. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Rob Norris <[email protected]> Closes openzfs#17858
We’re not always on the same page, but at least we’re in the same book. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: George Melikov <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Rob Norris <[email protected]> Closes openzfs#17860
In zio_crypt_key_wrap and zio_crypt_key_unwrap, the cuio_s variable was not initialized before the calls to zfs_uio_init, leading to uninitialized access to cuio_s.uio_offset. Initialize it to avoid gcc warnings. Similar issue as fixed in 2bf1520 ("Fix gcc uninitialized warning in FreeBSD zio_crypt.c") Signed-off-by: Ryan Libby <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Closes openzfs#17863
functional/trim tests do create pools of different types to test trim, autotrim_config.ksh is missing the type from zpool create command line while we are looping over different pool types. Sponsored-by: Edgecast Cloud LLC. Signed-off-by: Toomas Soome <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Closes openzfs#17874
Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Quartz <[email protected]> Closes openzfs#17868
Over the time many of DMU functions got flags argument to control prefetch, caching, etc. Few functions though left without it, even though closer look shown that many of them do not require prefetch due to their access pattern. This patch adds the flags argument to dmu_write(), dmu_buf_hold_array() and dmu_buf_hold_array_by_bonus(), passing DMU_READ_NO_PREFETCH where applicable. I am going to also pass DMU_UNCACHEDIO to some of them later. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Rob Norris <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Closes openzfs#17872
Remove the out of date helper scripts originally used to port Illumos commits to the ZoL repository. Due to layout changes made to this repository they're no longer entirely correct. Remove them to make it clear they're no longer being used or actively maintained. Reviewed-by: Rob Norris <[email protected]> Reviewed-by: George Melikov <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes openzfs#17880
Ultimately this is a revert of 779ac93, which according to @nabijaczleweli is to paper over automake <1.14's lack of %reldir% support. As I understand it, EL8 is the lowest current build target. Reviewed-by: Rob Norris <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Joseph Holsten <[email protected]> Closes openzfs#17878
Since we set bv_mos_brtvdev block size, and since we keep dirty bitmap at the same granularity, we should keep the allocations and writes done with. Otherwise it makes the last block write short, that will be odd once we implement writing of only dirty blocks, but also requires read-modify-write on DMU layer. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Closes openzfs#17875
Both DDT log and BRT counters we read on pool import and then only append or overwrite in full blocks. We don't need them in DMU or ARC caches. Fortunately we have DMU_UNCACHEDIO for this now. Even more we don't need BRT in non-evictable metadata DMU caches, since it will likely never fit there, while block the cache from its original users. Since DMU_OT_IS_METADATA_CACHED() has no way to differentiate the new metadata types, mark BRT with storage type of DMU_OT_DDT_ZAP. As side effect it will also put it on dedup device, but that should actually be right. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Closes openzfs#17875
In cases where all issued ZIOs must succeed, and we can't do anything clever about the errors, we should just explicitly set ZIO_FLAG_TRYHARD and let OS to do all the reasonable retries. In other cases, where retries can be different from the original, for example, some ZIOs are allowed to fail due to redundancy, or we can disable aggregation on retrial to get at least some of the data, we can do first pass without TRYHARD, and only if needed retry with ZIO_FLAG_IO_RETRY (which implies TRYHARD semantics). Reviewed-by: Rob Norris <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Closes openzfs#17877
Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: George Melikov <[email protected]> Signed-off-by: Sivesh Kumar <[email protected]> Closes openzfs#17889
Currently this function uses L0 offsets which: 1. is hard to read since it maps offsets to blkid and back each call 2. necessitates dnode_next_block to handle edge cases at limits 3. makes it hard to tell if the traversal can loop infinitely Instead, update this and dnode_next_offset to work in (blkid, index). This way the blkid manipulations are clear, and it's also clear that the traversal always terminates since blkid goes one direction. I've also considered updating dnode_next_offset to operate on blkid. Callers use both patterns, so maybe another PR can split the cases? While here tidy up dnode_next_offset_level comments. Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Robert Evans <[email protected]> Closes openzfs#17792
The label 'kfdok' is only used with O_TMPFILE, we need to use the same #ifdef around this label. Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Toomas Soome <[email protected]> Closes openzfs#17894
Update FreeBSD versions: - add FreeBSD 15.0-STABLE - add FreeBSD 16.0-CURRENT So we use the latest versions of each line now: - Freebsd 14.3 (RELEASE) - FreeBSD 15.0 (STABLE) - FreeBSD 16.0 (CURRENT) In commits - you may specify which type of CI should run: - ZFS-CI-Type: quick - ZFS-CI-Type: linux - ZFS-CI-Type: freebsd - ZFS-CI-Type: full Reviewed-by: Alexx Saver <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tino Reichardt <[email protected]> Closes openzfs#17896
When a write comes in via dmu_sync_late_arrival, its txg is equal to the open TXG. If that write gangs, and we have not yet activated the new gang header feature, and the gang header we pick can store a larger gang header, we will try to schedule the upgrade for the open TXG + 1. In debug mode, this causes an assertion to trip. This PR sets the TXG for activating the feature to be the larger of either the current open TXG or the syncing TXG + 1. Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Paul Dagnelie <[email protected]> Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Closes openzfs#17824
We need to specifically use the FX_XFLAG_* macros in zpl_ioctl_*attr() codepaths, and the FS_*_FL macros in the zpl_ioctl_*flags() codepaths. The earlier code just assumes the FS_*_FL macros for both codepaths. The 6.17 kernel add a bitmask check in copy_fsxattr_from_user() that exposed this error via failing 'projectquota' ZTS tests. Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Closes openzfs#17884 Closes openzfs#17869
Add an introductory sentance explaining why the reader may want to use this command, and establishing the requirement that the jail must be running. Move other requirements from the description of the subcommands to follow this for flow and structure. Move the caveat that this is for FreeBSD down to a cannonical CAVEATS section, and crossreference Linux's equivelant functionality. Mention that this utility can not be used to delegate the root directory of the jail to that section also. Reported by: Jan Brankamp <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Allan Jude <[email protected]> Signed-off-by: Alexander Ziaee <[email protected]> Closes openzfs#17883
We've heard anecdotes that suggest some confusion/surprise/disappointment that a changed recordsize is not applied during rewrite. Until such time as we actually can do that, we can at least explicitly mention it at something that doesn't work. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Allan Jude <[email protected]> Signed-off-by: Rob Norris <[email protected]> Closes openzfs#17898
We have infinite loop and on certain condition, we exit this loop and thread with pthread_exit(). But also after this loop, we have a code to perform pthread_cleanup_pop() and return from the thread. The problem is that modern compilers are able to recognize that we actually never get to the statements after loop and therefore it is dead code there. I think, instead of pthread_exit(), it is better to break out of loop and let the last statements to work as intended. This is because we do need to keep pthread_cleanup_pop() anyhow. Of course, it is matter of taste if we want to use return or pthread_exit as very last statement in this function. Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Igor Kozhukhov <[email protected]> Signed-off-by: Toomas Soome <[email protected]> Closes openzfs#17900
Change the spelling of "begining" on line 4875 to "beginning". Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Adi Gollamudi <[email protected]> Closes openzfs#17905
Disable the aarch64 NEON SIMD intrinsics for kernel builds. Safely using them in the kernel context requires saving/restoring the FPU registers which is not currently done. Additionally, remove the aarch64 optimized PREFETCH_L1 and PREFETCH_L2 instruction. Rely on the more portable compiler built ins. This lets us remove the problematic workaround in the aarch64_compat.h header which undefines the __aarch64__ macro. Reviewed-by: Rob Norris <[email protected]> Reviewed-by: Tino Reichardt <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes openzfs#17904 Closes openzfs#17852
FreeBSD now has a pathconf name called _PC_CASE_INSENSITIVE used to check if a file system performs case insensitive name lookups. This patch adds support for this name. Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rick Macklem <[email protected]> Closes openzfs#17908
Free issue threads might block waiting for synchronous DDT, BRT or GANG header reads. So unlike other taskqs using ZTI_SCALE to scale with number of CPUs, here we also need some amount of threads to potentially saturate pool reads. I am not sure we always want the 96 threads we had before ZTI_SCALE introduction at openzfs#11966 on small systems, but lets make it at least 32. While here, make free taskqs configurable, similar to read and write ones. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Rob Norris <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Closes openzfs#17903
Update description of zpool import --rewind-to-checkpoint in man/man7/zpoolconcepts.7 to explain that rewinding automatically discards a checkpoint. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Adi Gollamudi <[email protected]> Closes openzfs#12646 Closes openzfs#17918
BRT_RANGESIZE_TO_NBLOCKS() takes number of ranges as its argument. To get number of blocks we should multiply it by the entry size, not divide by it, as it was due to missing parentheses. Before openzfs#17875 this could cause small memory corruptions for vdevs bigger than 64TB, but the change made the bug more noticeable. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Closes openzfs#17886 Closes openzfs#17915
This is useful as debugging support, as it lets namespace lock operations be traced directly. It will also be useful for future work to reduce the use of spa_namespace_lock, traditionally a source of difficult deadlocks. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Rob Norris <[email protected]> Closes openzfs#17906
dmu_object_info_from_dnode() takes two locks and copies plenty of data that we don't need in zap_lockdir_impl(). Just read dn_type directly in this hot path. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Closes openzfs#17921
According to my observations, BRT ZAPs are typically compressible 3:1 for data and 2:1 for indirects. With ashift=12, typical these days, it means increasing the block sizes to 8KB we may get most of possible compression, reducing on-disk and in-ARC BRT footprint in half by the cost of some compression/decompression overhead, but without real write inflation, only some dirty data increase. Increase to 32KB similar to DDT could further increase compression and storage efficiency, but at the cost of write inflation and much bigger dirty data increase, which we can not properly control now. So lets leave this for a time when BRT log gets implemented. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Closes openzfs#17916
Implement BRT (Block Reference Table) prefetch functionality similar to existing DDT prefetch. This allows preloading BRT metadata into ARC to improve performance for block cloning operations and frees of earlier cloned blocks. Make -t parameter optional. When omitted, prefetch all supported metadata types (both DDT and BRT now). Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Closes openzfs#17890
Introduce a new vdev property `VDEV_PROP_SLOW_IO_REPORTING` that allows users to disable notifications for slow devices. This prevents ZED and/or ZFSD from degrading the pool due to slow I/O. Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Mariusz Zaborski <[email protected]> Closes 17477
Detect container environments and set timeout to zero unless ZFS_MODULE_TIMEOUT is already set. This avoids an unnecessary ten second delay after running zfs/zpool commands in a container where /dev/zfs is unavailable. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Adi Gollamudi <[email protected]> Closes openzfs#15165 Closes openzfs#17922
The nvlist_snprintf() function was added to the ABI of libnvpair. No other symbols were modified or removed. Bump the library-info SONAME current and age args to reflect this is a minor library version update. Reviewed-by: Rob Norris <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes openzfs#17911
The ABI of libzfs and libzpool have breaking changes since the last major release. Bump the SONAME for the upcoming 2.4 release branch to libzfs7 and libzpool7. Reviewed-by: Rob Norris <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes openzfs#17911
The zfs_tunable_* functions are a public interface which are part of the internal libspl convenience library. They should be hidden to prevent an unnecessary ABI change in installed libraries which link against libspl (e.g. libzfs_core, libuutil). We do already leak long standing libspl symbols. This commit is solely intended to prevent leaking these new ones until this is properly sorted out. Reviewed-by: Rob Norris <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes openzfs#17911
Refresh all ABI files using the CI generated files to reflect the library interfaces to be published for the 2.4 release. Reviewed-by: Rob Norris <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes openzfs#17911
Signed-off-by: Brian Behlendorf <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Status: Code Review Needed
Ready for review and testing
Status: Work in Progress
Not yet ready for general review
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Motivation and Context
Proposed patch set for zfs-2.4.0-rc3
Description
This includes important bug fixes, doc improvements, performance tuning, and CI updates landed in master since zfs-2.4.0-rc3.
Note that the
libzfsandlibzpoollibraries have had their soname bumped from 6 to 7 in preparation for the final 2.4.0 release. External consumers of these libraries will need to recompile their applications.How Has This Been Tested?
Will be tested by the CI. Additionally, all of these changes were clean cherry-pick from the master branch where they were previously tested.
Checklist:
Signed-off-by.