Skip to content

Add "zpool status -vv" #17502

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Conversation

asomers
Copy link
Contributor

@asomers asomers commented Jul 1, 2025

Specifying the verbose flag twice will display a list of all corrupt sectors within each corrupt file, as opposed to just the name of the file.

Signed-off-by: Alan Somers [email protected]
Sponsored by: ConnectWise

Motivation and Context

Displays the record number of every corrupt record in every corrupt file. I find this is very useful when cleaning up the fallout from #16626.

Description

The kernel already tracks the blkid of every corrupt record, and already transmits that information to userland. But libzfs has always thrown it away, until now. This PR adds a -vv option to zpool status. When used, it will print the level and blkid of every corrupt record. It works in combination with -j, too.

How Has This Been Tested?

Manually tested on about half a dozen production datasets that had on-disk corruption as a result of #16626 , in both L0 and L1 blocks.
Manually tested on a test dataset that I intentionally corrupted. That one had multiple corrupted records on multiple files.

Example output, in human readable mode:

...
errors: Permanent errors have been detected in the following files:

        /testpool/randfile.7 L0 record 3
        /testpool/randfile.7 L0 record 9
        /testpool/randfile.7 L0 record 16
        /testpool/randfile.9 L0 record 8
        /testpool/randfile.9 L0 record 15
        /testpool/randfile.10 L0 record 3
        /testpool/randfile.10 L0 record 11
        /testpool/randfile.5 L0 record 17
        /testpool/randfile.8 L0 record 11
        /testpool/randfile.8 L0 record 19
        /testpool/randfile.6 L0 record 3
        /testpool/randfile.6 L0 record 12
Example output, in json mode
{
  "output_version": {
    "command": "zpool status",
    "vers_major": 0,
    "vers_minor": 1
  },
  "pools": {
    "testpool": {
      "name": "testpool",
      "state": "ONLINE",
      "pool_guid": "10305967396160717712",
      "txg": "1523",
      "spa_version": "5000",
      "zpl_version": "5",
      "status": "One or more devices has experienced an error resulting in data\n\tcorruption.  Applications may be affected.\n",
      "action": "Restore the file in question if possible.  Otherwise restore the\n\tentire pool from backup.\n",
      "msgid": "ZFS-8000-8A",
      "moreinfo": "https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A",
      "scan_stats": {
        "function": "SCRUB",
        "state": "FINISHED",
        "start_time": "Tue Jul  1 12:11:14 2025",
        "end_time": "Tue Jul  1 12:11:14 2025",
        "to_examine": "56.0M",
        "examined": "56.0M",
        "skipped": "92K",
        "processed": "0B",
        "errors": "12",
        "bytes_per_scan": "0B",
        "pass_start": "1751393474",
        "scrub_pause": "-",
        "scrub_spent_paused": "0",
        "issued_bytes_per_scan": "55.9M",
        "issued": "55.9M"
      },
      "vdevs": {
        "testpool": {
          "name": "testpool",
          "vdev_type": "root",
          "guid": "10305967396160717712",
          "class": "normal",
          "state": "ONLINE",
          "alloc_space": "56.0M",
          "total_space": "112M",
          "def_space": "112M",
          "read_errors": "0",
          "write_errors": "0",
          "checksum_errors": "0",
          "vdevs": {
            "/tmp/zfs.img": {
              "name": "/tmp/zfs.img",
              "vdev_type": "file",
              "guid": "1719526601577822810",
              "path": "/tmp/zfs.img",
              "class": "normal",
              "state": "ONLINE",
              "alloc_space": "56.0M",
              "total_space": "112M",
              "def_space": "112M",
              "rep_dev_size": "116M",
              "self_healed": "1.50K",
              "phys_space": "128M",
              "read_errors": "0",
              "write_errors": "0",
              "checksum_errors": "27",
              "slow_ios": "0"
            }
          }
        }
      },
      "error_count": "12",
      "errlist": [
        {
          "path": "/testpool/randfile.7",
          "level": 0,
          "record": 3
        },
        {
          "path": "/testpool/randfile.7",
          "level": 0,
          "record": 9
        },
        {
          "path": "/testpool/randfile.7",
          "level": 0,
          "record": 16
        },
        {
          "path": "/testpool/randfile.9",
          "level": 0,
          "record": 8
        },
        {
          "path": "/testpool/randfile.9",
          "level": 0,
          "record": 15
        },
        {
          "path": "/testpool/randfile.10",
          "level": 0,
          "record": 3
        },
        {
          "path": "/testpool/randfile.10",
          "level": 0,
          "record": 11
        },
        {
          "path": "/testpool/randfile.5",
          "level": 0,
          "record": 17
        },
        {
          "path": "/testpool/randfile.8",
          "level": 0,
          "record": 11
        },
        {
          "path": "/testpool/randfile.8",
          "level": 0,
          "record": 19
        },
        {
          "path": "/testpool/randfile.6",
          "level": 0,
          "record": 3
        },
        {
          "path": "/testpool/randfile.6",
          "level": 0,
          "record": 12
        }
      ]
    }
  }
}

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Quality assurance (non-breaking change which makes the code more robust against bugs)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

asomers added 2 commits July 1, 2025 12:18
Specifying the verbose flag twice will display a list of all corrupt
sectors within each corrupt file, as opposed to just the name of the
file.

Signed-off-by:	Alan Somers <[email protected]>
Sponsored by:	ConnectWise

Signed-off-by: Alan Somers <[email protected]>
@gamanakis
Copy link
Contributor

On a first pass it looks good to me, thanks!. Though not really sure why the checks are not successful. Could you squash and re-push?

@asomers
Copy link
Contributor Author

asomers commented Jul 20, 2025

Well, I think the "checkstyle" check is failing because I didn't update libzfs.abi. But I can find no instructions for how to do that. @ixhamza you were the last to do it. Could you please tell me how to update libzfs.abi due to a function prototype change?

@gmelikov
Copy link
Member

gmelikov commented Jul 20, 2025

In addition to abi I see:

cmd/zpool/zpool_main.c: In function ‘errors_nvlist’:
cmd/zpool/zpool_main.c:9590:41: error: ‘errnvl’ may be used uninitialized [-Werror=maybe-uninitialized]
 9590 |                                         fnvlist_add_nvlist_array(item,
      |                                         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 9591 |                                             "errlist",
      |                                             ~~~~~~~~~~
 9592 |                                             (const nvlist_t **)errnvl,
      |                                             ~~~~~~~~~~~~~~~~~~~~~~~~~~
 9593 |                                             count);
      |                                             ~~~~~~
cmd/zpool/zpool_main.c:9532:44: note: ‘errnvl’ was declared here
 9532 |                                 nvlist_t **errnvl;
      |                                            ^~~~~~
cc1: all warnings being treated as errors

You may get new abi here https://github.com/openzfs/zfs/actions/runs/16008660282 (see artifact, direct link to it https://github.com/openzfs/zfs/actions/runs/16008660282/artifacts/3443942581r )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants