Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vdev_id: multi-lun disks & slot num zero pad #16603

Merged
merged 1 commit into from
Oct 9, 2024

Conversation

hellermf
Copy link
Contributor

@hellermf hellermf commented Oct 4, 2024

Add ability to generate disk names that contain both a slot number and a lun number in order to support multi-actuator SAS hard drives with multiple luns. Also add the ability to zero pad slot numbers to a desired digit length for easier sorting.

Motivation and Context

I manage many servers with ZFS pools comprised of dozens of Seagate Mach.2 dual-actuator SAS hard drives. These drives have two SCSI LUNs per physical drive, one per actuator, each giving access to half the storage capacity of the physical drive. Without these changes I can only configure vdev_id.conf to generate slot number based aliases for the first LUN of each drive, only half the capacity.

Description

This PR only introduces minor changes to the vdev_id script and associated documentation in the vdev_id.conf man page.

The first change is to add the option of generating disk names that include both a slot number and a lun number, this feature in enabled by adding one more value, 'bay_lun' to the list of values accepted by the 'slot' setting for vdev_id.conf.

The second little change is to add the option of requesting that disk names generated pad the slot number component to a specific number of digits using leading zeros. This is activated by a new 'zpad_slot' setting for vdev_id.conf whose value is the desired length, "zpad 3" could generate disk001, disk002, ...etc.

How Has This Been Tested?

I have RHEL 8 and Ubuntu 22.04 servers with dual-actuator SAS disks that have been using these changes for a couple years. I have not tested this for any other use cases but I figured it was time I offered to contribute my changes back to the community in case this enhancement was applicable to others.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

@behlendorf behlendorf added the Status: Code Review Needed Ready for review and testing label Oct 4, 2024
@tonyhutter
Copy link
Contributor

Please squash commits and add a Signed-off-by line

hellermf added a commit to hellermf/zfs that referenced this pull request Oct 8, 2024
Signed-off-by: Matthew Heller <matthew.f.heller@accre.vanderbilt.edu>
@hellermf hellermf force-pushed the vdev_id_for_multi_lun_drives branch from 22f5bb6 to 0c6e1aa Compare October 8, 2024 01:06
Add ability to generate disk names that contain both a slot number
and a lun number in order to support multi-actuator SAS hard drives
with multiple luns. Also add the ability to zero pad slot numbers to
a desired digit length for easier sorting.

Signed-off-by: Matthew Heller <matthew.f.heller@accre.vanderbilt.edu>
@hellermf hellermf force-pushed the vdev_id_for_multi_lun_drives branch from 0c6e1aa to 144c1e5 Compare October 8, 2024 11:44
@hellermf
Copy link
Contributor Author

hellermf commented Oct 8, 2024

I see the zloop CI check is failing in some test with interrupting raidz expansion reflow. Any thoughts? I wouldn't have thought my changes would be anywhere near the code paths that are failing. I know raidz expansion reflow is a new feature, is there any chance this PR isn't the cause?

https://github.com/openzfs/zfs/actions/runs/11234698822/job/31248072312?pr=16603

waiting for reflow to finish ...
verifying an interrupted raidz expansion using a pool scrub ...
ASSERT at cmd/ztest.c:8086:ztest_raidz_expand_check()
ztest_scrub_impl(spa) == 0 (0x10 == 0)
  PID: 117586    COMM: ztest
  TID: 117586    NAME: ztest
Call trace:
  [0x7f786a18af13] libspl_assertf+0x263
  [0x55821930a452] ztest_raidz_expand_check+0x2e2
  [0x55821932c188] ztest_run+0xdd8
  [0x5582193013c4] main+0x5e4
  [0x7f7867c2a1ca] __libc_init_first+0x8a
  [0x7f7867c2a28b] __libc_start_main+0x8b
  [0x5582193030f5] _start+0x25
Call trace:
  [0x558219308dc7] sig_handler+0xa7
  [0x7f7867c45320] __sigaction+0x50
  [0x7f7867c9eb1c] pthread_kill+0x11c
  [0x7f7867c4526e] gsignal+0x1e
  [0x7f7867c288ff] abort+0xdf
  [0x7f78696746af] libspl_assertf.cold+0x0a
  [0x55821930a452] ztest_raidz_expand_check+0x2e2
  [0x55821932c188] ztest_run+0xdd8
  [0x5582193013c4] main+0x5e4
  [0x7f7867c2a1ca] __libc_init_first+0x8a
  [0x7f7867c2a28b] __libc_start_main+0x8b
  [0x5582193030f5] _start+0x25

@behlendorf
Copy link
Contributor

@hellermf right, that failure is definitely unrelated. We've still got a few of those which the CI occasionally hits.

@behlendorf behlendorf merged commit cefef28 into openzfs:master Oct 9, 2024
18 of 20 checks passed
@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Oct 9, 2024
behlendorf pushed a commit to behlendorf/zfs that referenced this pull request Oct 9, 2024
Add ability to generate disk names that contain both a slot number
and a lun number in order to support multi-actuator SAS hard drives
with multiple luns. Also add the ability to zero pad slot numbers to
a desired digit length for easier sorting.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Heller <matthew.f.heller@accre.vanderbilt.edu>
Closes openzfs#16603
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants