Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[2.2] BRT and other fixes into 2.2.3-staging #15714

Merged

Commits on Dec 27, 2023

  1. ABD: Be more assertive in iterators

    Once we verified the ABDs and asserted the sizes we should never
    see premature ABDs ends.  Assert that and remove extra branches
    from production builds.
    
    Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
    Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
    Sponsored by:	iXsystems, Inc.
    Closes openzfs#15428
    amotin authored and mmatuska committed Dec 27, 2023
    Configuration menu
    Copy the full SHA
    ba6a530 View commit details
    Browse the repository at this point in the history
  2. Update the kstat dataset_name when renaming a zvol

    Add a dataset_kstats_rename function, and call it when renaming
    a zvol on FreeBSD and Linux.
    
    Reviewed-by: Alexander Motin <mav@FreeBSD.org>
    Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Signed-off-by: Alan Somers <asomers@gmail.com>
    Sponsored-by: Axcient
    Closes openzfs#15482
    Closes openzfs#15486
    asomers authored and mmatuska committed Dec 27, 2023
    Configuration menu
    Copy the full SHA
    de438c4 View commit details
    Browse the repository at this point in the history
  3. FreeBSD: Optimize large kstat outputs

    - Use sbuf_new_for_sysctl() to reduce double-buffering on sysctl
    output.
    - Use much faster sbuf_cat() instead of sbuf_printf("%s").
    
    Together it reduces `sysctl kstat.zfs.misc.dbufs` time from minutes
    to seconds, making dbufstat almost usable.
    
    Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Signed-off-by: Alexander Motin <mav@FreeBSD.org>
    Sponsored by: iXsystems, Inc.
    Closes openzfs#15495
    amotin authored and mmatuska committed Dec 27, 2023
    Configuration menu
    Copy the full SHA
    615d55a View commit details
    Browse the repository at this point in the history
  4. Linux: Reclaim unused spl_kmem_cache_reclaim

    It is unused for 3 years since openzfs#10576.
    
    Reviewed-by: George Melikov <mail@gmelikov.ru>
    Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
    Sponsored by:	iXsystems, Inc.
    Closes openzfs#15507
    amotin authored and mmatuska committed Dec 27, 2023
    Configuration menu
    Copy the full SHA
    ac7bfef View commit details
    Browse the repository at this point in the history
  5. L2ARC: Restrict write size to 1/4 of the device

    PR openzfs#15457 exposed weird logic in L2ARC write sizing. If it appeared
    bigger than device size, instead of liming write it reset all the
    system-wide tunables to their default.  Aside of being excessive,
    it did not actually help with the problem, still allowing infinite
    loop to happen.
    
    This patch removes the tunables reverting logic, but instead limits
    L2ARC writes (or at least eviction/trim) to 1/4 of the capacity.
    
    Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Reviewed-by: George Amanakis <gamanakis@gmail.com>
    Signed-off-by: Alexander Motin <mav@FreeBSD.org>
    Sponsored by: iXsystems, Inc.
    Closes openzfs#15519
    amotin authored and mmatuska committed Dec 27, 2023
    Configuration menu
    Copy the full SHA
    a8763f9 View commit details
    Browse the repository at this point in the history
  6. ZIL: Assert record sizes in different places

    This should make sure we have log written without overflows.
    
    Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
    Sponsored by:	iXsystems, Inc.
    Closes openzfs#15517
    amotin authored and mmatuska committed Dec 27, 2023
    Configuration menu
    Copy the full SHA
    f9abc6d View commit details
    Browse the repository at this point in the history
  7. ZIO: Add overflow checks for linear buffers

    Since we use a limited set of kmem caches, quite often we have unused
    memory after the end of the buffer.  Put there up to a 512-byte canary
    when built with debug to detect buffer overflows at the free time.
    
    Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
    Sponsored by:	iXsystems, Inc.
    Closes openzfs#15553
    amotin authored and mmatuska committed Dec 27, 2023
    Configuration menu
    Copy the full SHA
    650c4b0 View commit details
    Browse the repository at this point in the history
  8. ZIL: Remove TX_CLONE_RANGE replay for ZVOLs.

    zil_claim_clone_range() takes references on cloned blocks before ZIL
    replay.  Later zil_free_clone_range() drops them after replay or on
    dataset destroy.  The total balance is neutral.  It means we do not
    need to do anything (drop the references) for not implemented yet
    TX_CLONE_RANGE replay for ZVOLs.
    
    This is a logical follow up to openzfs#15603.
    
    Reviewed-by: Kay Pedersen <mail@mkwg.de>
    Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
    Sponsored by:	iXsystems, Inc.
    Closes openzfs#15612
    amotin authored and mmatuska committed Dec 27, 2023
    Configuration menu
    Copy the full SHA
    46c0bfc View commit details
    Browse the repository at this point in the history
  9. ZIL: Do not clone blocks from the future

    ZIL claim can not handle block pointers cloned from the future,
    since they are not yet allocated at that point.  It may happen
    either if the block was just written when it was cloned, or if
    the pool was frozen or somehow else rewound on import.
    
    Handle it from two sides: prevent cloning of blocks with physical
    birth time from not yet synced or frozen TXG, and abort ZIL claim
    if we still detect such blocks due to rewind or something else.
    
    While there, assert that any cloned blocks we claim are really
    allocated by calling metaslab_check_free().
    
    Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Signed-off-by: Alexander Motin <mav@FreeBSD.org>
    Sponsored by:	iXsystems, Inc.
    Closes openzfs#15617
    amotin authored and mmatuska committed Dec 27, 2023
    Configuration menu
    Copy the full SHA
    d2ff592 View commit details
    Browse the repository at this point in the history
  10. Allow block cloning across encrypted datasets

    When two datasets share the same master encryption key, it is safe
    to clone encrypted blocks. Currently only snapshots and clones
    of a dataset share with it the same encryption key.
    
    Added a test for:
    - Clone from encrypted sibling to encrypted sibling with
      non encrypted parent
    - Clone from encrypted parent to inherited encrypted child
    - Clone from child to sibling with encrypted parent
    - Clone from snapshot to the original datasets
    - Clone from foreign snapshot to a foreign dataset
    - Cloning from non-encrypted to encrypted datasets
    - Cloning from encrypted to non-encrypted datasets
    
    Reviewed-by: Alexander Motin <mav@FreeBSD.org>
    Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Original-patch-by: Pawel Jakub Dawidek <pawel@dawidek.net>
    Signed-off-by: Kay Pedersen <mail@mkwg.de>
    Closes openzfs#15544
    oromenahar authored and mmatuska committed Dec 27, 2023
    Configuration menu
    Copy the full SHA
    ff9eb7a View commit details
    Browse the repository at this point in the history
  11. zdb: Dump encrypted write and clone ZIL records

    Block pointers are not encrypted in TX_WRITE and TX_CLONE_RANGE
    records, so we can dump them, that may be useful for debugging.
    
    Related to openzfs#15543.
    Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Signed-off-by: Alexander Motin <mav@FreeBSD.org>
    Sponsored by:	iXsystems, Inc.
    Closes openzfs#15629
    amotin authored and mmatuska committed Dec 27, 2023
    Configuration menu
    Copy the full SHA
    4d74a5d View commit details
    Browse the repository at this point in the history
  12. ZIL: Remove 128K into 2x68K LWB split optimization

    To improve 128KB block write performance in case of multiple VDEVs
    ZIL used to spit those writes into two 64KB ones.  Unfortunately it
    was found to cause LWB buffer overflow, trying to write maximum-
    sizes 128KB TX_CLONE_RANGE record with 1022 block pointers into
    68KB buffer, since unlike TX_WRITE ZIL code can't split it.
    
    This is a minimally-invasive temporary block cloning fix until the
    following more invasive prediction code refactoring.
    
    Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Reviewed-by: Ameer Hamza <ahamza@ixsystems.com>
    Signed-off-by: Alexander Motin <mav@FreeBSD.org>
    Sponsored by:	iXsystems, Inc.
    Closes openzfs#15634
    amotin authored and mmatuska committed Dec 27, 2023
    Configuration menu
    Copy the full SHA
    532cf12 View commit details
    Browse the repository at this point in the history
  13. BRT: Limit brt_vdev_dump() to only one vdev

    Without this patch on pool of 60 vdevs with ZFS_DEBUG enabled clone
    takes much more time than copy, while heavily trashing dbgmsg for
    no good reason, repeatedly dumping all vdevs BRTs again and again,
    even unmodified ones.
    
    I am generally not sure this dumping is not excessive, but decided
    to keep it for now, just restricting its scope to more reasonable.
    
    Reviewed-by: Kay Pedersen <mail@mkwg.de>
    Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Signed-off-by: Alexander Motin <mav@FreeBSD.org>
    Sponsored by:	iXsystems, Inc.
    Closes openzfs#15625
    amotin authored and mmatuska committed Dec 27, 2023
    Configuration menu
    Copy the full SHA
    47ff01e View commit details
    Browse the repository at this point in the history
  14. DMU: Fix lock leak on dbuf_hold() error

    dmu_assign_arcbuf_by_dnode() should drop dn_struct_rwlock lock in
    case dbuf_hold() failed.  I don't have reproduction for this, but
    it looks inconsistent with dmu_buf_hold_noread_by_dnode() and co.
    
    Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Signed-off-by: Alexander Motin <mav@FreeBSD.org>
    Sponsored by: iXsystems, Inc.
    Closes openzfs#15644
    amotin authored and mmatuska committed Dec 27, 2023
    Configuration menu
    Copy the full SHA
    25753e0 View commit details
    Browse the repository at this point in the history
  15. dbuf: Handle arcbuf assignment after block cloning

    In some cases dbuf_assign_arcbuf() may be called on a block that
    was recently cloned.  If it happened in current TXG we must undo
    the block cloning first, since the only one dirty record per TXG
    can't and shouldn't mean both cloning and overwrite same time.
    
    Reviewed-by: Kay Pedersen <mail@mkwg.de>
    Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
    Sponsored by:	iXsystems, Inc.
    Closes openzfs#15653
    amotin authored and mmatuska committed Dec 27, 2023
    Configuration menu
    Copy the full SHA
    784c9cc View commit details
    Browse the repository at this point in the history
  16. dbuf: Set dr_data when unoverriding after clone

    Block cloning normally creates dirty record without dr_data.  But if
    the block is read after cloning, it is moved into DB_CACHED state and
    receives the data buffer.  If after that we call dbuf_unoverride()
    to convert the dirty record into normal write, we should give it the
    data buffer from dbuf and release one.
    
    Reviewed-by: Kay Pedersen <mail@mkwg.de>
    Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Signed-off-by: Alexander Motin <mav@FreeBSD.org>
    Sponsored by: iXsystems, Inc.
    Closes openzfs#15654
    Closes openzfs#15656
    amotin authored and mmatuska committed Dec 27, 2023
    Configuration menu
    Copy the full SHA
    ef34c71 View commit details
    Browse the repository at this point in the history
  17. Don't panic on unencrypted block in encrypted dataset

    While 763ca47 closes the situation of block cloning creating
    unencrypted records in encrypted datasets, existing data still causes
    panic on read. Setting zfs_recover bypasses this but at the cost of
    potentially ignoring more serious issues.
    
    Reviewed-by: Alexander Motin <mav@FreeBSD.org>
    Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Signed-off-by: Chris Peredun <chris.peredun@ixsystems.com>
    Closes openzfs#15677
    chrisperedun authored and mmatuska committed Dec 27, 2023
    Configuration menu
    Copy the full SHA
    f183b0b View commit details
    Browse the repository at this point in the history