Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nixos/zfs: introduce option to control hibernation #171680

Merged
merged 1 commit into from
Oct 29, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions nixos/doc/manual/from_md/release-notes/rl-2211.section.xml
Original file line number Diff line number Diff line change
Expand Up @@ -847,6 +847,18 @@
<literal>hardware.saleae-logic.package</literal>.
</para>
</listitem>
<listitem>
<para>
ZFS module will not allow hibernation by default, this is a
safety measure to prevent data loss cases like the ones
described at
<link xlink:href="https://github.com/openzfs/zfs/issues/260">OpenZFS/260</link>
and
<link xlink:href="https://github.com/openzfs/zfs/issues/12842">OpenZFS/12842</link>.
Use the <literal>boot.zfs.allowHibernation</literal> option to
configure this behaviour.
</para>
</listitem>
<listitem>
<para>
The Redis module now disables RDB persistence when
Expand Down
2 changes: 2 additions & 0 deletions nixos/doc/manual/release-notes/rl-2211.section.md
Original file line number Diff line number Diff line change
Expand Up @@ -274,6 +274,8 @@ Available as [services.patroni](options.html#opt-services.patroni.enable).

- A new module was added for the Saleae Logic device family, providing the options `hardware.saleae-logic.enable` and `hardware.saleae-logic.package`.

- ZFS module will not allow hibernation by default, this is a safety measure to prevent data loss cases like the ones described at [OpenZFS/260](https://github.com/openzfs/zfs/issues/260) and [OpenZFS/12842](https://github.com/openzfs/zfs/issues/12842). Use the `boot.zfs.allowHibernation` option to configure this behaviour.

- The Redis module now disables RDB persistence when `services.redis.servers.<name>.save = []` instead of using the Redis default.

- Neo4j was updated from version 3 to version 4. See this [migration guide](https://neo4j.com/docs/upgrade-migration-guide/current/) on how to migrate your Neo4j instance.
Expand Down
13 changes: 13 additions & 0 deletions nixos/modules/tasks/filesystems/zfs.nix
Original file line number Diff line number Diff line change
Expand Up @@ -226,6 +226,15 @@ in
'';
};

allowHibernation = mkOption {
type = types.bool;
default = false;
description = lib.mdDoc ''
Allow hibernation support, this may be a unsafe option depending on your
setup. Make sure to NOT use Swap on ZFS.
'';
};

extraPools = mkOption {
type = types.listOf types.str;
default = [];
Expand Down Expand Up @@ -498,6 +507,10 @@ in

boot = {
kernelModules = [ "zfs" ];
# https://github.com/openzfs/zfs/issues/260
# https://github.com/openzfs/zfs/issues/12842
# https://github.com/NixOS/nixpkgs/issues/106093
kernelParams = lib.optionals (!config.boot.zfs.allowHibernation) [ "nohibernate" ];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is the right workaround. Because it allows people to assign a swapDevices but that not taking any effect. I think it would be better to have an assertion like this:

{
  assertions = [{
    assertion = (config.boot.zfs.enable && config.swapDevices != []) -> config.config.boot.zfs.allowHibernation;
    message = "Using swap with ZFS can corrupt data. Either disable swap by removing all `swapDevices` entries, or if you know what you're doing, ignore this assertion by enabling `boot.zfs.allowHibernation`";
  }];

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also have another assertion which throws when ZFS is enabled and you have a swap file on a ZFS filesystem

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using swap with ZFS can corrupt data.

Does this apply even for a swap partition that's not on a ZVOL or otherwise on ZFS?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is the right workaround. Because it allows people to assign a swapDevices but that not taking any effect. I think it would be better to have an assertion like this:

{
  assertions = [{
    assertion = (config.boot.zfs.enable && config.swapDevices != []) -> config.config.boot.zfs.allowHibernation;
    message = "Using swap with ZFS can corrupt data. Either disable swap by removing all `swapDevices` entries, or if you know what you're doing, ignore this assertion by enabling `boot.zfs.allowHibernation`";
  }];

I just got bit by this. I'm only using ZFS on a block file in an EXT4 filesystem so I'm not concerned with hibernating ZFS filesystems. I think this suggestion would aid discoverability.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's a misunderstanding in @infinisil's year old comment. The issue is not that you cannot have swap devices while using ZFS; you totally can (though, for other unrelated reasons, those cannot be stored on ZFS). The issue is that you cannot hibernate while ZFS pools are imported, period, no matter where the swap is stored, or else those pools have a chance of being irreparably corrupted.

Point being: ZFS + swap on non-ZFS: Good!. ZFS + hibernation in any case whatsoever: Bad! Hence what we have now is the right fix. If you want to ignore it, well it is configurable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is that you cannot hibernate while ZFS pools are imported, period, no matter where the swap is stored, or else those pools have a chance of being irreparably corrupted.

Makes sense! Though, the discoverability for hibernation being disabled is not great. I only realized ZFS was the problem after reading this discourse post. I'm really not using ZFS in anger, so I'm happy to risk data loss for being able to hibernate my laptop, and I think @infinisil's suggestion would make this clear to new users.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

infinisil's suggestion would disable swap entirely with ZFS, which is certainly not desirable. We don't really have a way to know at eval-time whether a user intends to use hibernation, so it's difficult to warn about.

Actually, we should probably have an assertion that boot.resumeDevice cannot be set while allowHibernation is false, and that would cover some use cases. But NixOS has auto-detection of resume devices in stage 1 (if you're using EFI and scripted initrd exactly because... reasons), so this is very often not set.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

infinisil's suggestion would disable swap entirely with ZFS, which is certainly not desirable. We don't really have a way to know at eval-time whether a user intends to use hibernation, so it's difficult to warn about.

Hmm, how about:

{
  assertions = [{
    assertion = (config.boot.zfs.enable && ! (builtins.elem "nohibernate" config.boot.kernelParams)) -> config.config.boot.zfs.allowHibernation;
    message = "Using hibernation with ZFS can corrupt data. Either disable hibernation with `boot.kernelParams = [ "nohibernate" ]` or if you know what you're doing, ignore this assertion by enabling `boot.zfs.allowHibernation`";
  }];

Copy link
Member

@infinisil infinisil Nov 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As also hinted at in #171680 (comment), I've been using ZFS with swap (on a separate partition) with hibernation for a while now and nothing has corrupted 🤷. I am using a patch on top of Nixpkgs though: infinisil@448fe5d. This makes swap restore before ZFS imports.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@infinisil that patch helps. Certainly, it should be quite rare with that patch. But "quite a rare, but known, source of data loss" is not something we want people using by accident.

@RyanGibb That assertion doesn't really do anything though? allowHibernation implies that nohibernate is not in the cmdline. Unless the user manually added it themselves and enabled allowHibernation. While that would be a nonsensical thing for them to do, it would at least be safe, and nothing to worry about, and means that they explicitly added both configuration lines.


extraModulePackages = [
(if config.boot.zfs.enableUnstable then
Expand Down