Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WARN[0000] "/" is not a shared mount, this could cause issues or missing mounts with rootless containers #3726

Closed
bcaton85 opened this issue Jan 20, 2022 · 22 comments

Comments

@bcaton85
Copy link

Description

Running Podman as pod in Openshift 4.9.10 using Code Ready Containers. The pod is running unprivileged, rootless, and is using VFS storage. It is also being set with chroot isolation. The image uses the podman/stable base image and adds podman:100000:65536 to the subuid/subgid files and sets the storage option to VFS.

When running a build with podman build . --isolation chroot the following warning appears at the start of each command:

WARN[0000] "/" is not a shared mount, this could cause issues or missing mounts with rootless containers 

The builds work but I would like to know what could be causing this warning and if it could lead to more issues in the future.

Steps to reproduce the issue:

  1. Run an unprivileged/rootless podman
  • Run pod as podman (1000) user
  • Currently have subuid/subguid set to podman:100000:65536
  • VFS storage option set in /home/podman/.config/containers/storage.conf
  1. exec into pod and run: podman build . --isolation chroot

  2. See the following warning.

WARN[0000] "/" is not a shared mount, this could cause issues or missing mounts with rootless containers 

Describe the results you received:

The previous warning.

Describe the results you expected:

For the warning to not appear.

Output of rpm -q buildah or apt list buildah:

package buildah is not installed

Output of buildah version:

bash: buildah: command not found

Output of podman version if reporting a podman build issue:

podman version 3.4.4

Output of podman info --log-level debug

host:
  arch: amd64
  buildahVersion: 1.23.1
  cgroupControllers: []
  cgroupManager: cgroupfs
  cgroupVersion: v1
  conmon:
    package: conmon-2.0.30-2.fc35.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.30, commit: '
  cpus: 4
  distribution:
    distribution: fedora
    variant: container
    version: "35"
  eventLogger: file
  hostname: d7df40bd-e525-4458-9688-640721efb359-5mqkj--1-d5624
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 4.18.0-305.28.1.el8_4.x86_64
  linkmode: dynamic
  logDriver: k8s-file
  memFree: 332173312
  memTotal: 9404579840
  ociRuntime:
    name: crun
    package: crun-1.4-1.fc35.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.4
      commit: 3daded072ef008ef0840e8eccb0b52a7efbd165d
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
  os: linux
  remoteSocket:
    path: /tmp/podman-run-1000/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: false
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.1.12-2.fc35.x86_64
    version: |-
      slirp4netns version 1.1.12
      commit: 7a104a101aa3278a2152351a082a6df71f57c9a3
      libslirp: 4.6.1
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.5.3
  swapFree: 0
  swapTotal: 0
  uptime: 38m 48.99s
plugins:
  log:
  - k8s-file
  - none
  - journald
  network:
  - bridge
  - macvlan
  volume:
  - local
registries:
  search:
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - docker.io
  - quay.io
store:
  configFile: /home/podman/.config/containers/storage.conf
  containerStore:
    number: 0
    paused: 0
    running: 0
    stopped: 0
  graphDriverName: vfs
  graphOptions: {}
  graphRoot: /home/podman/.local/share/containers/storage
  graphStatus: {}
  imageStore:
    number: 11
  runRoot: /tmp/containers-user-1000/containers
  volumePath: /home/podman/.local/share/containers/storage/volumes
version:
  APIVersion: 3.4.4
  Built: 1638999907
  BuiltTime: Wed Dec  8 21:45:07 2021
  GitCommit: ""
  GoVersion: go1.16.8
  OsArch: linux/amd64
  Version: 3.4.4

Output of cat /etc/*release:

Fedora release 35 (Thirty Five)
NAME="Fedora Linux"
VERSION="35 (Container Image)"
ID=fedora
VERSION_ID=35
VERSION_CODENAME=""
PLATFORM_ID="platform:f35"
PRETTY_NAME="Fedora Linux 35 (Container Image)"
ANSI_COLOR="0;38;2;60;110;180"
LOGO=fedora-logo-icon
CPE_NAME="cpe:/o:fedoraproject:fedora:35"
HOME_URL="https://fedoraproject.org/"
DOCUMENTATION_URL="https://docs.fedoraproject.org/en-US/fedora/f35/system-administrators-guide/"
SUPPORT_URL="https://ask.fedoraproject.org/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"
REDHAT_BUGZILLA_PRODUCT="Fedora"
REDHAT_BUGZILLA_PRODUCT_VERSION=35
REDHAT_SUPPORT_PRODUCT="Fedora"
REDHAT_SUPPORT_PRODUCT_VERSION=35
PRIVACY_POLICY_URL="https://fedoraproject.org/wiki/Legal:PrivacyPolicy"
VARIANT="Container Image"
VARIANT_ID=container
Fedora release 35 (Thirty Five)
Fedora release 35 (Thirty Five)

Output of uname -a:

Linux d7df40bd-e525-4458-9688-640721efb359-5mqkj--1-d5624 4.18.0-305.28.1.el8_4.x86_64 #1 SMP Mon Nov 8 07:45:47 EST 2021 x86_64 x86_64 x86_64 GNU/Linux

Output of cat /etc/containers/storage.conf:
The storage.driver option is being overriden with VFS in the /home/podman/.config/containers/storage.conf file.

# This file is is the configuration file for all tools
# that use the containers/storage library.
# See man 5 containers-storage.conf for more information
# The "container storage" table contains all of the server options.
[storage]

# Default Storage Driver, Must be set for proper operation.
driver = "overlay"

# Temporary storage location
runroot = "/run/containers/storage"

# Primary Read/Write location of container storage
graphroot = "/var/lib/containers/storage"

# Storage path for rootless users
#
# rootless_storage_path = "$HOME/.local/share/containers/storage"

[storage.options]
# Storage options to be passed to underlying storage drivers

# AdditionalImageStores is used to pass paths to additional Read/Only image stores
# Must be comma separated list.
additionalimagestores = [
"/var/lib/shared",
]

# Remap-UIDs/GIDs is the mapping from UIDs/GIDs as they should appear inside of
# a container, to the UIDs/GIDs as they should appear outside of the container,
# and the length of the range of UIDs/GIDs.  Additional mapped sets can be
# listed and will be heeded by libraries, but there are limits to the number of
# mappings which the kernel will allow when you later attempt to run a
# container.
#
# remap-uids = 0:1668442479:65536
# remap-gids = 0:1668442479:65536

# Remap-User/Group is a user name which can be used to look up one or more UID/GID
# ranges in the /etc/subuid or /etc/subgid file.  Mappings are set up starting
# with an in-container ID of 0 and then a host-level ID taken from the lowest
# range that matches the specified name, and using the length of that range.
# Additional ranges are then assigned, using the ranges which specify the
# lowest host-level IDs first, to the lowest not-yet-mapped in-container ID,
# until all of the entries have been used for maps.
#
# remap-user = "containers"
# remap-group = "containers"

# Root-auto-userns-user is a user name which can be used to look up one or more UID/GID
# ranges in the /etc/subuid and /etc/subgid file.  These ranges will be partitioned
# to containers configured to create automatically a user namespace.  Containers
# configured to automatically create a user namespace can still overlap with containers
# having an explicit mapping set.
# This setting is ignored when running as rootless.
# root-auto-userns-user = "storage"
#
# Auto-userns-min-size is the minimum size for a user namespace created automatically.
# auto-userns-min-size=1024
#
# Auto-userns-max-size is the minimum size for a user namespace created automatically.
# auto-userns-max-size=65536

[storage.options.overlay]
# ignore_chown_errors can be set to allow a non privileged user running with
# a single UID within a user namespace to run containers. The user can pull
# and use any image even those with multiple uids.  Note multiple UIDs will be
# squashed down to the default uid in the container.  These images will have no
# separation between the users in the container. Only supported for the overlay
# and vfs drivers.
#ignore_chown_errors = "false"

# Inodes is used to set a maximum inodes of the container image.
# inodes = ""

# Path to an helper program to use for mounting the file system instead of mounting it
# directly.
mount_program = "/usr/bin/fuse-overlayfs"

# mountopt specifies comma separated list of extra mount options
mountopt = "nodev,fsync=0"

# Set to skip a PRIVATE bind mount on the storage home directory.
# skip_mount_home = "false"

# Size is used to set a maximum size of the container image.
# size = ""

# ForceMask specifies the permissions mask that is used for new files and
# directories.
#
# The values "shared" and "private" are accepted.
# Octal permission masks are also accepted.
#
#  "": No value specified.
#     All files/directories, get set with the permissions identified within the
#     image.
#  "private": it is equivalent to 0700.
#     All files/directories get set with 0700 permissions.  The owner has rwx
#     access to the files. No other users on the system can access the files.
#     This setting could be used with networked based homedirs.
#  "shared": it is equivalent to 0755.
#     The owner has rwx access to the files and everyone else can read, access
#     and execute them. This setting is useful for sharing containers storage
#     with other users.  For instance have a storage owned by root but shared
#     to rootless users as an additional store.
#     NOTE:  All files within the image are made readable and executable by any
#     user on the system. Even /etc/shadow within your image is now readable by
#     any user.
#
#   OCTAL: Users can experiment with other OCTAL Permissions.
#
#  Note: The force_mask Flag is an experimental feature, it could change in the
#  future.  When "force_mask" is set the original permission mask is stored in
#  the "user.containers.override_stat" xattr and the "mount_program" option must
#  be specified. Mount programs like "/usr/bin/fuse-overlayfs" present the
#  extended attribute permissions to processes within containers rather then the
#  "force_mask"  permissions.
#
# force_mask = ""

[storage.options.thinpool]
# Storage Options for thinpool

# autoextend_percent determines the amount by which pool needs to be
# grown. This is specified in terms of % of pool size. So a value of 20 means
# that when threshold is hit, pool will be grown by 20% of existing
# pool size.
# autoextend_percent = "20"

# autoextend_threshold determines the pool extension threshold in terms
# of percentage of pool size. For example, if threshold is 60, that means when
# pool is 60% full, threshold has been hit.
# autoextend_threshold = "80"

# basesize specifies the size to use when creating the base device, which
# limits the size of images and containers.
# basesize = "10G"

# blocksize specifies a custom blocksize to use for the thin pool.
# blocksize="64k"

# directlvm_device specifies a custom block storage device to use for the
# thin pool. Required if you setup devicemapper.
# directlvm_device = ""

# directlvm_device_force wipes device even if device already has a filesystem.
# directlvm_device_force = "True"

# fs specifies the filesystem type to use for the base device.
# fs="xfs"

# log_level sets the log level of devicemapper.
# 0: LogLevelSuppress 0 (Default)
# 2: LogLevelFatal
# 3: LogLevelErr
# 4: LogLevelWarn
# 5: LogLevelNotice
# 6: LogLevelInfo
# 7: LogLevelDebug
# log_level = "7"

# min_free_space specifies the min free space percent in a thin pool require for
# new device creation to succeed. Valid values are from 0% - 99%.
# Value 0% disables
# min_free_space = "10%"

# mkfsarg specifies extra mkfs arguments to be used when creating the base
# device.
# mkfsarg = ""

# metadata_size is used to set the `pvcreate --metadatasize` options when
# creating thin devices. Default is 128k
# metadata_size = ""

# Size is used to set a maximum size of the container image.
# size = ""

# use_deferred_removal marks devicemapper block device for deferred removal.
# If the thinpool is in use when the driver attempts to remove it, the driver
# tells the kernel to remove it as soon as possible. Note this does not free
# up the disk space, use deferred deletion to fully remove the thinpool.
# use_deferred_removal = "True"

# use_deferred_deletion marks thinpool device for deferred deletion.
# If the device is busy when the driver attempts to delete it, the driver
# will attempt to delete device every 30 seconds until successful.
# If the program using the driver exits, the driver will continue attempting
# to cleanup the next time the driver is used. Deferred deletion permanently
# deletes the device and all data stored in device will be lost.
# use_deferred_deletion = "True"

# xfs_nospace_max_retries specifies the maximum number of retries XFS should
# attempt to complete IO when ENOSPC (no space) error is returned by
# underlying storage device.
# xfs_nospace_max_retries = "0"
@flouthoc
Copy link
Collaborator

@bcaton85 Thanks for creating the issue.

I think its because the host / has PROPAGATION as MS_PRIVATE usually it is expected to be MS_SHARED.
I don't think you will face any issues during regular builds.

But I think mounting file into a running container or build using -v /something:/something might have some side effects.

I'll also tag others if they could also suggest something here: @rhatdan @nalind @giuseppe @vrothberg @mtrmac

@giuseppe
Copy link
Member

what is the underlying host?

podman/buildah expect the following:

$ findmnt -o PROPAGATION /
PROPAGATION
shared

I guess in your case it is something different?

The issue with not having a shared propagation on the root mount is that some mounts could not be propagated inside the inner container causing all sorts of weird failures.

@bcaton85
Copy link
Author

bcaton85 commented Jan 21, 2022

Yeah this returns private. I am on CRC, not sure if that would make a difference.

sh-4.4# findmnt -o PROPAGATION /
PROPAGATION
private

We're using this strictly for builds but not sure if there will ever be a time we need to mount a volume for a build.

@rhatdan
Copy link
Member

rhatdan commented Jan 21, 2022

Why does CRC have PROPAGATION of / set to private?

@bcaton85
Copy link
Author

Not sure. No CRC settings were modified.

@rhatdan
Copy link
Member

rhatdan commented Jan 24, 2022

Could you open an issue with CRC?

@bcaton85
Copy link
Author

Sure, i can do that.

@giuseppe
Copy link
Member

could you please link the issue here once you've opened it?

I am closing this issue for now, because there is nothing we can do to address it in Buildah

@bcaton85
Copy link
Author

Issue raised with CRC

@dhirschfeld
Copy link

I'm seeing the same issue in WSL2:

$ podman images
WARN[0000] "/" is not a shared mount, this could cause issues or missing mounts with rootless containers
REPOSITORY  TAG         IMAGE ID    CREATED     SIZE
$ findmnt -o PROPAGATION /
PROPAGATION
private
$ uname -r
5.10.16.3-microsoft-standard-WSL2

As a user, is there anything I can do to "fix" the problem?

@rhatdan
Copy link
Member

rhatdan commented Jun 30, 2022

Does
sudo mount -o remount,shared / /

Fix it?

@giuseppe
Copy link
Member

I think you need sudo mount --make-rshared /

@dhirschfeld
Copy link

☝️ yeah, that ran without problems and seemed to fix it.

I thought there may be something WSL specific why that was set to private, but it did seem to work so I guess that's fine then?! (IANALE - I Am Not a Linux Expert!)

@rhatdan
Copy link
Member

rhatdan commented Jun 30, 2022

Kernel default is Private. Systemd modifies the system to rshared by default.
Since WSL is not using systemd, you don't get the change. (I am not a WSL expert, but I believe that is the issue).

Is there a way to setup an init script in WSL to make this permanent?

@dhirschfeld
Copy link

Is there a way to setup an init script in WSL to make this permanent?

Ahhh. I thought it was when I checked in a new terminal... but WSL keeps running in the background. After doing wsl --shutdown a new terminal does show it has been reset to private 🙁

I'll have to do some research to see how I can make that permanent... 🤔

@dhirschfeld
Copy link

What's the usual way in linux to configure this? Does it have to do with /etc/fstab?

$ cat /etc/fstab
LABEL=cloudimg-rootfs   /        ext4   discard,errors=remount-ro       0 1

(sorry for the noob linux questions! 😬)

@dhirschfeld
Copy link

dhirschfeld commented Jul 1, 2022

I tried adding shared to the fstab entry but with no luck.

I did find some good info in https://superuser.com/a/1701393 and ended up using the wsl.exe hack! 🤢
(I couldn't use boot settings as I'm on Win10)
It does seem to work though 🎉

$ cat /etc/profile.d/02-shared-root.sh
wsl.exe -u root -e mount --make-rshared /
$ findmnt -o PROPAGATION /
PROPAGATION
shared

@rhatdan
Copy link
Member

rhatdan commented Jul 1, 2022

The usual way would be an init script, (Systemd for example unit file).
I have no idea how this works on Windows though.

Don't believe fstab supports setting the sharing.
@n1hility Thoughts?

@rhatdan
Copy link
Member

rhatdan commented Jul 1, 2022

@n1hility PTAL

@dhirschfeld
Copy link

I have no idea how this works on Windows though.

I gather WSL is a bit of an odd beast. I think the boot settings are the blessed way forward for Win11 users and, in the interim, the wsl.exe trick seems to work well enough for me.

@n1hility
Copy link
Member

n1hility commented Jul 1, 2022

The way WSL works is that it shares the kernel between all "distros". This is accomplished through namespaces, so each distro gets a private mount namespace. If you are using podman machine for windows, we create a nested namespace to be able to run systemd, and that namespace is created with a shared mount namespace. So if you use podman machine this is handled for you. Alternatively you can remount like you are doing for something custom.

@mwoodpatrick
Copy link

mwoodpatrick commented Jul 16, 2022

I am on windows 11 and the command

wsl.exe -u root -e mount --make-rshared /

generates

`<3>init: (8) ERROR: CreateProcessParseCommon:746: Failed to translate \wsl.localhost\Debian\home\mwoodpatrick

and I still: get

findmnt -o PROPAGATION /
PROPAGATION
private

`I filed Microsoft issue: 8623

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 31, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

7 participants