[batch] Check /proc/mounts for straggler cloudfuse mounts #12986

jigold · 2023-05-04T16:17:59Z

No description provided.

daniel-goldstein · 2023-05-04T16:29:41Z

batch/batch/worker/worker.py

@@ -419,6 +419,9 @@ async def _fuse_mount(self, destination: str, *, credentials_path: str, tmp_path
    async def _fuse_unmount(self, path: str):
        assert CLOUD_WORKER_API
        await CLOUD_WORKER_API.unmount_cloudfuse(path)
+        proc_mounts_stdout, _ = await check_shell_output('cat /proc/mounts')
+        if path in str(proc_mounts_stdout):
+            raise IncompleteCloudFuseCleanup(f'incomplete cloudfuse unmounting: {proc_mounts_stdout}')


Ah I see now. This looks good, but can we just use python open instead of cat to read in the contents of /proc/mounts?

done

danking · 2023-05-04T16:51:33Z

batch/batch/worker/worker.py

+        with open('/proc/mounts', 'r') as f:
+            output = f.read()
+            if self.cloudfuse_base_path() in output:
+                raise IncompleteCloudFuseCleanup(f'incomplete cloudfuse unmounting: {output}')


I'm fairly sure exceptions raised in DockerJob.cleanup are not logged anywhere. Using the raise to skip the xfs_quota and rmtree seems fine but we need to log the exception here.

danking · 2023-05-04T16:53:17Z

batch/batch/worker/worker.py

+        with open('/proc/mounts', 'r') as f:
+            output = f.read()
+            if path in output:
+                raise IncompleteCloudFuseCleanup(f'incomplete cloudfuse unmounting: {output}')


This would read this file for each mount rather than once after all attempts to unmount. Let's make JVMJob symmetric to DockerJob and check the mounts in cleanup in both Jobs. I think we should also make sure we attempt to upload the log before we check /proc/mounts.

Ah true, there's no need to put this logic in the cloudfuse cache because we never rmtree things in here. The attack vector is any bind mount left over in a directory that we will subsequently rmtree, so the way to handle the JVMJob case is to check /proc/mounts at the end of a JVM job for any leftover mounts in the scratch directory.

Why are we checking for paths in the scratch directory? Should I be checking for the scratch directory in Docker jobs as well?

jigold · 2023-05-04T17:36:38Z

batch/batch/worker/worker.py

@@ -2234,6 +2230,11 @@ async def cleanup(self):
                    await self.jvm.cloudfuse_mount_manager.unmount(mount_path, user=self.user, bucket=bucket)
                    config['mounted'] = False

+        with open('/proc/mounts', 'r') as f:
+            output = f.read()
+            if self.cloudfuse_base_path() in output:


I think my question is won't this fuse path remain alive if there's other jobs reading the data? That's why I had the check after we actually call fusermount.

This is not a fuse path, it's a bind mount path which is the thing that the JVMJob actually owns. Here's an example:

JVM 1 wants bucket foo at /cloudfuse/JVM-1/foo

FUSE cache manager makes two mounts:

FUSE mount at /cloudfuse/read_only/abc123/foo

Bind mounts /cloudfuse/read_only/abc123/foo to /cloudfuse/JVM-1/foo

JVM 2 wants bucket foo at /cloudfuse/JVM-2/foo

FUSE cache manager reuses the underlying FUSE mount and bind mounts /cloudfuse/read_only/abc123/foo to /cloudfuse/JVM-2/foo.

JVM 1 finishes and wants to clean up, it calls cloudfuse_mount_manager.unmount

FUSE cache manager unmounts the bind mount at /cloudfuse/JVM-1/foo. The underyling FUSE mount is still being used by another job, so it doesn't do anything else

At this point, the only thing JVM 1 needs to validate to make sure that it is properly cleaned up is that the bind mount that it owns (whatever is under /cloudfuse/JVM-1) is gone. Without that bind mount it has no connection to the underlying FUSE mount created by the cache manager.

Thanks! This was really helpful.

done

[batch] Check /proc/mounts for straggler cloudfuse mounts

fb278f8

jigold assigned danking and daniel-goldstein May 4, 2023

add check to JVM cloud fuse cache

e28c98b

daniel-goldstein previously requested changes May 4, 2023

View reviewed changes

use open

7a90642

danking previously requested changes May 4, 2023

View reviewed changes

address comments

c4e1e35

jigold commented May 4, 2023

View reviewed changes

undo jvm changes

b77bdf0

jigold force-pushed the check-proc-mounts branch from 45395f5 to b77bdf0 Compare May 4, 2023 17:58

danking approved these changes May 4, 2023

View reviewed changes

delint

f4fd73e

danking merged commit da6ba69 into hail-is:main May 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[batch] Check /proc/mounts for straggler cloudfuse mounts #12986

[batch] Check /proc/mounts for straggler cloudfuse mounts #12986

jigold commented May 4, 2023

daniel-goldstein May 4, 2023

danking May 4, 2023

danking May 4, 2023

daniel-goldstein May 4, 2023

jigold May 4, 2023 •

edited

Loading

jigold May 4, 2023

daniel-goldstein May 4, 2023

jigold May 4, 2023

[batch] Check /proc/mounts for straggler cloudfuse mounts #12986

[batch] Check /proc/mounts for straggler cloudfuse mounts #12986

Conversation

jigold commented May 4, 2023

daniel-goldstein May 4, 2023

Choose a reason for hiding this comment

danking May 4, 2023

Choose a reason for hiding this comment

danking May 4, 2023

Choose a reason for hiding this comment

daniel-goldstein May 4, 2023

Choose a reason for hiding this comment

jigold May 4, 2023 • edited Loading

Choose a reason for hiding this comment

jigold May 4, 2023

Choose a reason for hiding this comment

daniel-goldstein May 4, 2023

Choose a reason for hiding this comment

jigold May 4, 2023

Choose a reason for hiding this comment

jigold May 4, 2023 •

edited

Loading