`os.listdir` (and presumably other os functions) fails in the face of signals #955

emeryberger · 2024-07-29T00:50:44Z

Mountpoint for Amazon S3 version

mount-s3 1.7.2

AWS Region

us-east-1

Describe the running environment

Running on EC2, accessing an S3 bucket through my account, using this AMI: Deep Learning Base Proprietary Nvidia Driver GPU AMI (Ubuntu 20.04) 20240314. Same setup as here: plasma-umass/scalene#841

Mountpoint options

mount-s3 <MY_BUCKET> /mnt/mounted_bucket_dir

What happened?

This error (failure when running in a mounted S3 system) was brought to my attention with this issue with the Scalene profiler: plasma-umass/scalene#841

The root cause turns out to be the CPU timer signal; if the os.listdir function is interrupted by a signal.SIGALRM, the call fails with an OSError. I set the frequency below to a level that triggers the failure roughly half the time; setting it to 1 second makes it never happen. Since the default CPU sampling frequency used by Scalene is 0.01 seconds, it fails consistently. Note that the profiler py-spy also causes this failure.

MRE here:

import os
import signal
import time

def signal_handler(signum, frame):
    print("signal")
    pass

signal.setitimer(signal.ITIMER_REAL, 0.02) # setting to 0.01 triggers failures reliably                              
signal.signal(signal.SIGALRM, signal_handler)

dirname = "/mnt/mounted_bucket_dir"

assert os.path.exists(dirname)
assert os.access(dirname, os.R_OK)

print(len(os.listdir(dirname)))

Example failure:

Traceback (most recent call last):
  File "test-mount-s3.py", line 16, in <module>
    print(len(os.listdir(dirname)))
OSError: [Errno 22] Invalid argument: '/mnt/mounted_bucket_dir'

I have implemented a workaround for this situation (wrapping all os functions so that they block the SIGALRM signal) - plasma-umass/scalene#842 - , but it seems like it is exposing a race condition in mount-s3, or at minimum, undesirable behavior.

Relevant log output

2024-07-29T00:47:59.230133Z  WARN mountpoint_s3::cli: failed to detect network throughput. Using 10 gbps as throughput. Use --maximum-throughput-gbps CLI flag to configure a target throughput appropriate for the instance. Detection failed due to: failed to get network throughput
2024-07-29T00:48:15.393878Z  WARN readdirplus{req=10 ino=1 fh=1 offset=1}: mountpoint_s3::fuse: readdirplus failed: out-of-order readdir, expected=3, actual=1

The text was updated successfully, but these errors were encountered:

vladem · 2024-08-13T16:02:18Z

Hey @emeryberger, thanks for reporting this issue and sorry for the delay in the response. We've reproduced it and are able to confirm that the problem exists. Here is a debug log:

2024-07-31T15:56:07.404177Z DEBUG fuser::request: FUSE( 20) ino 0x0000000000000001 READDIRPLUS fh FileHandle(2), offset 0, size 4096
2024-07-31T15:56:07.404334Z DEBUG <request to s3 created>
2024-07-31T15:56:07.503964Z DEBUG fuser::request: FUSE( 21) ino 0x0000000000000000 INTERRUPT unique RequestId(20)
2024-07-31T15:56:07.745586Z DEBUG <got response from s3>
2024-07-31T15:56:07.746746Z DEBUG fuser::request: FUSE( 22) ino 0x0000000000000001 READDIRPLUS fh FileHandle(2), offset 1, size 4096
2024-07-31T15:56:07.746792Z  WARN readdirplus{req=22 ino=1 fh=2 offset=1}: mountpoint_s3::fuse: readdirplus failed: out-of-order readdir, expected=24, actual=1

From the log we can see that when an application gets interrupted in a readdir syscall the underlying chain of readdir fuse requests gets reset to an offset which is considered stale by Mountpoint (in the log the offset is 1 instead of 24). Due to a streaming implementation of readdir in Mountpoint such behaviour is not currently supported. We'll continue exploring possible solutions for this problem and provide updates in this issue.

A separate problem are possible errors caused by interrupts during other system calls, for which we don't have a reproduction yet. I've created a separate issue for that.

## Description of change When user application gets interrupted in a `readdir` syscall the underlying chain of `readdir` fuse requests gets reset to an offset which is considered stale by Mountpoint. In that case Mountpoint still completes the interrupted `readdir` request, but kernel partially discards the response. We already cache the last response, so we can use it to serve the request which follows the interrupt. Relevant issues: #955 ## Does this change impact existing behavior? This is not a breaking change. Previously an error was returned, now it'll be handled properly. --- By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the [Developer Certificate of Origin (DCO)](https://developercertificate.org/). --------- Signed-off-by: Vladislav Volodkin <vlaad@amazon.co.uk> Signed-off-by: Vlad Volodkin <vlaad@amazon.com> Co-authored-by: Vladislav Volodkin <vlaad@amazon.co.uk> Co-authored-by: Vlad Volodkin <vlaad@amazon.com>

emeryberger added the bug Something isn't working label Jul 29, 2024

vladem mentioned this issue Aug 1, 2024

Allow partial repeat of readdir response #965

Merged

vladem mentioned this issue Aug 13, 2024

Possible failures caused by interrupts during system call #981

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`os.listdir` (and presumably other os functions) fails in the face of signals #955

`os.listdir` (and presumably other os functions) fails in the face of signals #955

emeryberger commented Jul 29, 2024

vladem commented Aug 13, 2024

os.listdir (and presumably other os functions) fails in the face of signals #955

os.listdir (and presumably other os functions) fails in the face of signals #955

Comments

emeryberger commented Jul 29, 2024

Mountpoint for Amazon S3 version

AWS Region

Describe the running environment

Mountpoint options

What happened?

Relevant log output

vladem commented Aug 13, 2024

`os.listdir` (and presumably other os functions) fails in the face of signals #955

`os.listdir` (and presumably other os functions) fails in the face of signals #955