Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable fsspec Cache on Integration Tests #646

Merged
merged 1 commit into from
Nov 13, 2023

Conversation

ranchodeluxe
Copy link
Contributor

@ranchodeluxe ranchodeluxe commented Nov 6, 2023

Addresses #645

I can't find the deeper bug here to ticket with zarr or fsspec but potentially getting close

@ranchodeluxe ranchodeluxe added the test-integration Apply this label to run integration tests on a PR. label Nov 6, 2023
@ranchodeluxe
Copy link
Contributor Author

Super fascinating bug. I don't think this is actually minio in the end (for reasons I will explain later). Wondering if this is something with s3fs 🤔

@norlandrhagen
Copy link
Contributor

Oof, well thanks for digging into it!

@ranchodeluxe
Copy link
Contributor Author

ranchodeluxe commented Nov 7, 2023

Oof, well thanks for digging into it!

Breakdown

Here's some evidence for it not being just minio:

  1. I remove any dynamic paths here and hardcode a reusable one using pathlib.Path('/tmp')

  2. I comment out spinning up a pytest minio instnace and hardcode the secrets for my minio container running on my host (see next step) here

  3. I then run a dedicated minio container on my host:

docker run \
   -p 9000:9000 
   --name minio666 \
   -e "MINIO_ACCESS_KEY=YOURACCESSKEY" \
   -e "MINIO_SECRET_KEY=YOURSECRETKEY" \
   -v /tmp/:/data \                                  
   minio/minio server /data 
  1. I can ssh into that container and trace requests using:
mc alias set 'myminio' 'http://localhost:9000' 'YOURACCESSKEY' 'YOURSECRETKEY'
mc admin trace -v -a myminio
  1. Since the problem is intermittent I wait for a successful run to happen and check outputs on minio using mc ls <target>. If everything looks good I know the next runs should ALWAYS succeed b/c the lat key (for the noaa-oisst test specifically) exists already and we are reusing the hardcoded target from step 1 above for the next run

  2. More runs either succeed and fail intermittently even with this set up (pytest -s tests/test_integration.py -k 'test_integration[minio_confpath-noaa-oisst]' --run-integration --timeout=500)

  3. There is nothing in the verbose mc trace logging that shows bad PUT(s) or 500(s) or anything

  4. My hunch is if this was just minio we wouldn't be seeing the tests failing constantly for exactly the same key (which is lat for noaa-oisst test) but different keys across runs. I'm not seeing that. For the other test (gpcp) it's failing for the a different key constantly (time)

Proposals

  • Is there a reason these tests can't use an actually dedicated s3 or google storage bucket? Tests run fine against them

  • I imagine we don't want to use fsspec.implementations.local.LocalFileSystem for the tests b/c it's not really flexing what this project is about?

  • If we truly think it's just minio a possible work around here that will take more time is to do what pangeo-forge-runner does and set up a k3's instance and use minio there. But if it's not just minio it's all for nothing. Thoughts? 🤔

@ranchodeluxe
Copy link
Contributor Author

Proposals

Is there a reason these tests can't use an actually dedicated s3 or google storage bucket? Tests run fine against them

I imagine we don't want to use fsspec.implementations.local.LocalFileSystem for the tests b/c it's not really flexing what this project is about?

If we truly think it's just minio a possible work around here that will take more time is to do what pangeo-forge-runner does and set up a k3's instance and use minio there. But if it's not just minio it's all for nothing. Thoughts? 🤔

well, if I run minio as a service in k8s locally everything works fine so 🤷 looks like that's the way forward. won't get to that today though

@ranchodeluxe ranchodeluxe changed the title WIP: Add Back Assertion Use k3s on Integration Tests Nov 9, 2023
@ranchodeluxe
Copy link
Contributor Author

ranchodeluxe commented Nov 9, 2023

Wow, humbled. Things just got more interesting with these failures after the last commits

Cannot get it to fail locally with a similar set up

Looks like we cannot ignore the subtle interaction that is causing this and have to weed it out

@ranchodeluxe
Copy link
Contributor Author

ranchodeluxe commented Nov 10, 2023

It was s3fs the whole time!!!!!!! I had to disable cache settings and now everything works 😞 Will swap things back to the docker image and make sure that solves it there too

@ranchodeluxe ranchodeluxe changed the title Use k3s on Integration Tests Disable Cache on Integration Tests Nov 10, 2023
@ranchodeluxe ranchodeluxe changed the title Disable Cache on Integration Tests Disable fsspec Cache on Integration Tests Nov 10, 2023
Copy link
Contributor

@norlandrhagen norlandrhagen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@ranchodeluxe ranchodeluxe merged commit dc125ab into main Nov 13, 2023
8 checks passed
@ranchodeluxe ranchodeluxe deleted the gcorradini/fix/minio-tests branch November 13, 2023 16:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
test-integration Apply this label to run integration tests on a PR.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants