Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vsimem: Implicitly create parent directories when creating a file #8369

Merged
merged 13 commits into from
Oct 29, 2023

Conversation

dbaston
Copy link
Member

@dbaston dbaston commented Sep 8, 2023

What does this PR do?

Resolves #8232 by implicitly creating parent directories when needed. A side-effect is that it is no longer possible to have a a file /vsimem/foo and a file /vsimem/foo/baz. Some test updates are included to accommodate this.

What are related issues/pull requests?

#8232

Tasklist

  • Review
  • Adjust for comments
  • All CI builds and checks have passed

@dbaston
Copy link
Member Author

dbaston commented Sep 8, 2023

The test failure in gdrivers/jp2openjpeg.py::test_jp2openjpeg_45 is a bit beyond me. It looks like the following call on line 2549

    out_ds = gdaltest.jp2openjpeg_drv.CreateCopy(
        "/vsimem/jp2openjpeg_45.jp2",
        src_ds,
        options=["GMLJP2V2_DEF=" + json.dumps(conf)],
    )

now creates a directory /vsimem/gmljp2 that was previously not created, and this causes the assert on 2578 to fail.

@rouault
Copy link
Member

rouault commented Sep 8, 2023

The test failure in gdrivers/jp2openjpeg.py::test_jp2openjpeg_45 is a bit beyond me.

fix for that and similar (hidden) issues in dbaston#1

I basically grepped "/vsimem/" in the code base an looked for "/vsimem/foo/..." type of filenames . In most cases a simple replacement from "/vsimem/foo/bar" to "/vsimem/foo_bar" does it (for single files). A few particular cases for the GMLJP2 stuff and the WFS driver that actually did a RmdirRecursive(). In some cases anonymous vsimem files could also be created to simplify things

@rouault
Copy link
Member

rouault commented Sep 8, 2023

@dbaston It could be appropriate to add a note in MIGRATION_GUIDE.TXT to describe the new behaviour of /vsimem/ . It could cause some issues for external code doing very_long_loop_on(i) : create "/vsimem/{i}/foo" ; unlink "/vsimem/{i}/foo" with unbounded creation of /vsimem/{i} directories

@rouault
Copy link
Member

rouault commented Sep 8, 2023

Actually I'm wondering if GDAL drivers & components shouldn't write in a /vsimem/gdal/ directory to limit polluting the user /vsimem/ space (possibly with a special behaviour of ReadDir("/vsimem/") hidding the "_gdal" subdirectory ?), and autotest could check that the content of /vsimem/gdal/ is empty when tests have tear down, to detect failure of GDAL code to clean after itself

@dbaston dbaston force-pushed the vsimem-implicit-dir-create branch from 975e102 to 9a69968 Compare October 2, 2023 15:44
@dbaston
Copy link
Member Author

dbaston commented Oct 3, 2023

So it appears that all necessary test adjustments have been made, but:

  1. PLScenes tests were fixed by adding a magic query parameter ?fortest=true that is handled by that driver's HTTP request code
  2. ElasticSearch tests were fixed by a similar method, except that the name of the magic query parameter is controlled by a new option CPL_CURL_VSIMEM_DEFAULT_QUERY_PARAMETERS that is understood by CPLHTTPFetchEx.
  3. PLMosaic tests were fixed by switching from responses written in /vsimem to a webserver that is backed by a combination of in-memory responses (served via a dictionary) with a fallback to a VSI location. This fallback was needed because we need to write tiff files that are later requested by the driver.

Is it worth consolidating on a single strategy here? As much as I'd like to be done here, it seems worth raising 😅. I think #3 is the only strategy that can work for all drivers.

@rouault
Copy link
Member

rouault commented Oct 4, 2023

I think #3 is the only strategy that can work for all drivers.

yes, could be nice to consolidate that on that. And potentially remove /vsimem/ special processing in CPLHTTPFetchEx() (but we might use that in other tests) and int PLScenes driver that was done only for tests.
That can be for another pull request if you prefer

@dbaston
Copy link
Member Author

dbaston commented Oct 10, 2023

a webserver that is backed by a combination of in-memory responses (served via a dictionary) with a fallback to a VSI location.

Of course, this is more complicated than I envisioned -- for ElasticSearch, we need to correctly handle different HTTP verbs (vsimem CUSTOMREQUEST and POSTFIELDS functionality). I am working through the ElasticSearch tests to try and do this, but finding myself wondering if this is actually better than hitting a live ElasticSearch server, as is done for Postgres, MySQL, etc.

@rouault
Copy link
Member

rouault commented Oct 10, 2023

for ElasticSearch, we need to correctly handle different HTTP verbs (vsimem CUSTOMREQUEST and POSTFIELDS functionality).

/vsis3/ deals with different HTTP verbs

if this is actually better than hitting a live ElasticSearch server

Both approaches have their advantages:

  • stubbed server: easy for anyone to run the test suite, can test behaviour related to erroneous/hostile responses from server
  • real server: checks real interoperability (for ElasticSearch, my memories is that it is quite hard to have a reliable server using Docker image due it to be Java based with some important RAM requirements)

@dbaston dbaston force-pushed the vsimem-implicit-dir-create branch from 1d0f874 to 0205fec Compare October 13, 2023 14:30
@coveralls
Copy link
Collaborator

coveralls commented Oct 13, 2023

Coverage Status

coverage: 67.844% (+0.003%) from 67.841% when pulling 6754d5a on dbaston:vsimem-implicit-dir-create into b8bba1b on OSGeo:master.

@dbaston dbaston force-pushed the vsimem-implicit-dir-create branch 3 times, most recently from bdcc820 to 9abf1d4 Compare October 19, 2023 13:14
@rouault
Copy link
Member

rouault commented Oct 27, 2023

@dbaston Is it still WIP ?

@dbaston dbaston force-pushed the vsimem-implicit-dir-create branch from db09285 to bc566c5 Compare October 28, 2023 00:19
@dbaston dbaston force-pushed the vsimem-implicit-dir-create branch from bc566c5 to 6754d5a Compare October 28, 2023 01:14
@dbaston dbaston changed the title vsimem: Implicitly create parent directories when creating a file [WIP] vsimem: Implicitly create parent directories when creating a file Oct 28, 2023
@dbaston
Copy link
Member Author

dbaston commented Oct 28, 2023

I think it can be considered ready.

@rouault rouault merged commit 593b51a into OSGeo:master Oct 29, 2023
30 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

vsimem: Files can be created in non-existing directories but then cannot be found
3 participants