Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix some inconsistancies with the maven scheme #2149

Open
DavyLandman opened this issue Feb 18, 2025 · 1 comment
Open

Fix some inconsistancies with the maven scheme #2149

DavyLandman opened this issue Feb 18, 2025 · 1 comment

Comments

@DavyLandman
Copy link
Member

DavyLandman commented Feb 18, 2025

Current Design

The current mvn scheme works as follows: |mvn:///<groupId>!<artifactId>!<version>|. This points to a jar file on the local .m2 repository.

Here are the current design choices:

  • the root loc (so no path) acts as an alias to |home:///.m2/repository/<groupId>/<aritfactId>/<artifactId>-<version>.jar|
  • a loc with a path acts as a subpath acts as an alias to: |jar+home:///.m2/repository/<groupId>/<aritfactId>/<artifactId>-<version>.jar!/<subpath>|
  • because a root path is ambiguously a folder or a file, to make the difference:
    • |mvn:///<groupId>!<artifactId>!<version>| is a file (always a jar file)
    • |mvn:///<groupId>!<artifactId>!<version>/!| is a folder (always inside of a jar file)
    • |mvn:///<groupId>!<artifactId>!<version>/!someFile| == |mvn:///<groupId>--<artifactId>--<version>/someFile|

context:

  • see discussion here about the design of this URI New mvn scheme to replace lib scheme. #1916:
    • The available characters in the authority of a URI is limited, other tools have used : and ! to separate parts of the maven identifier, but we cannot in a authority, as it has a special meaning.
  • the actual PR: Introduced a new resolver for the mvn:/// scheme.  #1962
    • in the end we went for ! as a separator (we also tried --)
    • we tested the separator against the entire mvn grand central as it was in 2015, to make sure that no known groupId's or artifactId's contain !. Also in the current transitive dependencies of rascal and any of its libraries (flybytes), the -- separator is safe.
    • we chose not to do |mvn://<groupId>/<artifactId>/<version>| as the intend was to model how the project scheme worked.
    • we only want to access local files, not remote maven repositories.
    • the mvn scheme should parallel the design of the project:/// scheme as much as possible, where the authority defines the root of a project folder and what is in it is the contents of the project.
    • the mvn locations should be useful in both PathConfig.srcs and PathConfig.libs
      • srcs: folders with Rascal source files
      • libs: folders with Rascal source and .tpl files, and references to locs that an IClassLoaderResolver can handle to produce URLClassLoaders for use in the SourceLocationClassLoader implementation.

Challenges with current design

I have some concerns with the current design:

  1. the root acts like a file, even though we can go into sub directories:
    1. this is visible by isDirectory(|mvn:///org.rascalmpl!rascal!0.40.17|) returning false and .ls on that repo throwing an IO exception
    2. however, isDirectory(|mvn:///org.rascalmpl!rascal!0.40.17/org|) returns true and allows .ls.
  2. In all other rascal locs the exclamation mark ! is used to denote going into a compressed file. For example |jar+file:///a/file.jar!/path/in/that/file|. Now we have 3 of them in the authority, and since the root file points to a jar, this is one way to make it work consistently |jar+mvn:///org.rascalmpl!rascal!0.40.17/!/|, this now supports .ls.
  3. It's not possible to say: what are all the local version of rascal in my m2. While this is not why this scheme was added, but it does (together with point 1 and 2) make it harder to add auto-complete support in the REPL without a lot of custom support just for this scheme.

Suggested alternatives

If we agree that these challenges are something we want to fix, I see the following options we could consider:

  1. we only store the groupId in the authority (this is a thing that has been discussed in both New mvn scheme to replace lib scheme. #1916 and Introduced a new resolver for the mvn:/// scheme.  #1962). for example: |mvn://org.rascalmpl/rascal/0.40.17|. Where isFile is only true if it has both a artifactId and a version, and otherwise it behaves like a directory.
    1. We could then consider to make ! a feature that goes into a jar, but only if the ! is typed. so for example: |mvn://org.rascalmpl/rascal/0.40.17/!/org|.
    2. I think that jar+mvn:///..!/ is actually more consistant.
  2. we make mvn:/// and alias to the .m2/repositories/ folder (just like |home:///| behaves). So a valid location look like |mvn:///org/rascalmpl/rascal/0.40.17/rascal-0.40.17.jar. It would still be stable, but be a bit more verbose. The benefit is you can even provide auto-completion if a user has typed /org/.
  3. mvn always points to a file, if you want to "dive in" you have to use jar+mvn (example: |jar+mvn://org.rascalmpl!rascal!0.40.17/!/org|). This is the least invasive option, and it wouldn't allow for auto-complete (at least not in the options of 3).
  4. We can remove the "file" interpretation for mvn://groupId--artifactId--version/ and always make it list the contents of the jar:
    • the only reason to have the file interpretation is for the classloader, so if we add a simple IClassLoaderLocationResolver to the mvn resolver, it can produce the proper URLClassLoader for the root path of an authority as it should.
    • all the other uses consider the root authority to be an unpacked jar in this way, even .list and .listEntries.
    • this still parallels the project URI and clearly separates the jar file (authority) from its contents (path).
@DavyLandman DavyLandman changed the title Reconsider maven scheme: use path for artifacts & versions Fix some inconsistancies with the maven scheme Feb 19, 2025
@jurgenvinju
Copy link
Member

Thanks for the analysis! Let's talk about this face-to-face; the confusion (serious bug!) between folders and files has started some other confusions.

This scheme is supposed to work like all the other schemes that hide jar files under a thin layer of abstraction (like plugin, bundle, bundleresource, etc) and parallel to the project scheme, like they are.

The goal for all schemes is to have the root of the abstracted filesystem coincide with the root path of the URI scheme. This all prevents people from having to write specific code for specific schemes (like scattered and tangled calls to jarify and unjarify, or having to split the path manually or using the relativize function).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants