Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crawl (remote) Maven repos #1071

Open
vorburger opened this issue Feb 9, 2025 · 3 comments
Open

Crawl (remote) Maven repos #1071

vorburger opened this issue Feb 9, 2025 · 3 comments

Comments

@vorburger
Copy link
Member

@cstamas is there something somewhere (in Mima, Maven; or elsewhere that you know of) to "crawl" e.g. https://repo1.maven.org/maven2/ ? I mean something which doesn't start with a specific Artifact, but which just, given a (remote) Maven repo URL, gives you "all the Artifacts on that repo".

If that does not exist, I guess I could also do it by hacking my very own HTTP Directory Index HTML page parser (probably using jsoup) ... but maybe there are already libraries which can do this in the Maven ecosystem which I am not aware of?

@cstamas
Copy link

cstamas commented Feb 9, 2025

Ha! This is just a renewed debate in Maven PMC, as state of affairs with Maven repo part is a bit strange: it is held by an US company (history). We (Maven PMC) do want access to Maven Central.

As on surface:

  • crawling is forbidden (your IP will be banned)
  • index published is scarce, and may become even more scarce

So do not crawl.

There ARE simple workarounds:

[cstamas@blondie mima (main)]$ jbang toolbox@maveniverse
Toolbox 0.6.1 (MIMA Runtime 'standalone-static' version 2.4.21)

          Maven version 3.9.9
                Managed true
                Basedir /home/cstamas/Worx/maveniverse/mima
                Offline false

             MAVEN_HOME /home/cstamas/.sdkman/candidates/maven/current
           settings.xml /home/cstamas/.sdkman/candidates/maven/current/conf/settings.xml
         toolchains.xml /home/cstamas/.sdkman/candidates/maven/current/conf/toolchains.xml

              USER_HOME /home/cstamas/.m2
           settings.xml /home/cstamas/.m2/settings.xml
  settings-security.xml /home/cstamas/.m2/settings-security.xml
       local repository /home/cstamas/.m2/repository-oss

               PROFILES
                 Active [oss-development]
               Inactive []

    REMOTE REPOSITORIES
                        central (https://repo.maven.apache.org/maven2/, default, releases)
prompt> list org.apache.maven
org.apache.maven:apache-maven
org.apache.maven:archetype
...

Etc. But there are all (broken) circumventions, again, Maven PMC is working on getting access to whole central for exactly this purpose.
https://bsky.app/profile/brunoborges.bsky.social/post/3lexnq3cimc2e

@vorburger
Copy link
Member Author

@vorburger vorburger mentioned this issue Feb 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants