Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add list-jdks and list-vendors commands #22

Merged
merged 25 commits into from
Aug 9, 2024

Conversation

ctrueden
Copy link
Contributor

@ctrueden ctrueden commented Aug 1, 2024

This patchset adds a command to list matching JDKs, and another to list available vendors.

I don't love it, so I dragged my feet before filing this PR, but the two new commands ostensibly work.

I already discussed these changes directly with @marktsuchida 1.5 years ago now... as a starting point for discussion, here is a transcript of that conversation:

@ctrueden wrote:

Here are some questions and thoughts:

  • Would it be desirable to be able to filter more criteria (vendors, os, arch) more flexibly? E.g. match vendors via regex rather than just specifying the desired one? Or is this overkill? Maybe just globs, e.g. to match zulu* or even just *? Right now the list-vendors command I wrote just unions together all the vendors from all os/arch into one set...
  • Right now cjdk list-jdks lists all available jdks for the conf (e.g. cjdk -j zulu:17+ list-jdks). But I wanted to make cjdk list-jdks --all do that, and have cjdk list-jdks no-args only list the ones already cached. Or I guess cjdk list-jdks --cached could do that; either way. But I think both these modes are useful.
  • Would it make sense to have an Index class to provide an object-oriented of the index, instead of needing to pass the raw nested dict object around to all the _index functions? (asks the Java programmer 😝)
  • Right now the _api.list_jdks function duplicates a little bit of code from _index.resolve_jdk_version, because the logic for filtering JDKs according to conf is not well-separated enough from the logic to resolve a single version. I split things up a little by making a _match_versions to return all whereas _match_version returns the best... but I didn't take it all the way. Would be good to think about an architectural adjustment to make filtering into its own standalone piece of the library.

If you'd prefer I could file a draft PR and paste in these comments, whatever you like.

You know what's really funny? I did this work as a rabbit hole so I could find out how to install Zulu JDK+FX with cjdk... and now that I've done it, and dug through the index, it looks like that version of the JDK isn't in there? Not that I can find, anyway. 😂

@marktsuchida: We could build our own index, or supplement it. Either by forking and extending the original one (it's Scala code) or upstreaming, or by making cjdk use multiple indexes.

@ctrueden: Supplementing the index would make sense to me... I'll see if anyone filed an issue in the Coursier repo.

@ctrueden: Heh, they don't have Issues. So yeah, either updating their code to include it, or making cjdk support unioning multiple indices. Eh, punting!

@marktsuchida wrote:

Some not-so-organized thoughts/opinions on your comments above (I haven't yet looked at your patch in detail):

  • Globs maybe; regex seems overkill to me and harder to use for the majority of expected use cases (and people can/should pipe to grep if they really need to). Maybe limit to the vendor part, until we see a clear need for anything more.
    • Examples: *-jdk (regular JDKs only), graalvm-* (GraalVM for any Java version), zulu*-fx (Zulu JRE or JDK but with OpenJFX; not currently available)
  • Is * the same as not giving a vendor? That seems natural when listing available JDKs, but the existing behavior is that empty vendor means adoptium (overridable). I think this suggests that matching expressions for listing are not necessarily the same concept as the JDK spec used by existing commands (as long as we preserve the current behavior for the latter).
  • If allowing globs also for the existing commands, we also need to define what happens when two JDKs match that differ only in vendor (my preference: print the ambiguous matches and fail).
  • I'm uncertain about allowing globs for OS and architecture. The main issue here is that there is already a pragmatic transformation to map OSs and architectures to canonical names, and it's not clear how this would work in the presence of globbing.
  • The current vendor:version spec syntax was specifically designed to result in zero or one unambiguous matches in all cases (no unbreakable ties). The only natural way to extend this to the filtering use case (that I can think of) is to select, in addition to the primary match, the JDKs that "would have matched had there not been a newer version that also matches".
    • Changing the meaning of the spec between commands doesn't seem like a very good idea.
    • But this (again) leads to slightly counterintuitive results when combined with globbing: 17 will only list adoptium (JRE only); you would need *:17 to list all vendors' Java 17 JREs and JDKs. (Though maybe that is not the end of the world.)

list-jdks

  • Defaulting to showing only installed JDKs and providing a flag to list all available in the index sounds good to me.
    • Maybe something more specific than --all (such as --available) would be good, because other criteria may come in to play later (such as whether or not to include system-installed JDKs that we detect)
  • I think I prefer ls over list-jdks, this being a command-line tool and this subcommand probably being one that will be frequently invoked
    • In which case also ls-vendors instead of list-vendors
    • Perhaps you were following the example of cache-jdk etc., but those are advanced or less frequently used subcommands
    • Later there may also be added ls-files and ls-packages
    • One could argue for renaming cache-jdk to just cache for consistency with ls
    • The Python API function can be list_jdks

Make Index a class?

  • I vaguely remember thinking about this but can't remember if there was any particular reason I decided against it.
  • Maybe the fact that it was hard to come up with a good set of primitive operations on it (it seemed necessary to either expose the internal structure or build all query logic into what I would like to be a pure-data object).
  • Still, giving it a name (so it can be used in type hints) may help with readability even if all internal data is exposed.
  • I don't know that binding functions with data is particularly necessary in Python, but I'm not opposed if it can be made to fit in well with the rest of the code and unit tests.
  • The other idea I had was to ingest all the data into an in-memory sqlite3 database and use proper queries on it. This might still make sense (and the db could even be cached on disk if it records the hash of the index from which it was created).

@ctrueden: Agreed about ls over list-jdks. This is also the command used by pass for example. (git uses ls-files but I just aliased it.)

@ctrueden
Copy link
Contributor Author

ctrueden commented Aug 1, 2024

August 2024 update of my thoughts relating to the above:

  1. I'm less concerned about making the JDK+FX version of Zulu available these days, since Fiji is on the cusp of migrating beyond Java 8, at which point JavaFX can be handled as its own regular dependency rather than bundled with the JDK.

  2. As discussed, I will change list-jdks to ls, and list-vendors to ls-vendors.

  3. I'll make ls default to showing only installed JDKs, and make ls --available list all available in the index.

  4. I'll look into adding cache as a shorthand for cache-jdk.

  5. For Index, for now I'll look into just type aliasing it—something like Index = dict and then using Index everywhere applicable for the type hint. That should help readability and static type analyzers while still keeping the code simple.

@ctrueden ctrueden force-pushed the listings branch 4 times, most recently from 6263565 to 5951468 Compare August 1, 2024 12:45
And simplify the existing _match_version function
by delegating to the new _match_versions.

Technically this is slower time-complexity-wise, but
it's also more Pythonic, with performance unlikely to
matter for the number of available JDKs in the index.
It's helpful to be able to query the index to know what is available.
* list-jdks -> ls
* list-vendors -> ls-vendors
@ctrueden ctrueden force-pushed the listings branch 2 times, most recently from 9a53465 to 2b0bca3 Compare August 1, 2024 13:01
And add some type hints around adjacent functions, too.

And rename _cached_index function to _cached_index_path
because it returns a Path, not an Index dict.
And let `ls --available` list all matches from the index.
@ctrueden
Copy link
Contributor Author

ctrueden commented Aug 1, 2024

Most things above are now done. Still to do:

  1. Lint the code.
  2. Change default vendor from adoptium to None, and use adoptium only with certain commands but not others: a single match is needed (e.g. for cache), then None should be late-resolved to adoptium, whereas when multiple matches are desired, None should mean "any vendor".

This patch is kind of horrible, but it gets the job done correctly
for more challenging version strings like 21.0.1+12_openj9-0.42.0.
@ctrueden ctrueden marked this pull request as ready for review August 2, 2024 04:19
@ctrueden
Copy link
Contributor Author

ctrueden commented Aug 2, 2024

I believe everything is addressed now. Probably simplest to just read through the patches to see how I solved various things. Happy to elaborate verbally here as needed. The CI should pass now as well.

Edit: I should probably add more tests, but... 😴

@ctrueden
Copy link
Contributor Author

ctrueden commented Aug 7, 2024

@marktsuchida What do you think? Is this a good direction?

@marktsuchida
Copy link
Collaborator

Yes! Thanks so much for this.
I want to make sure the docs and tests are kept up to date (maybe in this PR, maybe later, and I'm happy to help), but other than that I like this.
Were you planning to add more? If not, I'll plan to merge this shortly, after I take a closer look at some details (and remind myself how this whole thing works :).

@ctrueden
Copy link
Contributor Author

ctrueden commented Aug 7, 2024

I'll write some tests right now.

_api:
- _get_vendors
- _get_jdks

_conf:
- configure: fallback_to_default_vendor=False

_index:
- jdk_url: exact_version=non-None
- _postprocess_index
- _match_versions
* Add ls and ls-vendors commands.
* Rename cache-jdk command to cache.
@ctrueden
Copy link
Contributor Author

ctrueden commented Aug 8, 2024

@marktsuchida OK, all set. The tests are pretty basic, but do cover the new functions. Here's a code coverage report diff from pytest --cov on this branch versus main:

diff --git a/../before.txt b/../after.txt
index 2e11e0e..8547965 100644
--- a/../before.txt
+++ b/../after.txt
@@ -2,26 +2,26 @@
 Name                                      Stmts   Miss  Cover
 -------------------------------------------------------------
 src/cjdk/__init__.py                          3      0   100%
-src/cjdk/_api.py                             56     12    79%
+src/cjdk/_api.py                             96     17    82%
 src/cjdk/_cache.py                          128     11    91%
-src/cjdk/_conf.py                           113     13    88%
+src/cjdk/_conf.py                           114     13    89%
 src/cjdk/_download.py                        51      8    84%
-src/cjdk/_index.py                           81      7    91%
-src/cjdk/_install.py                         16      0   100%
+src/cjdk/_index.py                          106      8    92%
+src/cjdk/_install.py                         17      0   100%
 src/cjdk/_jdk.py                             33      0   100%
 src/cjdk/_progress.py                        62     26    58%
 src/cjdk/_version.py                         11      2    82%
 tests/mock_server.py                         80      0   100%
-tests/test_api.py                            68      0   100%
+tests/test_api.py                            97      0   100%
 tests/test_cache.py                          83      3    96%
 tests/test_cache_atomic_file.py             114     15    87%
 tests/test_cache_permanent_directory.py      54      3    94%
 tests/test_cjdk.py                           16      0   100%
-tests/test_conf.py                          107      0   100%
+tests/test_conf.py                          110      0   100%
 tests/test_download.py                       84      1    99%
-tests/test_index.py                          98      0   100%
+tests/test_index.py                         115      0   100%
 tests/test_install.py                        38      2    95%
 tests/test_jdk.py                            43      0   100%
 tests/test_mock_server.py                    25      0   100%
 -------------------------------------------------------------
-TOTAL                                      1364    103    92%
+TOTAL                                      1480    109    93%

I have no idea if I got the syntax right for the directives in the docs, but I followed existing patterns.

@marktsuchida
Copy link
Collaborator

marktsuchida commented Aug 9, 2024

Pushed a commit to clarify test results. I can fix the tests for macOS and Windows.

It's not an error for the versions to be empty unless we are trying to
resolve to a single version.
(Better if we avoid remote-dependent unit tests, or course, but I'll
leave that for another time :)
Also avoid printing an empty line when the result is empty.
@marktsuchida
Copy link
Collaborator

Thanks again, @ctrueden, for putting the work into this!

I've reviewed all your changes and pushed a few minor changes. The main one is to move the print-to-stdout from Python API to CLI (so that the API functions return the data to the caller instead).

The only other thing I might change is that I feel that ls-vendors/list_vendors should also honor the --os and --arch settings, including the defaults (i.e., current OS and arch). The main motivation is behavioral uniformity. Do you have any strong opinions against that? If not, I'll do it in a separate PR. If you have a use case for listing all vendors across OS/arch, we could allow something like --os=all.

@marktsuchida marktsuchida merged commit 59abae3 into cachedjdk:main Aug 9, 2024
6 checks passed
@marktsuchida marktsuchida linked an issue Aug 9, 2024 that may be closed by this pull request
@ctrueden ctrueden deleted the listings branch August 20, 2024 20:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CLI feature: list available and installed JDKs
2 participants