Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] use graal native-image to create native images of jvm tools for performance increases in certain workflows #6893

Closed

Conversation

cosmicexplorer
Copy link
Contributor

@cosmicexplorer cosmicexplorer commented Dec 10, 2018

WIP, as it depends on:

Problem

Graal is a virtual machine with a very large variety of frontends, including JVM- and LLVM-based languages, as well as interpreted languages such as Python, JavaScript, and R. One of the many things it can do is create "native images" which are able to begin executing immediately, without any VM startup time (although the performance is limited to the capabilities of AOT compilation). In some experimentation, I found I was able to match the speed of a warm nailgunned scalafmt task on a large target set in our monorepo with an extremely naive parallelism strategy (invoking one native image process per target). Not having to maintain a warm nailgun (and the associated memory) has potential benefits on developer laptops, as well as CI machines, depending on how leaky the tool is, and the runtime environment. In an ephemeral environment of any sort (apparently kubernetes is an example of this), zero startup time may make it more scalable to lazily download JVM tools instead of maintaining state in any specific ephemeral node.

Solution

  • Add the GraalCE binary tool, which can transform a classpath into an AOT-compiled image with zero startup time.
  • Create a NativeImageExecutor for executing JVM code which creates a native image on the fly, then executes it hermetically.
  • Integrate native-image into the NailgunTaskBase implementation so that it can be selected using --execution-strategy=native-image, e.g. in pants.ini.
  • Add support for hermetic execution via native-image to the scalafmt task.

Followup

  • Use the v2 parallel product_request() API to perform parallel hermetic execution of all native-image subprocesses (ideally with an API to do this from NailgunTask).
  • Implement GraalNativeImageCreate to create native images out of any jvm_bundles, probably in a separate PR.
  • Fix the scalafmt task to only run on invalidated files, not all files in all invalidated targets, probably in a separate PR.
  • Define more clearly what workflows a graal-produced native image is generally more effective on, and document it.
  • Consider other similar tools such as OpenJDK's jaotc.

Result

Scalafmt can be invoked as a native-image, and the path is clear for other tools to do the same!

@cosmicexplorer cosmicexplorer force-pushed the graal-is-cool branch 3 times, most recently from 671fa8e to 767e2d7 Compare April 8, 2019 04:07
@cosmicexplorer cosmicexplorer changed the title [WIP] use graal to create native images of jvm tools and jvm_binaries for performance increases in certain workflows use graal native-image to create native images of jvm tools for performance increases in certain workflows Apr 8, 2019
@cosmicexplorer
Copy link
Contributor Author

This should be ready for review after removing many different parts from it. There might be a smaller surface to cut at, but I really wanted to have the end-to-end testing via scalafmt there along with this large new feature. I've made a project at https://github.com/pantsbuild/pants/projects/13 covering the larger arc of native-image work.

Copy link
Contributor

@illicitonion illicitonion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is super exciting! I have a couple of questions about the core abstractions here...

When we were adding the hermetic execution strategy, I started feeling kind of sketchy about it because in reality, each subclass needs to write specific code to handle the hermetic case. It looks like it's the same for native-image as a strategy, and that in a bunch of places we have fake implementations; if you specify --execution-strategy=native-image for a bunch of Tasks where you can, it will either do subprocess, or hermetic, or raise. Accordingly, I don't actually think --execution-strategy is a good flag to have on a common super-class. It feels like it should only be exposed by things that support it (which can use a common library class to implement it), rather than exposed by any JVM tool (where some things just don't implement some of the strategies)...

Though if we do keep these things in NailgunTaskBase, we should probably rename that to something like JvmToolTaskBase, because it's very not nailgun-specific any more.

Similarly, I'm not convinced that the JVM Executor and Runner abstractions are right to load this into; right now, for native image we simply ignore the arguments passed to the API methods; that feels like we're trying to force an implementation into an existing (but incorrect) abstraction; either we should be able to properly use the arguments to the methods, or we should not try to use the same API.

def generate_urls(self, version, host_platform):
system_id = self._SYSTEM_ID[host_platform.os_name]
archive_basename = self._ARCHIVE_BASE_FMT.format(version=version, system_id=system_id)
return [self._DIST_URL_FMT.format(version=version, base=archive_basename)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this interact with networks which can't access the public internet? Will this automatically fall back to some mirror, or is that code that needs writing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This uses the same resolution mechanism as other binary tools, in that if --no-allow-external-binary-tool-downloads is set, it will attempt to fetch it from --binaries-baseurls. The url generators are a convenience for open source users which allows someone to set e.g. --graal-version=1.0.0rc16 when that version is released in the future, without requiring any work on our part to allow that. We haven't yet upstreamed any script which syncs our pantsbuild s3 to another source, so people in other monorepos without public internet access will have to write that code for themselves.

] + list(native_image_config.extra_args)

pprinted_argv = safe_shlex_join(argv)
with temporary_dir() as tmp_dir:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may want to specify the actual output dir as the root_dir here because it guarantees that the move you do later will be on the same device (so quick); people who have tempdirs on different partitions from their cachedir would otherwise need to incur a second disk write to do the move

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've now modified native-image execution to occur hermetically, so we avoid this entirely, but noting this for the future!


class NativeImageCreationError(Exception): pass

def produce_native_image(self, tool_classpath, main_class, native_image_config, context,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like it could do with a unit test

text_type(root))


def digest_file_paths(paths, scheduler, root=None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably have a unit test, as it's now a publicly exposed function

@cosmicexplorer
Copy link
Contributor Author

@illicitonion I agree with your high-level feedback and will remove native-image as an execution strategy and instead (I'm thinking) move this functionality into its own task mixin!

@cosmicexplorer cosmicexplorer changed the title use graal native-image to create native images of jvm tools for performance increases in certain workflows [WIP] use graal native-image to create native images of jvm tools for performance increases in certain workflows Apr 8, 2019
@cosmicexplorer
Copy link
Contributor Author

Marking this as WIP again until the above comments are addressed.

@cosmicexplorer
Copy link
Contributor Author

The large refactoring behind the current diff is on top of the commits from #7551 and #7552. This remains a WIP due to the remaining comments above.

./pants --no-process-execution-cleanup-local-dirs fmt.scalafmt --execution-strategy=native-image --files-per-process=2 --no-skip src/scala::

Also, remove the parallel execution stuff and make scalafmt work for real:
- had to delay initialization of a few classes until runtime for the scalafmt cli to stop making paths
relative to the native-image build dir
- made a `.native_image_config` property to specify these delayed initializations
- make native-image creation output to a workunit
- add docstrings
@Eric-Arellano
Copy link
Contributor

Closing due to being stale (pretty big merge conflict) - although GraalVM is awesome and it would be neat to see this be revived.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants