-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gradle: @CacheableTask
causes build/test errors / re-think caching approach
#30852
Labels
Comments
snazy
added a commit
to snazy/nessie
that referenced
this issue
Feb 3, 2023
See [Quarkus issue 30852](quarkusio/quarkus#30852), both the actual bug and the fact that the cached output of the `GradleBuild` task is huge.
snazy
added a commit
to projectnessie/nessie
that referenced
this issue
Feb 3, 2023
See [Quarkus issue 30852](quarkusio/quarkus#30852), both the actual bug and the fact that the cached output of the `GradleBuild` task is huge.
This would probably be fixed by ensuring the data is written in sorted order (although I have no checked) |
snazy
added a commit
to snazy/quarkus
that referenced
this issue
Feb 4, 2023
TODO: find a way to allow caching "large" uber-jars and native runnables (for those who want that) * Removes the huge `lib/` folder from the output-directory of the `QuarkusBuild` task to reduce the size of the cache entity in Gradle's build cache. * Introduce a new `QuarkusBuildLibs` task to produce the `lib/` folder and its content. This task's "up-to-date" checks works, but it is intentionally not cacheable (it's huge). Re-creating the contents of the `lib/` folder is rather cheap and the dependency artifacts are mangaged by the local artifact repository or via the project's build artifacts (jars generated by other modules of the same build). * Introduce a new `QuarkusBuildFinish` task to combine the output of the "`lib/`-less" `QuarkusBuild` and the output of the `QuarkusBuildLibs` tasks, so the result is the same as before this change. Other, related changes: * The `QuarkusBuild` task intentionally removes outputs for other package types than the currently configured one. This makes up-to-date checks work across multiple package types. Other notes: * The task names `quarkusLibsBuild` and `quarkusFinishBuild` are intentionally "that way around". Letting the names of these tasks begin with `quarkusBuild...` could confuse users, who use abbreviated task names on the command line (for example `./gradlew qB` is automagically expanded to `./gradlew quarkusBuild`, which would become ambiguous with `quarkusBuildLibs` and `quarkusBuildFinish`). Relates to: quarkusio#30852
snazy
added a commit
to snazy/quarkus
that referenced
this issue
Feb 4, 2023
TODO: find a way to allow caching "large" uber-jars and native runnables (for those who want that) * Removes the huge `lib/` folder from the output-directory of the `QuarkusBuild` task to reduce the size of the cache entity in Gradle's build cache. * Introduce a new `QuarkusBuildLibs` task to produce the `lib/` folder and its content. This task's "up-to-date" checks works, but it is intentionally not cacheable (it's huge). Re-creating the contents of the `lib/` folder is rather cheap and the dependency artifacts are mangaged by the local artifact repository or via the project's build artifacts (jars generated by other modules of the same build). * Introduce a new `QuarkusBuildFinish` task to combine the output of the "`lib/`-less" `QuarkusBuild` and the output of the `QuarkusBuildLibs` tasks, so the result is the same as before this change. Other, related changes: * The `QuarkusBuild` task intentionally removes outputs for other package types than the currently configured one. This makes up-to-date checks work across multiple package types. Other notes: * The task names `quarkusLibsBuild` and `quarkusFinishBuild` are intentionally "that way around". Letting the names of these tasks begin with `quarkusBuild...` could confuse users, who use abbreviated task names on the command line (for example `./gradlew qB` is automagically expanded to `./gradlew quarkusBuild`, which would become ambiguous with `quarkusBuildLibs` and `quarkusBuildFinish`). Relates to: quarkusio#30852
snazy
added a commit
to snazy/quarkus
that referenced
this issue
Feb 4, 2023
TODO: find a way to allow caching "large" uber-jars and native runnables (for those who want that) * Removes the huge `lib/` folder from the output-directory of the `QuarkusBuild` task to reduce the size of the cache entity in Gradle's build cache. * Introduce a new `QuarkusBuildLibs` task to produce the `lib/` folder and its content. This task's "up-to-date" checks works, but it is intentionally not cacheable (it's huge). Re-creating the contents of the `lib/` folder is rather cheap and the dependency artifacts are mangaged by the local artifact repository or via the project's build artifacts (jars generated by other modules of the same build). * Introduce a new `QuarkusBuildFinish` task to combine the output of the "`lib/`-less" `QuarkusBuild` and the output of the `QuarkusBuildLibs` tasks, so the result is the same as before this change. Other, related changes: * The `QuarkusBuild` task intentionally removes outputs for other package types than the currently configured one. This makes up-to-date checks work across multiple package types. Other notes: * The task names `quarkusLibsBuild` and `quarkusFinishBuild` are intentionally "that way around". Letting the names of these tasks begin with `quarkusBuild...` could confuse users, who use abbreviated task names on the command line (for example `./gradlew qB` is automagically expanded to `./gradlew quarkusBuild`, which would become ambiguous with `quarkusBuildLibs` and `quarkusBuildFinish`). Relates to: quarkusio#30852
snazy
added a commit
to snazy/quarkus
that referenced
this issue
Feb 4, 2023
TODO: find a way to allow caching "large" uber-jars and native runnables (for those who want that) * Removes the huge `lib/` folder from the output-directory of the `QuarkusBuild` task to reduce the size of the cache entity in Gradle's build cache. * Introduce a new `QuarkusBuildLibs` task to produce the `lib/` folder and its content. This task's "up-to-date" checks works, but it is intentionally not cacheable (it's huge). Re-creating the contents of the `lib/` folder is rather cheap and the dependency artifacts are mangaged by the local artifact repository or via the project's build artifacts (jars generated by other modules of the same build). * Introduce a new `QuarkusBuildFinish` task to combine the output of the "`lib/`-less" `QuarkusBuild` and the output of the `QuarkusBuildLibs` tasks, so the result is the same as before this change. Other, related changes: * The `QuarkusBuild` task intentionally removes outputs for other package types than the currently configured one. This makes up-to-date checks work across multiple package types. Other notes: * The task names `quarkusLibsBuild` and `quarkusFinishBuild` are intentionally "that way around". Letting the names of these tasks begin with `quarkusBuild...` could confuse users, who use abbreviated task names on the command line (for example `./gradlew qB` is automagically expanded to `./gradlew quarkusBuild`, which would become ambiguous with `quarkusBuildLibs` and `quarkusBuildFinish`). Relates to: quarkusio#30852
snazy
added a commit
to snazy/quarkus
that referenced
this issue
Feb 5, 2023
Currently the `QuarkusBuild` task implementation adds even large build artifacts and unmodified dependency jars to Gradle's build cache. This pollutes the Gradle build cache quite a lot and therefore lets archived build caches become unnecessary huge. This change updates the build logic to fix this behavior by not adding dependencies and large build artifacts, like uber-jar and native binary, to the Gradle build cache. The `QuarkusBuild` task has been split into three tasks: 1. a new `QuarkusBuildDependencies` task that only collects the contents for `build/quarkus-app/lib` 2. a new `QuarkusBuildApp` task that builds the Quarkus app, which was previously done by the `QuarkusBuild` task 3. the `QuarkusBuild` task now combines the outputs of the above two tasks `QuarkusBuildDependencies` (named 'quarkusDependenciesBuild`) is not cacheable, because it only collects dependencies, which come either from a repository (and are already available locally elsewhere) or are built by other Gradle tasks. This task is only executed if the configured Quarkus package type requires the "quarkus-app" directory (`fast-jar` + `native`). It's "build working directory" is `build/quarkus-build/dep`. `QuarkusBuildApp` (named `quarkusAppBuild`) builds the Quarkus application. The contents of the "quarkus-app" directory _excluding_ the `lib/` directory are cacheable, which is the default for CI environments. Non-CI environments still cache all outputs, even uber-jars and native binaries to retain the existing behavior for developers and keep build turn-around times low. CI environments can opt-in to add even huge artifacts to Gradle's build cache by explicitly setting the `cacheUberAndNativeRunners` property in the Quarkus extension to `true`. It's "build working directory" is `build/quarkus-build/app`. Since `QuarkusBuild` only combines the outputs of the above two tasks, the same "CI vs local" caching behavior as for the `QuarkusBuildApp` task applies. To make "up to date" checks (kind of) more reliable, all outputs are removed first. This means, that for example an existing uber-jar in `build/` will disappear, when the build's package type is "fast-jar". This behavior can be disabled by setting the `cleanupBuildOutput` property on the Quarkus extension to `false`. The task names `quarkusDependenciesBuild` and `quarkusAppBuild` are intentionally "that way around". Letting the names of these tasks begin with `quarkusBuild...` could confuse users, who use abbreviated task names on the command line (for example `./gradlew qB` is automagically expanded to `./gradlew quarkusBuild`, which would become ambiguous with `quarkusBuildDependencies` and `quarkusBuildApp`). Relates to: quarkusio#30852
snazy
added a commit
to snazy/quarkus
that referenced
this issue
Feb 5, 2023
Currently the `QuarkusBuild` task implementation adds even large build artifacts and unmodified dependency jars to Gradle's build cache. This pollutes the Gradle build cache quite a lot and therefore lets archived build caches become unnecessary huge. This change updates the build logic to fix this behavior by not adding dependencies and large build artifacts, like uber-jar and native binary, to the Gradle build cache. The `QuarkusBuild` task has been split into three tasks: 1. a new `QuarkusBuildDependencies` task that only collects the contents for `build/quarkus-app/lib` 2. a new `QuarkusBuildApp` task that builds the Quarkus app, which was previously done by the `QuarkusBuild` task 3. the `QuarkusBuild` task now combines the outputs of the above two tasks `QuarkusBuildDependencies` (named 'quarkusDependenciesBuild`) is not cacheable, because it only collects dependencies, which come either from a repository (and are already available locally elsewhere) or are built by other Gradle tasks. This task is only executed if the configured Quarkus package type requires the "quarkus-app" directory (`fast-jar` + `native`). It's "build working directory" is `build/quarkus-build/dep`. `QuarkusBuildApp` (named `quarkusAppBuild`) builds the Quarkus application. The contents of the "quarkus-app" directory _excluding_ the `lib/` directory are cacheable, which is the default for CI environments. Non-CI environments still cache all outputs, even uber-jars and native binaries to retain the existing behavior for developers and keep build turn-around times low. CI environments can opt-in to add even huge artifacts to Gradle's build cache by explicitly setting the `cacheUberAndNativeRunners` property in the Quarkus extension to `true`. It's "build working directory" is `build/quarkus-build/app`. Since `QuarkusBuild` only combines the outputs of the above two tasks, the same "CI vs local" caching behavior as for the `QuarkusBuildApp` task applies. To make "up to date" checks (kind of) more reliable, all outputs are removed first. This means, that for example an existing uber-jar in `build/` will disappear, when the build's package type is "fast-jar". This behavior can be disabled by setting the `cleanupBuildOutput` property on the Quarkus extension to `false`. The task names `quarkusDependenciesBuild` and `quarkusAppBuild` are intentionally "that way around". Letting the names of these tasks begin with `quarkusBuild...` could confuse users, who use abbreviated task names on the command line (for example `./gradlew qB` is automagically expanded to `./gradlew quarkusBuild`, which would become ambiguous with `quarkusBuildDependencies` and `quarkusBuildApp`). Relates to: quarkusio#30852
snazy
added a commit
to snazy/quarkus
that referenced
this issue
Feb 5, 2023
Currently the `QuarkusBuild` task implementation adds even large build artifacts and unmodified dependency jars to Gradle's build cache. This pollutes the Gradle build cache quite a lot and therefore lets archived build caches become unnecessary huge. This change updates the build logic to fix this behavior by not adding dependencies and large build artifacts, like uber-jar and native binary, to the Gradle build cache. The `QuarkusBuild` task has been split into three tasks: 1. a new `QuarkusBuildDependencies` task that only collects the contents for `build/quarkus-app/lib` 2. a new `QuarkusBuildApp` task that builds the Quarkus app, which was previously done by the `QuarkusBuild` task 3. the `QuarkusBuild` task now combines the outputs of the above two tasks `QuarkusBuildDependencies` (named 'quarkusDependenciesBuild`) is not cacheable, because it only collects dependencies, which come either from a repository (and are already available locally elsewhere) or are built by other Gradle tasks. This task is only executed if the configured Quarkus package type requires the "quarkus-app" directory (`fast-jar` + `native`). It's "build working directory" is `build/quarkus-build/dep`. `QuarkusBuildApp` (named `quarkusAppBuild`) builds the Quarkus application. The contents of the "quarkus-app" directory _excluding_ the `lib/` directory are cacheable, which is the default for CI environments. Non-CI environments still cache all outputs, even uber-jars and native binaries to retain the existing behavior for developers and keep build turn-around times low. CI environments can opt-in to add even huge artifacts to Gradle's build cache by explicitly setting the `cacheUberAndNativeRunners` property in the Quarkus extension to `true`. It's "build working directory" is `build/quarkus-build/app`. Since `QuarkusBuild` only combines the outputs of the above two tasks, the same "CI vs local" caching behavior as for the `QuarkusBuildApp` task applies. To make "up to date" checks (kind of) more reliable, all outputs are removed first. This means, that for example an existing uber-jar in `build/` will disappear, when the build's package type is "fast-jar". This behavior can be disabled by setting the `cleanupBuildOutput` property on the Quarkus extension to `false`. The task names `quarkusDependenciesBuild` and `quarkusAppBuild` are intentionally "that way around". Letting the names of these tasks begin with `quarkusBuild...` could confuse users, who use abbreviated task names on the command line (for example `./gradlew qB` is automagically expanded to `./gradlew quarkusBuild`, which would become ambiguous with `quarkusBuildDependencies` and `quarkusBuildApp`). Relates to: quarkusio#30852
snazy
added a commit
to snazy/quarkus
that referenced
this issue
Feb 5, 2023
Currently the `QuarkusBuild` task implementation adds even large build artifacts and unmodified dependency jars to Gradle's build cache. This pollutes the Gradle build cache quite a lot and therefore lets archived build caches become unnecessary huge. This change updates the build logic to fix this behavior by not adding dependencies and large build artifacts, like uber-jar and native binary, to the Gradle build cache. The `QuarkusBuild` task has been split into three tasks: 1. a new `QuarkusBuildDependencies` task that only collects the contents for `build/quarkus-app/lib` 2. a new `QuarkusBuildApp` task that to collect everything else from a Quarkus build (everything except the `build/quarkus-app/lib`) 3. the `QuarkusBuild` task now combines the outputs of the above two tasks `QuarkusBuildDependencies` (named 'quarkusDependenciesBuild`) is not cacheable, because it only collects dependencies, which come either from a repository (and are already available locally elsewhere) or are built by other Gradle tasks. This task is only executed if the configured Quarkus package type requires the "quarkus-app" directory (`fast-jar` + `native`). It's "build working directory" is `build/quarkus-build/dep`. `QuarkusBuildApp` (named `quarkusAppBuild`) collects the contents of the "quarkus-app" directory _excluding_ the `lib/` directory are cacheable, which is the default for CI environments. Non-CI environments still cache all outputs, even uber-jars and native binaries to retain the existing behavior for developers and keep build turn-around times low. CI environments can opt-in to add even huge artifacts to Gradle's build cache by explicitly setting the `cacheUberAndNativeRunners` property in the Quarkus extension to `true`. It's "build working directory" is `build/quarkus-build/app`. Since `QuarkusBuild` only combines the outputs of the above two tasks, the same "CI vs local" caching behavior as for the `QuarkusBuildApp` task applies. To make "up to date" checks (kind of) more reliable, all outputs are removed first. This means, that for example an existing uber-jar in `build/` will disappear, when the build's package type is "fast-jar". This behavior can be disabled by setting the `cleanupBuildOutput` property on the Quarkus extension to `false`. Both `QuarkusBuildDependencies` and `QuarkusBuildApp` can trigger an actual Quarkus application build. That Quarkus app build will only be triggered when needed and only once per Gradle build. The task names `quarkusDependenciesBuild` and `quarkusAppBuild` are intentionally "that way around". Letting the names of these tasks begin with `quarkusBuild...` could confuse users, who use abbreviated task names on the command line (for example `./gradlew qB` is automagically expanded to `./gradlew quarkusBuild`, which would become ambiguous with `quarkusBuildDependencies` and `quarkusBuildApp`). Relates to: quarkusio#30852
snazy
added a commit
to snazy/quarkus
that referenced
this issue
Feb 5, 2023
Currently the `QuarkusBuild` task implementation adds even large build artifacts and unmodified dependency jars to Gradle's build cache. This pollutes the Gradle build cache quite a lot and therefore lets archived build caches become unnecessary huge. This change updates the build logic to fix this behavior by not adding dependencies and large build artifacts, like uber-jar and native binary, to the Gradle build cache. The `QuarkusBuild` task has been split into three tasks: 1. a new `QuarkusBuildDependencies` task that only collects the contents for `build/quarkus-app/lib` 2. a new `QuarkusBuildApp` task that to collect everything else from a Quarkus build (everything except the `build/quarkus-app/lib`) 3. the `QuarkusBuild` task now combines the outputs of the above two tasks `QuarkusBuildDependencies` (named 'quarkusDependenciesBuild`) is not cacheable, because it only collects dependencies, which come either from a repository (and are already available locally elsewhere) or are built by other Gradle tasks. This task is only executed if the configured Quarkus package type requires the "quarkus-app" directory (`fast-jar` + `native`). It's "build working directory" is `build/quarkus-build/dep`. `QuarkusBuildApp` (named `quarkusAppBuild`) collects the contents of the "quarkus-app" directory _excluding_ the `lib/` directory are cacheable, which is the default for CI environments. Non-CI environments still cache all outputs, even uber-jars and native binaries to retain the existing behavior for developers and keep build turn-around times low. CI environments can opt-in to add even huge artifacts to Gradle's build cache by explicitly setting the `cacheUberAndNativeRunners` property in the Quarkus extension to `true`. It's "build working directory" is `build/quarkus-build/app`. Since `QuarkusBuild` only combines the outputs of the above two tasks, the same "CI vs local" caching behavior as for the `QuarkusBuildApp` task applies. To make "up to date" checks (kind of) more reliable, all outputs are removed first. This means, that for example an existing uber-jar in `build/` will disappear, when the build's package type is "fast-jar". This behavior can be disabled by setting the `cleanupBuildOutput` property on the Quarkus extension to `false`. Both `QuarkusBuildDependencies` and `QuarkusBuildApp` can trigger an actual Quarkus application build. That Quarkus app build will only be triggered when needed and only once per Gradle build. The task names `quarkusDependenciesBuild` and `quarkusAppBuild` are intentionally "that way around". Letting the names of these tasks begin with `quarkusBuild...` could confuse users, who use abbreviated task names on the command line (for example `./gradlew qB` is automagically expanded to `./gradlew quarkusBuild`, which would become ambiguous with `quarkusBuildDependencies` and `quarkusBuildApp`). Relates to: quarkusio#30852
snazy
added a commit
to snazy/quarkus
that referenced
this issue
Feb 5, 2023
Currently the `QuarkusBuild` task implementation adds even large build artifacts and unmodified dependency jars to Gradle's build cache. This pollutes the Gradle build cache quite a lot and therefore lets archived build caches become unnecessary huge. This change updates the build logic to fix this behavior by not adding dependencies and large build artifacts, like uber-jar and native binary, to the Gradle build cache. The `QuarkusBuild` task has been split into three tasks: 1. a new `QuarkusBuildDependencies` task that only collects the contents for `build/quarkus-app/lib` 2. a new `QuarkusBuildApp` task that to collect everything else from a Quarkus build (everything except the `build/quarkus-app/lib`) 3. the `QuarkusBuild` task now combines the outputs of the above two tasks `QuarkusBuildDependencies` (named 'quarkusDependenciesBuild`) is not cacheable, because it only collects dependencies, which come either from a repository (and are already available locally elsewhere) or are built by other Gradle tasks. This task is only executed if the configured Quarkus package type requires the "quarkus-app" directory (`fast-jar` + `native`). It's "build working directory" is `build/quarkus-build/dep`. `QuarkusBuildApp` (named `quarkusAppBuild`) collects the contents of the "quarkus-app" directory _excluding_ the `lib/` directory are cacheable, which is the default for CI environments. Non-CI environments still cache all outputs, even uber-jars and native binaries to retain the existing behavior for developers and keep build turn-around times low. CI environments can opt-in to add even huge artifacts to Gradle's build cache by explicitly setting the `cacheUberAndNativeRunners` property in the Quarkus extension to `true`. It's "build working directory" is `build/quarkus-build/app`. Since `QuarkusBuild` only combines the outputs of the above two tasks, the same "CI vs local" caching behavior as for the `QuarkusBuildApp` task applies. To make "up to date" checks (kind of) more reliable, all outputs are removed first. This means, that for example an existing uber-jar in `build/` will disappear, when the build's package type is "fast-jar". This behavior can be disabled by setting the `cleanupBuildOutput` property on the Quarkus extension to `false`. Both `QuarkusBuildDependencies` and `QuarkusBuildApp` can trigger an actual Quarkus application build. That Quarkus app build will only be triggered when needed and only once per Gradle build. The task names `quarkusDependenciesBuild` and `quarkusAppBuild` are intentionally "that way around". Letting the names of these tasks begin with `quarkusBuild...` could confuse users, who use abbreviated task names on the command line (for example `./gradlew qB` is automagically expanded to `./gradlew quarkusBuild`, which would become ambiguous with `quarkusBuildDependencies` and `quarkusBuildApp`). Relates to: quarkusio#30852
geoand
added a commit
that referenced
this issue
Feb 6, 2023
Ensure that quarkus-application.dat is reproducible
snazy
added a commit
to snazy/quarkus
that referenced
this issue
Feb 6, 2023
Currently the `QuarkusBuild` task implementation adds even large build artifacts and unmodified dependency jars to Gradle's build cache. This pollutes the Gradle build cache quite a lot and therefore lets archived build caches become unnecessary huge. This change updates the build logic to fix this behavior by not adding dependencies and large build artifacts, like uber-jar and native binary, to the Gradle build cache. The `QuarkusBuild` task has been split into three tasks: 1. a new `QuarkusBuildDependencies` task that only collects the contents for `build/quarkus-app/lib` 2. a new `QuarkusBuildApp` task that to collect everything else from a Quarkus build (everything except the `build/quarkus-app/lib`) 3. the `QuarkusBuild` task now combines the outputs of the above two tasks `QuarkusBuildDependencies` (named 'quarkusDependenciesBuild`) is not cacheable, because it only collects dependencies, which come either from a repository (and are already available locally elsewhere) or are built by other Gradle tasks. This task is only executed if the configured Quarkus package type requires the "quarkus-app" directory (`fast-jar` + `native`). It's "build working directory" is `build/quarkus-build/dep`. `QuarkusBuildApp` (named `quarkusAppBuild`) collects the contents of the "quarkus-app" directory _excluding_ the `lib/` directory are cacheable, which is the default for CI environments. Non-CI environments still cache all outputs, even uber-jars and native binaries to retain the existing behavior for developers and keep build turn-around times low. CI environments can opt-in to add even huge artifacts to Gradle's build cache by explicitly setting the `cacheUberAndNativeRunners` property in the Quarkus extension to `true`. It's "build working directory" is `build/quarkus-build/app`. Since `QuarkusBuild` only combines the outputs of the above two tasks, the same "CI vs local" caching behavior as for the `QuarkusBuildApp` task applies. To make "up to date" checks (kind of) more reliable, all outputs are removed first. This means, that for example an existing uber-jar in `build/` will disappear, when the build's package type is "fast-jar". This behavior can be disabled by setting the `cleanupBuildOutput` property on the Quarkus extension to `false`. Both `QuarkusBuildDependencies` and `QuarkusBuildApp` can trigger an actual Quarkus application build. That Quarkus app build will only be triggered when needed and only once per Gradle build. The task names `quarkusDependenciesBuild` and `quarkusAppBuild` are intentionally "that way around". Letting the names of these tasks begin with `quarkusBuild...` could confuse users, who use abbreviated task names on the command line (for example `./gradlew qB` is automagically expanded to `./gradlew quarkusBuild`, which would become ambiguous with `quarkusBuildDependencies` and `quarkusBuildApp`). Unless the `cacheLargeArtifacts` property on the `quarkus` extension is set to `true`, the output of/for the package type `fast-jar` minus the dependency jars is cached by Gradle's build cache (similar for `legacy-jar`). Basically everything is cacheable to allow fast(er) local development turn-around cycles. Relates to: quarkusio#30852
snazy
added a commit
to snazy/quarkus
that referenced
this issue
Feb 7, 2023
Currently the `QuarkusBuild` task implementation adds even large build artifacts and unmodified dependency jars to Gradle's build cache. This pollutes the Gradle build cache quite a lot and therefore lets archived build caches become unnecessary huge. This change updates the build logic to fix this behavior by not adding dependencies and large build artifacts, like uber-jar and native binary, to the Gradle build cache. The `QuarkusBuild` task has been split into three tasks: 1. a new `QuarkusBuildDependencies` task that only collects the contents for `build/quarkus-app/lib` 2. a new `QuarkusBuildApp` task that to collect everything else from a Quarkus build (everything except the `build/quarkus-app/lib`) 3. the `QuarkusBuild` task now combines the outputs of the above two tasks `QuarkusBuildDependencies` (named 'quarkusDependenciesBuild`) is not cacheable, because it only collects dependencies, which come either from a repository (and are already available locally elsewhere) or are built by other Gradle tasks. This task is only executed if the configured Quarkus package type requires the "quarkus-app" directory (`fast-jar` + `native`). It's "build working directory" is `build/quarkus-build/dep`. `QuarkusBuildApp` (named `quarkusAppBuild`) collects the contents of the "quarkus-app" directory _excluding_ the `lib/` directory are cacheable, which is the default for CI environments. Non-CI environments still cache all outputs, even uber-jars and native binaries to retain the existing behavior for developers and keep build turn-around times low. CI environments can opt-in to add even huge artifacts to Gradle's build cache by explicitly setting the `cacheUberAndNativeRunners` property in the Quarkus extension to `true`. It's "build working directory" is `build/quarkus-build/app`. Since `QuarkusBuild` only combines the outputs of the above two tasks, the same "CI vs local" caching behavior as for the `QuarkusBuildApp` task applies. To make "up to date" checks (kind of) more reliable, all outputs are removed first. This means, that for example an existing uber-jar in `build/` will disappear, when the build's package type is "fast-jar". This behavior can be disabled by setting the `cleanupBuildOutput` property on the Quarkus extension to `false`. Both `QuarkusBuildDependencies` and `QuarkusBuildApp` can trigger an actual Quarkus application build. That Quarkus app build will only be triggered when needed and only once per Gradle build. The task names `quarkusDependenciesBuild` and `quarkusAppBuild` are intentionally "that way around". Letting the names of these tasks begin with `quarkusBuild...` could confuse users, who use abbreviated task names on the command line (for example `./gradlew qB` is automagically expanded to `./gradlew quarkusBuild`, which would become ambiguous with `quarkusBuildDependencies` and `quarkusBuildApp`). Unless the `cacheLargeArtifacts` property on the `quarkus` extension is set to `true`, the output of/for the package type `fast-jar` minus the dependency jars is cached by Gradle's build cache (similar for `legacy-jar`). Basically everything is cacheable to allow fast(er) local development turn-around cycles. Relates to: quarkusio#30852
snazy
added a commit
to snazy/quarkus
that referenced
this issue
Feb 7, 2023
Currently the `QuarkusBuild` task implementation adds even large build artifacts and unmodified dependency jars to Gradle's build cache. This pollutes the Gradle build cache quite a lot and therefore lets archived build caches become unnecessary huge. This change updates the build logic to fix this behavior by not adding dependencies and large build artifacts, like uber-jar and native binary, to the Gradle build cache. The `QuarkusBuild` task has been split into three tasks: 1. a new `QuarkusBuildDependencies` task that only collects the contents for `build/quarkus-app/lib` 2. a new `QuarkusBuildApp` task that to collect everything else from a Quarkus build (everything except the `build/quarkus-app/lib`) 3. the `QuarkusBuild` task now combines the outputs of the above two tasks `QuarkusBuildDependencies` (named 'quarkusDependenciesBuild`) is not cacheable, because it only collects dependencies, which come either from a repository (and are already available locally elsewhere) or are built by other Gradle tasks. This task is only executed if the configured Quarkus package type requires the "quarkus-app" directory (`fast-jar` + `native`). It's "build working directory" is `build/quarkus-build/dep`. `QuarkusBuildApp` (named `quarkusAppBuild`) collects the contents of the "quarkus-app" directory _excluding_ the `lib/` directory are cacheable, which is the default for CI environments. Non-CI environments still cache all outputs, even uber-jars and native binaries to retain the existing behavior for developers and keep build turn-around times low. CI environments can opt-in to add even huge artifacts to Gradle's build cache by explicitly setting the `cacheUberAndNativeRunners` property in the Quarkus extension to `true`. It's "build working directory" is `build/quarkus-build/app`. Since `QuarkusBuild` only combines the outputs of the above two tasks, the same "CI vs local" caching behavior as for the `QuarkusBuildApp` task applies. To make "up to date" checks (kind of) more reliable, all outputs are removed first. This means, that for example an existing uber-jar in `build/` will disappear, when the build's package type is "fast-jar". This behavior can be disabled by setting the `cleanupBuildOutput` property on the Quarkus extension to `false`. Both `QuarkusBuildDependencies` and `QuarkusBuildApp` can trigger an actual Quarkus application build. That Quarkus app build will only be triggered when needed and only once per Gradle build. The task names `quarkusDependenciesBuild` and `quarkusAppBuild` are intentionally "that way around". Letting the names of these tasks begin with `quarkusBuild...` could confuse users, who use abbreviated task names on the command line (for example `./gradlew qB` is automagically expanded to `./gradlew quarkusBuild`, which would become ambiguous with `quarkusBuildDependencies` and `quarkusBuildApp`). Unless the `cacheLargeArtifacts` property on the `quarkus` extension is set to `true`, the output of/for the package type `fast-jar` minus the dependency jars is cached by Gradle's build cache (similar for `legacy-jar`). Basically everything is cacheable to allow fast(er) local development turn-around cycles. Relates to: quarkusio#30852
snazy
added a commit
to snazy/quarkus
that referenced
this issue
Feb 7, 2023
Currently the `QuarkusBuild` task implementation adds even large build artifacts and unmodified dependency jars to Gradle's build cache. This pollutes the Gradle build cache quite a lot and therefore lets archived build caches become unnecessary huge. This change updates the build logic to fix this behavior by not adding dependencies and large build artifacts, like uber-jar and native binary, to the Gradle build cache. The `QuarkusBuild` task has been split into three tasks: 1. a new `QuarkusBuildDependencies` task that only collects the contents for `build/quarkus-app/lib` 2. a new `QuarkusBuildApp` task that to collect everything else from a Quarkus build (everything except the `build/quarkus-app/lib`) 3. the `QuarkusBuild` task now combines the outputs of the above two tasks `QuarkusBuildDependencies` (named 'quarkusDependenciesBuild`) is not cacheable, because it only collects dependencies, which come either from a repository (and are already available locally elsewhere) or are built by other Gradle tasks. This task is only executed if the configured Quarkus package type requires the "quarkus-app" directory (`fast-jar` + `native`). It's "build working directory" is `build/quarkus-build/dep`. `QuarkusBuildApp` (named `quarkusAppBuild`) collects the contents of the "quarkus-app" directory _excluding_ the `lib/` directory are cacheable, which is the default for CI environments. Non-CI environments still cache all outputs, even uber-jars and native binaries to retain the existing behavior for developers and keep build turn-around times low. CI environments can opt-in to add even huge artifacts to Gradle's build cache by explicitly setting the `cacheUberAndNativeRunners` property in the Quarkus extension to `true`. It's "build working directory" is `build/quarkus-build/app`. Since `QuarkusBuild` only combines the outputs of the above two tasks, the same "CI vs local" caching behavior as for the `QuarkusBuildApp` task applies. To make "up to date" checks (kind of) more reliable, all outputs are removed first. This means, that for example an existing uber-jar in `build/` will disappear, when the build's package type is "fast-jar". This behavior can be disabled by setting the `cleanupBuildOutput` property on the Quarkus extension to `false`. Both `QuarkusBuildDependencies` and `QuarkusBuildApp` can trigger an actual Quarkus application build. That Quarkus app build will only be triggered when needed and only once per Gradle build. The task names `quarkusDependenciesBuild` and `quarkusAppBuild` are intentionally "that way around". Letting the names of these tasks begin with `quarkusBuild...` could confuse users, who use abbreviated task names on the command line (for example `./gradlew qB` is automagically expanded to `./gradlew quarkusBuild`, which would become ambiguous with `quarkusBuildDependencies` and `quarkusBuildApp`). Unless the `cacheLargeArtifacts` property on the `quarkus` extension is set to `true`, the output of/for the package type `fast-jar` minus the dependency jars is cached by Gradle's build cache (similar for `legacy-jar`). Basically everything is cacheable to allow fast(er) local development turn-around cycles. Relates to: quarkusio#30852
snazy
added a commit
to snazy/quarkus
that referenced
this issue
Feb 14, 2023
Currently the `QuarkusBuild` task implementation adds even large build artifacts and unmodified dependency jars to Gradle's build cache. This pollutes the Gradle build cache quite a lot and therefore lets archived build caches become unnecessary huge. This change updates the build logic to fix this behavior by not adding dependencies and large build artifacts, like uber-jar and native binary, to the Gradle build cache. The `QuarkusBuild` task has been split into three tasks: 1. a new `QuarkusBuildDependencies` task that only collects the contents for `build/quarkus-app/lib` 2. a new `QuarkusBuildApp` task that to collect everything else from a Quarkus build (everything except the `build/quarkus-app/lib`) 3. the `QuarkusBuild` task now combines the outputs of the above two tasks `QuarkusBuildDependencies` (named 'quarkusDependenciesBuild`) is not cacheable, because it only collects dependencies, which come either from a repository (and are already available locally elsewhere) or are built by other Gradle tasks. This task is only executed if the configured Quarkus package type requires the "quarkus-app" directory (`fast-jar` + `native`). It's "build working directory" is `build/quarkus-build/dep`. `QuarkusBuildApp` (named `quarkusAppBuild`) collects the contents of the "quarkus-app" directory _excluding_ the `lib/` directory are cacheable, which is the default for CI environments. Non-CI environments still cache all outputs, even uber-jars and native binaries to retain the existing behavior for developers and keep build turn-around times low. CI environments can opt-in to add even huge artifacts to Gradle's build cache by explicitly setting the `cacheUberAndNativeRunners` property in the Quarkus extension to `true`. It's "build working directory" is `build/quarkus-build/app`. Since `QuarkusBuild` only combines the outputs of the above two tasks, the same "CI vs local" caching behavior as for the `QuarkusBuildApp` task applies. To make "up to date" checks (kind of) more reliable, all outputs are removed first. This means, that for example an existing uber-jar in `build/` will disappear, when the build's package type is "fast-jar". This behavior can be disabled by setting the `cleanupBuildOutput` property on the Quarkus extension to `false`. Both `QuarkusBuildDependencies` and `QuarkusBuildApp` can trigger an actual Quarkus application build. That Quarkus app build will only be triggered when needed and only once per Gradle build. The task names `quarkusDependenciesBuild` and `quarkusAppBuild` are intentionally "that way around". Letting the names of these tasks begin with `quarkusBuild...` could confuse users, who use abbreviated task names on the command line (for example `./gradlew qB` is automagically expanded to `./gradlew quarkusBuild`, which would become ambiguous with `quarkusBuildDependencies` and `quarkusBuildApp`). Unless the `cacheLargeArtifacts` property on the `quarkus` extension is set to `true`, the output of/for the package type `fast-jar` minus the dependency jars is cached by Gradle's build cache (similar for `legacy-jar`). Basically everything is cacheable to allow fast(er) local development turn-around cycles. To be able to investigate which configuration settings were effectively used, there's another new task called `quarkusShowEffectiveConfig` that shows all the `quarkus.*` properties and some more information, including the loaded `application.(properties|yaml|yml)`. This task is intended to debug build issues. The task can optionally save the effecitve config properties as a properties file in the `build/` directory, if run with the command line option `./gradlew quarkusShowEffectiveConfig --save-config-properties`. Relates to: quarkusio#30852
snazy
added a commit
to snazy/quarkus
that referenced
this issue
Feb 14, 2023
Currently the `QuarkusBuild` task implementation adds even large build artifacts and unmodified dependency jars to Gradle's build cache. This pollutes the Gradle build cache quite a lot and therefore lets archived build caches become unnecessary huge. This change updates the build logic to fix this behavior by not adding dependencies and large build artifacts, like uber-jar and native binary, to the Gradle build cache. The `QuarkusBuild` task has been split into three tasks: 1. a new `QuarkusBuildDependencies` task that only collects the contents for `build/quarkus-app/lib` 2. a new `QuarkusBuildApp` task that to collect everything else from a Quarkus build (everything except the `build/quarkus-app/lib`) 3. the `QuarkusBuild` task now combines the outputs of the above two tasks `QuarkusBuildDependencies` (named 'quarkusDependenciesBuild`) is not cacheable, because it only collects dependencies, which come either from a repository (and are already available locally elsewhere) or are built by other Gradle tasks. This task is only executed if the configured Quarkus package type requires the "quarkus-app" directory (`fast-jar` + `native`). It's "build working directory" is `build/quarkus-build/dep`. `QuarkusBuildApp` (named `quarkusAppBuild`) collects the contents of the "quarkus-app" directory _excluding_ the `lib/` directory are cacheable, which is the default for CI environments. Non-CI environments still cache all outputs, even uber-jars and native binaries to retain the existing behavior for developers and keep build turn-around times low. CI environments can opt-in to add even huge artifacts to Gradle's build cache by explicitly setting the `cacheUberAndNativeRunners` property in the Quarkus extension to `true`. It's "build working directory" is `build/quarkus-build/app`. Since `QuarkusBuild` only combines the outputs of the above two tasks, the same "CI vs local" caching behavior as for the `QuarkusBuildApp` task applies. To make "up to date" checks (kind of) more reliable, all outputs are removed first. This means, that for example an existing uber-jar in `build/` will disappear, when the build's package type is "fast-jar". This behavior can be disabled by setting the `cleanupBuildOutput` property on the Quarkus extension to `false`. Both `QuarkusBuildDependencies` and `QuarkusBuildApp` can trigger an actual Quarkus application build. That Quarkus app build will only be triggered when needed and only once per Gradle build. The task names `quarkusDependenciesBuild` and `quarkusAppBuild` are intentionally "that way around". Letting the names of these tasks begin with `quarkusBuild...` could confuse users, who use abbreviated task names on the command line (for example `./gradlew qB` is automagically expanded to `./gradlew quarkusBuild`, which would become ambiguous with `quarkusBuildDependencies` and `quarkusBuildApp`). Unless the `cacheLargeArtifacts` property on the `quarkus` extension is set to `true`, the output of/for the package type `fast-jar` minus the dependency jars is cached by Gradle's build cache (similar for `legacy-jar`). Basically everything is cacheable to allow fast(er) local development turn-around cycles. To be able to investigate which configuration settings were effectively used, there's another new task called `quarkusShowEffectiveConfig` that shows all the `quarkus.*` properties and some more information, including the loaded `application.(properties|yaml|yml)`. This task is intended to debug build issues. The task can optionally save the effecitve config properties as a properties file in the `build/` directory, if run with the command line option `./gradlew quarkusShowEffectiveConfig --save-config-properties`. Relates to: quarkusio#30852
snazy
added a commit
to snazy/quarkus
that referenced
this issue
Feb 14, 2023
Currently the `QuarkusBuild` task implementation adds even large build artifacts and unmodified dependency jars to Gradle's build cache. This pollutes the Gradle build cache quite a lot and therefore lets archived build caches become unnecessary huge. This change updates the build logic to fix this behavior by not adding dependencies and large build artifacts, like uber-jar and native binary, to the Gradle build cache. The `QuarkusBuild` task has been split into three tasks: 1. a new `QuarkusBuildDependencies` task that only collects the contents for `build/quarkus-app/lib` 2. a new `QuarkusBuildApp` task that to collect everything else from a Quarkus build (everything except the `build/quarkus-app/lib`) 3. the `QuarkusBuild` task now combines the outputs of the above two tasks `QuarkusBuildDependencies` (named 'quarkusDependenciesBuild`) is not cacheable, because it only collects dependencies, which come either from a repository (and are already available locally elsewhere) or are built by other Gradle tasks. This task is only executed if the configured Quarkus package type requires the "quarkus-app" directory (`fast-jar` + `native`). It's "build working directory" is `build/quarkus-build/dep`. `QuarkusBuildApp` (named `quarkusAppBuild`) collects the contents of the "quarkus-app" directory _excluding_ the `lib/` directory are cacheable, which is the default for CI environments. Non-CI environments still cache all outputs, even uber-jars and native binaries to retain the existing behavior for developers and keep build turn-around times low. CI environments can opt-in to add even huge artifacts to Gradle's build cache by explicitly setting the `cacheUberAndNativeRunners` property in the Quarkus extension to `true`. It's "build working directory" is `build/quarkus-build/app`. Since `QuarkusBuild` only combines the outputs of the above two tasks, the same "CI vs local" caching behavior as for the `QuarkusBuildApp` task applies. To make "up to date" checks (kind of) more reliable, all outputs are removed first. This means, that for example an existing uber-jar in `build/` will disappear, when the build's package type is "fast-jar". This behavior can be disabled by setting the `cleanupBuildOutput` property on the Quarkus extension to `false`. Both `QuarkusBuildDependencies` and `QuarkusBuildApp` can trigger an actual Quarkus application build. That Quarkus app build will only be triggered when needed and only once per Gradle build. The task names `quarkusDependenciesBuild` and `quarkusAppBuild` are intentionally "that way around". Letting the names of these tasks begin with `quarkusBuild...` could confuse users, who use abbreviated task names on the command line (for example `./gradlew qB` is automagically expanded to `./gradlew quarkusBuild`, which would become ambiguous with `quarkusBuildDependencies` and `quarkusBuildApp`). Unless the `cacheLargeArtifacts` property on the `quarkus` extension is set to `true`, the output of/for the package type `fast-jar` minus the dependency jars is cached by Gradle's build cache (similar for `legacy-jar`). Basically everything is cacheable to allow fast(er) local development turn-around cycles. To be able to investigate which configuration settings were effectively used, there's another new task called `quarkusShowEffectiveConfig` that shows all the `quarkus.*` properties and some more information, including the loaded `application.(properties|yaml|yml)`. This task is intended to debug build issues. The task can optionally save the effecitve config properties as a properties file in the `build/` directory, if run with the command line option `./gradlew quarkusShowEffectiveConfig --save-config-properties`. Relates to: quarkusio#30852
snazy
added a commit
to snazy/quarkus
that referenced
this issue
Feb 14, 2023
Currently the `QuarkusBuild` task implementation adds even large build artifacts and unmodified dependency jars to Gradle's build cache. This pollutes the Gradle build cache quite a lot and therefore lets archived build caches become unnecessary huge. This change updates the build logic to fix this behavior by not adding dependencies and large build artifacts, like uber-jar and native binary, to the Gradle build cache. The `QuarkusBuild` task has been split into three tasks: 1. a new `QuarkusBuildDependencies` task that only collects the contents for `build/quarkus-app/lib` 2. a new `QuarkusBuildApp` task that to collect everything else from a Quarkus build (everything except the `build/quarkus-app/lib`) 3. the `QuarkusBuild` task now combines the outputs of the above two tasks `QuarkusBuildDependencies` (named 'quarkusDependenciesBuild`) is not cacheable, because it only collects dependencies, which come either from a repository (and are already available locally elsewhere) or are built by other Gradle tasks. This task is only executed if the configured Quarkus package type requires the "quarkus-app" directory (`fast-jar` + `native`). It's "build working directory" is `build/quarkus-build/dep`. `QuarkusBuildApp` (named `quarkusAppBuild`) collects the contents of the "quarkus-app" directory _excluding_ the `lib/` directory are cacheable, which is the default for CI environments. Non-CI environments still cache all outputs, even uber-jars and native binaries to retain the existing behavior for developers and keep build turn-around times low. CI environments can opt-in to add even huge artifacts to Gradle's build cache by explicitly setting the `cacheUberAndNativeRunners` property in the Quarkus extension to `true`. It's "build working directory" is `build/quarkus-build/app`. Since `QuarkusBuild` only combines the outputs of the above two tasks, the same "CI vs local" caching behavior as for the `QuarkusBuildApp` task applies. To make "up to date" checks (kind of) more reliable, all outputs are removed first. This means, that for example an existing uber-jar in `build/` will disappear, when the build's package type is "fast-jar". This behavior can be disabled by setting the `cleanupBuildOutput` property on the Quarkus extension to `false`. Both `QuarkusBuildDependencies` and `QuarkusBuildApp` can trigger an actual Quarkus application build. That Quarkus app build will only be triggered when needed and only once per Gradle build. The task names `quarkusDependenciesBuild` and `quarkusAppBuild` are intentionally "that way around". Letting the names of these tasks begin with `quarkusBuild...` could confuse users, who use abbreviated task names on the command line (for example `./gradlew qB` is automagically expanded to `./gradlew quarkusBuild`, which would become ambiguous with `quarkusBuildDependencies` and `quarkusBuildApp`). Unless the `cacheLargeArtifacts` property on the `quarkus` extension is set to `true`, the output of/for the package type `fast-jar` minus the dependency jars is cached by Gradle's build cache (similar for `legacy-jar`). Basically everything is cacheable to allow fast(er) local development turn-around cycles. To be able to investigate which configuration settings were effectively used, there's another new task called `quarkusShowEffectiveConfig` that shows all the `quarkus.*` properties and some more information, including the loaded `application.(properties|yaml|yml)`. This task is intended to debug build issues. The task can optionally save the effecitve config properties as a properties file in the `build/` directory, if run with the command line option `./gradlew quarkusShowEffectiveConfig --save-config-properties`. Relates to: quarkusio#30852
snazy
added a commit
to snazy/quarkus
that referenced
this issue
Feb 19, 2023
Improve Quarkus build in Gradle, proper up-to-date mechanisms, right config-source priorities. == Background / Motivation Currently the `QuarkusBuild` task implementation adds even large build artifacts and unmodified dependency jars to Gradle's build cache. This pollutes the Gradle build cache quite a lot and therefore lets archived build caches become unnecessary huge. On top of that, "relicts" from previous runs of `QuarkusBuild` using a different configuration, like a different package-type, causes multiple permutations of the cached artifacts and leads to restoring unexpected artifacts from the build cache. This large change updates the Gradle logic to have "proper" up-to-date checks and in turn a reasonable setting of what is cached and what is not cached. Background on how I discovered the "caching and up-to-date issue" with `QuarkusBuild`: Quarkus 2.16 added the `@CacheableTask` annotation to `QuarkusBuild`. This means that _everything_ that is generated by `QuarkusBuild` (and "declared" as an output: fast-jar directory, uber-jar, native-runner) is cached. After having Quarkus 2.16.x in our "main" branch for about 1/2 day, nearly the whole "10 GB budget" for cache artifacts in GitHub was occupied by Gradle build cache artifacts - the build cache artifacts grew from build to build - this was not the case without the `@CacheableTask` annotation. The current inputs of the `QuarkusBuild` task are also incomplete. Those inputs do not include for example the configured `finalName`, the settings on the Gradle project and the settings on the Quarkus extension. This leads to situations that the wrong Gradle cache artifacts could be restored (think: you ask for a fast-jar, but get an uber-jar). Another issue is that repeated runs of `quarkusBuild` against the exact same code base and dependencies leads to different outputs, the contents of the generated jars and e.g. quarkus-application.dat differ. This causes different outputs of the `QuarkusBuild` task, which cause depending tasks and projects to unnecessarily run/work again. == Initial Goal The initial goal of this effort was to make caching of `fast-jar` builds work nice in CI environments, I assume that's the most common package type used in CI environments for example for testing purposes. The idea here is to _not_ cache all the dependency jars, but leverage the mechanisms available in Gradle to "just collect" the dependencies. In other words (and simplified): for `fast-jar`, cache everthing "in `quarkus-app` except `lib/`. == Updated Quarkus Gradle build implementation There is no change for users running a `./gradlew quarkusBuild`. `QuarkusBuild` is still the task that users should execute (or depend on in their builds). To achive the goal to make especially `fast-jar` builds "nicely cacheable" in CI, two new tasks had to be introduced: * `QuarkusBuildDependencies` uses the dependency information from the Quarkus `ApplicationModel` to tell Gradle to collect those. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is _never_ cacheable, but has _up-to-date_ checks. * `QuarkusBuildCacheableAppParts` performs a Quarkus application build and collects the _cacheable_ parts of that build. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is cacheable. * `QuarkusBuild`, for `fast-jar` and `legacy-jar`, assembles the outputs of the above two tasks to produce the expected output. Performs a Quarkus build for other package types. See below on when this task is cacheable. The `build/quarkus-build/` directory is used to properly separate the outputs of the above three tasks: * `build/quarkus-build/gen/` receives the full output of a Quarkus build * `build/quarkus-build/app/` receives the _cacheable_ parts from `build/quarkus-build/gen/` * `build/quarkus-build/dep/` receives the dependency jars == CI vs non-CI The output of `QuarkusBuild` is, by default, cacheable in non-CI environments and _not_ cacheable in CI environments. The behavior can be explicitly overridden using the new property `cacheLargeArtifacts` property. As outlined above, caching huge artifacts is not beneficial in CI, either due to space limitations (GitHub's 10GB limit for example) or just the cost of network traffic/duration. On a developer's machine however, the cost of storing/retrieving even bigger artifacts is rather neglectible and lower than rebuilding a Quarkus application. == Configuration Before this change, the "priority" of configuration settings was (in most cases...) effectively: System properties, environment variables, `application.properties`, configurations in Gradle build scripts & Gradle project properties. This order is unintuitive and was changed to: system properties, environment variables, Gradle project properties, Quarkus build configuration, application.properties. The previous code had several places in which Quarkus related configuration settings were retrieved with sometimes different priorities of the "config sources". The whole "machinery" to get the configuration has been rewritten and encapsulated in new classes `EffectiveConfig` (effective for a Quarkus build) and `BaseConfig` (available during Gradle task configuration phase). `EffectiveConfig` is not really different from `BaseConfig, but contains the "special" settings from e.g. `ImagePush` task. The new implementation also uses SmallRye Config and especially the existing Quarkus mechanisms to pull information out of a `PackageConfig` _object_ produced from a configuration. It turned out to be easier to reason about `PackageConfig` than "raw property values". Support for `application.yaml/yml` has also been added. == Workers All calls into the "inner workings" of a Quarkus build and Quarkus code generation have been moved to separate worker processes. The code in this change to support this is pretty dead simple. The primary reason to move that work into isolated worker processes is that all the `QuarkusBootstrap` and derived pieces know nothing about Gradle, so there is no "property like" mechanism to override settings from `application.properties/yaml/yml` with those from e.g. the `QuarkusPluginExtension`. The only option would have been to modify the system properties of the Gradle build process - but that's a no-go, especially considering other tasks running in parallel (think: two `QuarkusBuild` of different projects running at the same time). It would have been relatively easy to serialize all `QuarkusBuild` actions across a Gradle build, but why - and it would prevent using "beefy machines" to run many `QuarkusBuild`s in parallel. Another option would have been to implement a rather complex (and likely very racy) mechanism to track modifications to system properties. As a conclusion, it wasn't just very simple to leverage the process isolation using the Gradle worker API, but it's also not bad. Gradle does reuse already spawned and compatible worker instances. The "trick" implemented to "prioritize" configs from Gradle project properties and Quarkus extension settings is to pass the whole config as system properties to the worker. == Logging / cache-disabled messages A bunch of hopefully useful logging has been implemented and added. With info-level loogging, the tasks emit at least some basic information about what they do and at least the package type being used. == Effective config To be able to investigate which configuration settings were effectively used, there's another new task called `quarkusShowEffectiveConfig` that shows all the `quarkus.*` properties and some more information, including the loaded `application.(properties|yaml|yml)`. This task is intended to debug build issues. The task can optionally save the effecitve config properties as a properties file in the `build/` directory, if run with the command line option `./gradlew quarkusShowEffectiveConfig --save-config-properties`. == Test changes None of the existing tests has changed, except the `BuildCOnfigurationTest` had to be adopted to reflect the updated order of config sources. Relates to: quarkusio#30852
snazy
added a commit
to snazy/quarkus
that referenced
this issue
Feb 19, 2023
Make `QuarkusBuild` not pollute Gradle's build cache Improve Quarkus build in Gradle, proper up-to-date mechanisms, right config-source priorities. Currently the `QuarkusBuild` task implementation adds even large build artifacts and unmodified dependency jars to Gradle's build cache. This pollutes the Gradle build cache quite a lot and therefore lets archived build caches become unnecessary huge. On top of that, "relicts" from previous runs of `QuarkusBuild` using a different configuration, like a different package-type, causes multiple permutations of the cached artifacts and leads to restoring unexpected artifacts from the build cache. This large change updates the Gradle logic to have "proper" up-to-date checks and in turn a reasonable setting of what is cached and what is not cached. Background on how I discovered the "caching and up-to-date issue" with `QuarkusBuild`: Quarkus 2.16 added the `@CacheableTask` annotation to `QuarkusBuild`. This means that _everything_ that is generated by `QuarkusBuild` (and "declared" as an output: fast-jar directory, uber-jar, native-runner) is cached. After having Quarkus 2.16.x in our "main" branch for about 1/2 day, nearly the whole "10 GB budget" for cache artifacts in GitHub was occupied by Gradle build cache artifacts - the build cache artifacts grew from build to build - this was not the case without the `@CacheableTask` annotation. The current inputs of the `QuarkusBuild` task are also incomplete. Those inputs do not include for example the configured `finalName`, the settings on the Gradle project and the settings on the Quarkus extension. This leads to situations that the wrong Gradle cache artifacts could be restored (think: you ask for a fast-jar, but get an uber-jar). Another issue is that repeated runs of `quarkusBuild` against the exact same code base and dependencies leads to different outputs, the contents of the generated jars and e.g. quarkus-application.dat differ. This causes different outputs of the `QuarkusBuild` task, which cause depending tasks and projects to unnecessarily run/work again. The initial goal of this effort was to make caching of `fast-jar` builds work nice in CI environments, I assume that's the most common package type used in CI environments for example for testing purposes. The idea here is to _not_ cache all the dependency jars, but leverage the mechanisms available in Gradle to "just collect" the dependencies. In other words (and simplified): for `fast-jar`, cache everthing "in `quarkus-app` except `lib/`. There is no change for users running a `./gradlew quarkusBuild`. `QuarkusBuild` is still the task that users should execute (or depend on in their builds). To achive the goal to make especially `fast-jar` builds "nicely cacheable" in CI, two new tasks had to be introduced: * `QuarkusBuildDependencies` uses the dependency information from the Quarkus `ApplicationModel` to tell Gradle to collect those. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is _never_ cacheable, but has _up-to-date_ checks. * `QuarkusBuildCacheableAppParts` performs a Quarkus application build and collects the _cacheable_ parts of that build. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is cacheable. * `QuarkusBuild`, for `fast-jar` and `legacy-jar`, assembles the outputs of the above two tasks to produce the expected output. Performs a Quarkus build for other package types. See below on when this task is cacheable. The `build/quarkus-build/` directory is used to properly separate the outputs of the above three tasks: * `build/quarkus-build/gen/` receives the full output of a Quarkus build * `build/quarkus-build/app/` receives the _cacheable_ parts from `build/quarkus-build/gen/` * `build/quarkus-build/dep/` receives the dependency jars The output of `QuarkusBuild` is, by default, cacheable in non-CI environments and _not_ cacheable in CI environments. The behavior can be explicitly overridden using the new property `cacheLargeArtifacts` property. As outlined above, caching huge artifacts is not beneficial in CI, either due to space limitations (GitHub's 10GB limit for example) or just the cost of network traffic/duration. On a developer's machine however, the cost of storing/retrieving even bigger artifacts is rather neglectible and lower than rebuilding a Quarkus application. Before this change, the "priority" of configuration settings was (in most cases...) effectively: System properties, environment variables, `application.properties`, configurations in Gradle build scripts & Gradle project properties. This order is unintuitive and was changed to: system properties, environment variables, Gradle project properties, Quarkus build configuration, application.properties. The previous code had several places in which Quarkus related configuration settings were retrieved with sometimes different priorities of the "config sources". The whole "machinery" to get the configuration has been rewritten and encapsulated in new classes `EffectiveConfig` (effective for a Quarkus build) and `BaseConfig` (available during Gradle task configuration phase). `EffectiveConfig` is not really different from `BaseConfig, but contains the "special" settings from e.g. `ImagePush` task. The new implementation also uses SmallRye Config and especially the existing Quarkus mechanisms to pull information out of a `PackageConfig` _object_ produced from a configuration. It turned out to be easier to reason about `PackageConfig` than "raw property values". Support for `application.yaml/yml` has also been added. All calls into the "inner workings" of a Quarkus build and Quarkus code generation have been moved to separate worker processes. The code in this change to support this is pretty dead simple. The primary reason to move that work into isolated worker processes is that all the `QuarkusBootstrap` and derived pieces know nothing about Gradle, so there is no "property like" mechanism to override settings from `application.properties/yaml/yml` with those from e.g. the `QuarkusPluginExtension`. The only option would have been to modify the system properties of the Gradle build process - but that's a no-go, especially considering other tasks running in parallel (think: two `QuarkusBuild` of different projects running at the same time). It would have been relatively easy to serialize all `QuarkusBuild` actions across a Gradle build, but why - and it would prevent using "beefy machines" to run many `QuarkusBuild`s in parallel. Another option would have been to implement a rather complex (and likely very racy) mechanism to track modifications to system properties. As a conclusion, it wasn't just very simple to leverage the process isolation using the Gradle worker API, but it's also not bad. Gradle does reuse already spawned and compatible worker instances. The "trick" implemented to "prioritize" configs from Gradle project properties and Quarkus extension settings is to pass the whole config as system properties to the worker. A bunch of hopefully useful logging has been implemented and added. With info-level loogging, the tasks emit at least some basic information about what they do and at least the package type being used. To be able to investigate which configuration settings were effectively used, there's another new task called `quarkusShowEffectiveConfig` that shows all the `quarkus.*` properties and some more information, including the loaded `application.(properties|yaml|yml)`. This task is intended to debug build issues. The task can optionally save the effecitve config properties as a properties file in the `build/` directory, if run with the command line option `./gradlew quarkusShowEffectiveConfig --save-config-properties`. None of the existing tests has changed, except the `BuildCOnfigurationTest` had to be adopted to reflect the updated order of config sources. Relates to: quarkusio#30852
snazy
added a commit
to snazy/quarkus
that referenced
this issue
Feb 19, 2023
Bumps Gradle version to 8.0.1, adopts the build scripts. No functional change. Use thread-safe `AtomicReference` instead of an object array in `QuarkusDevGradleTestBase` Stop (and forcefully stop) processes launched by `QuarkusGradleWrapperTestBase` after all tests and on shutdown (for paranoia) Add missing `UP_TO_DATE` a "success" outcome in `BuildResult` Add convenience function `BuildResult.unsuccessfulTasks()` Refactor the `BuildConfigurationTest`, it suffers from false-test-failures due to quarkusio#30852 - also refactored the test to be (hopefully) more readable, but definitely quicker and "nicer" when test failures happen Fix for removed `AbstractArchiveTask.classifier`, replace usages of the previously deprecated (and in Gradle 8 removed) `AbstractArchiveTask.classifier` with `archiveClassifier` Make the relevant Quarkus configurations not consumable, configurations that are consumable AND resolvable are a hard error since Gradle 8. Also let QuarkusPlugin _register_ (and not eagerly configure) the Quarkus configurations. Split `implementation-files` integration test project into a variant using `files()` and another using `project()` dependency
snazy
added a commit
to snazy/quarkus
that referenced
this issue
Feb 19, 2023
Make `QuarkusBuild` not pollute Gradle's build cache Improve Quarkus build in Gradle, proper up-to-date mechanisms, right config-source priorities. Currently the `QuarkusBuild` task implementation adds even large build artifacts and unmodified dependency jars to Gradle's build cache. This pollutes the Gradle build cache quite a lot and therefore lets archived build caches become unnecessary huge. On top of that, "relicts" from previous runs of `QuarkusBuild` using a different configuration, like a different package-type, causes multiple permutations of the cached artifacts and leads to restoring unexpected artifacts from the build cache. This large change updates the Gradle logic to have "proper" up-to-date checks and in turn a reasonable setting of what is cached and what is not cached. Background on how I discovered the "caching and up-to-date issue" with `QuarkusBuild`: Quarkus 2.16 added the `@CacheableTask` annotation to `QuarkusBuild`. This means that _everything_ that is generated by `QuarkusBuild` (and "declared" as an output: fast-jar directory, uber-jar, native-runner) is cached. After having Quarkus 2.16.x in our "main" branch for about 1/2 day, nearly the whole "10 GB budget" for cache artifacts in GitHub was occupied by Gradle build cache artifacts - the build cache artifacts grew from build to build - this was not the case without the `@CacheableTask` annotation. The current inputs of the `QuarkusBuild` task are also incomplete. Those inputs do not include for example the configured `finalName`, the settings on the Gradle project and the settings on the Quarkus extension. This leads to situations that the wrong Gradle cache artifacts could be restored (think: you ask for a fast-jar, but get an uber-jar). Another issue is that repeated runs of `quarkusBuild` against the exact same code base and dependencies leads to different outputs, the contents of the generated jars and e.g. quarkus-application.dat differ. This causes different outputs of the `QuarkusBuild` task, which cause depending tasks and projects to unnecessarily run/work again. The initial goal of this effort was to make caching of `fast-jar` builds work nice in CI environments, I assume that's the most common package type used in CI environments for example for testing purposes. The idea here is to _not_ cache all the dependency jars, but leverage the mechanisms available in Gradle to "just collect" the dependencies. In other words (and simplified): for `fast-jar`, cache everthing "in `quarkus-app` except `lib/`. There is no change for users running a `./gradlew quarkusBuild`. `QuarkusBuild` is still the task that users should execute (or depend on in their builds). To achive the goal to make especially `fast-jar` builds "nicely cacheable" in CI, two new tasks had to be introduced: * `QuarkusBuildDependencies` uses the dependency information from the Quarkus `ApplicationModel` to tell Gradle to collect those. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is _never_ cacheable, but has _up-to-date_ checks. * `QuarkusBuildCacheableAppParts` performs a Quarkus application build and collects the _cacheable_ parts of that build. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is cacheable. * `QuarkusBuild`, for `fast-jar` and `legacy-jar`, assembles the outputs of the above two tasks to produce the expected output. Performs a Quarkus build for other package types. See below on when this task is cacheable. The `build/quarkus-build/` directory is used to properly separate the outputs of the above three tasks: * `build/quarkus-build/gen/` receives the full output of a Quarkus build * `build/quarkus-build/app/` receives the _cacheable_ parts from `build/quarkus-build/gen/` * `build/quarkus-build/dep/` receives the dependency jars The output of `QuarkusBuild` is, by default, cacheable in non-CI environments and _not_ cacheable in CI environments. The behavior can be explicitly overridden using the new property `cacheLargeArtifacts` property. As outlined above, caching huge artifacts is not beneficial in CI, either due to space limitations (GitHub's 10GB limit for example) or just the cost of network traffic/duration. On a developer's machine however, the cost of storing/retrieving even bigger artifacts is rather neglectible and lower than rebuilding a Quarkus application. Before this change, the "priority" of configuration settings was (in most cases...) effectively: System properties, environment variables, `application.properties`, configurations in Gradle build scripts & Gradle project properties. This order is unintuitive and was changed to: system properties, environment variables, Gradle project properties, Quarkus build configuration, application.properties. The previous code had several places in which Quarkus related configuration settings were retrieved with sometimes different priorities of the "config sources". The whole "machinery" to get the configuration has been rewritten and encapsulated in new classes `EffectiveConfig` (effective for a Quarkus build) and `BaseConfig` (available during Gradle task configuration phase). `EffectiveConfig` is not really different from `BaseConfig, but contains the "special" settings from e.g. `ImagePush` task. The new implementation also uses SmallRye Config and especially the existing Quarkus mechanisms to pull information out of a `PackageConfig` _object_ produced from a configuration. It turned out to be easier to reason about `PackageConfig` than "raw property values". Support for `application.yaml/yml` has also been added. All calls into the "inner workings" of a Quarkus build and Quarkus code generation have been moved to separate worker processes. The code in this change to support this is pretty dead simple. The primary reason to move that work into isolated worker processes is that all the `QuarkusBootstrap` and derived pieces know nothing about Gradle, so there is no "property like" mechanism to override settings from `application.properties/yaml/yml` with those from e.g. the `QuarkusPluginExtension`. The only option would have been to modify the system properties of the Gradle build process - but that's a no-go, especially considering other tasks running in parallel (think: two `QuarkusBuild` of different projects running at the same time). It would have been relatively easy to serialize all `QuarkusBuild` actions across a Gradle build, but why - and it would prevent using "beefy machines" to run many `QuarkusBuild`s in parallel. Another option would have been to implement a rather complex (and likely very racy) mechanism to track modifications to system properties. As a conclusion, it wasn't just very simple to leverage the process isolation using the Gradle worker API, but it's also not bad. Gradle does reuse already spawned and compatible worker instances. The "trick" implemented to "prioritize" configs from Gradle project properties and Quarkus extension settings is to pass the whole config as system properties to the worker. A bunch of hopefully useful logging has been implemented and added. With info-level loogging, the tasks emit at least some basic information about what they do and at least the package type being used. To be able to investigate which configuration settings were effectively used, there's another new task called `quarkusShowEffectiveConfig` that shows all the `quarkus.*` properties and some more information, including the loaded `application.(properties|yaml|yml)`. This task is intended to debug build issues. The task can optionally save the effecitve config properties as a properties file in the `build/` directory, if run with the command line option `./gradlew quarkusShowEffectiveConfig --save-config-properties`. None of the existing tests has changed, except the `BuildCOnfigurationTest` had to be adopted to reflect the updated order of config sources. Relates to: quarkusio#30852
snazy
added a commit
to snazy/quarkus
that referenced
this issue
Feb 19, 2023
Make `QuarkusBuild` not pollute Gradle's build cache Improve Quarkus build in Gradle, proper up-to-date mechanisms, right config-source priorities. Currently the `QuarkusBuild` task implementation adds even large build artifacts and unmodified dependency jars to Gradle's build cache. This pollutes the Gradle build cache quite a lot and therefore lets archived build caches become unnecessary huge. On top of that, "relicts" from previous runs of `QuarkusBuild` using a different configuration, like a different package-type, causes multiple permutations of the cached artifacts and leads to restoring unexpected artifacts from the build cache. This large change updates the Gradle logic to have "proper" up-to-date checks and in turn a reasonable setting of what is cached and what is not cached. Background on how I discovered the "caching and up-to-date issue" with `QuarkusBuild`: Quarkus 2.16 added the `@CacheableTask` annotation to `QuarkusBuild`. This means that _everything_ that is generated by `QuarkusBuild` (and "declared" as an output: fast-jar directory, uber-jar, native-runner) is cached. After having Quarkus 2.16.x in our "main" branch for about 1/2 day, nearly the whole "10 GB budget" for cache artifacts in GitHub was occupied by Gradle build cache artifacts - the build cache artifacts grew from build to build - this was not the case without the `@CacheableTask` annotation. The current inputs of the `QuarkusBuild` task are also incomplete. Those inputs do not include for example the configured `finalName`, the settings on the Gradle project and the settings on the Quarkus extension. This leads to situations that the wrong Gradle cache artifacts could be restored (think: you ask for a fast-jar, but get an uber-jar). Another issue is that repeated runs of `quarkusBuild` against the exact same code base and dependencies leads to different outputs, the contents of the generated jars and e.g. quarkus-application.dat differ. This causes different outputs of the `QuarkusBuild` task, which cause depending tasks and projects to unnecessarily run/work again. The initial goal of this effort was to make caching of `fast-jar` builds work nice in CI environments, I assume that's the most common package type used in CI environments for example for testing purposes. The idea here is to _not_ cache all the dependency jars, but leverage the mechanisms available in Gradle to "just collect" the dependencies. In other words (and simplified): for `fast-jar`, cache everthing "in `quarkus-app` except `lib/`. There is no change for users running a `./gradlew quarkusBuild`. `QuarkusBuild` is still the task that users should execute (or depend on in their builds). To achive the goal to make especially `fast-jar` builds "nicely cacheable" in CI, two new tasks had to be introduced: * `QuarkusBuildDependencies` uses the dependency information from the Quarkus `ApplicationModel` to tell Gradle to collect those. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is _never_ cacheable, but has _up-to-date_ checks. * `QuarkusBuildCacheableAppParts` performs a Quarkus application build and collects the _cacheable_ parts of that build. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is cacheable. * `QuarkusBuild`, for `fast-jar` and `legacy-jar`, assembles the outputs of the above two tasks to produce the expected output. Performs a Quarkus build for other package types. See below on when this task is cacheable. The `build/quarkus-build/` directory is used to properly separate the outputs of the above three tasks: * `build/quarkus-build/gen/` receives the full output of a Quarkus build * `build/quarkus-build/app/` receives the _cacheable_ parts from `build/quarkus-build/gen/` * `build/quarkus-build/dep/` receives the dependency jars The output of `QuarkusBuild` is, by default, cacheable in non-CI environments and _not_ cacheable in CI environments. The behavior can be explicitly overridden using the new property `cacheLargeArtifacts` property. As outlined above, caching huge artifacts is not beneficial in CI, either due to space limitations (GitHub's 10GB limit for example) or just the cost of network traffic/duration. On a developer's machine however, the cost of storing/retrieving even bigger artifacts is rather neglectible and lower than rebuilding a Quarkus application. Before this change, the "priority" of configuration settings was (in most cases...) effectively: System properties, environment variables, `application.properties`, configurations in Gradle build scripts & Gradle project properties. This order is unintuitive and was changed to: system properties, environment variables, Gradle project properties, Quarkus build configuration, application.properties. The previous code had several places in which Quarkus related configuration settings were retrieved with sometimes different priorities of the "config sources". The whole "machinery" to get the configuration has been rewritten and encapsulated in new classes `EffectiveConfig` (effective for a Quarkus build) and `BaseConfig` (available during Gradle task configuration phase). `EffectiveConfig` is not really different from `BaseConfig, but contains the "special" settings from e.g. `ImagePush` task. The new implementation also uses SmallRye Config and especially the existing Quarkus mechanisms to pull information out of a `PackageConfig` _object_ produced from a configuration. It turned out to be easier to reason about `PackageConfig` than "raw property values". Support for `application.yaml/yml` has also been added. All calls into the "inner workings" of a Quarkus build and Quarkus code generation have been moved to separate worker processes. The code in this change to support this is pretty dead simple. The primary reason to move that work into isolated worker processes is that all the `QuarkusBootstrap` and derived pieces know nothing about Gradle, so there is no "property like" mechanism to override settings from `application.properties/yaml/yml` with those from e.g. the `QuarkusPluginExtension`. The only option would have been to modify the system properties of the Gradle build process - but that's a no-go, especially considering other tasks running in parallel (think: two `QuarkusBuild` of different projects running at the same time). It would have been relatively easy to serialize all `QuarkusBuild` actions across a Gradle build, but why - and it would prevent using "beefy machines" to run many `QuarkusBuild`s in parallel. Another option would have been to implement a rather complex (and likely very racy) mechanism to track modifications to system properties. As a conclusion, it wasn't just very simple to leverage the process isolation using the Gradle worker API, but it's also not bad. Gradle does reuse already spawned and compatible worker instances. The "trick" implemented to "prioritize" configs from Gradle project properties and Quarkus extension settings is to pass the whole config as system properties to the worker. A bunch of hopefully useful logging has been implemented and added. With info-level loogging, the tasks emit at least some basic information about what they do and at least the package type being used. To be able to investigate which configuration settings were effectively used, there's another new task called `quarkusShowEffectiveConfig` that shows all the `quarkus.*` properties and some more information, including the loaded `application.(properties|yaml|yml)`. This task is intended to debug build issues. The task can optionally save the effecitve config properties as a properties file in the `build/` directory, if run with the command line option `./gradlew quarkusShowEffectiveConfig --save-config-properties`. None of the existing tests has changed, except the `BuildCOnfigurationTest` had to be adopted to reflect the updated order of config sources. Relates to: quarkusio#30852
snazy
added a commit
to snazy/quarkus
that referenced
this issue
Feb 20, 2023
Make `QuarkusBuild` not pollute Gradle's build cache Improve Quarkus build in Gradle, proper up-to-date mechanisms, right config-source priorities. Currently the `QuarkusBuild` task implementation adds even large build artifacts and unmodified dependency jars to Gradle's build cache. This pollutes the Gradle build cache quite a lot and therefore lets archived build caches become unnecessary huge. On top of that, "relicts" from previous runs of `QuarkusBuild` using a different configuration, like a different package-type, causes multiple permutations of the cached artifacts and leads to restoring unexpected artifacts from the build cache. This large change updates the Gradle logic to have "proper" up-to-date checks and in turn a reasonable setting of what is cached and what is not cached. Background on how I discovered the "caching and up-to-date issue" with `QuarkusBuild`: Quarkus 2.16 added the `@CacheableTask` annotation to `QuarkusBuild`. This means that _everything_ that is generated by `QuarkusBuild` (and "declared" as an output: fast-jar directory, uber-jar, native-runner) is cached. After having Quarkus 2.16.x in our "main" branch for about 1/2 day, nearly the whole "10 GB budget" for cache artifacts in GitHub was occupied by Gradle build cache artifacts - the build cache artifacts grew from build to build - this was not the case without the `@CacheableTask` annotation. The current inputs of the `QuarkusBuild` task are also incomplete. Those inputs do not include for example the configured `finalName`, the settings on the Gradle project and the settings on the Quarkus extension. This leads to situations that the wrong Gradle cache artifacts could be restored (think: you ask for a fast-jar, but get an uber-jar). Another issue is that repeated runs of `quarkusBuild` against the exact same code base and dependencies leads to different outputs, the contents of the generated jars and e.g. quarkus-application.dat differ. This causes different outputs of the `QuarkusBuild` task, which cause depending tasks and projects to unnecessarily run/work again. The initial goal of this effort was to make caching of `fast-jar` builds work nice in CI environments, I assume that's the most common package type used in CI environments for example for testing purposes. The idea here is to _not_ cache all the dependency jars, but leverage the mechanisms available in Gradle to "just collect" the dependencies. In other words (and simplified): for `fast-jar`, cache everthing "in `quarkus-app` except `lib/`. There is no change for users running a `./gradlew quarkusBuild`. `QuarkusBuild` is still the task that users should execute (or depend on in their builds). To achive the goal to make especially `fast-jar` builds "nicely cacheable" in CI, two new tasks had to be introduced: * `QuarkusBuildDependencies` uses the dependency information from the Quarkus `ApplicationModel` to tell Gradle to collect those. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is _never_ cacheable, but has _up-to-date_ checks. * `QuarkusBuildCacheableAppParts` performs a Quarkus application build and collects the _cacheable_ parts of that build. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is cacheable. * `QuarkusBuild`, for `fast-jar` and `legacy-jar`, assembles the outputs of the above two tasks to produce the expected output. Performs a Quarkus build for other package types. See below on when this task is cacheable. The `build/quarkus-build/` directory is used to properly separate the outputs of the above three tasks: * `build/quarkus-build/gen/` receives the full output of a Quarkus build * `build/quarkus-build/app/` receives the _cacheable_ parts from `build/quarkus-build/gen/` * `build/quarkus-build/dep/` receives the dependency jars The output of `QuarkusBuild` is, by default, cacheable in non-CI environments and _not_ cacheable in CI environments. The behavior can be explicitly overridden using the new property `cacheLargeArtifacts` property. As outlined above, caching huge artifacts is not beneficial in CI, either due to space limitations (GitHub's 10GB limit for example) or just the cost of network traffic/duration. On a developer's machine however, the cost of storing/retrieving even bigger artifacts is rather neglectible and lower than rebuilding a Quarkus application. Before this change, the "priority" of configuration settings was (in most cases...) effectively: System properties, environment variables, `application.properties`, configurations in Gradle build scripts & Gradle project properties. This order is unintuitive and was changed to: system properties, environment variables, Gradle project properties, Quarkus build configuration, application.properties. The previous code had several places in which Quarkus related configuration settings were retrieved with sometimes different priorities of the "config sources". The whole "machinery" to get the configuration has been rewritten and encapsulated in new classes `EffectiveConfig` (effective for a Quarkus build) and `BaseConfig` (available during Gradle task configuration phase). `EffectiveConfig` is not really different from `BaseConfig, but contains the "special" settings from e.g. `ImagePush` task. The new implementation also uses SmallRye Config and especially the existing Quarkus mechanisms to pull information out of a `PackageConfig` _object_ produced from a configuration. It turned out to be easier to reason about `PackageConfig` than "raw property values". Support for `application.yaml/yml` has also been added. All calls into the "inner workings" of a Quarkus build and Quarkus code generation have been moved to separate worker processes. The code in this change to support this is pretty dead simple. The primary reason to move that work into isolated worker processes is that all the `QuarkusBootstrap` and derived pieces know nothing about Gradle, so there is no "property like" mechanism to override settings from `application.properties/yaml/yml` with those from e.g. the `QuarkusPluginExtension`. The only option would have been to modify the system properties of the Gradle build process - but that's a no-go, especially considering other tasks running in parallel (think: two `QuarkusBuild` of different projects running at the same time). It would have been relatively easy to serialize all `QuarkusBuild` actions across a Gradle build, but why - and it would prevent using "beefy machines" to run many `QuarkusBuild`s in parallel. Another option would have been to implement a rather complex (and likely very racy) mechanism to track modifications to system properties. As a conclusion, it wasn't just very simple to leverage the process isolation using the Gradle worker API, but it's also not bad. Gradle does reuse already spawned and compatible worker instances. The "trick" implemented to "prioritize" configs from Gradle project properties and Quarkus extension settings is to pass the whole config as system properties to the worker. A bunch of hopefully useful logging has been implemented and added. With info-level loogging, the tasks emit at least some basic information about what they do and at least the package type being used. To be able to investigate which configuration settings were effectively used, there's another new task called `quarkusShowEffectiveConfig` that shows all the `quarkus.*` properties and some more information, including the loaded `application.(properties|yaml|yml)`. This task is intended to debug build issues. The task can optionally save the effecitve config properties as a properties file in the `build/` directory, if run with the command line option `./gradlew quarkusShowEffectiveConfig --save-config-properties`. None of the existing tests has changed, except the `BuildCOnfigurationTest` had to be adopted to reflect the updated order of config sources. Relates to: quarkusio#30852
snazy
added a commit
to snazy/quarkus
that referenced
this issue
Feb 21, 2023
Make `QuarkusBuild` not pollute Gradle's build cache Improve Quarkus build in Gradle, proper up-to-date mechanisms, right config-source priorities. Currently the `QuarkusBuild` task implementation adds even large build artifacts and unmodified dependency jars to Gradle's build cache. This pollutes the Gradle build cache quite a lot and therefore lets archived build caches become unnecessary huge. On top of that, "relicts" from previous runs of `QuarkusBuild` using a different configuration, like a different package-type, causes multiple permutations of the cached artifacts and leads to restoring unexpected artifacts from the build cache. This large change updates the Gradle logic to have "proper" up-to-date checks and in turn a reasonable setting of what is cached and what is not cached. Background on how I discovered the "caching and up-to-date issue" with `QuarkusBuild`: Quarkus 2.16 added the `@CacheableTask` annotation to `QuarkusBuild`. This means that _everything_ that is generated by `QuarkusBuild` (and "declared" as an output: fast-jar directory, uber-jar, native-runner) is cached. After having Quarkus 2.16.x in our "main" branch for about 1/2 day, nearly the whole "10 GB budget" for cache artifacts in GitHub was occupied by Gradle build cache artifacts - the build cache artifacts grew from build to build - this was not the case without the `@CacheableTask` annotation. The current inputs of the `QuarkusBuild` task are also incomplete. Those inputs do not include for example the configured `finalName`, the settings on the Gradle project and the settings on the Quarkus extension. This leads to situations that the wrong Gradle cache artifacts could be restored (think: you ask for a fast-jar, but get an uber-jar). Another issue is that repeated runs of `quarkusBuild` against the exact same code base and dependencies leads to different outputs, the contents of the generated jars and e.g. quarkus-application.dat differ. This causes different outputs of the `QuarkusBuild` task, which cause depending tasks and projects to unnecessarily run/work again. The initial goal of this effort was to make caching of `fast-jar` builds work nice in CI environments, I assume that's the most common package type used in CI environments for example for testing purposes. The idea here is to _not_ cache all the dependency jars, but leverage the mechanisms available in Gradle to "just collect" the dependencies. In other words (and simplified): for `fast-jar`, cache everthing "in `quarkus-app` except `lib/`. There is no change for users running a `./gradlew quarkusBuild`. `QuarkusBuild` is still the task that users should execute (or depend on in their builds). To achive the goal to make especially `fast-jar` builds "nicely cacheable" in CI, two new tasks had to be introduced: * `QuarkusBuildDependencies` uses the dependency information from the Quarkus `ApplicationModel` to tell Gradle to collect those. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is _never_ cacheable, but has _up-to-date_ checks. * `QuarkusBuildCacheableAppParts` performs a Quarkus application build and collects the _cacheable_ parts of that build. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is cacheable. * `QuarkusBuild`, for `fast-jar` and `legacy-jar`, assembles the outputs of the above two tasks to produce the expected output. Performs a Quarkus build for other package types. See below on when this task is cacheable. The `build/quarkus-build/` directory is used to properly separate the outputs of the above three tasks: * `build/quarkus-build/gen/` receives the full output of a Quarkus build * `build/quarkus-build/app/` receives the _cacheable_ parts from `build/quarkus-build/gen/` * `build/quarkus-build/dep/` receives the dependency jars The output of `QuarkusBuild` is, by default, cacheable in non-CI environments and _not_ cacheable in CI environments. The behavior can be explicitly overridden using the new property `cacheLargeArtifacts` property. As outlined above, caching huge artifacts is not beneficial in CI, either due to space limitations (GitHub's 10GB limit for example) or just the cost of network traffic/duration. On a developer's machine however, the cost of storing/retrieving even bigger artifacts is rather neglectible and lower than rebuilding a Quarkus application. Before this change, the "priority" of configuration settings was (in most cases...) effectively: System properties, environment variables, `application.properties`, configurations in Gradle build scripts & Gradle project properties. This order is unintuitive and was changed to: system properties, environment variables, Gradle project properties, Quarkus build configuration, application.properties. The previous code had several places in which Quarkus related configuration settings were retrieved with sometimes different priorities of the "config sources". The whole "machinery" to get the configuration has been rewritten and encapsulated in new classes `EffectiveConfig` (effective for a Quarkus build) and `BaseConfig` (available during Gradle task configuration phase). `EffectiveConfig` is not really different from `BaseConfig, but contains the "special" settings from e.g. `ImagePush` task. The new implementation also uses SmallRye Config and especially the existing Quarkus mechanisms to pull information out of a `PackageConfig` _object_ produced from a configuration. It turned out to be easier to reason about `PackageConfig` than "raw property values". Support for `application.yaml/yml` has also been added. All calls into the "inner workings" of a Quarkus build and Quarkus code generation have been moved to separate worker processes. The code in this change to support this is pretty dead simple. The primary reason to move that work into isolated worker processes is that all the `QuarkusBootstrap` and derived pieces know nothing about Gradle, so there is no "property like" mechanism to override settings from `application.properties/yaml/yml` with those from e.g. the `QuarkusPluginExtension`. The only option would have been to modify the system properties of the Gradle build process - but that's a no-go, especially considering other tasks running in parallel (think: two `QuarkusBuild` of different projects running at the same time). It would have been relatively easy to serialize all `QuarkusBuild` actions across a Gradle build, but why - and it would prevent using "beefy machines" to run many `QuarkusBuild`s in parallel. Another option would have been to implement a rather complex (and likely very racy) mechanism to track modifications to system properties. As a conclusion, it wasn't just very simple to leverage the process isolation using the Gradle worker API, but it's also not bad. Gradle does reuse already spawned and compatible worker instances. The "trick" implemented to "prioritize" configs from Gradle project properties and Quarkus extension settings is to pass the whole config as system properties to the worker. A bunch of hopefully useful logging has been implemented and added. With info-level loogging, the tasks emit at least some basic information about what they do and at least the package type being used. To be able to investigate which configuration settings were effectively used, there's another new task called `quarkusShowEffectiveConfig` that shows all the `quarkus.*` properties and some more information, including the loaded `application.(properties|yaml|yml)`. This task is intended to debug build issues. The task can optionally save the effecitve config properties as a properties file in the `build/` directory, if run with the command line option `./gradlew quarkusShowEffectiveConfig --save-config-properties`. None of the existing tests has changed, except the `BuildCOnfigurationTest` had to be adopted to reflect the updated order of config sources. Relates to: quarkusio#30852
snazy
added a commit
to snazy/quarkus
that referenced
this issue
Feb 25, 2023
Make `QuarkusBuild` not pollute Gradle's build cache Improve Quarkus build in Gradle, proper up-to-date mechanisms, right config-source priorities. Currently the `QuarkusBuild` task implementation adds even large build artifacts and unmodified dependency jars to Gradle's build cache. This pollutes the Gradle build cache quite a lot and therefore lets archived build caches become unnecessary huge. On top of that, "relicts" from previous runs of `QuarkusBuild` using a different configuration, like a different package-type, causes multiple permutations of the cached artifacts and leads to restoring unexpected artifacts from the build cache. This large change updates the Gradle logic to have "proper" up-to-date checks and in turn a reasonable setting of what is cached and what is not cached. Background on how I discovered the "caching and up-to-date issue" with `QuarkusBuild`: Quarkus 2.16 added the `@CacheableTask` annotation to `QuarkusBuild`. This means that _everything_ that is generated by `QuarkusBuild` (and "declared" as an output: fast-jar directory, uber-jar, native-runner) is cached. After having Quarkus 2.16.x in our "main" branch for about 1/2 day, nearly the whole "10 GB budget" for cache artifacts in GitHub was occupied by Gradle build cache artifacts - the build cache artifacts grew from build to build - this was not the case without the `@CacheableTask` annotation. The current inputs of the `QuarkusBuild` task are also incomplete. Those inputs do not include for example the configured `finalName`, the settings on the Gradle project and the settings on the Quarkus extension. This leads to situations that the wrong Gradle cache artifacts could be restored (think: you ask for a fast-jar, but get an uber-jar). Another issue is that repeated runs of `quarkusBuild` against the exact same code base and dependencies leads to different outputs, the contents of the generated jars and e.g. quarkus-application.dat differ. This causes different outputs of the `QuarkusBuild` task, which cause depending tasks and projects to unnecessarily run/work again. The initial goal of this effort was to make caching of `fast-jar` builds work nice in CI environments, I assume that's the most common package type used in CI environments for example for testing purposes. The idea here is to _not_ cache all the dependency jars, but leverage the mechanisms available in Gradle to "just collect" the dependencies. In other words (and simplified): for `fast-jar`, cache everthing "in `quarkus-app` except `lib/`. There is no change for users running a `./gradlew quarkusBuild`. `QuarkusBuild` is still the task that users should execute (or depend on in their builds). To achive the goal to make especially `fast-jar` builds "nicely cacheable" in CI, two new tasks had to be introduced: * `QuarkusBuildDependencies` uses the dependency information from the Quarkus `ApplicationModel` to tell Gradle to collect those. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is _never_ cacheable, but has _up-to-date_ checks. * `QuarkusBuildCacheableAppParts` performs a Quarkus application build and collects the _cacheable_ parts of that build. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is cacheable. * `QuarkusBuild`, for `fast-jar` and `legacy-jar`, assembles the outputs of the above two tasks to produce the expected output. Performs a Quarkus build for other package types. See below on when this task is cacheable. The `build/quarkus-build/` directory is used to properly separate the outputs of the above three tasks: * `build/quarkus-build/gen/` receives the full output of a Quarkus build * `build/quarkus-build/app/` receives the _cacheable_ parts from `build/quarkus-build/gen/` * `build/quarkus-build/dep/` receives the dependency jars The output of `QuarkusBuild` is, by default, cacheable in non-CI environments and _not_ cacheable in CI environments. The behavior can be explicitly overridden using the new property `cacheLargeArtifacts` property. As outlined above, caching huge artifacts is not beneficial in CI, either due to space limitations (GitHub's 10GB limit for example) or just the cost of network traffic/duration. On a developer's machine however, the cost of storing/retrieving even bigger artifacts is rather neglectible and lower than rebuilding a Quarkus application. Before this change, the "priority" of configuration settings was (in most cases...) effectively: System properties, environment variables, `application.properties`, configurations in Gradle build scripts & Gradle project properties. This order is unintuitive and was changed to: system properties, environment variables, Gradle project properties, Quarkus build configuration, application.properties. The previous code had several places in which Quarkus related configuration settings were retrieved with sometimes different priorities of the "config sources". The whole "machinery" to get the configuration has been rewritten and encapsulated in new classes `EffectiveConfig` (effective for a Quarkus build) and `BaseConfig` (available during Gradle task configuration phase). `EffectiveConfig` is not really different from `BaseConfig, but contains the "special" settings from e.g. `ImagePush` task. The new implementation also uses SmallRye Config and especially the existing Quarkus mechanisms to pull information out of a `PackageConfig` _object_ produced from a configuration. It turned out to be easier to reason about `PackageConfig` than "raw property values". Support for `application.yaml/yml` has also been added. All calls into the "inner workings" of a Quarkus build and Quarkus code generation have been moved to separate worker processes. The code in this change to support this is pretty dead simple. The primary reason to move that work into isolated worker processes is that all the `QuarkusBootstrap` and derived pieces know nothing about Gradle, so there is no "property like" mechanism to override settings from `application.properties/yaml/yml` with those from e.g. the `QuarkusPluginExtension`. The only option would have been to modify the system properties of the Gradle build process - but that's a no-go, especially considering other tasks running in parallel (think: two `QuarkusBuild` of different projects running at the same time). It would have been relatively easy to serialize all `QuarkusBuild` actions across a Gradle build, but why - and it would prevent using "beefy machines" to run many `QuarkusBuild`s in parallel. Another option would have been to implement a rather complex (and likely very racy) mechanism to track modifications to system properties. As a conclusion, it wasn't just very simple to leverage the process isolation using the Gradle worker API, but it's also not bad. Gradle does reuse already spawned and compatible worker instances. The "trick" implemented to "prioritize" configs from Gradle project properties and Quarkus extension settings is to pass the whole config as system properties to the worker. A bunch of hopefully useful logging has been implemented and added. With info-level loogging, the tasks emit at least some basic information about what they do and at least the package type being used. To be able to investigate which configuration settings were effectively used, there's another new task called `quarkusShowEffectiveConfig` that shows all the `quarkus.*` properties and some more information, including the loaded `application.(properties|yaml|yml)`. This task is intended to debug build issues. The task can optionally save the effecitve config properties as a properties file in the `build/` directory, if run with the command line option `./gradlew quarkusShowEffectiveConfig --save-config-properties`. None of the existing tests has changed, except the `BuildCOnfigurationTest` had to be adopted to reflect the updated order of config sources. Relates to: quarkusio#30852
snazy
added a commit
to snazy/quarkus
that referenced
this issue
Feb 25, 2023
Make `QuarkusBuild` not pollute Gradle's build cache Improve Quarkus build in Gradle, proper up-to-date mechanisms, right config-source priorities. Currently the `QuarkusBuild` task implementation adds even large build artifacts and unmodified dependency jars to Gradle's build cache. This pollutes the Gradle build cache quite a lot and therefore lets archived build caches become unnecessary huge. On top of that, "relicts" from previous runs of `QuarkusBuild` using a different configuration, like a different package-type, causes multiple permutations of the cached artifacts and leads to restoring unexpected artifacts from the build cache. This large change updates the Gradle logic to have "proper" up-to-date checks and in turn a reasonable setting of what is cached and what is not cached. Background on how I discovered the "caching and up-to-date issue" with `QuarkusBuild`: Quarkus 2.16 added the `@CacheableTask` annotation to `QuarkusBuild`. This means that _everything_ that is generated by `QuarkusBuild` (and "declared" as an output: fast-jar directory, uber-jar, native-runner) is cached. After having Quarkus 2.16.x in our "main" branch for about 1/2 day, nearly the whole "10 GB budget" for cache artifacts in GitHub was occupied by Gradle build cache artifacts - the build cache artifacts grew from build to build - this was not the case without the `@CacheableTask` annotation. The current inputs of the `QuarkusBuild` task are also incomplete. Those inputs do not include for example the configured `finalName`, the settings on the Gradle project and the settings on the Quarkus extension. This leads to situations that the wrong Gradle cache artifacts could be restored (think: you ask for a fast-jar, but get an uber-jar). Another issue is that repeated runs of `quarkusBuild` against the exact same code base and dependencies leads to different outputs, the contents of the generated jars and e.g. quarkus-application.dat differ. This causes different outputs of the `QuarkusBuild` task, which cause depending tasks and projects to unnecessarily run/work again. The initial goal of this effort was to make caching of `fast-jar` builds work nice in CI environments, I assume that's the most common package type used in CI environments for example for testing purposes. The idea here is to _not_ cache all the dependency jars, but leverage the mechanisms available in Gradle to "just collect" the dependencies. In other words (and simplified): for `fast-jar`, cache everthing "in `quarkus-app` except `lib/`. There is no change for users running a `./gradlew quarkusBuild`. `QuarkusBuild` is still the task that users should execute (or depend on in their builds). To achive the goal to make especially `fast-jar` builds "nicely cacheable" in CI, two new tasks had to be introduced: * `QuarkusBuildDependencies` uses the dependency information from the Quarkus `ApplicationModel` to tell Gradle to collect those. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is _never_ cacheable, but has _up-to-date_ checks. * `QuarkusBuildCacheableAppParts` performs a Quarkus application build and collects the _cacheable_ parts of that build. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is cacheable. * `QuarkusBuild`, for `fast-jar` and `legacy-jar`, assembles the outputs of the above two tasks to produce the expected output. Performs a Quarkus build for other package types. See below on when this task is cacheable. The `build/quarkus-build/` directory is used to properly separate the outputs of the above three tasks: * `build/quarkus-build/gen/` receives the full output of a Quarkus build * `build/quarkus-build/app/` receives the _cacheable_ parts from `build/quarkus-build/gen/` * `build/quarkus-build/dep/` receives the dependency jars The output of `QuarkusBuild` is, by default, cacheable in non-CI environments and _not_ cacheable in CI environments. The behavior can be explicitly overridden using the new property `cacheLargeArtifacts` property. As outlined above, caching huge artifacts is not beneficial in CI, either due to space limitations (GitHub's 10GB limit for example) or just the cost of network traffic/duration. On a developer's machine however, the cost of storing/retrieving even bigger artifacts is rather neglectible and lower than rebuilding a Quarkus application. Before this change, the "priority" of configuration settings was (in most cases...) effectively: System properties, environment variables, `application.properties`, configurations in Gradle build scripts & Gradle project properties. This order is unintuitive and was changed to: system properties, environment variables, Gradle project properties, Quarkus build configuration, application.properties. The previous code had several places in which Quarkus related configuration settings were retrieved with sometimes different priorities of the "config sources". The whole "machinery" to get the configuration has been rewritten and encapsulated in new classes `EffectiveConfig` (effective for a Quarkus build) and `BaseConfig` (available during Gradle task configuration phase). `EffectiveConfig` is not really different from `BaseConfig, but contains the "special" settings from e.g. `ImagePush` task. The new implementation also uses SmallRye Config and especially the existing Quarkus mechanisms to pull information out of a `PackageConfig` _object_ produced from a configuration. It turned out to be easier to reason about `PackageConfig` than "raw property values". Support for `application.yaml/yml` has also been added. All calls into the "inner workings" of a Quarkus build and Quarkus code generation have been moved to separate worker processes. The code in this change to support this is pretty dead simple. The primary reason to move that work into isolated worker processes is that all the `QuarkusBootstrap` and derived pieces know nothing about Gradle, so there is no "property like" mechanism to override settings from `application.properties/yaml/yml` with those from e.g. the `QuarkusPluginExtension`. The only option would have been to modify the system properties of the Gradle build process - but that's a no-go, especially considering other tasks running in parallel (think: two `QuarkusBuild` of different projects running at the same time). It would have been relatively easy to serialize all `QuarkusBuild` actions across a Gradle build, but why - and it would prevent using "beefy machines" to run many `QuarkusBuild`s in parallel. Another option would have been to implement a rather complex (and likely very racy) mechanism to track modifications to system properties. As a conclusion, it wasn't just very simple to leverage the process isolation using the Gradle worker API, but it's also not bad. Gradle does reuse already spawned and compatible worker instances. The "trick" implemented to "prioritize" configs from Gradle project properties and Quarkus extension settings is to pass the whole config as system properties to the worker. A bunch of hopefully useful logging has been implemented and added. With info-level loogging, the tasks emit at least some basic information about what they do and at least the package type being used. To be able to investigate which configuration settings were effectively used, there's another new task called `quarkusShowEffectiveConfig` that shows all the `quarkus.*` properties and some more information, including the loaded `application.(properties|yaml|yml)`. This task is intended to debug build issues. The task can optionally save the effecitve config properties as a properties file in the `build/` directory, if run with the command line option `./gradlew quarkusShowEffectiveConfig --save-config-properties`. None of the existing tests has changed, except the `BuildCOnfigurationTest` had to be adopted to reflect the updated order of config sources. Relates to: quarkusio#30852
snazy
added a commit
to snazy/quarkus
that referenced
this issue
Feb 27, 2023
Make `QuarkusBuild` not pollute Gradle's build cache Improve Quarkus build in Gradle, proper up-to-date mechanisms, right config-source priorities. Currently the `QuarkusBuild` task implementation adds even large build artifacts and unmodified dependency jars to Gradle's build cache. This pollutes the Gradle build cache quite a lot and therefore lets archived build caches become unnecessary huge. On top of that, "relicts" from previous runs of `QuarkusBuild` using a different configuration, like a different package-type, causes multiple permutations of the cached artifacts and leads to restoring unexpected artifacts from the build cache. This large change updates the Gradle logic to have "proper" up-to-date checks and in turn a reasonable setting of what is cached and what is not cached. Background on how I discovered the "caching and up-to-date issue" with `QuarkusBuild`: Quarkus 2.16 added the `@CacheableTask` annotation to `QuarkusBuild`. This means that _everything_ that is generated by `QuarkusBuild` (and "declared" as an output: fast-jar directory, uber-jar, native-runner) is cached. After having Quarkus 2.16.x in our "main" branch for about 1/2 day, nearly the whole "10 GB budget" for cache artifacts in GitHub was occupied by Gradle build cache artifacts - the build cache artifacts grew from build to build - this was not the case without the `@CacheableTask` annotation. The current inputs of the `QuarkusBuild` task are also incomplete. Those inputs do not include for example the configured `finalName`, the settings on the Gradle project and the settings on the Quarkus extension. This leads to situations that the wrong Gradle cache artifacts could be restored (think: you ask for a fast-jar, but get an uber-jar). Another issue is that repeated runs of `quarkusBuild` against the exact same code base and dependencies leads to different outputs, the contents of the generated jars and e.g. quarkus-application.dat differ. This causes different outputs of the `QuarkusBuild` task, which cause depending tasks and projects to unnecessarily run/work again. The initial goal of this effort was to make caching of `fast-jar` builds work nice in CI environments, I assume that's the most common package type used in CI environments for example for testing purposes. The idea here is to _not_ cache all the dependency jars, but leverage the mechanisms available in Gradle to "just collect" the dependencies. In other words (and simplified): for `fast-jar`, cache everthing "in `quarkus-app` except `lib/`. There is no change for users running a `./gradlew quarkusBuild`. `QuarkusBuild` is still the task that users should execute (or depend on in their builds). To achive the goal to make especially `fast-jar` builds "nicely cacheable" in CI, two new tasks had to be introduced: * `QuarkusBuildDependencies` uses the dependency information from the Quarkus `ApplicationModel` to tell Gradle to collect those. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is _never_ cacheable, but has _up-to-date_ checks. * `QuarkusBuildCacheableAppParts` performs a Quarkus application build and collects the _cacheable_ parts of that build. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is cacheable. * `QuarkusBuild`, for `fast-jar` and `legacy-jar`, assembles the outputs of the above two tasks to produce the expected output. Performs a Quarkus build for other package types. See below on when this task is cacheable. The `build/quarkus-build/` directory is used to properly separate the outputs of the above three tasks: * `build/quarkus-build/gen/` receives the full output of a Quarkus build * `build/quarkus-build/app/` receives the _cacheable_ parts from `build/quarkus-build/gen/` * `build/quarkus-build/dep/` receives the dependency jars The output of `QuarkusBuild` is, by default, cacheable in non-CI environments and _not_ cacheable in CI environments. The behavior can be explicitly overridden using the new property `cacheLargeArtifacts` property. As outlined above, caching huge artifacts is not beneficial in CI, either due to space limitations (GitHub's 10GB limit for example) or just the cost of network traffic/duration. On a developer's machine however, the cost of storing/retrieving even bigger artifacts is rather neglectible and lower than rebuilding a Quarkus application. Before this change, the "priority" of configuration settings was (in most cases...) effectively: System properties, environment variables, `application.properties`, configurations in Gradle build scripts & Gradle project properties. This order is unintuitive and was changed to: system properties, environment variables, Gradle project properties, Quarkus build configuration, application.properties. The previous code had several places in which Quarkus related configuration settings were retrieved with sometimes different priorities of the "config sources". The whole "machinery" to get the configuration has been rewritten and encapsulated in new classes `EffectiveConfig` (effective for a Quarkus build) and `BaseConfig` (available during Gradle task configuration phase). `EffectiveConfig` is not really different from `BaseConfig, but contains the "special" settings from e.g. `ImagePush` task. The new implementation also uses SmallRye Config and especially the existing Quarkus mechanisms to pull information out of a `PackageConfig` _object_ produced from a configuration. It turned out to be easier to reason about `PackageConfig` than "raw property values". Support for `application.yaml/yml` has also been added. All calls into the "inner workings" of a Quarkus build and Quarkus code generation have been moved to separate worker processes. The code in this change to support this is pretty dead simple. The primary reason to move that work into isolated worker processes is that all the `QuarkusBootstrap` and derived pieces know nothing about Gradle, so there is no "property like" mechanism to override settings from `application.properties/yaml/yml` with those from e.g. the `QuarkusPluginExtension`. The only option would have been to modify the system properties of the Gradle build process - but that's a no-go, especially considering other tasks running in parallel (think: two `QuarkusBuild` of different projects running at the same time). It would have been relatively easy to serialize all `QuarkusBuild` actions across a Gradle build, but why - and it would prevent using "beefy machines" to run many `QuarkusBuild`s in parallel. Another option would have been to implement a rather complex (and likely very racy) mechanism to track modifications to system properties. As a conclusion, it wasn't just very simple to leverage the process isolation using the Gradle worker API, but it's also not bad. Gradle does reuse already spawned and compatible worker instances. The "trick" implemented to "prioritize" configs from Gradle project properties and Quarkus extension settings is to pass the whole config as system properties to the worker. A bunch of hopefully useful logging has been implemented and added. With info-level loogging, the tasks emit at least some basic information about what they do and at least the package type being used. To be able to investigate which configuration settings were effectively used, there's another new task called `quarkusShowEffectiveConfig` that shows all the `quarkus.*` properties and some more information, including the loaded `application.(properties|yaml|yml)`. This task is intended to debug build issues. The task can optionally save the effecitve config properties as a properties file in the `build/` directory, if run with the command line option `./gradlew quarkusShowEffectiveConfig --save-config-properties`. None of the existing tests has changed, except the `BuildCOnfigurationTest` had to be adopted to reflect the updated order of config sources. Relates to: quarkusio#30852
snazy
added a commit
to snazy/quarkus
that referenced
this issue
Feb 27, 2023
Make `QuarkusBuild` not pollute Gradle's build cache Improve Quarkus build in Gradle, proper up-to-date mechanisms, right config-source priorities. Currently the `QuarkusBuild` task implementation adds even large build artifacts and unmodified dependency jars to Gradle's build cache. This pollutes the Gradle build cache quite a lot and therefore lets archived build caches become unnecessary huge. On top of that, "relicts" from previous runs of `QuarkusBuild` using a different configuration, like a different package-type, causes multiple permutations of the cached artifacts and leads to restoring unexpected artifacts from the build cache. This large change updates the Gradle logic to have "proper" up-to-date checks and in turn a reasonable setting of what is cached and what is not cached. Background on how I discovered the "caching and up-to-date issue" with `QuarkusBuild`: Quarkus 2.16 added the `@CacheableTask` annotation to `QuarkusBuild`. This means that _everything_ that is generated by `QuarkusBuild` (and "declared" as an output: fast-jar directory, uber-jar, native-runner) is cached. After having Quarkus 2.16.x in our "main" branch for about 1/2 day, nearly the whole "10 GB budget" for cache artifacts in GitHub was occupied by Gradle build cache artifacts - the build cache artifacts grew from build to build - this was not the case without the `@CacheableTask` annotation. The current inputs of the `QuarkusBuild` task are also incomplete. Those inputs do not include for example the configured `finalName`, the settings on the Gradle project and the settings on the Quarkus extension. This leads to situations that the wrong Gradle cache artifacts could be restored (think: you ask for a fast-jar, but get an uber-jar). Another issue is that repeated runs of `quarkusBuild` against the exact same code base and dependencies leads to different outputs, the contents of the generated jars and e.g. quarkus-application.dat differ. This causes different outputs of the `QuarkusBuild` task, which cause depending tasks and projects to unnecessarily run/work again. The initial goal of this effort was to make caching of `fast-jar` builds work nice in CI environments, I assume that's the most common package type used in CI environments for example for testing purposes. The idea here is to _not_ cache all the dependency jars, but leverage the mechanisms available in Gradle to "just collect" the dependencies. In other words (and simplified): for `fast-jar`, cache everthing "in `quarkus-app` except `lib/`. There is no change for users running a `./gradlew quarkusBuild`. `QuarkusBuild` is still the task that users should execute (or depend on in their builds). To achive the goal to make especially `fast-jar` builds "nicely cacheable" in CI, two new tasks had to be introduced: * `QuarkusBuildDependencies` uses the dependency information from the Quarkus `ApplicationModel` to tell Gradle to collect those. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is _never_ cacheable, but has _up-to-date_ checks. * `QuarkusBuildCacheableAppParts` performs a Quarkus application build and collects the _cacheable_ parts of that build. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is cacheable. * `QuarkusBuild`, for `fast-jar` and `legacy-jar`, assembles the outputs of the above two tasks to produce the expected output. Performs a Quarkus build for other package types. See below on when this task is cacheable. The `build/quarkus-build/` directory is used to properly separate the outputs of the above three tasks: * `build/quarkus-build/gen/` receives the full output of a Quarkus build * `build/quarkus-build/app/` receives the _cacheable_ parts from `build/quarkus-build/gen/` * `build/quarkus-build/dep/` receives the dependency jars The output of `QuarkusBuild` is, by default, cacheable in non-CI environments and _not_ cacheable in CI environments. The behavior can be explicitly overridden using the new property `cacheLargeArtifacts` property. As outlined above, caching huge artifacts is not beneficial in CI, either due to space limitations (GitHub's 10GB limit for example) or just the cost of network traffic/duration. On a developer's machine however, the cost of storing/retrieving even bigger artifacts is rather neglectible and lower than rebuilding a Quarkus application. Before this change, the "priority" of configuration settings was (in most cases...) effectively: System properties, environment variables, `application.properties`, configurations in Gradle build scripts & Gradle project properties. This order is unintuitive and was changed to: system properties, environment variables, Gradle project properties, Quarkus build configuration, application.properties. The previous code had several places in which Quarkus related configuration settings were retrieved with sometimes different priorities of the "config sources". The whole "machinery" to get the configuration has been rewritten and encapsulated in new classes `EffectiveConfig` (effective for a Quarkus build) and `BaseConfig` (available during Gradle task configuration phase). `EffectiveConfig` is not really different from `BaseConfig, but contains the "special" settings from e.g. `ImagePush` task. The new implementation also uses SmallRye Config and especially the existing Quarkus mechanisms to pull information out of a `PackageConfig` _object_ produced from a configuration. It turned out to be easier to reason about `PackageConfig` than "raw property values". Support for `application.yaml/yml` has also been added. All calls into the "inner workings" of a Quarkus build and Quarkus code generation have been moved to separate worker processes. The code in this change to support this is pretty dead simple. The primary reason to move that work into isolated worker processes is that all the `QuarkusBootstrap` and derived pieces know nothing about Gradle, so there is no "property like" mechanism to override settings from `application.properties/yaml/yml` with those from e.g. the `QuarkusPluginExtension`. The only option would have been to modify the system properties of the Gradle build process - but that's a no-go, especially considering other tasks running in parallel (think: two `QuarkusBuild` of different projects running at the same time). It would have been relatively easy to serialize all `QuarkusBuild` actions across a Gradle build, but why - and it would prevent using "beefy machines" to run many `QuarkusBuild`s in parallel. Another option would have been to implement a rather complex (and likely very racy) mechanism to track modifications to system properties. As a conclusion, it wasn't just very simple to leverage the process isolation using the Gradle worker API, but it's also not bad. Gradle does reuse already spawned and compatible worker instances. The "trick" implemented to "prioritize" configs from Gradle project properties and Quarkus extension settings is to pass the whole config as system properties to the worker. A bunch of hopefully useful logging has been implemented and added. With info-level loogging, the tasks emit at least some basic information about what they do and at least the package type being used. To be able to investigate which configuration settings were effectively used, there's another new task called `quarkusShowEffectiveConfig` that shows all the `quarkus.*` properties and some more information, including the loaded `application.(properties|yaml|yml)`. This task is intended to debug build issues. The task can optionally save the effecitve config properties as a properties file in the `build/` directory, if run with the command line option `./gradlew quarkusShowEffectiveConfig --save-config-properties`. None of the existing tests has changed, except the `BuildCOnfigurationTest` had to be adopted to reflect the updated order of config sources. Relates to: quarkusio#30852
snazy
added a commit
to snazy/quarkus
that referenced
this issue
Feb 28, 2023
Make `QuarkusBuild` not pollute Gradle's build cache Improve Quarkus build in Gradle, proper up-to-date mechanisms, right config-source priorities. Currently the `QuarkusBuild` task implementation adds even large build artifacts and unmodified dependency jars to Gradle's build cache. This pollutes the Gradle build cache quite a lot and therefore lets archived build caches become unnecessary huge. On top of that, "relicts" from previous runs of `QuarkusBuild` using a different configuration, like a different package-type, causes multiple permutations of the cached artifacts and leads to restoring unexpected artifacts from the build cache. This large change updates the Gradle logic to have "proper" up-to-date checks and in turn a reasonable setting of what is cached and what is not cached. Background on how I discovered the "caching and up-to-date issue" with `QuarkusBuild`: Quarkus 2.16 added the `@CacheableTask` annotation to `QuarkusBuild`. This means that _everything_ that is generated by `QuarkusBuild` (and "declared" as an output: fast-jar directory, uber-jar, native-runner) is cached. After having Quarkus 2.16.x in our "main" branch for about 1/2 day, nearly the whole "10 GB budget" for cache artifacts in GitHub was occupied by Gradle build cache artifacts - the build cache artifacts grew from build to build - this was not the case without the `@CacheableTask` annotation. The current inputs of the `QuarkusBuild` task are also incomplete. Those inputs do not include for example the configured `finalName`, the settings on the Gradle project and the settings on the Quarkus extension. This leads to situations that the wrong Gradle cache artifacts could be restored (think: you ask for a fast-jar, but get an uber-jar). Another issue is that repeated runs of `quarkusBuild` against the exact same code base and dependencies leads to different outputs, the contents of the generated jars and e.g. quarkus-application.dat differ. This causes different outputs of the `QuarkusBuild` task, which cause depending tasks and projects to unnecessarily run/work again. The initial goal of this effort was to make caching of `fast-jar` builds work nice in CI environments, I assume that's the most common package type used in CI environments for example for testing purposes. The idea here is to _not_ cache all the dependency jars, but leverage the mechanisms available in Gradle to "just collect" the dependencies. In other words (and simplified): for `fast-jar`, cache everthing "in `quarkus-app` except `lib/`. There is no change for users running a `./gradlew quarkusBuild`. `QuarkusBuild` is still the task that users should execute (or depend on in their builds). To achive the goal to make especially `fast-jar` builds "nicely cacheable" in CI, two new tasks had to be introduced: * `QuarkusBuildDependencies` uses the dependency information from the Quarkus `ApplicationModel` to tell Gradle to collect those. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is _never_ cacheable, but has _up-to-date_ checks. * `QuarkusBuildCacheableAppParts` performs a Quarkus application build and collects the _cacheable_ parts of that build. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is cacheable. * `QuarkusBuild`, for `fast-jar` and `legacy-jar`, assembles the outputs of the above two tasks to produce the expected output. Performs a Quarkus build for other package types. See below on when this task is cacheable. The `build/quarkus-build/` directory is used to properly separate the outputs of the above three tasks: * `build/quarkus-build/gen/` receives the full output of a Quarkus build * `build/quarkus-build/app/` receives the _cacheable_ parts from `build/quarkus-build/gen/` * `build/quarkus-build/dep/` receives the dependency jars The output of `QuarkusBuild` is, by default, cacheable in non-CI environments and _not_ cacheable in CI environments. The behavior can be explicitly overridden using the new property `cacheLargeArtifacts` property. As outlined above, caching huge artifacts is not beneficial in CI, either due to space limitations (GitHub's 10GB limit for example) or just the cost of network traffic/duration. On a developer's machine however, the cost of storing/retrieving even bigger artifacts is rather neglectible and lower than rebuilding a Quarkus application. Before this change, the "priority" of configuration settings was (in most cases...) effectively: System properties, environment variables, `application.properties`, configurations in Gradle build scripts & Gradle project properties. This order is unintuitive and was changed to: system properties, environment variables, Gradle project properties, Quarkus build configuration, application.properties. The previous code had several places in which Quarkus related configuration settings were retrieved with sometimes different priorities of the "config sources". The whole "machinery" to get the configuration has been rewritten and encapsulated in new classes `EffectiveConfig` (effective for a Quarkus build) and `BaseConfig` (available during Gradle task configuration phase). `EffectiveConfig` is not really different from `BaseConfig, but contains the "special" settings from e.g. `ImagePush` task. The new implementation also uses SmallRye Config and especially the existing Quarkus mechanisms to pull information out of a `PackageConfig` _object_ produced from a configuration. It turned out to be easier to reason about `PackageConfig` than "raw property values". Support for `application.yaml/yml` has also been added. All calls into the "inner workings" of a Quarkus build and Quarkus code generation have been moved to separate worker processes. The code in this change to support this is pretty dead simple. The primary reason to move that work into isolated worker processes is that all the `QuarkusBootstrap` and derived pieces know nothing about Gradle, so there is no "property like" mechanism to override settings from `application.properties/yaml/yml` with those from e.g. the `QuarkusPluginExtension`. The only option would have been to modify the system properties of the Gradle build process - but that's a no-go, especially considering other tasks running in parallel (think: two `QuarkusBuild` of different projects running at the same time). It would have been relatively easy to serialize all `QuarkusBuild` actions across a Gradle build, but why - and it would prevent using "beefy machines" to run many `QuarkusBuild`s in parallel. Another option would have been to implement a rather complex (and likely very racy) mechanism to track modifications to system properties. As a conclusion, it wasn't just very simple to leverage the process isolation using the Gradle worker API, but it's also not bad. Gradle does reuse already spawned and compatible worker instances. The "trick" implemented to "prioritize" configs from Gradle project properties and Quarkus extension settings is to pass the whole config as system properties to the worker. A bunch of hopefully useful logging has been implemented and added. With info-level loogging, the tasks emit at least some basic information about what they do and at least the package type being used. To be able to investigate which configuration settings were effectively used, there's another new task called `quarkusShowEffectiveConfig` that shows all the `quarkus.*` properties and some more information, including the loaded `application.(properties|yaml|yml)`. This task is intended to debug build issues. The task can optionally save the effecitve config properties as a properties file in the `build/` directory, if run with the command line option `./gradlew quarkusShowEffectiveConfig --save-config-properties`. None of the existing tests has changed, except the `BuildCOnfigurationTest` had to be adopted to reflect the updated order of config sources. Relates to: quarkusio#30852
snazy
added a commit
to snazy/quarkus
that referenced
this issue
Mar 1, 2023
Make `QuarkusBuild` not pollute Gradle's build cache Improve Quarkus build in Gradle, proper up-to-date mechanisms, right config-source priorities. Currently the `QuarkusBuild` task implementation adds even large build artifacts and unmodified dependency jars to Gradle's build cache. This pollutes the Gradle build cache quite a lot and therefore lets archived build caches become unnecessary huge. On top of that, "relicts" from previous runs of `QuarkusBuild` using a different configuration, like a different package-type, causes multiple permutations of the cached artifacts and leads to restoring unexpected artifacts from the build cache. This large change updates the Gradle logic to have "proper" up-to-date checks and in turn a reasonable setting of what is cached and what is not cached. Background on how I discovered the "caching and up-to-date issue" with `QuarkusBuild`: Quarkus 2.16 added the `@CacheableTask` annotation to `QuarkusBuild`. This means that _everything_ that is generated by `QuarkusBuild` (and "declared" as an output: fast-jar directory, uber-jar, native-runner) is cached. After having Quarkus 2.16.x in our "main" branch for about 1/2 day, nearly the whole "10 GB budget" for cache artifacts in GitHub was occupied by Gradle build cache artifacts - the build cache artifacts grew from build to build - this was not the case without the `@CacheableTask` annotation. The current inputs of the `QuarkusBuild` task are also incomplete. Those inputs do not include for example the configured `finalName`, the settings on the Gradle project and the settings on the Quarkus extension. This leads to situations that the wrong Gradle cache artifacts could be restored (think: you ask for a fast-jar, but get an uber-jar). Another issue is that repeated runs of `quarkusBuild` against the exact same code base and dependencies leads to different outputs, the contents of the generated jars and e.g. quarkus-application.dat differ. This causes different outputs of the `QuarkusBuild` task, which cause depending tasks and projects to unnecessarily run/work again. The initial goal of this effort was to make caching of `fast-jar` builds work nice in CI environments, I assume that's the most common package type used in CI environments for example for testing purposes. The idea here is to _not_ cache all the dependency jars, but leverage the mechanisms available in Gradle to "just collect" the dependencies. In other words (and simplified): for `fast-jar`, cache everthing "in `quarkus-app` except `lib/`. There is no change for users running a `./gradlew quarkusBuild`. `QuarkusBuild` is still the task that users should execute (or depend on in their builds). To achive the goal to make especially `fast-jar` builds "nicely cacheable" in CI, two new tasks had to be introduced: * `QuarkusBuildDependencies` uses the dependency information from the Quarkus `ApplicationModel` to tell Gradle to collect those. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is _never_ cacheable, but has _up-to-date_ checks. * `QuarkusBuildCacheableAppParts` performs a Quarkus application build and collects the _cacheable_ parts of that build. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is cacheable. * `QuarkusBuild`, for `fast-jar` and `legacy-jar`, assembles the outputs of the above two tasks to produce the expected output. Performs a Quarkus build for other package types. See below on when this task is cacheable. The `build/quarkus-build/` directory is used to properly separate the outputs of the above three tasks: * `build/quarkus-build/gen/` receives the full output of a Quarkus build * `build/quarkus-build/app/` receives the _cacheable_ parts from `build/quarkus-build/gen/` * `build/quarkus-build/dep/` receives the dependency jars The output of `QuarkusBuild` is, by default, cacheable in non-CI environments and _not_ cacheable in CI environments. The behavior can be explicitly overridden using the new property `cacheLargeArtifacts` property. As outlined above, caching huge artifacts is not beneficial in CI, either due to space limitations (GitHub's 10GB limit for example) or just the cost of network traffic/duration. On a developer's machine however, the cost of storing/retrieving even bigger artifacts is rather neglectible and lower than rebuilding a Quarkus application. Before this change, the "priority" of configuration settings was (in most cases...) effectively: System properties, environment variables, `application.properties`, configurations in Gradle build scripts & Gradle project properties. This order is unintuitive and was changed to: system properties, environment variables, Gradle project properties, Quarkus build configuration, application.properties. The previous code had several places in which Quarkus related configuration settings were retrieved with sometimes different priorities of the "config sources". The whole "machinery" to get the configuration has been rewritten and encapsulated in new classes `EffectiveConfig` (effective for a Quarkus build) and `BaseConfig` (available during Gradle task configuration phase). `EffectiveConfig` is not really different from `BaseConfig, but contains the "special" settings from e.g. `ImagePush` task. The new implementation also uses SmallRye Config and especially the existing Quarkus mechanisms to pull information out of a `PackageConfig` _object_ produced from a configuration. It turned out to be easier to reason about `PackageConfig` than "raw property values". Support for `application.yaml/yml` has also been added. All calls into the "inner workings" of a Quarkus build and Quarkus code generation have been moved to separate worker processes. The code in this change to support this is pretty dead simple. The primary reason to move that work into isolated worker processes is that all the `QuarkusBootstrap` and derived pieces know nothing about Gradle, so there is no "property like" mechanism to override settings from `application.properties/yaml/yml` with those from e.g. the `QuarkusPluginExtension`. The only option would have been to modify the system properties of the Gradle build process - but that's a no-go, especially considering other tasks running in parallel (think: two `QuarkusBuild` of different projects running at the same time). It would have been relatively easy to serialize all `QuarkusBuild` actions across a Gradle build, but why - and it would prevent using "beefy machines" to run many `QuarkusBuild`s in parallel. Another option would have been to implement a rather complex (and likely very racy) mechanism to track modifications to system properties. As a conclusion, it wasn't just very simple to leverage the process isolation using the Gradle worker API, but it's also not bad. Gradle does reuse already spawned and compatible worker instances. The "trick" implemented to "prioritize" configs from Gradle project properties and Quarkus extension settings is to pass the whole config as system properties to the worker. A bunch of hopefully useful logging has been implemented and added. With info-level loogging, the tasks emit at least some basic information about what they do and at least the package type being used. To be able to investigate which configuration settings were effectively used, there's another new task called `quarkusShowEffectiveConfig` that shows all the `quarkus.*` properties and some more information, including the loaded `application.(properties|yaml|yml)`. This task is intended to debug build issues. The task can optionally save the effecitve config properties as a properties file in the `build/` directory, if run with the command line option `./gradlew quarkusShowEffectiveConfig --save-config-properties`. None of the existing tests has changed, except the `BuildCOnfigurationTest` had to be adopted to reflect the updated order of config sources. Relates to: quarkusio#30852
snazy
added a commit
to snazy/quarkus
that referenced
this issue
Mar 2, 2023
Make `QuarkusBuild` not pollute Gradle's build cache Improve Quarkus build in Gradle, proper up-to-date mechanisms, right config-source priorities. Currently the `QuarkusBuild` task implementation adds even large build artifacts and unmodified dependency jars to Gradle's build cache. This pollutes the Gradle build cache quite a lot and therefore lets archived build caches become unnecessary huge. On top of that, "relicts" from previous runs of `QuarkusBuild` using a different configuration, like a different package-type, causes multiple permutations of the cached artifacts and leads to restoring unexpected artifacts from the build cache. This large change updates the Gradle logic to have "proper" up-to-date checks and in turn a reasonable setting of what is cached and what is not cached. Background on how I discovered the "caching and up-to-date issue" with `QuarkusBuild`: Quarkus 2.16 added the `@CacheableTask` annotation to `QuarkusBuild`. This means that _everything_ that is generated by `QuarkusBuild` (and "declared" as an output: fast-jar directory, uber-jar, native-runner) is cached. After having Quarkus 2.16.x in our "main" branch for about 1/2 day, nearly the whole "10 GB budget" for cache artifacts in GitHub was occupied by Gradle build cache artifacts - the build cache artifacts grew from build to build - this was not the case without the `@CacheableTask` annotation. The current inputs of the `QuarkusBuild` task are also incomplete. Those inputs do not include for example the configured `finalName`, the settings on the Gradle project and the settings on the Quarkus extension. This leads to situations that the wrong Gradle cache artifacts could be restored (think: you ask for a fast-jar, but get an uber-jar). Another issue is that repeated runs of `quarkusBuild` against the exact same code base and dependencies leads to different outputs, the contents of the generated jars and e.g. quarkus-application.dat differ. This causes different outputs of the `QuarkusBuild` task, which cause depending tasks and projects to unnecessarily run/work again. The initial goal of this effort was to make caching of `fast-jar` builds work nice in CI environments, I assume that's the most common package type used in CI environments for example for testing purposes. The idea here is to _not_ cache all the dependency jars, but leverage the mechanisms available in Gradle to "just collect" the dependencies. In other words (and simplified): for `fast-jar`, cache everthing "in `quarkus-app` except `lib/`. There is no change for users running a `./gradlew quarkusBuild`. `QuarkusBuild` is still the task that users should execute (or depend on in their builds). To achive the goal to make especially `fast-jar` builds "nicely cacheable" in CI, two new tasks had to be introduced: * `QuarkusBuildDependencies` uses the dependency information from the Quarkus `ApplicationModel` to tell Gradle to collect those. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is _never_ cacheable, but has _up-to-date_ checks. * `QuarkusBuildCacheableAppParts` performs a Quarkus application build and collects the _cacheable_ parts of that build. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is cacheable. * `QuarkusBuild`, for `fast-jar` and `legacy-jar`, assembles the outputs of the above two tasks to produce the expected output. Performs a Quarkus build for other package types. See below on when this task is cacheable. The `build/quarkus-build/` directory is used to properly separate the outputs of the above three tasks: * `build/quarkus-build/gen/` receives the full output of a Quarkus build * `build/quarkus-build/app/` receives the _cacheable_ parts from `build/quarkus-build/gen/` * `build/quarkus-build/dep/` receives the dependency jars The output of `QuarkusBuild` is, by default, cacheable in non-CI environments and _not_ cacheable in CI environments. The behavior can be explicitly overridden using the new property `cacheLargeArtifacts` property. As outlined above, caching huge artifacts is not beneficial in CI, either due to space limitations (GitHub's 10GB limit for example) or just the cost of network traffic/duration. On a developer's machine however, the cost of storing/retrieving even bigger artifacts is rather neglectible and lower than rebuilding a Quarkus application. Before this change, the "priority" of configuration settings was (in most cases...) effectively: System properties, environment variables, `application.properties`, configurations in Gradle build scripts & Gradle project properties. This order is unintuitive and was changed to: system properties, environment variables, Gradle project properties, Quarkus build configuration, application.properties. The previous code had several places in which Quarkus related configuration settings were retrieved with sometimes different priorities of the "config sources". The whole "machinery" to get the configuration has been rewritten and encapsulated in new classes `EffectiveConfig` (effective for a Quarkus build) and `BaseConfig` (available during Gradle task configuration phase). `EffectiveConfig` is not really different from `BaseConfig, but contains the "special" settings from e.g. `ImagePush` task. The new implementation also uses SmallRye Config and especially the existing Quarkus mechanisms to pull information out of a `PackageConfig` _object_ produced from a configuration. It turned out to be easier to reason about `PackageConfig` than "raw property values". Support for `application.yaml/yml` has also been added. All calls into the "inner workings" of a Quarkus build and Quarkus code generation have been moved to separate worker processes. The code in this change to support this is pretty dead simple. The primary reason to move that work into isolated worker processes is that all the `QuarkusBootstrap` and derived pieces know nothing about Gradle, so there is no "property like" mechanism to override settings from `application.properties/yaml/yml` with those from e.g. the `QuarkusPluginExtension`. The only option would have been to modify the system properties of the Gradle build process - but that's a no-go, especially considering other tasks running in parallel (think: two `QuarkusBuild` of different projects running at the same time). It would have been relatively easy to serialize all `QuarkusBuild` actions across a Gradle build, but why - and it would prevent using "beefy machines" to run many `QuarkusBuild`s in parallel. Another option would have been to implement a rather complex (and likely very racy) mechanism to track modifications to system properties. As a conclusion, it wasn't just very simple to leverage the process isolation using the Gradle worker API, but it's also not bad. Gradle does reuse already spawned and compatible worker instances. The "trick" implemented to "prioritize" configs from Gradle project properties and Quarkus extension settings is to pass the whole config as system properties to the worker. A bunch of hopefully useful logging has been implemented and added. With info-level loogging, the tasks emit at least some basic information about what they do and at least the package type being used. To be able to investigate which configuration settings were effectively used, there's another new task called `quarkusShowEffectiveConfig` that shows all the `quarkus.*` properties and some more information, including the loaded `application.(properties|yaml|yml)`. This task is intended to debug build issues. The task can optionally save the effecitve config properties as a properties file in the `build/` directory, if run with the command line option `./gradlew quarkusShowEffectiveConfig --save-config-properties`. None of the existing tests has changed, except the `BuildCOnfigurationTest` had to be adopted to reflect the updated order of config sources. Relates to: quarkusio#30852
snazy
added a commit
to snazy/quarkus
that referenced
this issue
Mar 3, 2023
Make `QuarkusBuild` not pollute Gradle's build cache Improve Quarkus build in Gradle, proper up-to-date mechanisms, right config-source priorities. Currently the `QuarkusBuild` task implementation adds even large build artifacts and unmodified dependency jars to Gradle's build cache. This pollutes the Gradle build cache quite a lot and therefore lets archived build caches become unnecessary huge. On top of that, "relicts" from previous runs of `QuarkusBuild` using a different configuration, like a different package-type, causes multiple permutations of the cached artifacts and leads to restoring unexpected artifacts from the build cache. This large change updates the Gradle logic to have "proper" up-to-date checks and in turn a reasonable setting of what is cached and what is not cached. Background on how I discovered the "caching and up-to-date issue" with `QuarkusBuild`: Quarkus 2.16 added the `@CacheableTask` annotation to `QuarkusBuild`. This means that _everything_ that is generated by `QuarkusBuild` (and "declared" as an output: fast-jar directory, uber-jar, native-runner) is cached. After having Quarkus 2.16.x in our "main" branch for about 1/2 day, nearly the whole "10 GB budget" for cache artifacts in GitHub was occupied by Gradle build cache artifacts - the build cache artifacts grew from build to build - this was not the case without the `@CacheableTask` annotation. The current inputs of the `QuarkusBuild` task are also incomplete. Those inputs do not include for example the configured `finalName`, the settings on the Gradle project and the settings on the Quarkus extension. This leads to situations that the wrong Gradle cache artifacts could be restored (think: you ask for a fast-jar, but get an uber-jar). Another issue is that repeated runs of `quarkusBuild` against the exact same code base and dependencies leads to different outputs, the contents of the generated jars and e.g. quarkus-application.dat differ. This causes different outputs of the `QuarkusBuild` task, which cause depending tasks and projects to unnecessarily run/work again. The initial goal of this effort was to make caching of `fast-jar` builds work nice in CI environments, I assume that's the most common package type used in CI environments for example for testing purposes. The idea here is to _not_ cache all the dependency jars, but leverage the mechanisms available in Gradle to "just collect" the dependencies. In other words (and simplified): for `fast-jar`, cache everthing "in `quarkus-app` except `lib/`. There is no change for users running a `./gradlew quarkusBuild`. `QuarkusBuild` is still the task that users should execute (or depend on in their builds). To achive the goal to make especially `fast-jar` builds "nicely cacheable" in CI, two new tasks had to be introduced: * `QuarkusBuildDependencies` uses the dependency information from the Quarkus `ApplicationModel` to tell Gradle to collect those. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is _never_ cacheable, but has _up-to-date_ checks. * `QuarkusBuildCacheableAppParts` performs a Quarkus application build and collects the _cacheable_ parts of that build. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is cacheable. * `QuarkusBuild`, for `fast-jar` and `legacy-jar`, assembles the outputs of the above two tasks to produce the expected output. Performs a Quarkus build for other package types. See below on when this task is cacheable. The `build/quarkus-build/` directory is used to properly separate the outputs of the above three tasks: * `build/quarkus-build/gen/` receives the full output of a Quarkus build * `build/quarkus-build/app/` receives the _cacheable_ parts from `build/quarkus-build/gen/` * `build/quarkus-build/dep/` receives the dependency jars The output of `QuarkusBuild` is, by default, cacheable in non-CI environments and _not_ cacheable in CI environments. The behavior can be explicitly overridden using the new property `cacheLargeArtifacts` property. As outlined above, caching huge artifacts is not beneficial in CI, either due to space limitations (GitHub's 10GB limit for example) or just the cost of network traffic/duration. On a developer's machine however, the cost of storing/retrieving even bigger artifacts is rather neglectible and lower than rebuilding a Quarkus application. Before this change, the "priority" of configuration settings was (in most cases...) effectively: System properties, environment variables, `application.properties`, configurations in Gradle build scripts & Gradle project properties. This order is unintuitive and was changed to: system properties, environment variables, Gradle project properties, Quarkus build configuration, application.properties. The previous code had several places in which Quarkus related configuration settings were retrieved with sometimes different priorities of the "config sources". The whole "machinery" to get the configuration has been rewritten and encapsulated in new classes `EffectiveConfig` (effective for a Quarkus build) and `BaseConfig` (available during Gradle task configuration phase). `EffectiveConfig` is not really different from `BaseConfig, but contains the "special" settings from e.g. `ImagePush` task. The new implementation also uses SmallRye Config and especially the existing Quarkus mechanisms to pull information out of a `PackageConfig` _object_ produced from a configuration. It turned out to be easier to reason about `PackageConfig` than "raw property values". Support for `application.yaml/yml` has also been added. All calls into the "inner workings" of a Quarkus build and Quarkus code generation have been moved to separate worker processes. The code in this change to support this is pretty dead simple. The primary reason to move that work into isolated worker processes is that all the `QuarkusBootstrap` and derived pieces know nothing about Gradle, so there is no "property like" mechanism to override settings from `application.properties/yaml/yml` with those from e.g. the `QuarkusPluginExtension`. The only option would have been to modify the system properties of the Gradle build process - but that's a no-go, especially considering other tasks running in parallel (think: two `QuarkusBuild` of different projects running at the same time). It would have been relatively easy to serialize all `QuarkusBuild` actions across a Gradle build, but why - and it would prevent using "beefy machines" to run many `QuarkusBuild`s in parallel. Another option would have been to implement a rather complex (and likely very racy) mechanism to track modifications to system properties. As a conclusion, it wasn't just very simple to leverage the process isolation using the Gradle worker API, but it's also not bad. Gradle does reuse already spawned and compatible worker instances. The "trick" implemented to "prioritize" configs from Gradle project properties and Quarkus extension settings is to pass the whole config as system properties to the worker. A bunch of hopefully useful logging has been implemented and added. With info-level loogging, the tasks emit at least some basic information about what they do and at least the package type being used. To be able to investigate which configuration settings were effectively used, there's another new task called `quarkusShowEffectiveConfig` that shows all the `quarkus.*` properties and some more information, including the loaded `application.(properties|yaml|yml)`. This task is intended to debug build issues. The task can optionally save the effecitve config properties as a properties file in the `build/` directory, if run with the command line option `./gradlew quarkusShowEffectiveConfig --save-config-properties`. None of the existing tests has changed, except the `BuildCOnfigurationTest` had to be adopted to reflect the updated order of config sources. Relates to: quarkusio#30852
snazy
added a commit
to snazy/quarkus
that referenced
this issue
Mar 6, 2023
Make `QuarkusBuild` not pollute Gradle's build cache Improve Quarkus build in Gradle, proper up-to-date mechanisms, right config-source priorities. Currently the `QuarkusBuild` task implementation adds even large build artifacts and unmodified dependency jars to Gradle's build cache. This pollutes the Gradle build cache quite a lot and therefore lets archived build caches become unnecessary huge. On top of that, "relicts" from previous runs of `QuarkusBuild` using a different configuration, like a different package-type, causes multiple permutations of the cached artifacts and leads to restoring unexpected artifacts from the build cache. This large change updates the Gradle logic to have "proper" up-to-date checks and in turn a reasonable setting of what is cached and what is not cached. Background on how I discovered the "caching and up-to-date issue" with `QuarkusBuild`: Quarkus 2.16 added the `@CacheableTask` annotation to `QuarkusBuild`. This means that _everything_ that is generated by `QuarkusBuild` (and "declared" as an output: fast-jar directory, uber-jar, native-runner) is cached. After having Quarkus 2.16.x in our "main" branch for about 1/2 day, nearly the whole "10 GB budget" for cache artifacts in GitHub was occupied by Gradle build cache artifacts - the build cache artifacts grew from build to build - this was not the case without the `@CacheableTask` annotation. The current inputs of the `QuarkusBuild` task are also incomplete. Those inputs do not include for example the configured `finalName`, the settings on the Gradle project and the settings on the Quarkus extension. This leads to situations that the wrong Gradle cache artifacts could be restored (think: you ask for a fast-jar, but get an uber-jar). Another issue is that repeated runs of `quarkusBuild` against the exact same code base and dependencies leads to different outputs, the contents of the generated jars and e.g. quarkus-application.dat differ. This causes different outputs of the `QuarkusBuild` task, which cause depending tasks and projects to unnecessarily run/work again. The initial goal of this effort was to make caching of `fast-jar` builds work nice in CI environments, I assume that's the most common package type used in CI environments for example for testing purposes. The idea here is to _not_ cache all the dependency jars, but leverage the mechanisms available in Gradle to "just collect" the dependencies. In other words (and simplified): for `fast-jar`, cache everthing "in `quarkus-app` except `lib/`. There is no change for users running a `./gradlew quarkusBuild`. `QuarkusBuild` is still the task that users should execute (or depend on in their builds). To achive the goal to make especially `fast-jar` builds "nicely cacheable" in CI, two new tasks had to be introduced: * `QuarkusBuildDependencies` uses the dependency information from the Quarkus `ApplicationModel` to tell Gradle to collect those. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is _never_ cacheable, but has _up-to-date_ checks. * `QuarkusBuildCacheableAppParts` performs a Quarkus application build and collects the _cacheable_ parts of that build. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is cacheable. * `QuarkusBuild`, for `fast-jar` and `legacy-jar`, assembles the outputs of the above two tasks to produce the expected output. Performs a Quarkus build for other package types. See below on when this task is cacheable. The `build/quarkus-build/` directory is used to properly separate the outputs of the above three tasks: * `build/quarkus-build/gen/` receives the full output of a Quarkus build * `build/quarkus-build/app/` receives the _cacheable_ parts from `build/quarkus-build/gen/` * `build/quarkus-build/dep/` receives the dependency jars The output of `QuarkusBuild` is, by default, cacheable in non-CI environments and _not_ cacheable in CI environments. The behavior can be explicitly overridden using the new property `cacheLargeArtifacts` property. As outlined above, caching huge artifacts is not beneficial in CI, either due to space limitations (GitHub's 10GB limit for example) or just the cost of network traffic/duration. On a developer's machine however, the cost of storing/retrieving even bigger artifacts is rather neglectible and lower than rebuilding a Quarkus application. Before this change, the "priority" of configuration settings was (in most cases...) effectively: System properties, environment variables, `application.properties`, configurations in Gradle build scripts & Gradle project properties. This order is unintuitive and was changed to: system properties, environment variables, Gradle project properties, Quarkus build configuration, application.properties. The previous code had several places in which Quarkus related configuration settings were retrieved with sometimes different priorities of the "config sources". The whole "machinery" to get the configuration has been rewritten and encapsulated in new classes `EffectiveConfig` (effective for a Quarkus build) and `BaseConfig` (available during Gradle task configuration phase). `EffectiveConfig` is not really different from `BaseConfig, but contains the "special" settings from e.g. `ImagePush` task. The new implementation also uses SmallRye Config and especially the existing Quarkus mechanisms to pull information out of a `PackageConfig` _object_ produced from a configuration. It turned out to be easier to reason about `PackageConfig` than "raw property values". Support for `application.yaml/yml` has also been added. All calls into the "inner workings" of a Quarkus build and Quarkus code generation have been moved to separate worker processes. The code in this change to support this is pretty dead simple. The primary reason to move that work into isolated worker processes is that all the `QuarkusBootstrap` and derived pieces know nothing about Gradle, so there is no "property like" mechanism to override settings from `application.properties/yaml/yml` with those from e.g. the `QuarkusPluginExtension`. The only option would have been to modify the system properties of the Gradle build process - but that's a no-go, especially considering other tasks running in parallel (think: two `QuarkusBuild` of different projects running at the same time). It would have been relatively easy to serialize all `QuarkusBuild` actions across a Gradle build, but why - and it would prevent using "beefy machines" to run many `QuarkusBuild`s in parallel. Another option would have been to implement a rather complex (and likely very racy) mechanism to track modifications to system properties. As a conclusion, it wasn't just very simple to leverage the process isolation using the Gradle worker API, but it's also not bad. Gradle does reuse already spawned and compatible worker instances. The "trick" implemented to "prioritize" configs from Gradle project properties and Quarkus extension settings is to pass the whole config as system properties to the worker. A bunch of hopefully useful logging has been implemented and added. With info-level loogging, the tasks emit at least some basic information about what they do and at least the package type being used. To be able to investigate which configuration settings were effectively used, there's another new task called `quarkusShowEffectiveConfig` that shows all the `quarkus.*` properties and some more information, including the loaded `application.(properties|yaml|yml)`. This task is intended to debug build issues. The task can optionally save the effecitve config properties as a properties file in the `build/` directory, if run with the command line option `./gradlew quarkusShowEffectiveConfig --save-config-properties`. None of the existing tests has changed, except the `BuildCOnfigurationTest` had to be adopted to reflect the updated order of config sources. Relates to: quarkusio#30852
snazy
added a commit
to snazy/quarkus
that referenced
this issue
Mar 6, 2023
Make `QuarkusBuild` not pollute Gradle's build cache Improve Quarkus build in Gradle, proper up-to-date mechanisms, right config-source priorities. Currently the `QuarkusBuild` task implementation adds even large build artifacts and unmodified dependency jars to Gradle's build cache. This pollutes the Gradle build cache quite a lot and therefore lets archived build caches become unnecessary huge. On top of that, "relicts" from previous runs of `QuarkusBuild` using a different configuration, like a different package-type, causes multiple permutations of the cached artifacts and leads to restoring unexpected artifacts from the build cache. This large change updates the Gradle logic to have "proper" up-to-date checks and in turn a reasonable setting of what is cached and what is not cached. Background on how I discovered the "caching and up-to-date issue" with `QuarkusBuild`: Quarkus 2.16 added the `@CacheableTask` annotation to `QuarkusBuild`. This means that _everything_ that is generated by `QuarkusBuild` (and "declared" as an output: fast-jar directory, uber-jar, native-runner) is cached. After having Quarkus 2.16.x in our "main" branch for about 1/2 day, nearly the whole "10 GB budget" for cache artifacts in GitHub was occupied by Gradle build cache artifacts - the build cache artifacts grew from build to build - this was not the case without the `@CacheableTask` annotation. The current inputs of the `QuarkusBuild` task are also incomplete. Those inputs do not include for example the configured `finalName`, the settings on the Gradle project and the settings on the Quarkus extension. This leads to situations that the wrong Gradle cache artifacts could be restored (think: you ask for a fast-jar, but get an uber-jar). Another issue is that repeated runs of `quarkusBuild` against the exact same code base and dependencies leads to different outputs, the contents of the generated jars and e.g. quarkus-application.dat differ. This causes different outputs of the `QuarkusBuild` task, which cause depending tasks and projects to unnecessarily run/work again. The initial goal of this effort was to make caching of `fast-jar` builds work nice in CI environments, I assume that's the most common package type used in CI environments for example for testing purposes. The idea here is to _not_ cache all the dependency jars, but leverage the mechanisms available in Gradle to "just collect" the dependencies. In other words (and simplified): for `fast-jar`, cache everthing "in `quarkus-app` except `lib/`. There is no change for users running a `./gradlew quarkusBuild`. `QuarkusBuild` is still the task that users should execute (or depend on in their builds). To achive the goal to make especially `fast-jar` builds "nicely cacheable" in CI, two new tasks had to be introduced: * `QuarkusBuildDependencies` uses the dependency information from the Quarkus `ApplicationModel` to tell Gradle to collect those. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is _never_ cacheable, but has _up-to-date_ checks. * `QuarkusBuildCacheableAppParts` performs a Quarkus application build and collects the _cacheable_ parts of that build. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is cacheable. * `QuarkusBuild`, for `fast-jar` and `legacy-jar`, assembles the outputs of the above two tasks to produce the expected output. Performs a Quarkus build for other package types. See below on when this task is cacheable. The `build/quarkus-build/` directory is used to properly separate the outputs of the above three tasks: * `build/quarkus-build/gen/` receives the full output of a Quarkus build * `build/quarkus-build/app/` receives the _cacheable_ parts from `build/quarkus-build/gen/` * `build/quarkus-build/dep/` receives the dependency jars The output of `QuarkusBuild` is, by default, cacheable in non-CI environments and _not_ cacheable in CI environments. The behavior can be explicitly overridden using the new property `cacheLargeArtifacts` property. As outlined above, caching huge artifacts is not beneficial in CI, either due to space limitations (GitHub's 10GB limit for example) or just the cost of network traffic/duration. On a developer's machine however, the cost of storing/retrieving even bigger artifacts is rather neglectible and lower than rebuilding a Quarkus application. Before this change, the "priority" of configuration settings was (in most cases...) effectively: System properties, environment variables, `application.properties`, configurations in Gradle build scripts & Gradle project properties. This order is unintuitive and was changed to: system properties, environment variables, Gradle project properties, Quarkus build configuration, application.properties. The previous code had several places in which Quarkus related configuration settings were retrieved with sometimes different priorities of the "config sources". The whole "machinery" to get the configuration has been rewritten and encapsulated in new classes `EffectiveConfig` (effective for a Quarkus build) and `BaseConfig` (available during Gradle task configuration phase). `EffectiveConfig` is not really different from `BaseConfig, but contains the "special" settings from e.g. `ImagePush` task. The new implementation also uses SmallRye Config and especially the existing Quarkus mechanisms to pull information out of a `PackageConfig` _object_ produced from a configuration. It turned out to be easier to reason about `PackageConfig` than "raw property values". Support for `application.yaml/yml` has also been added. All calls into the "inner workings" of a Quarkus build and Quarkus code generation have been moved to separate worker processes. The code in this change to support this is pretty dead simple. The primary reason to move that work into isolated worker processes is that all the `QuarkusBootstrap` and derived pieces know nothing about Gradle, so there is no "property like" mechanism to override settings from `application.properties/yaml/yml` with those from e.g. the `QuarkusPluginExtension`. The only option would have been to modify the system properties of the Gradle build process - but that's a no-go, especially considering other tasks running in parallel (think: two `QuarkusBuild` of different projects running at the same time). It would have been relatively easy to serialize all `QuarkusBuild` actions across a Gradle build, but why - and it would prevent using "beefy machines" to run many `QuarkusBuild`s in parallel. Another option would have been to implement a rather complex (and likely very racy) mechanism to track modifications to system properties. As a conclusion, it wasn't just very simple to leverage the process isolation using the Gradle worker API, but it's also not bad. Gradle does reuse already spawned and compatible worker instances. The "trick" implemented to "prioritize" configs from Gradle project properties and Quarkus extension settings is to pass the whole config as system properties to the worker. A bunch of hopefully useful logging has been implemented and added. With info-level loogging, the tasks emit at least some basic information about what they do and at least the package type being used. To be able to investigate which configuration settings were effectively used, there's another new task called `quarkusShowEffectiveConfig` that shows all the `quarkus.*` properties and some more information, including the loaded `application.(properties|yaml|yml)`. This task is intended to debug build issues. The task can optionally save the effecitve config properties as a properties file in the `build/` directory, if run with the command line option `./gradlew quarkusShowEffectiveConfig --save-config-properties`. None of the existing tests has changed, except the `BuildCOnfigurationTest` had to be adopted to reflect the updated order of config sources. Relates to: quarkusio#30852
snazy
added a commit
to snazy/quarkus
that referenced
this issue
Mar 7, 2023
Make `QuarkusBuild` not pollute Gradle's build cache Improve Quarkus build in Gradle, proper up-to-date mechanisms, right config-source priorities. Currently the `QuarkusBuild` task implementation adds even large build artifacts and unmodified dependency jars to Gradle's build cache. This pollutes the Gradle build cache quite a lot and therefore lets archived build caches become unnecessary huge. On top of that, "relicts" from previous runs of `QuarkusBuild` using a different configuration, like a different package-type, causes multiple permutations of the cached artifacts and leads to restoring unexpected artifacts from the build cache. This large change updates the Gradle logic to have "proper" up-to-date checks and in turn a reasonable setting of what is cached and what is not cached. Background on how I discovered the "caching and up-to-date issue" with `QuarkusBuild`: Quarkus 2.16 added the `@CacheableTask` annotation to `QuarkusBuild`. This means that _everything_ that is generated by `QuarkusBuild` (and "declared" as an output: fast-jar directory, uber-jar, native-runner) is cached. After having Quarkus 2.16.x in our "main" branch for about 1/2 day, nearly the whole "10 GB budget" for cache artifacts in GitHub was occupied by Gradle build cache artifacts - the build cache artifacts grew from build to build - this was not the case without the `@CacheableTask` annotation. The current inputs of the `QuarkusBuild` task are also incomplete. Those inputs do not include for example the configured `finalName`, the settings on the Gradle project and the settings on the Quarkus extension. This leads to situations that the wrong Gradle cache artifacts could be restored (think: you ask for a fast-jar, but get an uber-jar). Another issue is that repeated runs of `quarkusBuild` against the exact same code base and dependencies leads to different outputs, the contents of the generated jars and e.g. quarkus-application.dat differ. This causes different outputs of the `QuarkusBuild` task, which cause depending tasks and projects to unnecessarily run/work again. The initial goal of this effort was to make caching of `fast-jar` builds work nice in CI environments, I assume that's the most common package type used in CI environments for example for testing purposes. The idea here is to _not_ cache all the dependency jars, but leverage the mechanisms available in Gradle to "just collect" the dependencies. In other words (and simplified): for `fast-jar`, cache everthing "in `quarkus-app` except `lib/`. There is no change for users running a `./gradlew quarkusBuild`. `QuarkusBuild` is still the task that users should execute (or depend on in their builds). To achive the goal to make especially `fast-jar` builds "nicely cacheable" in CI, two new tasks had to be introduced: * `QuarkusBuildDependencies` uses the dependency information from the Quarkus `ApplicationModel` to tell Gradle to collect those. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is _never_ cacheable, but has _up-to-date_ checks. * `QuarkusBuildCacheableAppParts` performs a Quarkus application build and collects the _cacheable_ parts of that build. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is cacheable. * `QuarkusBuild`, for `fast-jar` and `legacy-jar`, assembles the outputs of the above two tasks to produce the expected output. Performs a Quarkus build for other package types. See below on when this task is cacheable. The `build/quarkus-build/` directory is used to properly separate the outputs of the above three tasks: * `build/quarkus-build/gen/` receives the full output of a Quarkus build * `build/quarkus-build/app/` receives the _cacheable_ parts from `build/quarkus-build/gen/` * `build/quarkus-build/dep/` receives the dependency jars The output of `QuarkusBuild` is, by default, cacheable in non-CI environments and _not_ cacheable in CI environments. The behavior can be explicitly overridden using the new property `cacheLargeArtifacts` property. As outlined above, caching huge artifacts is not beneficial in CI, either due to space limitations (GitHub's 10GB limit for example) or just the cost of network traffic/duration. On a developer's machine however, the cost of storing/retrieving even bigger artifacts is rather neglectible and lower than rebuilding a Quarkus application. Before this change, the "priority" of configuration settings was (in most cases...) effectively: System properties, environment variables, `application.properties`, configurations in Gradle build scripts & Gradle project properties. This order is unintuitive and was changed to: system properties, environment variables, Gradle project properties, Quarkus build configuration, application.properties. The previous code had several places in which Quarkus related configuration settings were retrieved with sometimes different priorities of the "config sources". The whole "machinery" to get the configuration has been rewritten and encapsulated in new classes `EffectiveConfig` (effective for a Quarkus build) and `BaseConfig` (available during Gradle task configuration phase). `EffectiveConfig` is not really different from `BaseConfig, but contains the "special" settings from e.g. `ImagePush` task. The new implementation also uses SmallRye Config and especially the existing Quarkus mechanisms to pull information out of a `PackageConfig` _object_ produced from a configuration. It turned out to be easier to reason about `PackageConfig` than "raw property values". Support for `application.yaml/yml` has also been added. All calls into the "inner workings" of a Quarkus build and Quarkus code generation have been moved to separate worker processes. The code in this change to support this is pretty dead simple. The primary reason to move that work into isolated worker processes is that all the `QuarkusBootstrap` and derived pieces know nothing about Gradle, so there is no "property like" mechanism to override settings from `application.properties/yaml/yml` with those from e.g. the `QuarkusPluginExtension`. The only option would have been to modify the system properties of the Gradle build process - but that's a no-go, especially considering other tasks running in parallel (think: two `QuarkusBuild` of different projects running at the same time). It would have been relatively easy to serialize all `QuarkusBuild` actions across a Gradle build, but why - and it would prevent using "beefy machines" to run many `QuarkusBuild`s in parallel. Another option would have been to implement a rather complex (and likely very racy) mechanism to track modifications to system properties. As a conclusion, it wasn't just very simple to leverage the process isolation using the Gradle worker API, but it's also not bad. Gradle does reuse already spawned and compatible worker instances. The "trick" implemented to "prioritize" configs from Gradle project properties and Quarkus extension settings is to pass the whole config as system properties to the worker. A bunch of hopefully useful logging has been implemented and added. With info-level loogging, the tasks emit at least some basic information about what they do and at least the package type being used. To be able to investigate which configuration settings were effectively used, there's another new task called `quarkusShowEffectiveConfig` that shows all the `quarkus.*` properties and some more information, including the loaded `application.(properties|yaml|yml)`. This task is intended to debug build issues. The task can optionally save the effecitve config properties as a properties file in the `build/` directory, if run with the command line option `./gradlew quarkusShowEffectiveConfig --save-config-properties`. None of the existing tests has changed, except the `BuildCOnfigurationTest` had to be adopted to reflect the updated order of config sources. Relates to: quarkusio#30852
snazy
added a commit
to snazy/quarkus
that referenced
this issue
Mar 8, 2023
Make `QuarkusBuild` not pollute Gradle's build cache Improve Quarkus build in Gradle, proper up-to-date mechanisms, right config-source priorities. Currently the `QuarkusBuild` task implementation adds even large build artifacts and unmodified dependency jars to Gradle's build cache. This pollutes the Gradle build cache quite a lot and therefore lets archived build caches become unnecessary huge. On top of that, "relicts" from previous runs of `QuarkusBuild` using a different configuration, like a different package-type, causes multiple permutations of the cached artifacts and leads to restoring unexpected artifacts from the build cache. This large change updates the Gradle logic to have "proper" up-to-date checks and in turn a reasonable setting of what is cached and what is not cached. Background on how I discovered the "caching and up-to-date issue" with `QuarkusBuild`: Quarkus 2.16 added the `@CacheableTask` annotation to `QuarkusBuild`. This means that _everything_ that is generated by `QuarkusBuild` (and "declared" as an output: fast-jar directory, uber-jar, native-runner) is cached. After having Quarkus 2.16.x in our "main" branch for about 1/2 day, nearly the whole "10 GB budget" for cache artifacts in GitHub was occupied by Gradle build cache artifacts - the build cache artifacts grew from build to build - this was not the case without the `@CacheableTask` annotation. The current inputs of the `QuarkusBuild` task are also incomplete. Those inputs do not include for example the configured `finalName`, the settings on the Gradle project and the settings on the Quarkus extension. This leads to situations that the wrong Gradle cache artifacts could be restored (think: you ask for a fast-jar, but get an uber-jar). Another issue is that repeated runs of `quarkusBuild` against the exact same code base and dependencies leads to different outputs, the contents of the generated jars and e.g. quarkus-application.dat differ. This causes different outputs of the `QuarkusBuild` task, which cause depending tasks and projects to unnecessarily run/work again. The initial goal of this effort was to make caching of `fast-jar` builds work nice in CI environments, I assume that's the most common package type used in CI environments for example for testing purposes. The idea here is to _not_ cache all the dependency jars, but leverage the mechanisms available in Gradle to "just collect" the dependencies. In other words (and simplified): for `fast-jar`, cache everthing "in `quarkus-app` except `lib/`. There is no change for users running a `./gradlew quarkusBuild`. `QuarkusBuild` is still the task that users should execute (or depend on in their builds). To achive the goal to make especially `fast-jar` builds "nicely cacheable" in CI, two new tasks had to be introduced: * `QuarkusBuildDependencies` uses the dependency information from the Quarkus `ApplicationModel` to tell Gradle to collect those. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is _never_ cacheable, but has _up-to-date_ checks. * `QuarkusBuildCacheableAppParts` performs a Quarkus application build and collects the _cacheable_ parts of that build. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is cacheable. * `QuarkusBuild`, for `fast-jar` and `legacy-jar`, assembles the outputs of the above two tasks to produce the expected output. Performs a Quarkus build for other package types. See below on when this task is cacheable. The `build/quarkus-build/` directory is used to properly separate the outputs of the above three tasks: * `build/quarkus-build/gen/` receives the full output of a Quarkus build * `build/quarkus-build/app/` receives the _cacheable_ parts from `build/quarkus-build/gen/` * `build/quarkus-build/dep/` receives the dependency jars The output of `QuarkusBuild` is, by default, cacheable in non-CI environments and _not_ cacheable in CI environments. The behavior can be explicitly overridden using the new property `cacheLargeArtifacts` property. As outlined above, caching huge artifacts is not beneficial in CI, either due to space limitations (GitHub's 10GB limit for example) or just the cost of network traffic/duration. On a developer's machine however, the cost of storing/retrieving even bigger artifacts is rather neglectible and lower than rebuilding a Quarkus application. Before this change, the "priority" of configuration settings was (in most cases...) effectively: System properties, environment variables, `application.properties`, configurations in Gradle build scripts & Gradle project properties. This order is unintuitive and was changed to: system properties, environment variables, Gradle project properties, Quarkus build configuration, application.properties. The previous code had several places in which Quarkus related configuration settings were retrieved with sometimes different priorities of the "config sources". The whole "machinery" to get the configuration has been rewritten and encapsulated in new classes `EffectiveConfig` (effective for a Quarkus build) and `BaseConfig` (available during Gradle task configuration phase). `EffectiveConfig` is not really different from `BaseConfig, but contains the "special" settings from e.g. `ImagePush` task. The new implementation also uses SmallRye Config and especially the existing Quarkus mechanisms to pull information out of a `PackageConfig` _object_ produced from a configuration. It turned out to be easier to reason about `PackageConfig` than "raw property values". Support for `application.yaml/yml` has also been added. All calls into the "inner workings" of a Quarkus build and Quarkus code generation have been moved to separate worker processes. The code in this change to support this is pretty dead simple. The primary reason to move that work into isolated worker processes is that all the `QuarkusBootstrap` and derived pieces know nothing about Gradle, so there is no "property like" mechanism to override settings from `application.properties/yaml/yml` with those from e.g. the `QuarkusPluginExtension`. The only option would have been to modify the system properties of the Gradle build process - but that's a no-go, especially considering other tasks running in parallel (think: two `QuarkusBuild` of different projects running at the same time). It would have been relatively easy to serialize all `QuarkusBuild` actions across a Gradle build, but why - and it would prevent using "beefy machines" to run many `QuarkusBuild`s in parallel. Another option would have been to implement a rather complex (and likely very racy) mechanism to track modifications to system properties. As a conclusion, it wasn't just very simple to leverage the process isolation using the Gradle worker API, but it's also not bad. Gradle does reuse already spawned and compatible worker instances. The "trick" implemented to "prioritize" configs from Gradle project properties and Quarkus extension settings is to pass the whole config as system properties to the worker. A bunch of hopefully useful logging has been implemented and added. With info-level loogging, the tasks emit at least some basic information about what they do and at least the package type being used. To be able to investigate which configuration settings were effectively used, there's another new task called `quarkusShowEffectiveConfig` that shows all the `quarkus.*` properties and some more information, including the loaded `application.(properties|yaml|yml)`. This task is intended to debug build issues. The task can optionally save the effecitve config properties as a properties file in the `build/` directory, if run with the command line option `./gradlew quarkusShowEffectiveConfig --save-config-properties`. None of the existing tests has changed, except the `BuildCOnfigurationTest` had to be adopted to reflect the updated order of config sources. Relates to: quarkusio#30852
snazy
added a commit
to snazy/quarkus
that referenced
this issue
Mar 9, 2023
Make `QuarkusBuild` not pollute Gradle's build cache Improve Quarkus build in Gradle, proper up-to-date mechanisms, right config-source priorities. Currently the `QuarkusBuild` task implementation adds even large build artifacts and unmodified dependency jars to Gradle's build cache. This pollutes the Gradle build cache quite a lot and therefore lets archived build caches become unnecessary huge. On top of that, "relicts" from previous runs of `QuarkusBuild` using a different configuration, like a different package-type, causes multiple permutations of the cached artifacts and leads to restoring unexpected artifacts from the build cache. This large change updates the Gradle logic to have "proper" up-to-date checks and in turn a reasonable setting of what is cached and what is not cached. Background on how I discovered the "caching and up-to-date issue" with `QuarkusBuild`: Quarkus 2.16 added the `@CacheableTask` annotation to `QuarkusBuild`. This means that _everything_ that is generated by `QuarkusBuild` (and "declared" as an output: fast-jar directory, uber-jar, native-runner) is cached. After having Quarkus 2.16.x in our "main" branch for about 1/2 day, nearly the whole "10 GB budget" for cache artifacts in GitHub was occupied by Gradle build cache artifacts - the build cache artifacts grew from build to build - this was not the case without the `@CacheableTask` annotation. The current inputs of the `QuarkusBuild` task are also incomplete. Those inputs do not include for example the configured `finalName`, the settings on the Gradle project and the settings on the Quarkus extension. This leads to situations that the wrong Gradle cache artifacts could be restored (think: you ask for a fast-jar, but get an uber-jar). Another issue is that repeated runs of `quarkusBuild` against the exact same code base and dependencies leads to different outputs, the contents of the generated jars and e.g. quarkus-application.dat differ. This causes different outputs of the `QuarkusBuild` task, which cause depending tasks and projects to unnecessarily run/work again. The initial goal of this effort was to make caching of `fast-jar` builds work nice in CI environments, I assume that's the most common package type used in CI environments for example for testing purposes. The idea here is to _not_ cache all the dependency jars, but leverage the mechanisms available in Gradle to "just collect" the dependencies. In other words (and simplified): for `fast-jar`, cache everthing "in `quarkus-app` except `lib/`. There is no change for users running a `./gradlew quarkusBuild`. `QuarkusBuild` is still the task that users should execute (or depend on in their builds). To achive the goal to make especially `fast-jar` builds "nicely cacheable" in CI, two new tasks had to be introduced: * `QuarkusBuildDependencies` uses the dependency information from the Quarkus `ApplicationModel` to tell Gradle to collect those. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is _never_ cacheable, but has _up-to-date_ checks. * `QuarkusBuildCacheableAppParts` performs a Quarkus application build and collects the _cacheable_ parts of that build. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is cacheable. * `QuarkusBuild`, for `fast-jar` and `legacy-jar`, assembles the outputs of the above two tasks to produce the expected output. Performs a Quarkus build for other package types. See below on when this task is cacheable. The `build/quarkus-build/` directory is used to properly separate the outputs of the above three tasks: * `build/quarkus-build/gen/` receives the full output of a Quarkus build * `build/quarkus-build/app/` receives the _cacheable_ parts from `build/quarkus-build/gen/` * `build/quarkus-build/dep/` receives the dependency jars The output of `QuarkusBuild` is, by default, cacheable in non-CI environments and _not_ cacheable in CI environments. The behavior can be explicitly overridden using the new property `cacheLargeArtifacts` property. As outlined above, caching huge artifacts is not beneficial in CI, either due to space limitations (GitHub's 10GB limit for example) or just the cost of network traffic/duration. On a developer's machine however, the cost of storing/retrieving even bigger artifacts is rather neglectible and lower than rebuilding a Quarkus application. Before this change, the "priority" of configuration settings was (in most cases...) effectively: System properties, environment variables, `application.properties`, configurations in Gradle build scripts & Gradle project properties. This order is unintuitive and was changed to: system properties, environment variables, Gradle project properties, Quarkus build configuration, application.properties. The previous code had several places in which Quarkus related configuration settings were retrieved with sometimes different priorities of the "config sources". The whole "machinery" to get the configuration has been rewritten and encapsulated in new classes `EffectiveConfig` (effective for a Quarkus build) and `BaseConfig` (available during Gradle task configuration phase). `EffectiveConfig` is not really different from `BaseConfig, but contains the "special" settings from e.g. `ImagePush` task. The new implementation also uses SmallRye Config and especially the existing Quarkus mechanisms to pull information out of a `PackageConfig` _object_ produced from a configuration. It turned out to be easier to reason about `PackageConfig` than "raw property values". Support for `application.yaml/yml` has also been added. All calls into the "inner workings" of a Quarkus build and Quarkus code generation have been moved to separate worker processes. The code in this change to support this is pretty dead simple. The primary reason to move that work into isolated worker processes is that all the `QuarkusBootstrap` and derived pieces know nothing about Gradle, so there is no "property like" mechanism to override settings from `application.properties/yaml/yml` with those from e.g. the `QuarkusPluginExtension`. The only option would have been to modify the system properties of the Gradle build process - but that's a no-go, especially considering other tasks running in parallel (think: two `QuarkusBuild` of different projects running at the same time). It would have been relatively easy to serialize all `QuarkusBuild` actions across a Gradle build, but why - and it would prevent using "beefy machines" to run many `QuarkusBuild`s in parallel. Another option would have been to implement a rather complex (and likely very racy) mechanism to track modifications to system properties. As a conclusion, it wasn't just very simple to leverage the process isolation using the Gradle worker API, but it's also not bad. Gradle does reuse already spawned and compatible worker instances. The "trick" implemented to "prioritize" configs from Gradle project properties and Quarkus extension settings is to pass the whole config as system properties to the worker. A bunch of hopefully useful logging has been implemented and added. With info-level loogging, the tasks emit at least some basic information about what they do and at least the package type being used. To be able to investigate which configuration settings were effectively used, there's another new task called `quarkusShowEffectiveConfig` that shows all the `quarkus.*` properties and some more information, including the loaded `application.(properties|yaml|yml)`. This task is intended to debug build issues. The task can optionally save the effecitve config properties as a properties file in the `build/` directory, if run with the command line option `./gradlew quarkusShowEffectiveConfig --save-config-properties`. None of the existing tests has changed, except the `BuildCOnfigurationTest` had to be adopted to reflect the updated order of config sources. Relates to: quarkusio#30852
snazy
added a commit
to snazy/quarkus
that referenced
this issue
Mar 13, 2023
Make `QuarkusBuild` not pollute Gradle's build cache Improve Quarkus build in Gradle, proper up-to-date mechanisms, right config-source priorities. Currently the `QuarkusBuild` task implementation adds even large build artifacts and unmodified dependency jars to Gradle's build cache. This pollutes the Gradle build cache quite a lot and therefore lets archived build caches become unnecessary huge. On top of that, "relicts" from previous runs of `QuarkusBuild` using a different configuration, like a different package-type, causes multiple permutations of the cached artifacts and leads to restoring unexpected artifacts from the build cache. This large change updates the Gradle logic to have "proper" up-to-date checks and in turn a reasonable setting of what is cached and what is not cached. Background on how I discovered the "caching and up-to-date issue" with `QuarkusBuild`: Quarkus 2.16 added the `@CacheableTask` annotation to `QuarkusBuild`. This means that _everything_ that is generated by `QuarkusBuild` (and "declared" as an output: fast-jar directory, uber-jar, native-runner) is cached. After having Quarkus 2.16.x in our "main" branch for about 1/2 day, nearly the whole "10 GB budget" for cache artifacts in GitHub was occupied by Gradle build cache artifacts - the build cache artifacts grew from build to build - this was not the case without the `@CacheableTask` annotation. The current inputs of the `QuarkusBuild` task are also incomplete. Those inputs do not include for example the configured `finalName`, the settings on the Gradle project and the settings on the Quarkus extension. This leads to situations that the wrong Gradle cache artifacts could be restored (think: you ask for a fast-jar, but get an uber-jar). Another issue is that repeated runs of `quarkusBuild` against the exact same code base and dependencies leads to different outputs, the contents of the generated jars and e.g. quarkus-application.dat differ. This causes different outputs of the `QuarkusBuild` task, which cause depending tasks and projects to unnecessarily run/work again. The initial goal of this effort was to make caching of `fast-jar` builds work nice in CI environments, I assume that's the most common package type used in CI environments for example for testing purposes. The idea here is to _not_ cache all the dependency jars, but leverage the mechanisms available in Gradle to "just collect" the dependencies. In other words (and simplified): for `fast-jar`, cache everthing "in `quarkus-app` except `lib/`. There is no change for users running a `./gradlew quarkusBuild`. `QuarkusBuild` is still the task that users should execute (or depend on in their builds). To achive the goal to make especially `fast-jar` builds "nicely cacheable" in CI, two new tasks had to be introduced: * `QuarkusBuildDependencies` uses the dependency information from the Quarkus `ApplicationModel` to tell Gradle to collect those. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is _never_ cacheable, but has _up-to-date_ checks. * `QuarkusBuildCacheableAppParts` performs a Quarkus application build and collects the _cacheable_ parts of that build. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is cacheable. * `QuarkusBuild`, for `fast-jar` and `legacy-jar`, assembles the outputs of the above two tasks to produce the expected output. Performs a Quarkus build for other package types. See below on when this task is cacheable. The `build/quarkus-build/` directory is used to properly separate the outputs of the above three tasks: * `build/quarkus-build/gen/` receives the full output of a Quarkus build * `build/quarkus-build/app/` receives the _cacheable_ parts from `build/quarkus-build/gen/` * `build/quarkus-build/dep/` receives the dependency jars The output of `QuarkusBuild` is, by default, cacheable in non-CI environments and _not_ cacheable in CI environments. The behavior can be explicitly overridden using the new property `cacheLargeArtifacts` property. As outlined above, caching huge artifacts is not beneficial in CI, either due to space limitations (GitHub's 10GB limit for example) or just the cost of network traffic/duration. On a developer's machine however, the cost of storing/retrieving even bigger artifacts is rather neglectible and lower than rebuilding a Quarkus application. Before this change, the "priority" of configuration settings was (in most cases...) effectively: System properties, environment variables, `application.properties`, configurations in Gradle build scripts & Gradle project properties. This order is unintuitive and was changed to: system properties, environment variables, Gradle project properties, Quarkus build configuration, application.properties. The previous code had several places in which Quarkus related configuration settings were retrieved with sometimes different priorities of the "config sources". The whole "machinery" to get the configuration has been rewritten and encapsulated in new classes `EffectiveConfig` (effective for a Quarkus build) and `BaseConfig` (available during Gradle task configuration phase). `EffectiveConfig` is not really different from `BaseConfig, but contains the "special" settings from e.g. `ImagePush` task. The new implementation also uses SmallRye Config and especially the existing Quarkus mechanisms to pull information out of a `PackageConfig` _object_ produced from a configuration. It turned out to be easier to reason about `PackageConfig` than "raw property values". Support for `application.yaml/yml` has also been added. All calls into the "inner workings" of a Quarkus build and Quarkus code generation have been moved to separate worker processes. The code in this change to support this is pretty dead simple. The primary reason to move that work into isolated worker processes is that all the `QuarkusBootstrap` and derived pieces know nothing about Gradle, so there is no "property like" mechanism to override settings from `application.properties/yaml/yml` with those from e.g. the `QuarkusPluginExtension`. The only option would have been to modify the system properties of the Gradle build process - but that's a no-go, especially considering other tasks running in parallel (think: two `QuarkusBuild` of different projects running at the same time). It would have been relatively easy to serialize all `QuarkusBuild` actions across a Gradle build, but why - and it would prevent using "beefy machines" to run many `QuarkusBuild`s in parallel. Another option would have been to implement a rather complex (and likely very racy) mechanism to track modifications to system properties. As a conclusion, it wasn't just very simple to leverage the process isolation using the Gradle worker API, but it's also not bad. Gradle does reuse already spawned and compatible worker instances. The "trick" implemented to "prioritize" configs from Gradle project properties and Quarkus extension settings is to pass the whole config as system properties to the worker. A bunch of hopefully useful logging has been implemented and added. With info-level loogging, the tasks emit at least some basic information about what they do and at least the package type being used. To be able to investigate which configuration settings were effectively used, there's another new task called `quarkusShowEffectiveConfig` that shows all the `quarkus.*` properties and some more information, including the loaded `application.(properties|yaml|yml)`. This task is intended to debug build issues. The task can optionally save the effecitve config properties as a properties file in the `build/` directory, if run with the command line option `./gradlew quarkusShowEffectiveConfig --save-config-properties`. None of the existing tests has changed, except the `BuildCOnfigurationTest` had to be adopted to reflect the updated order of config sources. Relates to: quarkusio#30852
snazy
added a commit
to snazy/quarkus
that referenced
this issue
Mar 16, 2023
Make `QuarkusBuild` not pollute Gradle's build cache Improve Quarkus build in Gradle, proper up-to-date mechanisms, right config-source priorities. Currently the `QuarkusBuild` task implementation adds even large build artifacts and unmodified dependency jars to Gradle's build cache. This pollutes the Gradle build cache quite a lot and therefore lets archived build caches become unnecessary huge. On top of that, "relicts" from previous runs of `QuarkusBuild` using a different configuration, like a different package-type, causes multiple permutations of the cached artifacts and leads to restoring unexpected artifacts from the build cache. This large change updates the Gradle logic to have "proper" up-to-date checks and in turn a reasonable setting of what is cached and what is not cached. Background on how I discovered the "caching and up-to-date issue" with `QuarkusBuild`: Quarkus 2.16 added the `@CacheableTask` annotation to `QuarkusBuild`. This means that _everything_ that is generated by `QuarkusBuild` (and "declared" as an output: fast-jar directory, uber-jar, native-runner) is cached. After having Quarkus 2.16.x in our "main" branch for about 1/2 day, nearly the whole "10 GB budget" for cache artifacts in GitHub was occupied by Gradle build cache artifacts - the build cache artifacts grew from build to build - this was not the case without the `@CacheableTask` annotation. The current inputs of the `QuarkusBuild` task are also incomplete. Those inputs do not include for example the configured `finalName`, the settings on the Gradle project and the settings on the Quarkus extension. This leads to situations that the wrong Gradle cache artifacts could be restored (think: you ask for a fast-jar, but get an uber-jar). Another issue is that repeated runs of `quarkusBuild` against the exact same code base and dependencies leads to different outputs, the contents of the generated jars and e.g. quarkus-application.dat differ. This causes different outputs of the `QuarkusBuild` task, which cause depending tasks and projects to unnecessarily run/work again. The initial goal of this effort was to make caching of `fast-jar` builds work nice in CI environments, I assume that's the most common package type used in CI environments for example for testing purposes. The idea here is to _not_ cache all the dependency jars, but leverage the mechanisms available in Gradle to "just collect" the dependencies. In other words (and simplified): for `fast-jar`, cache everthing "in `quarkus-app` except `lib/`. There is no change for users running a `./gradlew quarkusBuild`. `QuarkusBuild` is still the task that users should execute (or depend on in their builds). To achive the goal to make especially `fast-jar` builds "nicely cacheable" in CI, two new tasks had to be introduced: * `QuarkusBuildDependencies` uses the dependency information from the Quarkus `ApplicationModel` to tell Gradle to collect those. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is _never_ cacheable, but has _up-to-date_ checks. * `QuarkusBuildCacheableAppParts` performs a Quarkus application build and collects the _cacheable_ parts of that build. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is cacheable. * `QuarkusBuild`, for `fast-jar` and `legacy-jar`, assembles the outputs of the above two tasks to produce the expected output. Performs a Quarkus build for other package types. See below on when this task is cacheable. The `build/quarkus-build/` directory is used to properly separate the outputs of the above three tasks: * `build/quarkus-build/gen/` receives the full output of a Quarkus build * `build/quarkus-build/app/` receives the _cacheable_ parts from `build/quarkus-build/gen/` * `build/quarkus-build/dep/` receives the dependency jars The output of `QuarkusBuild` is, by default, cacheable in non-CI environments and _not_ cacheable in CI environments. The behavior can be explicitly overridden using the new property `cacheLargeArtifacts` property. As outlined above, caching huge artifacts is not beneficial in CI, either due to space limitations (GitHub's 10GB limit for example) or just the cost of network traffic/duration. On a developer's machine however, the cost of storing/retrieving even bigger artifacts is rather neglectible and lower than rebuilding a Quarkus application. Before this change, the "priority" of configuration settings was (in most cases...) effectively: System properties, environment variables, `application.properties`, configurations in Gradle build scripts & Gradle project properties. This order is unintuitive and was changed to: system properties, environment variables, Gradle project properties, Quarkus build configuration, application.properties. The previous code had several places in which Quarkus related configuration settings were retrieved with sometimes different priorities of the "config sources". The whole "machinery" to get the configuration has been rewritten and encapsulated in new classes `EffectiveConfig` (effective for a Quarkus build) and `BaseConfig` (available during Gradle task configuration phase). `EffectiveConfig` is not really different from `BaseConfig, but contains the "special" settings from e.g. `ImagePush` task. The new implementation also uses SmallRye Config and especially the existing Quarkus mechanisms to pull information out of a `PackageConfig` _object_ produced from a configuration. It turned out to be easier to reason about `PackageConfig` than "raw property values". Support for `application.yaml/yml` has also been added. All calls into the "inner workings" of a Quarkus build and Quarkus code generation have been moved to separate worker processes. The code in this change to support this is pretty dead simple. The primary reason to move that work into isolated worker processes is that all the `QuarkusBootstrap` and derived pieces know nothing about Gradle, so there is no "property like" mechanism to override settings from `application.properties/yaml/yml` with those from e.g. the `QuarkusPluginExtension`. The only option would have been to modify the system properties of the Gradle build process - but that's a no-go, especially considering other tasks running in parallel (think: two `QuarkusBuild` of different projects running at the same time). It would have been relatively easy to serialize all `QuarkusBuild` actions across a Gradle build, but why - and it would prevent using "beefy machines" to run many `QuarkusBuild`s in parallel. Another option would have been to implement a rather complex (and likely very racy) mechanism to track modifications to system properties. As a conclusion, it wasn't just very simple to leverage the process isolation using the Gradle worker API, but it's also not bad. Gradle does reuse already spawned and compatible worker instances. The "trick" implemented to "prioritize" configs from Gradle project properties and Quarkus extension settings is to pass the whole config as system properties to the worker. A bunch of hopefully useful logging has been implemented and added. With info-level loogging, the tasks emit at least some basic information about what they do and at least the package type being used. To be able to investigate which configuration settings were effectively used, there's another new task called `quarkusShowEffectiveConfig` that shows all the `quarkus.*` properties and some more information, including the loaded `application.(properties|yaml|yml)`. This task is intended to debug build issues. The task can optionally save the effecitve config properties as a properties file in the `build/` directory, if run with the command line option `./gradlew quarkusShowEffectiveConfig --save-config-properties`. None of the existing tests has changed, except the `BuildCOnfigurationTest` had to be adopted to reflect the updated order of config sources. Relates to: quarkusio#30852
snazy
added a commit
to snazy/quarkus
that referenced
this issue
Mar 20, 2023
Improve Quarkus build in Gradle, proper up-to-date mechanisms, right config-source priorities. Currently the `QuarkusBuild` task implementation adds even large build artifacts and unmodified dependency jars to Gradle's build cache. This pollutes the Gradle build cache quite a lot and therefore lets archived build caches become unnecessary huge. On top of that, "relicts" from previous runs of `QuarkusBuild` using a different configuration, like a different package-type, causes multiple permutations of the cached artifacts and leads to restoring unexpected artifacts from the build cache. This large change updates the Gradle logic to have "proper" up-to-date checks and in turn a reasonable setting of what is cached and what is not cached. Background on how I discovered the "caching and up-to-date issue" with `QuarkusBuild`: Quarkus 2.16 added the `@CacheableTask` annotation to `QuarkusBuild`. This means that _everything_ that is generated by `QuarkusBuild` (and "declared" as an output: fast-jar directory, uber-jar, native-runner) is cached. After having Quarkus 2.16.x in our "main" branch for about 1/2 day, nearly the whole "10 GB budget" for cache artifacts in GitHub was occupied by Gradle build cache artifacts - the build cache artifacts grew from build to build - this was not the case without the `@CacheableTask` annotation. The current inputs of the `QuarkusBuild` task are also incomplete. Those inputs do not include for example the configured `finalName`, the settings on the Gradle project and the settings on the Quarkus extension. This leads to situations that the wrong Gradle cache artifacts could be restored (think: you ask for a fast-jar, but get an uber-jar). Another issue is that repeated runs of `quarkusBuild` against the exact same code base and dependencies leads to different outputs, the contents of the generated jars and e.g. quarkus-application.dat differ. This causes different outputs of the `QuarkusBuild` task, which cause depending tasks and projects to unnecessarily run/work again. The initial goal of this effort was to make caching of `fast-jar` builds work nice in CI environments, I assume that's the most common package type used in CI environments for example for testing purposes. The idea here is to _not_ cache all the dependency jars, but leverage the mechanisms available in Gradle to "just collect" the dependencies. In other words (and simplified): for `fast-jar`, cache everthing "in `quarkus-app` except `lib/`. There is no change for users running a `./gradlew quarkusBuild`. `QuarkusBuild` is still the task that users should execute (or depend on in their builds). To achive the goal to make especially `fast-jar` builds "nicely cacheable" in CI, two new tasks had to be introduced: * `QuarkusBuildDependencies` uses the dependency information from the Quarkus `ApplicationModel` to tell Gradle to collect those. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is _never_ cacheable, but has _up-to-date_ checks. * `QuarkusBuildCacheableAppParts` performs a Quarkus application build and collects the _cacheable_ parts of that build. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is cacheable. * `QuarkusBuild`, for `fast-jar` and `legacy-jar`, assembles the outputs of the above two tasks to produce the expected output. Performs a Quarkus build for other package types. See below on when this task is cacheable. The `build/quarkus-build/` directory is used to properly separate the outputs of the above three tasks: * `build/quarkus-build/gen/` receives the full output of a Quarkus build * `build/quarkus-build/app/` receives the _cacheable_ parts from `build/quarkus-build/gen/` * `build/quarkus-build/dep/` receives the dependency jars The output of `QuarkusBuild` is, by default, cacheable in non-CI environments and _not_ cacheable in CI environments. The behavior can be explicitly overridden using the new property `cacheLargeArtifacts` property. As outlined above, caching huge artifacts is not beneficial in CI, either due to space limitations (GitHub's 10GB limit for example) or just the cost of network traffic/duration. On a developer's machine however, the cost of storing/retrieving even bigger artifacts is rather neglectible and lower than rebuilding a Quarkus application. Before this change, the "priority" of configuration settings was (in most cases...) effectively: System properties, environment variables, `application.properties`, configurations in Gradle build scripts & Gradle project properties. This order is unintuitive and was changed to: system properties, environment variables, Gradle project properties, Quarkus build configuration, application.properties. The previous code had several places in which Quarkus related configuration settings were retrieved with sometimes different priorities of the "config sources". The whole "machinery" to get the configuration has been rewritten and encapsulated in new classes `EffectiveConfig` (effective for a Quarkus build) and `BaseConfig` (available during Gradle task configuration phase). `EffectiveConfig` is not really different from `BaseConfig, but contains the "special" settings from e.g. `ImagePush` task. The new implementation also uses SmallRye Config and especially the existing Quarkus mechanisms to pull information out of a `PackageConfig` _object_ produced from a configuration. It turned out to be easier to reason about `PackageConfig` than "raw property values". Support for `application.yaml/yml` has also been added. All calls into the "inner workings" of a Quarkus build and Quarkus code generation have been moved to separate worker processes. The code in this change to support this is pretty dead simple. The primary reason to move that work into isolated worker processes is that all the `QuarkusBootstrap` and derived pieces know nothing about Gradle, so there is no "property like" mechanism to override settings from `application.properties/yaml/yml` with those from e.g. the `QuarkusPluginExtension`. The only option would have been to modify the system properties of the Gradle build process - but that's a no-go, especially considering other tasks running in parallel (think: two `QuarkusBuild` of different projects running at the same time). It would have been relatively easy to serialize all `QuarkusBuild` actions across a Gradle build, but why - and it would prevent using "beefy machines" to run many `QuarkusBuild`s in parallel. Another option would have been to implement a rather complex (and likely very racy) mechanism to track modifications to system properties. As a conclusion, it wasn't just very simple to leverage the process isolation using the Gradle worker API, but it's also not bad. Gradle does reuse already spawned and compatible worker instances. The "trick" implemented to "prioritize" configs from Gradle project properties and Quarkus extension settings is to pass the whole config as system properties to the worker. A bunch of hopefully useful logging has been implemented and added. With info-level loogging, the tasks emit at least some basic information about what they do and at least the package type being used. To be able to investigate which configuration settings were effectively used, there's another new task called `quarkusShowEffectiveConfig` that shows all the `quarkus.*` properties and some more information, including the loaded `application.(properties|yaml|yml)`. This task is intended to debug build issues. The task can optionally save the effecitve config properties as a properties file in the `build/` directory, if run with the command line option `./gradlew quarkusShowEffectiveConfig --save-config-properties`. None of the existing tests has changed, except the `BuildCOnfigurationTest` had to be adopted to reflect the updated order of config sources. Relates to: quarkusio#30852
snazy
added a commit
to snazy/quarkus
that referenced
this issue
Mar 24, 2023
Improve Quarkus build in Gradle, proper up-to-date mechanisms, right config-source priorities. Currently the `QuarkusBuild` task implementation adds even large build artifacts and unmodified dependency jars to Gradle's build cache. This pollutes the Gradle build cache quite a lot and therefore lets archived build caches become unnecessary huge. On top of that, "relicts" from previous runs of `QuarkusBuild` using a different configuration, like a different package-type, causes multiple permutations of the cached artifacts and leads to restoring unexpected artifacts from the build cache. This large change updates the Gradle logic to have "proper" up-to-date checks and in turn a reasonable setting of what is cached and what is not cached. Background on how I discovered the "caching and up-to-date issue" with `QuarkusBuild`: Quarkus 2.16 added the `@CacheableTask` annotation to `QuarkusBuild`. This means that _everything_ that is generated by `QuarkusBuild` (and "declared" as an output: fast-jar directory, uber-jar, native-runner) is cached. After having Quarkus 2.16.x in our "main" branch for about 1/2 day, nearly the whole "10 GB budget" for cache artifacts in GitHub was occupied by Gradle build cache artifacts - the build cache artifacts grew from build to build - this was not the case without the `@CacheableTask` annotation. The current inputs of the `QuarkusBuild` task are also incomplete. Those inputs do not include for example the configured `finalName`, the settings on the Gradle project and the settings on the Quarkus extension. This leads to situations that the wrong Gradle cache artifacts could be restored (think: you ask for a fast-jar, but get an uber-jar). Another issue is that repeated runs of `quarkusBuild` against the exact same code base and dependencies leads to different outputs, the contents of the generated jars and e.g. quarkus-application.dat differ. This causes different outputs of the `QuarkusBuild` task, which cause depending tasks and projects to unnecessarily run/work again. The initial goal of this effort was to make caching of `fast-jar` builds work nice in CI environments, I assume that's the most common package type used in CI environments for example for testing purposes. The idea here is to _not_ cache all the dependency jars, but leverage the mechanisms available in Gradle to "just collect" the dependencies. In other words (and simplified): for `fast-jar`, cache everthing "in `quarkus-app` except `lib/`. There is no change for users running a `./gradlew quarkusBuild`. `QuarkusBuild` is still the task that users should execute (or depend on in their builds). To achive the goal to make especially `fast-jar` builds "nicely cacheable" in CI, two new tasks had to be introduced: * `QuarkusBuildDependencies` uses the dependency information from the Quarkus `ApplicationModel` to tell Gradle to collect those. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is _never_ cacheable, but has _up-to-date_ checks. * `QuarkusBuildCacheableAppParts` performs a Quarkus application build and collects the _cacheable_ parts of that build. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is cacheable. * `QuarkusBuild`, for `fast-jar` and `legacy-jar`, assembles the outputs of the above two tasks to produce the expected output. Performs a Quarkus build for other package types. See below on when this task is cacheable. The `build/quarkus-build/` directory is used to properly separate the outputs of the above three tasks: * `build/quarkus-build/gen/` receives the full output of a Quarkus build * `build/quarkus-build/app/` receives the _cacheable_ parts from `build/quarkus-build/gen/` * `build/quarkus-build/dep/` receives the dependency jars The output of `QuarkusBuild` is, by default, cacheable in non-CI environments and _not_ cacheable in CI environments. The behavior can be explicitly overridden using the new property `cacheLargeArtifacts` property. As outlined above, caching huge artifacts is not beneficial in CI, either due to space limitations (GitHub's 10GB limit for example) or just the cost of network traffic/duration. On a developer's machine however, the cost of storing/retrieving even bigger artifacts is rather neglectible and lower than rebuilding a Quarkus application. Before this change, the "priority" of configuration settings was (in most cases...) effectively: System properties, environment variables, `application.properties`, configurations in Gradle build scripts & Gradle project properties. This order is unintuitive and was changed to: system properties, environment variables, Gradle project properties, Quarkus build configuration, application.properties. The previous code had several places in which Quarkus related configuration settings were retrieved with sometimes different priorities of the "config sources". The whole "machinery" to get the configuration has been rewritten and encapsulated in new classes `EffectiveConfig` (effective for a Quarkus build) and `BaseConfig` (available during Gradle task configuration phase). `EffectiveConfig` is not really different from `BaseConfig, but contains the "special" settings from e.g. `ImagePush` task. The new implementation also uses SmallRye Config and especially the existing Quarkus mechanisms to pull information out of a `PackageConfig` _object_ produced from a configuration. It turned out to be easier to reason about `PackageConfig` than "raw property values". Support for `application.yaml/yml` has also been added. All calls into the "inner workings" of a Quarkus build and Quarkus code generation have been moved to separate worker processes. The code in this change to support this is pretty dead simple. The primary reason to move that work into isolated worker processes is that all the `QuarkusBootstrap` and derived pieces know nothing about Gradle, so there is no "property like" mechanism to override settings from `application.properties/yaml/yml` with those from e.g. the `QuarkusPluginExtension`. The only option would have been to modify the system properties of the Gradle build process - but that's a no-go, especially considering other tasks running in parallel (think: two `QuarkusBuild` of different projects running at the same time). It would have been relatively easy to serialize all `QuarkusBuild` actions across a Gradle build, but why - and it would prevent using "beefy machines" to run many `QuarkusBuild`s in parallel. Another option would have been to implement a rather complex (and likely very racy) mechanism to track modifications to system properties. As a conclusion, it wasn't just very simple to leverage the process isolation using the Gradle worker API, but it's also not bad. Gradle does reuse already spawned and compatible worker instances. The "trick" implemented to "prioritize" configs from Gradle project properties and Quarkus extension settings is to pass the whole config as system properties to the worker. A bunch of hopefully useful logging has been implemented and added. With info-level loogging, the tasks emit at least some basic information about what they do and at least the package type being used. To be able to investigate which configuration settings were effectively used, there's another new task called `quarkusShowEffectiveConfig` that shows all the `quarkus.*` properties and some more information, including the loaded `application.(properties|yaml|yml)`. This task is intended to debug build issues. The task can optionally save the effecitve config properties as a properties file in the `build/` directory, if run with the command line option `./gradlew quarkusShowEffectiveConfig --save-config-properties`. None of the existing tests has changed, except the `BuildCOnfigurationTest` had to be adopted to reflect the updated order of config sources. Relates to: quarkusio#30852
snazy
added a commit
to snazy/quarkus
that referenced
this issue
Mar 27, 2023
Improve Quarkus build in Gradle, proper up-to-date mechanisms, right config-source priorities. Currently the `QuarkusBuild` task implementation adds even large build artifacts and unmodified dependency jars to Gradle's build cache. This pollutes the Gradle build cache quite a lot and therefore lets archived build caches become unnecessary huge. On top of that, "relicts" from previous runs of `QuarkusBuild` using a different configuration, like a different package-type, causes multiple permutations of the cached artifacts and leads to restoring unexpected artifacts from the build cache. This large change updates the Gradle logic to have "proper" up-to-date checks and in turn a reasonable setting of what is cached and what is not cached. Background on how I discovered the "caching and up-to-date issue" with `QuarkusBuild`: Quarkus 2.16 added the `@CacheableTask` annotation to `QuarkusBuild`. This means that _everything_ that is generated by `QuarkusBuild` (and "declared" as an output: fast-jar directory, uber-jar, native-runner) is cached. After having Quarkus 2.16.x in our "main" branch for about 1/2 day, nearly the whole "10 GB budget" for cache artifacts in GitHub was occupied by Gradle build cache artifacts - the build cache artifacts grew from build to build - this was not the case without the `@CacheableTask` annotation. The current inputs of the `QuarkusBuild` task are also incomplete. Those inputs do not include for example the configured `finalName`, the settings on the Gradle project and the settings on the Quarkus extension. This leads to situations that the wrong Gradle cache artifacts could be restored (think: you ask for a fast-jar, but get an uber-jar). Another issue is that repeated runs of `quarkusBuild` against the exact same code base and dependencies leads to different outputs, the contents of the generated jars and e.g. quarkus-application.dat differ. This causes different outputs of the `QuarkusBuild` task, which cause depending tasks and projects to unnecessarily run/work again. The initial goal of this effort was to make caching of `fast-jar` builds work nice in CI environments, I assume that's the most common package type used in CI environments for example for testing purposes. The idea here is to _not_ cache all the dependency jars, but leverage the mechanisms available in Gradle to "just collect" the dependencies. In other words (and simplified): for `fast-jar`, cache everthing "in `quarkus-app` except `lib/`. There is no change for users running a `./gradlew quarkusBuild`. `QuarkusBuild` is still the task that users should execute (or depend on in their builds). To achive the goal to make especially `fast-jar` builds "nicely cacheable" in CI, two new tasks had to be introduced: * `QuarkusBuildDependencies` uses the dependency information from the Quarkus `ApplicationModel` to tell Gradle to collect those. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is _never_ cacheable, but has _up-to-date_ checks. * `QuarkusBuildCacheableAppParts` performs a Quarkus application build and collects the _cacheable_ parts of that build. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is cacheable. * `QuarkusBuild`, for `fast-jar` and `legacy-jar`, assembles the outputs of the above two tasks to produce the expected output. Performs a Quarkus build for other package types. See below on when this task is cacheable. The `build/quarkus-build/` directory is used to properly separate the outputs of the above three tasks: * `build/quarkus-build/gen/` receives the full output of a Quarkus build * `build/quarkus-build/app/` receives the _cacheable_ parts from `build/quarkus-build/gen/` * `build/quarkus-build/dep/` receives the dependency jars The output of `QuarkusBuild` is, by default, cacheable in non-CI environments and _not_ cacheable in CI environments. The behavior can be explicitly overridden using the new property `cacheLargeArtifacts` property. As outlined above, caching huge artifacts is not beneficial in CI, either due to space limitations (GitHub's 10GB limit for example) or just the cost of network traffic/duration. On a developer's machine however, the cost of storing/retrieving even bigger artifacts is rather neglectible and lower than rebuilding a Quarkus application. Before this change, the "priority" of configuration settings was (in most cases...) effectively: System properties, environment variables, `application.properties`, configurations in Gradle build scripts & Gradle project properties. This order is unintuitive and was changed to: system properties, environment variables, Gradle project properties, Quarkus build configuration, application.properties. The previous code had several places in which Quarkus related configuration settings were retrieved with sometimes different priorities of the "config sources". The whole "machinery" to get the configuration has been rewritten and encapsulated in new classes `EffectiveConfig` (effective for a Quarkus build) and `BaseConfig` (available during Gradle task configuration phase). `EffectiveConfig` is not really different from `BaseConfig, but contains the "special" settings from e.g. `ImagePush` task. The new implementation also uses SmallRye Config and especially the existing Quarkus mechanisms to pull information out of a `PackageConfig` _object_ produced from a configuration. It turned out to be easier to reason about `PackageConfig` than "raw property values". Support for `application.yaml/yml` has also been added. All calls into the "inner workings" of a Quarkus build and Quarkus code generation have been moved to separate worker processes. The code in this change to support this is pretty dead simple. The primary reason to move that work into isolated worker processes is that all the `QuarkusBootstrap` and derived pieces know nothing about Gradle, so there is no "property like" mechanism to override settings from `application.properties/yaml/yml` with those from e.g. the `QuarkusPluginExtension`. The only option would have been to modify the system properties of the Gradle build process - but that's a no-go, especially considering other tasks running in parallel (think: two `QuarkusBuild` of different projects running at the same time). It would have been relatively easy to serialize all `QuarkusBuild` actions across a Gradle build, but why - and it would prevent using "beefy machines" to run many `QuarkusBuild`s in parallel. Another option would have been to implement a rather complex (and likely very racy) mechanism to track modifications to system properties. As a conclusion, it wasn't just very simple to leverage the process isolation using the Gradle worker API, but it's also not bad. Gradle does reuse already spawned and compatible worker instances. The "trick" implemented to "prioritize" configs from Gradle project properties and Quarkus extension settings is to pass the whole config as system properties to the worker. A bunch of hopefully useful logging has been implemented and added. With info-level loogging, the tasks emit at least some basic information about what they do and at least the package type being used. To be able to investigate which configuration settings were effectively used, there's another new task called `quarkusShowEffectiveConfig` that shows all the `quarkus.*` properties and some more information, including the loaded `application.(properties|yaml|yml)`. This task is intended to debug build issues. The task can optionally save the effecitve config properties as a properties file in the `build/` directory, if run with the command line option `./gradlew quarkusShowEffectiveConfig --save-config-properties`. None of the existing tests has changed, except the `BuildCOnfigurationTest` had to be adopted to reflect the updated order of config sources. Relates to: quarkusio#30852
snazy
added a commit
to snazy/quarkus
that referenced
this issue
Mar 27, 2023
Improve Quarkus build in Gradle, proper up-to-date mechanisms, right config-source priorities. Currently the `QuarkusBuild` task implementation adds even large build artifacts and unmodified dependency jars to Gradle's build cache. This pollutes the Gradle build cache quite a lot and therefore lets archived build caches become unnecessary huge. On top of that, "relicts" from previous runs of `QuarkusBuild` using a different configuration, like a different package-type, causes multiple permutations of the cached artifacts and leads to restoring unexpected artifacts from the build cache. This large change updates the Gradle logic to have "proper" up-to-date checks and in turn a reasonable setting of what is cached and what is not cached. Background on how I discovered the "caching and up-to-date issue" with `QuarkusBuild`: Quarkus 2.16 added the `@CacheableTask` annotation to `QuarkusBuild`. This means that _everything_ that is generated by `QuarkusBuild` (and "declared" as an output: fast-jar directory, uber-jar, native-runner) is cached. After having Quarkus 2.16.x in our "main" branch for about 1/2 day, nearly the whole "10 GB budget" for cache artifacts in GitHub was occupied by Gradle build cache artifacts - the build cache artifacts grew from build to build - this was not the case without the `@CacheableTask` annotation. The current inputs of the `QuarkusBuild` task are also incomplete. Those inputs do not include for example the configured `finalName`, the settings on the Gradle project and the settings on the Quarkus extension. This leads to situations that the wrong Gradle cache artifacts could be restored (think: you ask for a fast-jar, but get an uber-jar). Another issue is that repeated runs of `quarkusBuild` against the exact same code base and dependencies leads to different outputs, the contents of the generated jars and e.g. quarkus-application.dat differ. This causes different outputs of the `QuarkusBuild` task, which cause depending tasks and projects to unnecessarily run/work again. The initial goal of this effort was to make caching of `fast-jar` builds work nice in CI environments, I assume that's the most common package type used in CI environments for example for testing purposes. The idea here is to _not_ cache all the dependency jars, but leverage the mechanisms available in Gradle to "just collect" the dependencies. In other words (and simplified): for `fast-jar`, cache everthing "in `quarkus-app` except `lib/`. There is no change for users running a `./gradlew quarkusBuild`. `QuarkusBuild` is still the task that users should execute (or depend on in their builds). To achive the goal to make especially `fast-jar` builds "nicely cacheable" in CI, two new tasks had to be introduced: * `QuarkusBuildDependencies` uses the dependency information from the Quarkus `ApplicationModel` to tell Gradle to collect those. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is _never_ cacheable, but has _up-to-date_ checks. * `QuarkusBuildCacheableAppParts` performs a Quarkus application build and collects the _cacheable_ parts of that build. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is cacheable. * `QuarkusBuild`, for `fast-jar` and `legacy-jar`, assembles the outputs of the above two tasks to produce the expected output. Performs a Quarkus build for other package types. See below on when this task is cacheable. The `build/quarkus-build/` directory is used to properly separate the outputs of the above three tasks: * `build/quarkus-build/gen/` receives the full output of a Quarkus build * `build/quarkus-build/app/` receives the _cacheable_ parts from `build/quarkus-build/gen/` * `build/quarkus-build/dep/` receives the dependency jars The output of `QuarkusBuild` is, by default, cacheable in non-CI environments and _not_ cacheable in CI environments. The behavior can be explicitly overridden using the new property `cacheLargeArtifacts` property. As outlined above, caching huge artifacts is not beneficial in CI, either due to space limitations (GitHub's 10GB limit for example) or just the cost of network traffic/duration. On a developer's machine however, the cost of storing/retrieving even bigger artifacts is rather neglectible and lower than rebuilding a Quarkus application. Before this change, the "priority" of configuration settings was (in most cases...) effectively: System properties, environment variables, `application.properties`, configurations in Gradle build scripts & Gradle project properties. This order is unintuitive and was changed to: system properties, environment variables, Gradle project properties, Quarkus build configuration, application.properties. The previous code had several places in which Quarkus related configuration settings were retrieved with sometimes different priorities of the "config sources". The whole "machinery" to get the configuration has been rewritten and encapsulated in new classes `EffectiveConfig` (effective for a Quarkus build) and `BaseConfig` (available during Gradle task configuration phase). `EffectiveConfig` is not really different from `BaseConfig, but contains the "special" settings from e.g. `ImagePush` task. The new implementation also uses SmallRye Config and especially the existing Quarkus mechanisms to pull information out of a `PackageConfig` _object_ produced from a configuration. It turned out to be easier to reason about `PackageConfig` than "raw property values". Support for `application.yaml/yml` has also been added. All calls into the "inner workings" of a Quarkus build and Quarkus code generation have been moved to separate worker processes. The code in this change to support this is pretty dead simple. The primary reason to move that work into isolated worker processes is that all the `QuarkusBootstrap` and derived pieces know nothing about Gradle, so there is no "property like" mechanism to override settings from `application.properties/yaml/yml` with those from e.g. the `QuarkusPluginExtension`. The only option would have been to modify the system properties of the Gradle build process - but that's a no-go, especially considering other tasks running in parallel (think: two `QuarkusBuild` of different projects running at the same time). It would have been relatively easy to serialize all `QuarkusBuild` actions across a Gradle build, but why - and it would prevent using "beefy machines" to run many `QuarkusBuild`s in parallel. Another option would have been to implement a rather complex (and likely very racy) mechanism to track modifications to system properties. As a conclusion, it wasn't just very simple to leverage the process isolation using the Gradle worker API, but it's also not bad. Gradle does reuse already spawned and compatible worker instances. The "trick" implemented to "prioritize" configs from Gradle project properties and Quarkus extension settings is to pass the whole config as system properties to the worker. A bunch of hopefully useful logging has been implemented and added. With info-level loogging, the tasks emit at least some basic information about what they do and at least the package type being used. To be able to investigate which configuration settings were effectively used, there's another new task called `quarkusShowEffectiveConfig` that shows all the `quarkus.*` properties and some more information, including the loaded `application.(properties|yaml|yml)`. This task is intended to debug build issues. The task can optionally save the effecitve config properties as a properties file in the `build/` directory, if run with the command line option `./gradlew quarkusShowEffectiveConfig --save-config-properties`. None of the existing tests has changed, except the `BuildCOnfigurationTest` had to be adopted to reflect the updated order of config sources. Relates to: quarkusio#30852
snazy
added a commit
to snazy/quarkus
that referenced
this issue
Mar 28, 2023
Improve Quarkus build in Gradle, proper up-to-date mechanisms, right config-source priorities. Currently the `QuarkusBuild` task implementation adds even large build artifacts and unmodified dependency jars to Gradle's build cache. This pollutes the Gradle build cache quite a lot and therefore lets archived build caches become unnecessary huge. On top of that, "relicts" from previous runs of `QuarkusBuild` using a different configuration, like a different package-type, causes multiple permutations of the cached artifacts and leads to restoring unexpected artifacts from the build cache. This large change updates the Gradle logic to have "proper" up-to-date checks and in turn a reasonable setting of what is cached and what is not cached. Background on how I discovered the "caching and up-to-date issue" with `QuarkusBuild`: Quarkus 2.16 added the `@CacheableTask` annotation to `QuarkusBuild`. This means that _everything_ that is generated by `QuarkusBuild` (and "declared" as an output: fast-jar directory, uber-jar, native-runner) is cached. After having Quarkus 2.16.x in our "main" branch for about 1/2 day, nearly the whole "10 GB budget" for cache artifacts in GitHub was occupied by Gradle build cache artifacts - the build cache artifacts grew from build to build - this was not the case without the `@CacheableTask` annotation. The current inputs of the `QuarkusBuild` task are also incomplete. Those inputs do not include for example the configured `finalName`, the settings on the Gradle project and the settings on the Quarkus extension. This leads to situations that the wrong Gradle cache artifacts could be restored (think: you ask for a fast-jar, but get an uber-jar). Another issue is that repeated runs of `quarkusBuild` against the exact same code base and dependencies leads to different outputs, the contents of the generated jars and e.g. quarkus-application.dat differ. This causes different outputs of the `QuarkusBuild` task, which cause depending tasks and projects to unnecessarily run/work again. The initial goal of this effort was to make caching of `fast-jar` builds work nice in CI environments, I assume that's the most common package type used in CI environments for example for testing purposes. The idea here is to _not_ cache all the dependency jars, but leverage the mechanisms available in Gradle to "just collect" the dependencies. In other words (and simplified): for `fast-jar`, cache everthing "in `quarkus-app` except `lib/`. There is no change for users running a `./gradlew quarkusBuild`. `QuarkusBuild` is still the task that users should execute (or depend on in their builds). To achive the goal to make especially `fast-jar` builds "nicely cacheable" in CI, two new tasks had to be introduced: * `QuarkusBuildDependencies` uses the dependency information from the Quarkus `ApplicationModel` to tell Gradle to collect those. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is _never_ cacheable, but has _up-to-date_ checks. * `QuarkusBuildCacheableAppParts` performs a Quarkus application build and collects the _cacheable_ parts of that build. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is cacheable. * `QuarkusBuild`, for `fast-jar` and `legacy-jar`, assembles the outputs of the above two tasks to produce the expected output. Performs a Quarkus build for other package types. See below on when this task is cacheable. The `build/quarkus-build/` directory is used to properly separate the outputs of the above three tasks: * `build/quarkus-build/gen/` receives the full output of a Quarkus build * `build/quarkus-build/app/` receives the _cacheable_ parts from `build/quarkus-build/gen/` * `build/quarkus-build/dep/` receives the dependency jars The output of `QuarkusBuild` is, by default, cacheable in non-CI environments and _not_ cacheable in CI environments. The behavior can be explicitly overridden using the new property `cacheLargeArtifacts` property. As outlined above, caching huge artifacts is not beneficial in CI, either due to space limitations (GitHub's 10GB limit for example) or just the cost of network traffic/duration. On a developer's machine however, the cost of storing/retrieving even bigger artifacts is rather neglectible and lower than rebuilding a Quarkus application. Before this change, the "priority" of configuration settings was (in most cases...) effectively: System properties, environment variables, `application.properties`, configurations in Gradle build scripts & Gradle project properties. This order is unintuitive and was changed to: system properties, environment variables, Gradle project properties, Quarkus build configuration, application.properties. The previous code had several places in which Quarkus related configuration settings were retrieved with sometimes different priorities of the "config sources". The whole "machinery" to get the configuration has been rewritten and encapsulated in new classes `EffectiveConfig` (effective for a Quarkus build) and `BaseConfig` (available during Gradle task configuration phase). `EffectiveConfig` is not really different from `BaseConfig, but contains the "special" settings from e.g. `ImagePush` task. The new implementation also uses SmallRye Config and especially the existing Quarkus mechanisms to pull information out of a `PackageConfig` _object_ produced from a configuration. It turned out to be easier to reason about `PackageConfig` than "raw property values". Support for `application.yaml/yml` has also been added. All calls into the "inner workings" of a Quarkus build and Quarkus code generation have been moved to separate worker processes. The code in this change to support this is pretty dead simple. The primary reason to move that work into isolated worker processes is that all the `QuarkusBootstrap` and derived pieces know nothing about Gradle, so there is no "property like" mechanism to override settings from `application.properties/yaml/yml` with those from e.g. the `QuarkusPluginExtension`. The only option would have been to modify the system properties of the Gradle build process - but that's a no-go, especially considering other tasks running in parallel (think: two `QuarkusBuild` of different projects running at the same time). It would have been relatively easy to serialize all `QuarkusBuild` actions across a Gradle build, but why - and it would prevent using "beefy machines" to run many `QuarkusBuild`s in parallel. Another option would have been to implement a rather complex (and likely very racy) mechanism to track modifications to system properties. As a conclusion, it wasn't just very simple to leverage the process isolation using the Gradle worker API, but it's also not bad. Gradle does reuse already spawned and compatible worker instances. The "trick" implemented to "prioritize" configs from Gradle project properties and Quarkus extension settings is to pass the whole config as system properties to the worker. A bunch of hopefully useful logging has been implemented and added. With info-level loogging, the tasks emit at least some basic information about what they do and at least the package type being used. To be able to investigate which configuration settings were effectively used, there's another new task called `quarkusShowEffectiveConfig` that shows all the `quarkus.*` properties and some more information, including the loaded `application.(properties|yaml|yml)`. This task is intended to debug build issues. The task can optionally save the effecitve config properties as a properties file in the `build/` directory, if run with the command line option `./gradlew quarkusShowEffectiveConfig --save-config-properties`. None of the existing tests has changed, except the `BuildCOnfigurationTest` had to be adopted to reflect the updated order of config sources. Relates to: quarkusio#30852
snazy
added a commit
to snazy/quarkus
that referenced
this issue
Mar 28, 2023
Improve Quarkus build in Gradle, proper up-to-date mechanisms, right config-source priorities. Currently the `QuarkusBuild` task implementation adds even large build artifacts and unmodified dependency jars to Gradle's build cache. This pollutes the Gradle build cache quite a lot and therefore lets archived build caches become unnecessary huge. On top of that, "relicts" from previous runs of `QuarkusBuild` using a different configuration, like a different package-type, causes multiple permutations of the cached artifacts and leads to restoring unexpected artifacts from the build cache. This large change updates the Gradle logic to have "proper" up-to-date checks and in turn a reasonable setting of what is cached and what is not cached. Background on how I discovered the "caching and up-to-date issue" with `QuarkusBuild`: Quarkus 2.16 added the `@CacheableTask` annotation to `QuarkusBuild`. This means that _everything_ that is generated by `QuarkusBuild` (and "declared" as an output: fast-jar directory, uber-jar, native-runner) is cached. After having Quarkus 2.16.x in our "main" branch for about 1/2 day, nearly the whole "10 GB budget" for cache artifacts in GitHub was occupied by Gradle build cache artifacts - the build cache artifacts grew from build to build - this was not the case without the `@CacheableTask` annotation. The current inputs of the `QuarkusBuild` task are also incomplete. Those inputs do not include for example the configured `finalName`, the settings on the Gradle project and the settings on the Quarkus extension. This leads to situations that the wrong Gradle cache artifacts could be restored (think: you ask for a fast-jar, but get an uber-jar). Another issue is that repeated runs of `quarkusBuild` against the exact same code base and dependencies leads to different outputs, the contents of the generated jars and e.g. quarkus-application.dat differ. This causes different outputs of the `QuarkusBuild` task, which cause depending tasks and projects to unnecessarily run/work again. The initial goal of this effort was to make caching of `fast-jar` builds work nice in CI environments, I assume that's the most common package type used in CI environments for example for testing purposes. The idea here is to _not_ cache all the dependency jars, but leverage the mechanisms available in Gradle to "just collect" the dependencies. In other words (and simplified): for `fast-jar`, cache everthing "in `quarkus-app` except `lib/`. There is no change for users running a `./gradlew quarkusBuild`. `QuarkusBuild` is still the task that users should execute (or depend on in their builds). To achive the goal to make especially `fast-jar` builds "nicely cacheable" in CI, two new tasks had to be introduced: * `QuarkusBuildDependencies` uses the dependency information from the Quarkus `ApplicationModel` to tell Gradle to collect those. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is _never_ cacheable, but has _up-to-date_ checks. * `QuarkusBuildCacheableAppParts` performs a Quarkus application build and collects the _cacheable_ parts of that build. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is cacheable. * `QuarkusBuild`, for `fast-jar` and `legacy-jar`, assembles the outputs of the above two tasks to produce the expected output. Performs a Quarkus build for other package types. See below on when this task is cacheable. The `build/quarkus-build/` directory is used to properly separate the outputs of the above three tasks: * `build/quarkus-build/gen/` receives the full output of a Quarkus build * `build/quarkus-build/app/` receives the _cacheable_ parts from `build/quarkus-build/gen/` * `build/quarkus-build/dep/` receives the dependency jars The output of `QuarkusBuild` is, by default, cacheable in non-CI environments and _not_ cacheable in CI environments. The behavior can be explicitly overridden using the new property `cacheLargeArtifacts` property. As outlined above, caching huge artifacts is not beneficial in CI, either due to space limitations (GitHub's 10GB limit for example) or just the cost of network traffic/duration. On a developer's machine however, the cost of storing/retrieving even bigger artifacts is rather neglectible and lower than rebuilding a Quarkus application. Before this change, the "priority" of configuration settings was (in most cases...) effectively: System properties, environment variables, `application.properties`, configurations in Gradle build scripts & Gradle project properties. This order is unintuitive and was changed to: system properties, environment variables, Gradle project properties, Quarkus build configuration, application.properties. The previous code had several places in which Quarkus related configuration settings were retrieved with sometimes different priorities of the "config sources". The whole "machinery" to get the configuration has been rewritten and encapsulated in new classes `EffectiveConfig` (effective for a Quarkus build) and `BaseConfig` (available during Gradle task configuration phase). `EffectiveConfig` is not really different from `BaseConfig, but contains the "special" settings from e.g. `ImagePush` task. The new implementation also uses SmallRye Config and especially the existing Quarkus mechanisms to pull information out of a `PackageConfig` _object_ produced from a configuration. It turned out to be easier to reason about `PackageConfig` than "raw property values". Support for `application.yaml/yml` has also been added. All calls into the "inner workings" of a Quarkus build and Quarkus code generation have been moved to separate worker processes. The code in this change to support this is pretty dead simple. The primary reason to move that work into isolated worker processes is that all the `QuarkusBootstrap` and derived pieces know nothing about Gradle, so there is no "property like" mechanism to override settings from `application.properties/yaml/yml` with those from e.g. the `QuarkusPluginExtension`. The only option would have been to modify the system properties of the Gradle build process - but that's a no-go, especially considering other tasks running in parallel (think: two `QuarkusBuild` of different projects running at the same time). It would have been relatively easy to serialize all `QuarkusBuild` actions across a Gradle build, but why - and it would prevent using "beefy machines" to run many `QuarkusBuild`s in parallel. Another option would have been to implement a rather complex (and likely very racy) mechanism to track modifications to system properties. As a conclusion, it wasn't just very simple to leverage the process isolation using the Gradle worker API, but it's also not bad. Gradle does reuse already spawned and compatible worker instances. The "trick" implemented to "prioritize" configs from Gradle project properties and Quarkus extension settings is to pass the whole config as system properties to the worker. A bunch of hopefully useful logging has been implemented and added. With info-level loogging, the tasks emit at least some basic information about what they do and at least the package type being used. To be able to investigate which configuration settings were effectively used, there's another new task called `quarkusShowEffectiveConfig` that shows all the `quarkus.*` properties and some more information, including the loaded `application.(properties|yaml|yml)`. This task is intended to debug build issues. The task can optionally save the effecitve config properties as a properties file in the `build/` directory, if run with the command line option `./gradlew quarkusShowEffectiveConfig --save-config-properties`. None of the existing tests has changed, except the `BuildCOnfigurationTest` had to be adopted to reflect the updated order of config sources. Relates to: quarkusio#30852
holly-cummins
pushed a commit
to holly-cummins/quarkus
that referenced
this issue
Apr 20, 2023
Improve Quarkus build in Gradle, proper up-to-date mechanisms, right config-source priorities. Currently the `QuarkusBuild` task implementation adds even large build artifacts and unmodified dependency jars to Gradle's build cache. This pollutes the Gradle build cache quite a lot and therefore lets archived build caches become unnecessary huge. On top of that, "relicts" from previous runs of `QuarkusBuild` using a different configuration, like a different package-type, causes multiple permutations of the cached artifacts and leads to restoring unexpected artifacts from the build cache. This large change updates the Gradle logic to have "proper" up-to-date checks and in turn a reasonable setting of what is cached and what is not cached. Background on how I discovered the "caching and up-to-date issue" with `QuarkusBuild`: Quarkus 2.16 added the `@CacheableTask` annotation to `QuarkusBuild`. This means that _everything_ that is generated by `QuarkusBuild` (and "declared" as an output: fast-jar directory, uber-jar, native-runner) is cached. After having Quarkus 2.16.x in our "main" branch for about 1/2 day, nearly the whole "10 GB budget" for cache artifacts in GitHub was occupied by Gradle build cache artifacts - the build cache artifacts grew from build to build - this was not the case without the `@CacheableTask` annotation. The current inputs of the `QuarkusBuild` task are also incomplete. Those inputs do not include for example the configured `finalName`, the settings on the Gradle project and the settings on the Quarkus extension. This leads to situations that the wrong Gradle cache artifacts could be restored (think: you ask for a fast-jar, but get an uber-jar). Another issue is that repeated runs of `quarkusBuild` against the exact same code base and dependencies leads to different outputs, the contents of the generated jars and e.g. quarkus-application.dat differ. This causes different outputs of the `QuarkusBuild` task, which cause depending tasks and projects to unnecessarily run/work again. The initial goal of this effort was to make caching of `fast-jar` builds work nice in CI environments, I assume that's the most common package type used in CI environments for example for testing purposes. The idea here is to _not_ cache all the dependency jars, but leverage the mechanisms available in Gradle to "just collect" the dependencies. In other words (and simplified): for `fast-jar`, cache everthing "in `quarkus-app` except `lib/`. There is no change for users running a `./gradlew quarkusBuild`. `QuarkusBuild` is still the task that users should execute (or depend on in their builds). To achive the goal to make especially `fast-jar` builds "nicely cacheable" in CI, two new tasks had to be introduced: * `QuarkusBuildDependencies` uses the dependency information from the Quarkus `ApplicationModel` to tell Gradle to collect those. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is _never_ cacheable, but has _up-to-date_ checks. * `QuarkusBuildCacheableAppParts` performs a Quarkus application build and collects the _cacheable_ parts of that build. This task works for `fast-jar` and `legacy-jar` and is a "no op" for other package types. This task is cacheable. * `QuarkusBuild`, for `fast-jar` and `legacy-jar`, assembles the outputs of the above two tasks to produce the expected output. Performs a Quarkus build for other package types. See below on when this task is cacheable. The `build/quarkus-build/` directory is used to properly separate the outputs of the above three tasks: * `build/quarkus-build/gen/` receives the full output of a Quarkus build * `build/quarkus-build/app/` receives the _cacheable_ parts from `build/quarkus-build/gen/` * `build/quarkus-build/dep/` receives the dependency jars The output of `QuarkusBuild` is, by default, cacheable in non-CI environments and _not_ cacheable in CI environments. The behavior can be explicitly overridden using the new property `cacheLargeArtifacts` property. As outlined above, caching huge artifacts is not beneficial in CI, either due to space limitations (GitHub's 10GB limit for example) or just the cost of network traffic/duration. On a developer's machine however, the cost of storing/retrieving even bigger artifacts is rather neglectible and lower than rebuilding a Quarkus application. Before this change, the "priority" of configuration settings was (in most cases...) effectively: System properties, environment variables, `application.properties`, configurations in Gradle build scripts & Gradle project properties. This order is unintuitive and was changed to: system properties, environment variables, Gradle project properties, Quarkus build configuration, application.properties. The previous code had several places in which Quarkus related configuration settings were retrieved with sometimes different priorities of the "config sources". The whole "machinery" to get the configuration has been rewritten and encapsulated in new classes `EffectiveConfig` (effective for a Quarkus build) and `BaseConfig` (available during Gradle task configuration phase). `EffectiveConfig` is not really different from `BaseConfig, but contains the "special" settings from e.g. `ImagePush` task. The new implementation also uses SmallRye Config and especially the existing Quarkus mechanisms to pull information out of a `PackageConfig` _object_ produced from a configuration. It turned out to be easier to reason about `PackageConfig` than "raw property values". Support for `application.yaml/yml` has also been added. All calls into the "inner workings" of a Quarkus build and Quarkus code generation have been moved to separate worker processes. The code in this change to support this is pretty dead simple. The primary reason to move that work into isolated worker processes is that all the `QuarkusBootstrap` and derived pieces know nothing about Gradle, so there is no "property like" mechanism to override settings from `application.properties/yaml/yml` with those from e.g. the `QuarkusPluginExtension`. The only option would have been to modify the system properties of the Gradle build process - but that's a no-go, especially considering other tasks running in parallel (think: two `QuarkusBuild` of different projects running at the same time). It would have been relatively easy to serialize all `QuarkusBuild` actions across a Gradle build, but why - and it would prevent using "beefy machines" to run many `QuarkusBuild`s in parallel. Another option would have been to implement a rather complex (and likely very racy) mechanism to track modifications to system properties. As a conclusion, it wasn't just very simple to leverage the process isolation using the Gradle worker API, but it's also not bad. Gradle does reuse already spawned and compatible worker instances. The "trick" implemented to "prioritize" configs from Gradle project properties and Quarkus extension settings is to pass the whole config as system properties to the worker. A bunch of hopefully useful logging has been implemented and added. With info-level loogging, the tasks emit at least some basic information about what they do and at least the package type being used. To be able to investigate which configuration settings were effectively used, there's another new task called `quarkusShowEffectiveConfig` that shows all the `quarkus.*` properties and some more information, including the loaded `application.(properties|yaml|yml)`. This task is intended to debug build issues. The task can optionally save the effecitve config properties as a properties file in the `build/` directory, if run with the command line option `./gradlew quarkusShowEffectiveConfig --save-config-properties`. None of the existing tests has changed, except the `BuildCOnfigurationTest` had to be adopted to reflect the updated order of config sources. Relates to: quarkusio#30852
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the bug
This commit introduced a serious regression. We pretty much run into CI failures, like in this CI run (Gradle build scan).
It might be that the cached artifact of a native-build is messed up with a build for a fast-jar - or something else?
The relevant configuration of the
QuarkusBuild
task is here.We made
GradleBuild
caching with 2.15.3.Final before, which seemed to work.While it is generally nice that the
QuarkusBuild
task is cacheable, the size of the cached artifact(s) is really huge, because it contains literally all dependencies. In our case, a cachedQuarkusBuild
execution ("standard" runner jar) occupies about 130M, where 129M are just dependencies inbuild/quarkus-app/lib/
, which are available from elsewhere.The Gradle cache keeps recent and not just the most recent output of a cacheable task, so the above 129M are cached over and over again, letting the Gradle cache "explode" just due to the cacheable output of the
QuarkusBuild
task - in numbers: the build-cache stored by the GitHub actions is currently about 2G big, grown from a few-10M.IMO the
QuarkusBuild
task should be cacheable, but definitely not include dependencies that are available from elsewhere.Running the
QuarkusBuild
("fast jar" or even "uber jar") during every CI run is not that much of an issue. The real issue is that if other projects/tasks depend on the output ofQuarkusBuild
, those are not able to cache or "pass" their "up to date" checks, because the output of a quarkus build is not reproducible. Means, even if the inputs of theQuarkusBuild
task are the same, it produces a different output every time.quarkus-application.dat
is different every time, even if the inputs are exactly the same(We did not notice this issue earlier due to #30453)
Expected behavior
No response
Actual behavior
Stack trace:
How to Reproduce?
No response
Output of
uname -a
orver
No response
Output of
java -version
No response
GraalVM version (if different from Java)
No response
Quarkus version or git rev
2.16.1.Final
Build tool (ie. output of
mvnw --version
orgradlew --version
)No response
Additional information
No response
The text was updated successfully, but these errors were encountered: