-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal on caching #412
proposal on caching #412
Conversation
I like it so far! my suggestion would be to finish it by having a proposal on how one would do caching based on each main technology currently supported (aka go, java, typescript, python). |
Makes sense to me. I will start looking at node/npm. Happy to have volunteers for all techs! The way I am testing this is locally against one of our projects. I wrote a little helper script for this available at https://gist.github.com/henrjk/4265bc1b4f6fe3f7cd9027b0bcc284f9 |
volunteering for Go |
volunteering for python |
With commit 4dc6da4 I added a first draft to enable typescript caching. Some of this should be familiar to @gerardcl. The script will require rsync which would need to be added to the image. I also noted that git is not available and so it uses FYI @henninggross, @netzartist |
docs/proposals/caching.adoc
Outdated
|
||
== Con | ||
|
||
* Tools such as for typescript https://github.com/pnpm/pnpm[pnpm] and https://github.com/yarnpkg/berry[yarn berry] support deduplication allowing to reference artifacts from a central location without packaging them again. pnpn uses hard links to make this efficient. However hard linking is not supported across PVCs and thus could not be used with the proposed design as is. See also related discussion at https://github.com/opendevstack/ods-pipeline/discussions/411[node typescript security and caching - npm or yarn? #411]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To enable exploring the newer npm techs using hard linking, I am now in favor of changing the proposal so that pipeline-cache-dir
and global-cache-parent
would always be on the same PVC. However the checkout dir might still not be on the same PVC and perhaps not even be a PVC at all.
docs/proposals/caching.adoc
Outdated
== Background | ||
|
||
Currently there is a single PVC per project which is shared by all build pipelines. | ||
As a consequence only a single build at a time can run for a project. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will add this clarification to the proposal so we don't loose it when resolving the conversation.
docs/proposals/caching.adoc
Outdated
|
||
With sufficient caching space available: | ||
|
||
* A global cache directory where the parent directory is indicated by parameter `global-cache-parent`. Cleanup will only spare directories with in it so that build tasks must keep stuff in a subdirectory. For example in `$(global-cache-parent)/go-modules/`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need the parameter global-cache-parent
? Do you think some build tools will not be configurable to point to a directory of our choice?
docs/proposals/caching.adoc
Outdated
|
||
* A pipeline cache directory as indicated by parameter `pipeline-cache-dir`. | ||
|
||
In the initial implementation the parameters will be set to `/.cache/` and `/.cache-p/<pipeline-name>/` if space permits and be empty if there is not sufficient space available. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why /.cache-p/<pipeline-name>/
? Does that mean you want caching within a branch to be on by default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pipeline caching should only be enabled when the tasks declares how much free space it requires for a new pipeline with a parameter pipeline-caching-mb
greater than 0.
There should probably also be way to not using caching for a specific commit, perhaps by having [build-no-cache]
as part of a commit message.
The pipeline cache should be left undisturbed from the previous build of the same pipeline, except in case where cache space is reclaimed. Not sure it answers your first question.
docs/proposals/caching.adoc
Outdated
|
||
Files in the cache locations would persist between builds unless they are removed by the cleanup mechanism. | ||
|
||
The build scripts are adjusted to take advantage of the cache parameters above. In addition each build task will support a new parameters `global-caching-mb` and `pipeline-caching-mb`. These defaults to 0 to indicate the task is not using caching. Otherwise it specify the minimum required cache space in MB. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the reasoning behind two separate parameters? Why do we need both?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As now mentioned in the comment above one purpose of these parameters is to enable caching by explicitly stating how much cache space the task will require in the respective cache.
For the pipeline cache this will also allow for a good cleanup strategy as the space is not shared with other pipelines.
For the global cache the situation is more complex as it would be shared by all pipelines. So it would grow over time and cleanup might from time to time wipe out the cache (e.g. rm -rf /.cache/go
) when the disk space on the caching PVC no longer is able to fulfill the demand. However to do a good job the cleaner would actually need to understand which tasks use which global caches which the proposal as is does not make explicit.
docs/proposals/caching.adoc
Outdated
|
||
Before cache cleanup `ods-start` cleans up the prior PVC checkout dir, which means everything except `/.cache/<some-dir>` or `/.cache-p/<some-dir>`. In particular to prevent having files at the top levels regular files or links below `/.cache/` or `/.cache-p/` are deleted. | ||
|
||
`ods-start` then cleans up separately the global and pipeline cache to fulfill the declared requirements of all build tasks in the odl.yaml. If disk space remains too low in a particular cache area a warning message is logged and the respective parameter `global-cache-parent` and/or `pipeline-cache-dir` will be empty. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cleans up separately the global and pipeline cache
I think while reading through this I am actually confused what the difference is and why we need it. Could you clarify your proposal regarding this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The pipeline cache enable to reuse installs of dependencies in subsequent builds similar to how happens on your local machine.
For example npm install
will install its dependencies in node_modules
. A subsequent build will verify that the installs are consistent with the latest package*.json
version (which is nearly instantaneous). Before installing a new dependency npm will always download these to a local cache first. This cache is by default at ~/.npm
. One can reconfigure this cache to be in the global cache and would speed up the build, however the performance gain probably does not outweigh the extra space requirement in my opinion. See also #411 (comment) for some impressions of performance gains.
The global cache in contrast is shared by multiple build pipelines and thus not meant to share installations of dependencies but only the dependencies itself.
Michael and I were talking directly about some of the details before the weekend.
This would indeed simplify things a bit, in particular the build script would not have copy sources from the checkout dir to the pipeline cache dir. Also there would be no longer a need to distinguish these directories and we could simply call it workspace directory :-) If one enable caching for a workspace by setting a @michaelsauter is this a good summary? If it makes sense to you I would work on an updated proposal based on this. |
@henrjk Yes, I think it does, but I would like to add the general view I shared to provide some background. I am not too excited to add a "workspace cache" at all into the main build tasks. In my view, starting from a clean slate is one of the "features" of CI and I personally would not encourage usage of a "workspace cache" in my projects. I believe we could also use the problem as a trigger to explore completely different approaches to avoid long build times, such as using a bundler like esbundle, or using YARN as you suggested. For other languages / build tools I think only a global cache is needed, if it all (as Nexus should be fast already!). |
I agree with your concerns and call to action! At the moment the workspace cache would not improve FE builds by much. Nonetheless I would like to pursue a workspace cache proposal a bit further to learn whether we can provide a workspace cache and
Can we support multi build repos better by skipping builds where nothing changed? Do we need a workspace cache to enable this? For example the idea emerged after my initial proposal was to utilize the workspace cache to skip build tasks if their work-dir did not change as per git commit. When looking at the details the build-task would still be invoked to copy prior reports and bits into the docker context so that the image build could still continue. This is for the case where we have a FE and BE in the same repo and the FE packed js files are copied into the final docker build. Of course this increases complexity in the build scripts and perhaps one could perhaps factor out the uploading of reports and skip this entirely in the further pipeline. Or as mentioned above implement this without a workspace cache. |
I update the proposal for increased clarity. It still contains both workspace caching and global caching. I believe things a bit simpler also in terms of build script complexity, but there are areas where I am not sure it can be implemented as proposed. Also the Con section of caching.adoc is now somewhat better informed. I will investigate what it would take to skip builds in multi build repos where sources did not change without using a workspace cache. |
@henrjk Like the overhaul of the proposal! I guess I am still having trouble to understand the actual benefit of a "workspace cache" over using a global cache, within the context of the original problem statement of "caching dependencies". I do not know much about Python/NPM, so what I'm going to say may sound stupid - sorry in advance. I am basically approaching this coming from my - hopefully somewhat comparable - Ruby/Bundler experience. For Bundler, it is common in CI systems to place the bundle not into the workspace, but in a directory which is cached. I did some googling and it seems that So - sorry if this doesn't make sense, and then I'll stop bringing it up 😄 However if it does make sense I propose to split this up into two suggestion, where at least the global cache seems way clearer and easier to achieve to me. Also note that my current understanding is that |
I also see the point of splitting and start with the global option one, since it will also work for python by caching and using the venv on that cache (same as bundler) -> we just need to ensure we give the option to load the venv fromthe given cache and if present. A problem we need to address for sure first of all is to ensure we are not having many pipelines using the same cache (related to race condition already under discussion). |
Corrected: earlier I said --index where I meant --prefix. @michaelsauter Thanks for doubting!
In the workspace cache proposal the script actually adjusts to use
Answering this could be its own blog series ;-) With node and npm if you have two apps depending on the same version of package foo each app may resolve the foo's own package dependencies differently depending on additional version constraints and thus one cannot share the node_modules as they are created by npm install or ci. Using To not overshare this it would need to be a branch cache and thus essentially be a workspace cache. Using a There are however alternative to npm which are pnpm and yarn for which a true global shared cache is probably worthwhile. The benchmarks from pnpm compare the various package managers. The code behind it is at commandsMap.js |
Deno has a interesting concept that you could skip dependency management like NPM at all via importing desired versions of dependencies via URLs into the Deno runtime, and exporting them in a custom lockfile to be used for caching.
The inventor of NPM created Deno after reflecting a lot about the architectural issues of NPM. |
docs/proposals/caching.adoc
Outdated
|
||
=== Global cache | ||
|
||
Only build tasks for which a global cache provides a significant performance gain will support global caching. At the moment the primary candidates is the go language. Nexus does not support go and we have no similar go dependency artifacts manager in place. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gradle will also benefit. Normally because it always download the wrapper, unless the url that points to the gradlew wrapper is being updated to point to nexus down (opendevstack/ods-core#919). However to avoid this problem, I added to the gradle task an instruction to download the gradlew wrapper so that it lands in the gradle cache folder (
gradle wrapper --gradle-distribution-sha256-sum ${GRADLE_WRAPPER_DOWNLOAD_SHA256} && ./gradlew -version && \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made some comments to share some thoughts but I don't any blockers. I think the concept is ready and we could give it a try. It looks good to me!
Just update the proposal to focus on global caching only. I also aimed to undo my changes that showcased the workspace cache. The typescript build file was intended to be the same as in master now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am very happy with where this is now 👍. Thanks heaps for all the work you put into this! Marking this as "approved", even though I guess we would not merge the PR as-is but maybe just the cache proposal design doc?
docs/proposals/caching.adoc
Outdated
|
||
The following new parameter is introduced to build tasks supporting dependency caching: | ||
|
||
* `cache-deps` a boolean defaulting to `false`. Only if set to `true` the task will used dependency caching. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer cache-dependencies
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will change this as you propose.
|
||
In addition to the build task parameter, build tasks also receives the following: | ||
|
||
* File `.ods/cache-deps-parent-dir` contains an absolute path without trailing '/' to an existing directory (no whitespace or newlines). If file `.ods/cache-deps-parent-dir` does not exits the task must not use dependency caching. This is used to switch off dependency caching dynamically for example with a special tag in a git commit message. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would propose to start without disabling via commit message. Do we need this file then at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strictly speaking we wouldn't need it. However without it the caching locations would need to be hardcoded, which I would rather not have.
Instead if disk space issues arise one should manually: | ||
|
||
* Increase the PVC space of the associated repository or | ||
* Recreate the PVC (TODO: should this be done automatically for example on a randomized bi-weekly schedule for example) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd propose to keep it manual for now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense to me as well. A first version should do that this way.
docs/proposals/caching.adoc
Outdated
|
||
The following cache locations on the PVC are used: | ||
|
||
- `+/.c/deps/<technology-name>/+` for dependency caches of a particular build technology. The technology-name would be defined by the build script and unknown to `ods-start`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer e.g. /.ods-cache/dependencies/<technology-name>/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this a bit lengthy. How about . /.ods-cache/deps/<technology-name/
?
Not sure. If we were to merge the proposals as well, there would still be a discussions etc around it missing. So perhaps it is better to have a link to the proposal PR in the changelog. I would still aim to remove the other unrelated historic changes, so that only the latest adoc is kept (with final additions). I could start work on this soon (unless somebody else if keen to) and work on a PR which likely should also include one consumer. I'd choose go for this. |
This also removes changes intended showcase workspace caching for typescript.
This has now been implemented with PR #460 |
This proposal aims to address issue #147
Looking forward to your feedback!