-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Build Cache Task #9190
Comments
Thanks for raising this issue @kapilt. We are actively looking into a build caching solution for Azure Pipelines, we have done some initial design work and hope to start work on it in the new year. |
Azure Pipelines offers many advantages over AppVeyor, with the biggest one being a maximum of 10 parallel builds for open-source projects, drastically reducing the builds completion time. A pipeline takes about the same time to build on the two CI infrastructures, however the caching feature is currently only available on AppVeyor, meaning that on Azure Pipelines the compressed environment is downloaded for each build consuming about 10 minutes. This means that we'll save at least 10 minutes per pipeline compared to AppVeyor once the feature is added to Azure Pipelines: microsoft/azure-pipelines-tasks#9190
Azure Pipelines offers many advantages over AppVeyor, with the biggest one being a maximum of 10 parallel builds for open-source projects, drastically reducing the builds completion time. A pipeline takes about the same time to build on the two CI infrastructures, however the caching feature is currently only available on AppVeyor, meaning that on Azure Pipelines the compressed environment is downloaded for each build consuming about 10 minutes. This means that we'll save at least 10 minutes per pipeline compared to AppVeyor once the feature is added to Azure Pipelines: microsoft/azure-pipelines-tasks#9190
Thanks for all the feedback and interest. Since @mitchdenny’s last post, we’ve completed an initial design for Pipeline Caching, and we’re starting the implementation now. You can see how it’ll work and can leave comments in the PR. We look forward to your feedback and are excited to have you try it out once it’s ready. |
I have a suggestion that there is somehow native support for vcpkg or an easy way to make vcpkg work. Vcpkg builds are a perfect candidate for the build cache. Try vcpkg install cgal :-) For some reason the ubuntu images have vcpkg installed but the Windows ones don't. Anyone from MSFT know why? |
Adding @chrisrpatterson to comment on having vcpkg pre-baked on the Windows agent. For caching, we see native builds as a scenario we want to improve. The initial steps will be building blocks and we'll tackle serving individual ecosystems as we go along. We've already had conversations with the vcpkg team :) |
This is what I have built for our pipeline, uses an NPM Artifact feed. Bit of a hack but it works for now. # Yaml Spec: https://aka.ms/yaml
# AzureDevOps Pipeline Caching
#
# Right now there is no out of box solution.
# see: https://github.com/Microsoft/azure-pipelines-tasks/issues/9190
#
# This home grown solution uses an npm registry as a central cache
parameters:
# The name of artifact that you wish to retrieve from cache.
# This becomes the NPM package name, therefore it must be
# a valid npm package name.
name: ''
# A command to generate a hash key.
# If the hash does not exist nothing will be downloaded.
# By default we hash the provided gencmd.
hashcmd: 'echo "${GENCMD}" | sha1sum'
# A command to execute in the event of a cache miss
gencmd: 'echo "Failed to provide gencmd" && exit 1;'
# The directory to extract the cache into.
# This can be any folder, if it does not exist we will create it.
directory: $(Build.ArtifactStagingDirectory)
# NPM Registry to connect to for storing cache artifacts
npmRegistry: $(build.cache.npm.registry)
npmToken: $(build.cache.npm.token)
steps:
- bash: |
set -Eeuo pipefail;
OLD_CWD="$PWD";
function installRc {
echo "Installing build cache .npmrc config file";
if [ -f ~/.npmrc ];
then
mv ~/.npmrc ~/.npmrc.bk;
fi;
echo "always-auth=true" >> ~/.npmrc;
echo "registry=https://${NPM_REGISTRY}/" >> ~/.npmrc;
echo "//${NPM_REGISTRY}/:_authToken=${NPM_TOKEN}" >> ~/.npmrc;
}
function restoreRc {
echo "Restoring original .npmrc config file";
if [ -f ~/.npmrc.bk ];
then
mv ~/.npmrc.bk ~/.npmrc;
else
rm -f ~/.npmrc;
fi;
cd $OLD_CWD;
rm -rf /tmp/cacheprep;
}
trap restoreRc EXIT;
installRc;
GENCMD="${{parameters.gencmd}}";
HASH=$(${{parameters.hashcmd}});
HASH=${HASH::-3}
echo "Calculated hash key - $HASH";
BEARER=$(cat ~/.npmrc | grep -m 1 _authToken | sed -e 's/.*_authToken=//g');
REGISTRY=$(cat ~/.npmrc | grep -m 1 _authToken | sed -e 's/:_authToken=.*//g');
RESPONSE=$(curl -s -H "Authorization: Bearer $BEARER" https:${REGISTRY}${{parameters.name}});
if [[ $RESPONSE != *"$HASH"* ]];
then
echo "Cache miss, running gencmd";
restoreRc;
eval "${{parameters.gencmd}}";
installRc;
rm -rf /tmp/cacheprep;
mkdir -p /tmp/cacheprep;
tar -czf /tmp/cacheprep/data.tar.gz -C ${{parameters.directory}} .;
cd /tmp/cacheprep;
echo "{\"name\":\"${{parameters.name}}\",\"version\":\"0.0.0-$HASH\"}" > package.json;
cat package.json;
npm -reg https:${REGISTRY} publish;
exit 0;
fi
echo "Ensure target directory exists - ${{parameters.directory}}";
if [ ! -d ${{parameters.directory}} ];
then
mkdir -p ${{parameters.directory}};
fi;
cd ${{parameters.directory}};
echo "Downloading cached artifacts";
curl -L -O -H "Authorization: Bearer $BEARER" https:${REGISTRY}${{parameters.name}}/-/${{parameters.name}}-0.0.0-$HASH.tgz;
tar -xzf ${{parameters.name}}-0.0.0-${HASH}.tgz;
rm -f ${{parameters.name}}-0.0.0-${HASH}.tgz;
tar -xzf ./package/data.tar.gz;
rm -rf ./package;
env:
NPM_REGISTRY: ${{parameters.npmRegistry}}
NPM_TOKEN: ${{parameters.npmToken}}
displayName: Pipeline cache (${{parameters.name}}) |
You might find that Universal Packages gives you better performance here (the protocol that Universal Packages uses will be what our build cache uses). |
I did start out by trying to use Universal Packages but the performance was not amazing. Not necessarily due to the underlying protocols it uses but due to the fact you always have to install ArtifactTool.exe If you follow the instructions from the Feed Connection page you need to have the vsts cli installed, which then goes and downloads the ArtifactTool.exe If you use the Universal Packages "task" it also has to install the ArtifactTool.exe and I imagine the AzureCli is no different. I guess if this additional executable was baked into the VM images it would speed things up a lot. I tried looking for docs/specs on a HTTP API for Universal Packages but couldn't find anything useful. So that's when I changed over to NPM. The registry spec is openly known and no additional tools need to be downloaded to use it. To download a cached artifact the above pipeline step normally runs in sub second territory. |
For Pipeline Caching we'll end up being baked into the agent, so nothing to download. I am interested in how much your depdencies are that they are coming down sub-second. How many files on disk, total volume on disk, size of archive/package? |
This is the one that runs sub-second so admittedly it's not a massive payload to download at only 8MB but when I was using Universal Packages it was faster to just build this artifact (a golang binary) instead of cache it. Anyway hopefully the built in cache mechanism being built at the moment will be nice and performant.
|
Closing this issue as Azure Pipelines now has pipeline caching. |
@mitchdenny yet this says it's WIP? https://github.com/microsoft/azure-pipelines-yaml/blob/master/design/pipeline-caching.md |
Yeah - that was a design note - the ** Work In Progress ** thing wasn't removed before it was merged. Here is where the feature is described right now: https://devblogs.microsoft.com/devops/caching-and-faster-artifacts-in-azure-pipelines/ I believe there is still more work to do - but it is in a usable state right now. |
Is there a place to leave feedback? I have some... thoughts |
|
Its very common for npm, pip, and others to use a build cache to speed up installations and time to first test execution. For many hosted ci platforms this is accomplished via a plugin that stores/retrieves cache artifacts from a cloud provider object storage. Ideally an azure pipeline integration would work the same so it works regardless of hosted/byoa execution runtimes.
There have been numerous requests over the years for the same.
https://visualstudio.uservoice.com/forums/330519-azure-devops-formerly-visual-studio-team-services/suggestions/32044321-improve-hosted-build-agent-performance-with-build
https://feedback.azure.com/forums/169382-cache/suggestions/35604928-support-caching-in-azure-pipelines
#7254
#553
The text was updated successfully, but these errors were encountered: