Error: Failed to synchronize cache for repo 'updates' #11452

csrwng · 2016-10-19T18:18:27Z

Parent issue for discussion: #8571

====

Not sure if this is the same as the yum failures, but opening just in case it's different

https://ci.openshift.redhat.com/jenkins/job/test_pull_requests_origin_networking/277/

csrwng · 2016-10-19T18:18:40Z

@stevekuznetsov fyi

marun · 2016-10-21T02:04:29Z

I've been seeing this failure regularly when I attempt to dnf -y update fedora24 images.

marun · 2016-10-21T02:11:07Z

I've added failure cause 'dnf update failure' to jenkins.

stevekuznetsov · 2016-10-21T15:43:15Z

This is either an internet connectivity issue or a mirror issue... @tdawson are we mirroring @updates internally somewhere we can use?

tdawson · 2016-10-21T15:53:13Z

I believe this is a Fedora only issue. We are not mirroring any plain Fedora repo's, only EPEL.

stevekuznetsov · 2016-10-21T16:01:01Z

I believe this is a Fedora only issue

Yes, this is true.

As these types of issues proliferate, I think we need a better strategy for interacting with mirrors in general. Can we reduce the yum traffic? When we're building the DIND images do we really need to be doing the installs every time? Why can't we just layer the code/variant bits on top and update the base image with OS dependencies once a week or so? @marun

marun · 2016-10-21T19:14:58Z

@stevekuznetsov I don't think this is a connectivity issue. In addition to these ci failures, I've seen the same error building fedora images locally or via the docker hub. Something is up with the fedora repos.

stevekuznetsov · 2016-10-21T19:25:57Z

I understand -- if we don't build them, we don't have the issue. What are we gaining by re-installing the dependencies in every build?

marun · 2016-10-21T19:31:00Z

I think it's a good idea to build regularly to ensure we catch problems before they impact too many people on the networking team. But that doesn't have to be with every PR. I think a good strategy would be to bake the dind images into the ami and then rely on the extended and post-merge jobs to catch build-related regressions.

marun · 2016-10-21T19:32:32Z

Related pr: #9622

bparees · 2017-04-28T14:17:07Z

not clear from @stevekuznetsov's comment above if what i just hit is this flake or not:

https://ci.openshift.redhat.com/jenkins/job/test_pull_request_origin_extended_networking_minimal/1529

Step 2 : RUN dnf -y update && dnf -y install bind-utils findutils hostname iproute iputils less procps-ng tar which bridge-utils ethtool iptables-services openvswitch && dnf clean all
 ---> Running in a3eed2d6d478
Error: Failed to synchronize cache for repo 'updates'
The command '/bin/sh -c dnf -y update && dnf -y install bind-utils findutils hostname iproute iputils less procps-ng tar which bridge-utils ethtool iptables-services openvswitch && dnf clean all' returned a non-zero code: 1
[ERROR] PID 16368: hack/dind-cluster.sh:276: `${DOCKER_CMD} build -t "${image_name}" .` exited with status 1.
[INFO] 		Stack Trace: 
[INFO] 		  1: hack/dind-cluster.sh:276: `${DOCKER_CMD} build -t "${image_name}" .`
[INFO] 		  2: hack/dind-cluster.sh:267: build-image
[INFO] 		  3: hack/dind-cluster.sh:372: build-images
[INFO]   Exiting with code 1.
[ERROR] PID 906: test/extended/networking.sh:350: `${CLUSTER_CMD} build-images` exited with status 1.
[INFO] 		Stack Trace: 
[INFO] 		  1: test/extended/networking.sh:350: `${CLUSTER_CMD} build-images`
[INFO]   Exiting with code 1.
/data/src/github.com/openshift/origin/hack/lib/log/system.sh: line 31: 16363 Terminated              sar -A -o "${binary_logfile}" 1 86400 > /dev/null 2> "${stderr_logfile}"
[ERROR] PID 853: test/extended/networking-minimal.sh:6: `NETWORKING_E2E_MINIMAL=1 "${OS_ROOT}/test/extended/networking.sh"` exited with status 1.
[INFO] 		Stack Trace: 
[INFO] 		  1: test/extended/networking-minimal.sh:6: `NETWORKING_E2E_MINIMAL=1 "${OS_ROOT}/test/extended/networking.sh"`
[INFO]   Exiting with code 1.
make: *** [test-extended] Error 1
++ export status=FAILURE
++ status=FAILURE
+ set +o xtrace
########## FINISHED STAGE: FAILURE: RUN EXTENDED TESTS ##########

bparees · 2017-04-28T14:17:53Z

but i think it's this one. it's not a specific package not found issue, just a general dnf connection issue.

stevekuznetsov · 2017-04-28T14:23:42Z

The previous error was 06:44:44 Error: No packages marked for removal. which is why I pointed him at a different issue. The logs you posted show the dnf issue that is really being tracked here.

deads2k · 2017-04-28T15:12:41Z

@stevekuznetsov is this another ocurrence https://ci.openshift.redhat.com/jenkins/job/test_pull_request_origin_extended_networking_minimal/1535/consoleText ?

stevekuznetsov · 2017-04-28T15:16:23Z

Sure looks like it. Must have been a roll-out of a new version to @updates recently. Until we serve RPMs from our mirrors and our mirrors only (@gmontero) inside and outside of container builds, we'll continue seeing this forever. Or we could try to change the yum backend to be more graceful here.

danwinship · 2017-04-28T15:42:44Z

It's interesting that the failure seems to always happen when building the second image; openshift/dind builds successfully (including doing a "dnf update") but then openshift/dind-node fails. Maybe if we drop the "dnf clean all" from the openshift/dind Dockerfile then this bug will magically go away?

stevekuznetsov · 2017-04-28T15:58:42Z

I don't know enough about the environment to say for certain but I would be surprised if the caches or other dnf data were actually interacting between the two builds. @smarterclayton would you expect that sort of cross-pollination to be possible?

danwinship · 2017-04-28T16:08:01Z

openshift/dind-node is built "FROM openshift/dind", so its "RUN dnf -y update" is running against whatever state the openshift/dind build left the dnf caches in. Obviously this shouldn't actually be a problem, but if there was some bug in dnf's regenerate-caches-from-scratch code, it wouldn't get seen much in normal operation (since people don't normally "dnf clean all") so that might explain why we see this problem all the time but ordinary fedora users don't

stevekuznetsov · 2017-04-28T16:18:42Z

Ah, I see what you mean. Was it there just to reduce the size of the image? Seems reasonable to remove it/

soltysh · 2017-04-28T22:35:23Z

There's no cache in place, what we usually do in all our images is, we clean the cache after installation, see here.

soltysh · 2017-04-28T22:36:26Z

OK, nvmd, I just noticed the PR removing that 🤦‍♂️

bparees · 2017-05-04T02:00:43Z

https://ci.openshift.redhat.com/jenkins/job/merge_pull_request_origin/548/consoleFull#82220402358b6e51eb7608a5981914356

Error: Failed to synchronize cache for repo 'updates'
The command '/bin/sh -c dnf -y update && dnf -y install docker glibc-langpack-en iptables openssh-clients openssh-server' returned a non-zero code: 1
[ERROR] PID 16703: hack/dind-cluster.sh:276: `${DOCKER_CMD} build -t "${image_name}" .` exited with status 1.
[INFO] 		Stack Trace: 
[INFO] 		  1: hack/dind-cluster.sh:276: `${DOCKER_CMD} build -t "${image_name}" .`
[INFO] 		  2: hack/dind-cluster.sh:266: build-image
[INFO] 		  3: hack/dind-cluster.sh:372: build-images
[INFO]   Exiting with code 1.
[ERROR] PID 1183: test/extended/networking.sh:350: `${CLUSTER_CMD} build-images` exited with status 1.
[INFO] 		Stack Trace: 
[INFO] 		  1: test/extended/networking.sh:350: `${CLUSTER_CMD} build-images`
[INFO]   Exiting with code 1.
/data/src/github.com/openshift/origin/hack/lib/log/system.sh: line 31: 16698 Terminated              sar -A -o "${binary_logfile}" 1 86400 > /dev/null 2> "${stderr_logfile}"
[ERROR] PID 1130: test/extended/networking-minimal.sh:6: `NETWORKING_E2E_MINIMAL=1 "${OS_ROOT}/test/extended/networking.sh"` exited with status 1.
[INFO] 		Stack Trace: 
[INFO] 		  1: test/extended/networking-minimal.sh:6: `NETWORKING_E2E_MINIMAL=1 "${OS_ROOT}/test/extended/networking.sh"`
[INFO]   Exiting with code 1.
make: *** [test-extended] Error 1

levysantanna · 2017-05-17T14:45:15Z

Same issue here:

Error: Failed to synchronize cache for repo 'fedora'
error: build error: The command '/bin/sh -c dnf update -y --releasever=25' returned a non-zero code: 1

Strangely it works on my Desktop's Docker:

Step 5 : RUN dnf update -y --releasever=25
 ---> Running in 1c14b7622cac
Last metadata expiration check: 0:00:46 ago on Wed May 17 14:42:19 2017.
Dependencies resolved.
================================================================================
 Package                       Arch     Version                 Repository
                                                                           Size
================================================================================
Upgrading:
 audit-libs                    x86_64   2.7.6-1.fc25            updates   107 k
 ca-certificates               noarch   2017.2.14-1.0.fc25      updates   477 k
 coreutils                     x86_64   8.25-17.fc25            updates   1.1 M
 coreutils-common              x86_64   8.25-17.fc25            updates   1.9 M

stevekuznetsov · 2017-05-17T15:10:40Z

@levysantanna this is a transient failure, no reason to expect you'd be able to reproduce it

zopyx · 2017-07-06T12:38:39Z

This is not a transient failure. I have this error today on one dev machine with one Docker image but not a different machine with the same Docker image...this behavior appears weird...

danwinship · 2017-07-06T12:58:26Z

The error appears to happen when there's some specific sort of problem on one of the fedora mirrors, which will then cause every "yum update" that hits that mirror to fail until eventually the mirror resyncs with the masters and fixes things. (If you look through the past instances of the flake, it tends to happen in bursts; it will happen 5 or 10 times in one day, and then not at all for a few weeks or months.)

If you reliably see it on one machine and not on another (at a given time), it's just because of DNS caching; one of them has resolved "mirrors.fedoraproject.org" to the mirror that has the problem, and the other has resolved it to a different mirror. If you can actually figure out which mirror is having the problems then filing a bug against the fedora infrastructure and/or emailing the maintainer of that mirror might help them figure out exactly what causes this...

openshift-bot · 2018-02-13T16:40:41Z

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

stevekuznetsov · 2018-02-13T23:32:52Z

We've pruned out dependencies on @updaets and @epel so this should be fixed.

/close

nicklasring · 2019-03-27T10:52:25Z

Still having this issue, works fine on my desktop, "Error: Failed to synchronize cache for repo 'updates'" on the server.

csrwng added priority/P2 area/tests kind/test-flake Categorizes issue or PR as related to test flakes. labels Oct 19, 2016

csrwng mentioned this issue Oct 19, 2016

Change gitserver-persistent template's deploy strategy to Recrate #11432

Merged

danmcp assigned stevekuznetsov Oct 20, 2016

knobunc mentioned this issue Oct 20, 2016

new SDN isolation rule organization #11378

Merged

jsafrane mentioned this issue Oct 20, 2016

Add Ceph RBD and Gluster provisioners #11460

Merged

bparees mentioned this issue Oct 21, 2016

add nodeselector and annotation build pod overrides and defaulters #11380

Merged

csrwng mentioned this issue Oct 25, 2016

Start-build/env and multitag jenkins plugin tests #11252

Merged

juanvallejo mentioned this issue Oct 25, 2016

emit warning on no liveness probe defined for pods #10363

Merged

stevekuznetsov mentioned this issue Oct 28, 2016

dnf update failure from category flake #11567

Closed

ncdc mentioned this issue Nov 4, 2016

UPSTREAM: 36248: Fix possible race in operationNotSupportedCache #11791

Merged

oatmealraisin mentioned this issue Nov 5, 2016

Removed fields added to the build strategy when using --strategy=docker. #9553

Closed

bparees mentioned this issue Apr 13, 2017

ignore namespace when processing templates #13725

Merged

bparees mentioned this issue Apr 28, 2017

debug logging for WaitForABuild failures #13904

Merged

deads2k mentioned this issue Apr 28, 2017

Set version tags for client-go #13928

Merged

deads2k mentioned this issue Apr 28, 2017

UPSTREAM: <carry>: taint-controller-tests: double 'a bit of time' to avoid flakes #13953

Merged

danwinship mentioned this issue Apr 28, 2017

Drop "dnf clean all" from openshift/dind build to try to fix a flake #13957

Merged

csrwng mentioned this issue Apr 29, 2017

Increase BuildPodControllerTest timeout #13723

Merged

bparees mentioned this issue May 4, 2017

Use kubernetes public func to get docker auth #10608

Merged

soltysh mentioned this issue Jun 22, 2017

Update oc run help #14769

Merged

openshift-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 13, 2018

openshift-ci-robot closed this as completed Feb 13, 2018

nicklasring mentioned this issue Mar 27, 2019

Error building datasource distributed-system-analysis/sarjitsu#64

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error: Failed to synchronize cache for repo 'updates' #11452

Error: Failed to synchronize cache for repo 'updates' #11452

csrwng commented Oct 19, 2016 •

edited by stevekuznetsov

Loading

csrwng commented Oct 19, 2016

marun commented Oct 21, 2016

marun commented Oct 21, 2016 •

edited

Loading

stevekuznetsov commented Oct 21, 2016

tdawson commented Oct 21, 2016

stevekuznetsov commented Oct 21, 2016

marun commented Oct 21, 2016

stevekuznetsov commented Oct 21, 2016

marun commented Oct 21, 2016

marun commented Oct 21, 2016

bparees commented Apr 28, 2017

bparees commented Apr 28, 2017

stevekuznetsov commented Apr 28, 2017

deads2k commented Apr 28, 2017

stevekuznetsov commented Apr 28, 2017

danwinship commented Apr 28, 2017

stevekuznetsov commented Apr 28, 2017

danwinship commented Apr 28, 2017

stevekuznetsov commented Apr 28, 2017

soltysh commented Apr 28, 2017

soltysh commented Apr 28, 2017

bparees commented May 4, 2017

levysantanna commented May 17, 2017 •

edited

Loading

stevekuznetsov commented May 17, 2017

zopyx commented Jul 6, 2017

danwinship commented Jul 6, 2017

openshift-bot commented Feb 13, 2018

stevekuznetsov commented Feb 13, 2018

nicklasring commented Mar 27, 2019

Error: Failed to synchronize cache for repo 'updates' #11452

Error: Failed to synchronize cache for repo 'updates' #11452

Comments

csrwng commented Oct 19, 2016 • edited by stevekuznetsov Loading

csrwng commented Oct 19, 2016

marun commented Oct 21, 2016

marun commented Oct 21, 2016 • edited Loading

stevekuznetsov commented Oct 21, 2016

tdawson commented Oct 21, 2016

stevekuznetsov commented Oct 21, 2016

marun commented Oct 21, 2016

stevekuznetsov commented Oct 21, 2016

marun commented Oct 21, 2016

marun commented Oct 21, 2016

bparees commented Apr 28, 2017

bparees commented Apr 28, 2017

stevekuznetsov commented Apr 28, 2017

deads2k commented Apr 28, 2017

stevekuznetsov commented Apr 28, 2017

danwinship commented Apr 28, 2017

stevekuznetsov commented Apr 28, 2017

danwinship commented Apr 28, 2017

stevekuznetsov commented Apr 28, 2017

soltysh commented Apr 28, 2017

soltysh commented Apr 28, 2017

bparees commented May 4, 2017

levysantanna commented May 17, 2017 • edited Loading

stevekuznetsov commented May 17, 2017

zopyx commented Jul 6, 2017

danwinship commented Jul 6, 2017

openshift-bot commented Feb 13, 2018

stevekuznetsov commented Feb 13, 2018

nicklasring commented Mar 27, 2019

csrwng commented Oct 19, 2016 •

edited by stevekuznetsov

Loading

marun commented Oct 21, 2016 •

edited

Loading

levysantanna commented May 17, 2017 •

edited

Loading