Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

yum flakes: operation too slow #9203

Closed
stevekuznetsov opened this issue Jun 7, 2016 · 10 comments
Closed

yum flakes: operation too slow #9203

stevekuznetsov opened this issue Jun 7, 2016 · 10 comments
Assignees
Labels
area/infrastructure kind/test-flake Categorizes issue or PR as related to test flakes. priority/P2

Comments

@stevekuznetsov
Copy link
Contributor

Seeing failures on AWS building OpenShift images:

Jun 07 11:22:34 --- openshift/origin-haproxy-router ---
Jun 07 11:22:34 --> FROM openshift/origin
Jun 07 11:22:35 --> RUN yum -y install haproxy &&     mkdir -p /var/lib/haproxy/router/{certs,cacerts} &&     mkdir -p /var/lib/haproxy/{conf,run,bin,log} &&     touch /var/lib/haproxy/conf/{{os_http_be,os_edge_http_be,os_tcp_be,os_sni_passthrough,os_reencrypt,os_edge_http_expose,os_edge_http_redirect}.map,haproxy.config} &&     chmod -R 777 /var &&     yum clean all &&     setcap 'cap_net_bind_service=ep' /usr/sbin/haproxy
Jun 07 11:22:35 Loaded plugins: fastestmirror, ovl
Jun 07 11:23:06 Could not retrieve mirrorlist http://mirrorlist.centos.org/?release=7&arch=x86_64&repo=os&infra=stock error was
Jun 07 11:23:06 12: Timeout on http://mirrorlist.centos.org/?release=7&arch=x86_64&repo=os&infra=stock: (28, 'Operation too slow. Less than 1000 bytes/sec transferred the last 30 seconds')
Jun 07 11:23:06 
Jun 07 11:23:06 
Jun 07 11:23:06  One of the configured repositories failed (Unknown),
Jun 07 11:23:06  and yum doesn't have enough cached data to continue. At this point the only
Jun 07 11:23:06  safe thing yum can do is fail. There are a few ways to work "fix" this:
Jun 07 11:23:06 
Jun 07 11:23:06      1. Contact the upstream for the repository and get them to fix the problem.
Jun 07 11:23:06 
Jun 07 11:23:06      2. Reconfigure the baseurl/etc. for the repository, to point to a working
Jun 07 11:23:06         upstream. This is most often useful if you are using a newer
Jun 07 11:23:06         distribution release than is supported by the repository (and the
Jun 07 11:23:06         packages for the previous distribution release still work).
Jun 07 11:23:06 
Jun 07 11:23:06      3. Disable the repository, so yum won't use it by default. Yum will then
Jun 07 11:23:06         just ignore the repository until you permanently enable it again or use
Jun 07 11:23:06         --enablerepo for temporary usage:
Jun 07 11:23:06 
Jun 07 11:23:06             yum-config-manager --disable <repoid>
Jun 07 11:23:06 
Jun 07 11:23:06      4. Configure the failing repository to be skipped, if it is unavailable.
Jun 07 11:23:06         Note that yum will try to contact the repo. when it runs most commands,
Jun 07 11:23:06         so will have to try and fail each time (and thus. yum will be be much
Jun 07 11:23:06         slower). If it is a very temporary problem though, this is often a nice
Jun 07 11:23:06         compromise:
Jun 07 11:23:06 
Jun 07 11:23:06             yum-config-manager --save --setopt=<repoid>.skip_if_unavailable=true
Jun 07 11:23:06 
Jun 07 11:23:06 Cannot find a valid baseurl for repo: base/7/x86_64
Jun 07 11:23:41 error: running '/bin/sh -c cd "/var/lib/origin" && yum -y install haproxy &&     mkdir -p /var/lib/haproxy/router/{certs,cacerts} &&     mkdir -p /var/lib/haproxy/{conf,run,bin,log} &&     touch /var/lib/haproxy/conf/{{os_http_be,os_edge_http_be,os_tcp_be,os_sni_passthrough,os_reencrypt,os_edge_http_expose,os_edge_http_redirect}.map,haproxy.config} &&     chmod -R 777 /var &&     yum clean all &&     setcap 'cap_net_bind_service=ep' /usr/sbin/haproxy' failed with exit code 1
Jun 07 11:23:41 !!! Error in hack/build-images.sh:65
Jun 07 11:23:41     '"${oc}" ex dockerbuild $2 $1' exited with status 1
Jun 07 11:23:41 Call stack:
Jun 07 11:23:41     1: hack/build-images.sh:65 build(...)
Jun 07 11:23:41     2: hack/build-images.sh:99 image(...)
Jun 07 11:23:41     3: hack/build-images.sh:113 main(...)
Jun 07 11:23:41 Exiting with status 1
Jun 07 11:23:41 make: *** [release] Error 1

/cc @danmcp

@stevekuznetsov
Copy link
Contributor Author

@wshearn saw this 17/100 in some tests this morning, so still happening fairly frequently

@soltysh
Copy link
Contributor

soltysh commented Jun 14, 2016

Seen earlier in #9021.

@stevekuznetsov
Copy link
Contributor Author

Underlying issue is https://bugs.centos.org/view.php?id=10685 as diagnosed in #9230

@stevekuznetsov
Copy link
Contributor Author

Fixed upstream

@soltysh
Copy link
Contributor

soltysh commented Jun 29, 2016

Unfortunately seen again in #9597: https://ci.openshift.redhat.com/jenkins/job/test_pr_origin/5548/

@marun
Copy link
Contributor

marun commented Jun 29, 2016

@stevekuznetsov
Copy link
Contributor Author

@soltysh @marun in your logs I cannot see any mention of a timeout ... I think the error you are seeing is new. Please create a new issue for the errors and file a Jenkins Failure Cause Management item to point to the new issue. The error I see in both of your logs is:

Cannot retrieve metalink for repository: epel/x86_64. Please verify its path and try again

@marun
Copy link
Contributor

marun commented Jun 29, 2016

@stevekuznetsov sorry for the confusion, #8846 looks to match though.

@soltysh
Copy link
Contributor

soltysh commented Jun 30, 2016

Created #9642.

@soltysh
Copy link
Contributor

soltysh commented Jul 28, 2016

Seen this happening in #9240. Full log https://ci.openshift.redhat.com/jenkins/job/test_pull_requests_origin_integration/4109/console. I'm hoping it is intermittent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/infrastructure kind/test-flake Categorizes issue or PR as related to test flakes. priority/P2
Projects
None yet
Development

No branches or pull requests

4 participants