Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spurious failure with stalled macOS builds (specifically the check i686-apple-darwin job) #44221

Closed
kennytm opened this issue Aug 31, 2017 · 4 comments

Comments

@kennytm
Copy link
Member

kennytm commented Aug 31, 2017

First seen on 2017 Aug 31st in:

Symptom:

The job just stopped in the middle with no output in 30 minutes without any reason. End of the log looks like:

[00:40:25] �[m�[m�[32m�[1m   Compiling�[m rustc_llvm v0.0.0 (file:///Users/travis/build/rust-lang/rust/src/librustc_llvm)
[00:40:31] �[m�[m�[32m�[1m   Compiling�[m flate2 v0.2.19
[00:40:34] �[m�[m�[32m�[1m   Compiling�[m rustc_errors v0.0.0 (file:///Users/travis/build/rust-lang/rust/src/librustc_errors)



No output has been received in the last 30m0s, this potentially indicates a stalled build or something wrong with the build itself.
Check the details on how to adjust your build configuration on: https://docs.travis-ci.com/user/common-build-problems/#Build-times-out-because-no-output-was-received

The build has been terminated

Status of Travis and MacStadium are both green on 2017 Aug 31st. Not sure if unreported upstream issue or something else.

@ssbarnea
Copy link

ssbarnea commented Sep 1, 2017

Well, I was hit by the same issue using brew upgrade .... failed on rust 1.20 ... seems like a recurring problem with rust.

This was referenced Sep 1, 2017
@alexcrichton
Copy link
Member

I talked with Travis yesterday and got:

both your mac pros are healthy, once has two VMs and the other has three
both our networking and SAN stuff is hitting some limits, so it may be one of those, unfortunately.

which makes me think it's probably upstream issues and we don't really have a way to work around :(

@kennytm
Copy link
Member Author

kennytm commented Sep 5, 2017

Not sure if Travis CI is aware of this, but the problem is very likely in the connection to the log collector or RabbitMQ, not the VMs themselves (yeah not something we can fix).

I'm currently monitoring https://travis-ci.org/rust-lang/rust/jobs/272020017, and logs are being written out if you "Follow log" at the web interface, but opening it again the logs will be all gone.

screenshot_2017-09-05 20 30 20_gfnuzu-fs8


🤔 The failing jobs are always the check jobs, not the dist jobs. The difference between them are xcode8.2 vs xcode7.3. I wonder if this can be fixed by upgrading the check jobs to xcode8.3 or xcode9. (I don't think we want to downgrade to xcode7.3.)

bors added a commit that referenced this issue Sep 5, 2017
[WIP] [DO NOT MERGE] Change the Travis CI macOS Xcode version to 9.0

Just to experiment if it can make #44221 appear less often.
bors added a commit that referenced this issue Sep 6, 2017
Upgrade the Travis CI macOS Xcode version to 8.3

Just to experiment if it can make #44221 appear less often.
@kennytm
Copy link
Member Author

kennytm commented Sep 6, 2017

Incident notice: https://www.traviscistatus.com/incidents/2f0443bbphld

We’re investigating an increased rate of internal restarts of macOS builds, resulting in longer boot times for both public and private repositories. This has resulted in an increased backlog for macOS builds at travis-ci.org

Sep 6, 09:00 UTC


Edit: Upstream acknowledged the logging issue:

In addition to longer boot times, users are experiencing an increase in errored builds due to log timeouts when running macOS builds. We are investigating networking issues and will update again as soon as we know more.

Sep 06, 2017 - 10:45 UTC

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants