Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

outerloop-nightly is broken because "native test components" is unavailable #85263

Closed
kunalspathak opened this issue Apr 24, 2023 · 9 comments
Closed

Comments

@kunalspathak
Copy link
Member

It seems that we skip to upload the "native test components" because of which all the test jobs in outerloop-nightly fails because it tries to download those artifacts but doesn't find it. The conditions to skip uploading seems legitimate, so not sure if we should skip downloading them as well with the same conditions check.

- ${{ if and(ne(parameters.compilerName, 'gcc'), ne(parameters.testGroup, ''), ne(parameters.testGroup, 'clrTools'), ne(parameters.disableClrTest, true)) }}:
# Publish test native components for consumption by test execution.
- ${{ if and(ne(parameters.isOfficialBuild, true), eq(parameters.pgoType, '')) }}:
- template: /eng/pipelines/common/upload-artifact-step.yml
parameters:
rootFolder: $(nativeTestArtifactRootFolderPath)
includeRootFolder: false
archiveType: $(archiveType)
tarCompression: $(tarCompression)
archiveExtension: $(archiveExtension)
artifactName: $(nativeTestArtifactName)
displayName: 'native test components'

runtime pipeline

image

vs.

outerloop-nightly pipeline

image

and the download job for native test components keep failing.

image

@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Apr 24, 2023
@ghost ghost added the untriaged New issue has not been triaged by the area owner label Apr 24, 2023
@kunalspathak kunalspathak added area-Infrastructure and removed untriaged New issue has not been triaged by the area owner needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Apr 24, 2023
@ghost
Copy link

ghost commented Apr 24, 2023

Tagging subscribers to this area: @dotnet/runtime-infrastructure
See info in area-owners.md if you want to be subscribed.

Issue Details

It seems that we skip to upload the "native test components" because of which all the test jobs in outerloop-nightly fails because it tries to download those artifacts but doesn't find it. The conditions to skip uploading seems legitimate, so not sure if we should skip downloading them as well with the same conditions check.

- ${{ if and(ne(parameters.compilerName, 'gcc'), ne(parameters.testGroup, ''), ne(parameters.testGroup, 'clrTools'), ne(parameters.disableClrTest, true)) }}:
# Publish test native components for consumption by test execution.
- ${{ if and(ne(parameters.isOfficialBuild, true), eq(parameters.pgoType, '')) }}:
- template: /eng/pipelines/common/upload-artifact-step.yml
parameters:
rootFolder: $(nativeTestArtifactRootFolderPath)
includeRootFolder: false
archiveType: $(archiveType)
tarCompression: $(tarCompression)
archiveExtension: $(archiveExtension)
artifactName: $(nativeTestArtifactName)
displayName: 'native test components'

runtime pipeline

image

vs.

outerloop-nightly pipeline

image

and the download job for native test components keep failing.

image
Author: kunalspathak
Assignees: -
Labels:

area-Infrastructure

Milestone: -

@kunalspathak
Copy link
Member Author

@kunalspathak kunalspathak added the blocking-outerloop Blocking the 'runtime-coreclr outerloop' and 'runtime-libraries-coreclr outerloop' runs label Apr 24, 2023
@trylek
Copy link
Member

trylek commented Apr 24, 2023

It seems to me that the problem is that we're not setting the parameter testGroup when calling build-coreclr-and-libraries-job so that the condition ne(parameters.testGroup, '') doesn't hold. Other than that, I'm somewhat surprised why we have two different flavors of coreclr outerloop runs,

https://dev.azure.com/dnceng-public/public/_build?definitionId=135&_a=summary

vs.

https://dev.azure.com/dnceng-public/public/_build?definitionId=108&_a=summary

In particular, I believe that in our Monday backlog sync-ups JeffSchw looks at the latter so if the intent was to improve outerloop coverage by implementing some checked vs. release build variations, it would seem logical to me to add them to the existing outerloop run instead of creating a new one.

By the way, even though it doesn't seem to be the root problem here, the pipeline template build-coreclr-and-libraries-job doesn't pass the isOfficialBuild parameter to build-job so the value passed to the template at line 31 is silently dropped on the floor.

@kunalspathak
Copy link
Member Author

In particular, I believe that in our Monday backlog sync-ups JeffSchw looks at the latter so if the intent was to improve outerloop coverage by implementing some checked vs. release build variations, it would seem logical to me to add them to the existing outerloop run instead of creating a new one.

Agree.

@trylek
Copy link
Member

trylek commented Apr 24, 2023

I have looked at the GitHub history and it looks like the coreclr-release-outerloop-nightly pipeline was either created or ported as part of our repo merging in early 2020. I must admit I have no detailed memory of the pipeline despite the fact that I apparently contributed to it by one or two fixes. While I believe it should be possible to fix the pipeline by adding the testGroup jobParameter when calling build-coreclr-and-libraries-job, longer term I would find it preferable to fold this pipeline into the runtime-coreclr outerloop pipeline provided the various infra stakeholders agree (unless there's some crucial counterargument I'm missing right now).

@trylek
Copy link
Member

trylek commented Apr 24, 2023

I have triggered an experimental run with my proposed fix for the pipeline:

https://dev.azure.com/dnceng-public/public/_build/results?buildId=250522&view=logs&j=af986455-df3a-5ed6-b067-9144b6c8d0c6

It's only just started but I do see the "Zip / publish native test components" steps in the CoreCLR build jobs. I can send it out for PR if the test finishes fine; we can then have a secondary discussion regarding its potential consolidation with the other outerloop pipeline.

@kunalspathak
Copy link
Member Author

I can send it out for PR if the test finishes fine

Sounds good to me.

@trylek
Copy link
Member

trylek commented Apr 24, 2023

OK, so it's far from green (I haven't found any previous green run of the pipeline) but at least it's now actually sending the tests to Helix and just a handful of them is failing, a few test execution jobs have passed. I'm going to go ahead and publish the PR now as a general goodness, I think that some of the test failures are really interesting in terms of test coverage, we may have been missing some of them previously due to using checked builds in the runtime-coreclr outerloop pipeline.

@kunalspathak
Copy link
Member Author

OK, so it's far from green

@TIHan FYI. Please consider checking the unique failures and open issues as appropriate.

@hoyosjs hoyosjs removed the blocking-outerloop Blocking the 'runtime-coreclr outerloop' and 'runtime-libraries-coreclr outerloop' runs label Apr 25, 2023
@ghost ghost locked as resolved and limited conversation to collaborators Jul 5, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants