Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Setup OSX.1100.ARM64.* helix queues #47919

Merged
merged 9 commits into from
Mar 2, 2021
Merged

Conversation

sdmaclea
Copy link
Contributor

@sdmaclea sdmaclea commented Feb 5, 2021

/cc @mangod9

@sdmaclea sdmaclea added this to the 6.0.0 milestone Feb 5, 2021
@sdmaclea sdmaclea self-assigned this Feb 5, 2021
@ghost
Copy link

ghost commented Feb 5, 2021

Tagging subscribers to this area: @ViktorHofer
See info in area-owners.md if you want to be subscribed.

Issue Details

/cc @mangod9

Author: sdmaclea
Assignees: sdmaclea
Labels:

area-Infrastructure

Milestone: 6.0.0

@sdmaclea
Copy link
Contributor Author

sdmaclea commented Feb 5, 2021

The current version should enable osx-arm64 test for CI runs. Is there a way to trigger them before merging? Or should this be done in a infra branch.

@safern
Copy link
Member

safern commented Feb 5, 2021

The current version should enable osx-arm64 test for CI runs. Is there a way to trigger them before merging? Or should this be done in a infra branch.

You can queue a manual build from your branch.

@sdmaclea
Copy link
Contributor Author

sdmaclea commented Feb 5, 2021

Helix M1 queues don't have any machines yet

OK I'll hold off testing till the queues are populated.

However in the runtime pipeline we can't merge with any failures as that pipeline we track it's pass rate

I can disable the unstable tests before merging.

@mangod9
Copy link
Member

mangod9 commented Feb 23, 2021

Hey @sdmaclea, the queues should be hydrated now, could you please try running tests on them?

@steveisok
Copy link
Member

@mangod9
Copy link
Member

mangod9 commented Feb 24, 2021

Thanks @steveisok for the info. Is anyone looking into the python script issues? I do see a mention of script issues being worked on here: https://github.com/dotnet/core-eng/issues/11937#issuecomment-780183832. @parose1 is this something you are still looking into?

@parose1
Copy link

parose1 commented Feb 24, 2021

@mangod9 The python issue I was looking into was blocking the helix script from completing due to a conflict with the python version homebrew installs. This prevented the machines from being added to helix. The solution was to just uninstall the homebrew python and let the helix script install it instead, which allowed the script to complete.

@sdmaclea
Copy link
Contributor Author

/azp run runtime-coreclr jitstress

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@sdmaclea
Copy link
Contributor Author

@parose1 Can. you provide more info? What version of python is installed on the machines now? The initial plan was to have the universal binary version of python 3.9.1 installed. Is that still installed and on the default PATH?

@parose1
Copy link

parose1 commented Feb 24, 2021

@sdmaclea Running "python3 --version" I see python 3.9.1 is installed on the machines. The only difference is that the helix script installs it instead of homebrew. @ulisesh made the suggestion to uninstall the homebrew version and let the helix script install to get me unblocked, so I could add these into the helix queue, so he may know more on the reasoning as to why the homebrew version was causing issues with the helix script.

@ulisesh
Copy link
Contributor

ulisesh commented Feb 24, 2021

@parose1 Can. you provide more info? What version of python is installed on the machines now? The initial plan was to have the universal binary version of python 3.9.1 installed. Is that still installed and on the default PATH?

The default python in the PATH has Intel binaries. We had problems using Python with universal binaries because there were some package (like cryptography) that don't support M1

@sdmaclea can you please help us understand how having Intel based Python affects your scenario?

Talking about https://helixre107v0xdeko0k025g8.blob.core.windows.net/dotnet-runtime-refs-pull-47864-merge-39d501d2b0984d9ea6/System.Buffers.Tests/console.fba4d13d.log?sv=2019-07-07&se=2021-03-15T18%3A41%3A56Z&sr=c&sp=rl&sig=Vl6KbD0L0CdGQOe5CJ%2F42cXOHHO8n090lOIm9ZVaVAw%3D it looks like the test reporter is having problems running on Python 3.9 and the recommendation here is to upgrade setuptools. @parose1 could you run python3 -m pip install --upgrade setuptools in all the machines?

@parose1
Copy link

parose1 commented Feb 24, 2021

Talked with @ulisesh on teams and changed the command a bit to "sudo -H python3 -m pip install --no-user --upgrade setuptools"

I completed running that on all the machines in the osx.1100.arm64.open and osx.1100.arm64 helix queues. Let me know if you need anything else, thanks.

@ulisesh
Copy link
Contributor

ulisesh commented Feb 24, 2021

/azp run runtime-coreclr jitstress

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@sdmaclea
Copy link
Contributor Author

/azp run runtime-coreclr outerloop

@azure-pipelines
Copy link

Azure Pipelines failed to run 1 pipeline(s).

@sdmaclea
Copy link
Contributor Author

/azp list

@sdmaclea
Copy link
Contributor Author

/azp run runtime-coreclr jitstress

@sdmaclea
Copy link
Contributor Author

/azp run runtime-coreclr jitstress

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@sdmaclea
Copy link
Contributor Author

/azp run runtime-coreclr outerloop

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@sdmaclea
Copy link
Contributor Author

/azp run runtime-coreclr jitstressregs

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mangod9
Copy link
Member

mangod9 commented Feb 26, 2021

Are tests actually running on osx-arm64? I am noticing this failure in the jitstress runs:

Microsoft.DotNet.XUnitConsoleRunner v2.5.0 (64-bit .NET 6.0.0-preview.2.21117.5)
  Discovering: tracing.eventcounter.XUnitWrapper
  Discovered:  tracing.eventcounter.XUnitWrapper
  Starting:    tracing.eventcounter.XUnitWrapper
Unhandled exception. System.NullReferenceException: Object reference not set to an instance of an object.
   at System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1.AsyncStateMachineBox`1.ExecutionContextCallback(Object s)
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
--- End of stack trace from previous location ---
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
   at System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1.AsyncStateMachineBox`1.MoveNext(Thread threadPoolThread)
   at System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1.AsyncStateMachineBox`1.MoveNext()
   at System.Threading.Tasks.AwaitTaskContinuation.<>c.<.cctor>b__17_0(Object state)
   at System.Threading.Tasks.AwaitTaskContinuation.RunCallback(ContextCallback callback, Object state, Task& currentTask)
--- End of stack trace from previous location ---
   at System.Threading.Tasks.Task.<>c.<ThrowAsync>b__140_1(Object state)
   at System.Threading.QueueUserWorkItemCallbackDefaultContext.Execute()
   at System.Threading.ThreadPoolWorkQueue.Dispatch()
   at System.Threading.PortableThreadPool.WorkerThread.WorkerThreadStart()
   at System.Threading.Thread.StartCallback()

@sdmaclea
Copy link
Contributor Author

Yes. That test suite had run in earlier runs. It looks like a spurious failure in that leg of that particular run.

If you remove the failed+aborted filter in the test tab, you can see all the tests run and passed.

I looked at outerloop and some of the other jitstress legs/runs and they look fine.

I think this is close to ready to merge.

@hoyosjs @safern Any objections?

@sdmaclea
Copy link
Contributor Author

I wanted to trigger a libraries-outer fullMatrix run on this before merging. The one triggered through azp run doesn't trigger the osx-arm64 legs, because it is not full mattrix.

The ilasm legs have a similar issue since the PR trigger limits them to PR which change ilasm/ildasm source code.

@hoyosjs
Copy link
Member

hoyosjs commented Feb 26, 2021

I'll take a look at this later.

@sdmaclea
Copy link
Contributor Author

sdmaclea commented Feb 26, 2021

Triggered libraries outerloop here https://dev.azure.com/dnceng/public/_build/results?buildId=1014515&view=results

This one looks pretty good. There is one macOS failure. It could easily be spurious. x64 also has one failure presumably a different one.

@sdmaclea
Copy link
Contributor Author

sdmaclea commented Feb 26, 2021

I triggered the ilasm pipeline here https://dev.azure.com/dnceng/public/_build/results?buildId=1014527&view=results

This one also looks good for Apple Silicon

@sdmaclea
Copy link
Contributor Author

sdmaclea commented Feb 26, 2021

The coreclr-release-outerloop-nightly has been broken on all other platform for a while. I didn't retrigger it. I can fix it in a separate PR (It is even possible I broke it 9mo ago...)

The various other jitstress runs I didn't retrigger. The earlier runs were very similar to each other.

Base automatically changed from master to main March 1, 2021 09:07
@sdmaclea
Copy link
Contributor Author

sdmaclea commented Mar 1, 2021

I plan to merge this today.

@ViktorHofer
Copy link
Member

@safern can you please take a look at the yml changes in particular?

Copy link
Member

@safern safern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like libraries tests are only going to be run for outerloop... do we want to add a libraries test run for this on innerloop libraries tests that run only when IsFullMatrix == true?

@@ -85,6 +85,11 @@ jobs:
# Limiting interp runs as we don't need as much coverage.
- Debian.9.Amd64.Open

# OSX arm64
- ${{ if eq(parameters.platform, 'OSX_arm64') }}:
- ${{ if eq(parameters.jobParameters.isFullMatrix, true) }}:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know we discussed this earlier, but I believe we should remove this condition for libraries and then instead condition the test runs in the yml entry point?

@sdmaclea
Copy link
Contributor Author

sdmaclea commented Mar 2, 2021

I plan several other PR to gradually increase coverage. I am going to merge this so we begin getting test coverage. I'll put up a PR to for library tests tomorrow.

@sdmaclea sdmaclea merged commit b5c809f into dotnet:main Mar 2, 2021
@ghost ghost locked as resolved and limited conversation to collaborators Apr 1, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants