Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update fedora helix to 34 #483

Merged
merged 5 commits into from
Jul 27, 2021
Merged

update fedora helix to 34 #483

merged 5 commits into from
Jul 27, 2021

Conversation

wfurt
Copy link
Member

@wfurt wfurt commented Jul 26, 2021

This is similar to existing image with following exception:

  1. base os updated to Fedora 34
    contributes to Drop support for Fedora 32 core#6431
    (when I did work on add libmsquic to one helix image #474 I did not realized 32 is out of support)

  2. we still don't have flow for Linux package of msquic.
    Until we do, I updated the image to simple rebuild current dotnet/msquic and consume outcome directly.
    This will allow us to stay on top of fixes for MsQuic.
    contributes to [QUIC] update msquic package runtime#55746

cc: @CarnaViire @karelz

@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this PR. If you have write-permissions please help me learn by adding exactly one area label.

Copy link
Member

@mthalman mthalman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like there is a build failure that needs to be investigated too.

src/fedora/34/amd64/Dockerfile Outdated Show resolved Hide resolved
src/fedora/34/helix/amd64/Dockerfile Outdated Show resolved Hide resolved
src/fedora/34/helix/amd64/Dockerfile Outdated Show resolved Hide resolved
@wfurt
Copy link
Member Author

wfurt commented Jul 26, 2021

I see the OpenSSL warning in my local test runs @mthalman but the images is created successfully. Is there some extra check for the CI runs? This is fails in OpenSSL.

@mthalman
Copy link
Member

Oh, I see. I thought it was just failing the Docker build but I see that it succeeds. It looks like the AzDO pipeline itself is interpreting the error output as a failure for some reason. That's strange. I'll have to dig into it.

@wfurt
Copy link
Member Author

wfurt commented Jul 26, 2021

Thanks. I can hide the error if needed. Should I give it try?
This step is hopefully temporary until we get the msquic package flowing. (blocked on signing and publishing at the moment)

@mthalman
Copy link
Member

Thanks. I can hide the error if needed. Should I give it try?

Yeah, let's try that first.

@mthalman
Copy link
Member

I see your changes fixed the issue but the build for Fedora Rawhide is failing as well. This looks to be an external issue.

Step 1/4 : FROM registry.fedoraproject.org/fedora:rawhide
 ---> 887689ee223e
Step 2/4 : RUN dnf install -y         clang         cmake         findutils         gdb         glibc-langpack-en         lldb-devel         llvm-devel         make         python         which     && dnf clean all
 ---> Running in 8ade5d70dc00
Fedora rawhide openh264 (From Cisco) - x86_64   0.0  B/s |   0  B     00:00    
Errors during downloading metadata for repository 'fedora-cisco-openh264':
  - Curl error (6): Couldn't resolve host name for https://mirrors.fedoraproject.org/metalink?repo=fedora-cisco-openh264-rawhide&arch=x86_64 [getaddrinfo() thread failed to start]
Error: Failed to download metadata for repo 'fedora-cisco-openh264': Cannot prepare internal mirrorlist: Curl error (6): Couldn't resolve host name for https://mirrors.fedoraproject.org/metalink?repo=fedora-cisco-openh264-rawhide&arch=x86_64 [getaddrinfo() thread failed to start]
Fedora - Rawhide - Developmental packages for t 0.0  B/s |   0  B     00:00    
Errors during downloading metadata for repository 'rawhide':
  - Curl error (6): Couldn't resolve host name for https://mirrors.fedoraproject.org/metalink?repo=rawhide&arch=x86_64&countme=1 [getaddrinfo() thread failed to start]
  - Curl error (6): Couldn't resolve host name for https://mirrors.fedoraproject.org/metalink?repo=rawhide&arch=x86_64 [getaddrinfo() thread failed to start]
Error: Failed to download metadata for repo 'rawhide': Cannot prepare internal mirrorlist: Curl error (6): Couldn't resolve host name for https://mirrors.fedoraproject.org/metalink?repo=rawhide&arch=x86_64 [getaddrinfo() thread failed to start]

@omajid - Any ideas why this is happening? Looks to be doing a pretty basic dnf install.

@wfurt
Copy link
Member Author

wfurt commented Jul 26, 2021

yes, the deal was in passing -ci to the embedded build. With that msbuild would spit out something like

##vso[task.complete result=Failed;]msbuild execution failed.

(and other ##vso tags)

I would expect that the raw hide would not be impacted by my change unless I miss something
Couldn't resolve host name looks suspicious and I'm wondering if that can be related to name server it points to. But I don't know why that would be any different for this particular image.

@wfurt
Copy link
Member Author

wfurt commented Jul 26, 2021

BTW I'm not sure if we even need the fedora-cisco-openh264 registry. We don't seem to do any media stuff.

@wfurt
Copy link
Member Author

wfurt commented Jul 26, 2021

may be some infrastructure issue. It builds for me locally

pwsh build.ps1 -DockerfilePath  *fedora/rawhide*
...

Complete!
22 files removed
Removing intermediate container 630ae006f21e
 ---> 6e35c355d162
Successfully built 6e35c355d162
Successfully tagged mcr.microsoft.com/dotnet-buildtools/prereqs:fedora-rawhide-20210726211709-031e7d2
-- EXECUTION ELAPSED TIME: 00:09:40.8047626

IMAGES BUILT
------------
mcr.microsoft.com/dotnet-buildtools/prereqs:fedora-rawhide-20210726211709-031e7d2

@mthalman
Copy link
Member

Interesting. The build agent can't even ping mirrors.fedoraproject.org.

Script contents:
ping -c 1 mirrors.fedoraproject.org
========================== Starting Command Output ===========================
/usr/bin/bash --noprofile --norc /home/vsts/work/_temp/74aee205-b228-411b-95e1-2743194a2990.sh
PING wildcard.fedoraproject.org (152.19.134.198) 56(84) bytes of data.

--- wildcard.fedoraproject.org ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms

@mthalman
Copy link
Member

@wfurt - For now, go ahead and comment out this section of the manifest to prevent Rawhide from building:

{
"platforms": [{
"dockerfile": "src/fedora/rawhide/amd64",
"os": "linux",
"osVersion": "fedora-rawhide",
"tags": {
"fedora-rawhide-$(System:TimeStamp)-$(System:DockerfileGitCommitSha)": {}
}
}]
},

@mthalman
Copy link
Member

I've logged #484 for the Rawhide build issue.

libuuid-devel \
lttng-ust-devel \
openssl-devel \
uuid-devel \
Copy link
Member

@omajid omajid Jul 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's surprising that we need both libuuid-devel and uuid-devel. Is that intentional?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know. This is straight copy from existing 32. Also I don't know if we use it to actually build CoreCLR and others. It feels like we could go away with just Helix & runtime dependencies. But I did not want to fiddle with all.
I really wanted updated Quic. The upgrade to Fedora 34 is opportunistic.


# This is temporary until we have flow to packages.microsoft.com

RUN dnf install -y dotnet gem lttng-tools perl-FindBin rpmdevtools ruby-devel && \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: this pulls down the Fedora build of .NET. Since you are adding the Microsoft packages repo to this container, I am not sure if it's intentional.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is ok. msquic only need global dotnet executable to install and run clog. I'm hoping this section will go away in few weeks (e.g. before 6.0 ships)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. Now that I think about it, what it pulls down might be non-deterministic. If it pulls down a mix of packages (some from Fedora, others from Microsoft) that might not work too well. You might want to verify that the SDK that gets installed into this container is working. See https://docs.microsoft.com/en-us/dotnet/core/install/linux-package-mixup for more details/fixes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems to be fine. The dotnet is removed immediately after build on line 32. So it should not interfere with Helix or anything else. Using the OS package seems most natural to me. I would do it for PowerShell as well but I could not find anything direclty from Fedora.

@omajid
Copy link
Member

omajid commented Jul 26, 2021

@omajid - Any ideas why this is happening? Looks to be doing a pretty basic dnf install.

Yeah, the TLDR is glibc is broken in Fedora 35 for some containerized environments.

https://bugzilla.redhat.com/show_bug.cgi?id=1985499 goes into a little bit more detail, but still misses the larger context. From what I can tell, glibc took a change to support certain new hardware security features (Intel CET) . When such a new glibc is running in a containerized environment, it needs additional fixes that need to be in the container engine itself (which is running the container that contains the updated glibc) otherwise glibc breaks, breaking everything else in the stack too.

The fix needs to go into docker/moby/containerd and friends.

@mthalman
Copy link
Member

@omajid - Any ideas why this is happening? Looks to be doing a pretty basic dnf install.

Yeah, the TLDR is glibc is broken in Fedora 35 for some containerized environments.

https://bugzilla.redhat.com/show_bug.cgi?id=1985499 goes into a little bit more detail, but still misses the larger context. From what I can tell, glibc took a change to support certain new hardware security features (Intel CET) . When such a new glibc is running in a containerized environment, it needs additional fixes that need to be in the container engine itself (which is running the container that contains the updated glibc) otherwise glibc breaks, breaking everything else in the stack too.

The fix needs to go into docker/moby/containerd and friends.

But that doesn't explain why the build agent is unable to even ping mirrors.fedoraproject.org.

@mthalman
Copy link
Member

/azp run dotnet-buildtools-prereqs-docker-fedora

@azure-pipelines
Copy link

Pull request contains merge conflicts.

@wfurt wfurt merged commit bb45865 into dotnet:main Jul 27, 2021
@mthalman
Copy link
Member

Even though the build failed due to infrastructure issues, the images were successfully published. The tags are:

mcr.microsoft.com/dotnet-buildtools/prereqs:fedora-34-20210727172939-5883dc8
mcr.microsoft.com/dotnet-buildtools/prereqs:fedora-34-helix-20210727172939-607b2b2

@wfurt
Copy link
Member Author

wfurt commented Jul 27, 2021

Thanks for update @mthalman . I'll give them try.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants