Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DnsGetHostEntry_LocalHost_ReturnsFqdnAndLoopbackIPs failed in CI #34317

Closed
jaredpar opened this issue Mar 31, 2020 · 15 comments · Fixed by #35399
Closed

DnsGetHostEntry_LocalHost_ReturnsFqdnAndLoopbackIPs failed in CI #34317

jaredpar opened this issue Mar 31, 2020 · 15 comments · Fixed by #35399
Assignees
Labels
area-System.Net disabled-test The test is disabled in source code against the issue os-linux Linux OS (any supported distro) test-run-core Test failures in .NET Core test runs
Milestone

Comments

@jaredpar
Copy link
Member

Console Log Summary

    System.Net.NameResolution.Tests.GetHostEntryTest.DnsGetHostEntry_LocalHost_ReturnsFqdnAndLoopbackIPs(mode: 0) [FAIL]
      Assert.All() Failure: 2 out of 4 items in the collection did not pass.
      [3]: Item: fe80::20d:3aff:fe5b:2be9%2
           Xunit.Sdk.TrueException: Not a loopback address: fe80::20d:3aff:fe5b:2be9%2
           Expected: True
           Actual:   False
              at Xunit.Assert.True(Nullable`1 condition, String userMessage) in C:\Dev\xunit\xunit\src\xunit.assert\Asserts\BooleanAsserts.cs:line 95
              at Xunit.Assert.True(Boolean condition, String userMessage) in C:\Dev\xunit\xunit\src\xunit.assert\Asserts\BooleanAsserts.cs:line 83
              at System.Net.NameResolution.Tests.GetHostEntryTest.<>c.<DnsGetHostEntry_LocalHost_ReturnsFqdnAndLoopbackIPs>b__13_0(IPAddress addr) in /_/src/libraries/System.Net.NameResolution/tests/FunctionalTests/GetHostEntryTest.cs:line 205
              at Xunit.Assert.All[T](IEnumerable`1 collection, Action`1 action) in C:\Dev\xunit\xunit\src\xunit.assert\Asserts\CollectionAsserts.cs:line 36
      [2]: Item: 10.0.0.22
           Xunit.Sdk.TrueException: Not a loopback address: 10.0.0.22
           Expected: True
           Actual:   False
              at Xunit.Assert.True(Nullable`1 condition, String userMessage) in C:\Dev\xunit\xunit\src\xunit.assert\Asserts\BooleanAsserts.cs:line 95
              at Xunit.Assert.True(Boolean condition, String userMessage) in C:\Dev\xunit\xunit\src\xunit.assert\Asserts\BooleanAsserts.cs:line 83
              at System.Net.NameResolution.Tests.GetHostEntryTest.<>c.<DnsGetHostEntry_LocalHost_ReturnsFqdnAndLoopbackIPs>b__13_0(IPAddress addr) in /_/src/libraries/System.Net.NameResolution/tests/FunctionalTests/GetHostEntryTest.cs:line 205
              at Xunit.Assert.All[T](IEnumerable`1 collection, Action`1 action) in C:\Dev\xunit\xunit\src\xunit.assert\Asserts\CollectionAsserts.cs:line 36
      Stack Trace:
        /_/src/libraries/System.Net.NameResolution/tests/FunctionalTests/GetHostEntryTest.cs(205,0): at System.Net.NameResolution.Tests.GetHostEntryTest.DnsGetHostEntry_LocalHost_ReturnsFqdnAndLoopbackIPs(Int32 mode)
        --- End of stack trace from previous location ---

Builds

Build Pull Request Test Failure Count
#580674 Rolling 1

Configurations

  • netcoreapp5.0-Linux-Release-x64-CoreCLR_release-SLES.12.Amd64.Open

Helix Logs

Build Pull Request Console Core Test Results Run Client
#580674 Rolling console.log testResults.xml run_client.py

Only seen one failure so far but also suspicious that this showed up in CI

@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added area-System.Net untriaged New issue has not been triaged by the area owner labels Mar 31, 2020
@jaredpar
Copy link
Member Author

jaredpar commented Apr 2, 2020

This is now regularly blocking CI @karelz

Builds

Build Pull Request Test Failure Count Date
#584646 Rolling 1 2020/4/1
#584991 Rolling 1 2020/4/1
#585184 Rolling 2 2020/4/2

Configurations

  • netcoreapp5.0-Linux-Release-x64-CoreCLR_release-SLES.12.Amd64.Open
  • netcoreapp5.0-Linux-Release-x64-Mono_release-SLES.12.Amd64.Open

Helix Logs

Build Pull Request Console Core Test Results Run Client
#584646 Rolling console.log testResults.xml run_client.py
#584991 Rolling console.log testResults.xml run_client.py
#585184 Rolling console.log testResults.xml run_client.py
#585184 Rolling console.log testResults.xml run_client.py

runfo tests -d runtime -c 100 -pr -n System.Net.NameResolution.Functional.Tests -m

@jaredpar jaredpar added the blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms' label Apr 2, 2020
@karelz
Copy link
Member

karelz commented Apr 3, 2020

@MihaZupan can you please take a look at this new failure?
Kusto database mining shows first and only failure 2 days ago (3/31). Disabling the test to unblock CI may be best first response.
cc @alnikola

@karelz karelz added this to the 5.0 milestone Apr 3, 2020
@karelz karelz added test-run-core Test failures in .NET Core test runs os-linux Linux OS (any supported distro) and removed untriaged New issue has not been triaged by the area owner labels Apr 3, 2020
@jaredpar
Copy link
Member Author

jaredpar commented Apr 3, 2020

Builds

Build Pull Request Test Failure Count
#585813 Rolling 1
#586664 Rolling 1
#587083 Rolling 1

Configurations

  • netcoreapp5.0-Linux-Release-x64-CoreCLR_release-SLES.12.Amd64.Open
  • netcoreapp5.0-Linux-Release-x64-Mono_release-SLES.12.Amd64.Open

Helix Logs

Build Pull Request Console Core Test Results Run Client
#585813 Rolling console.log testResults.xml run_client.py
#586664 Rolling console.log testResults.xml run_client.py
#587083 Rolling console.log testResults.xml run_client.py

@jaredpar
Copy link
Member Author

jaredpar commented Apr 3, 2020

This is causing roughly a 3% failure rate at this point. Do we have an ETA for when this will be disabled?

@karelz
Copy link
Member

karelz commented Apr 3, 2020

Sorry for that, I thought @MihaZupan had chance to do it over night.
PR is up - see #34527

@MihaZupan
Copy link
Member

From console logs here I am seeing the following tests failing

DnsGetHostEntry_LocalHost_ReturnsFqdnAndLoopbackIPs
DnsObsoleteGetHostByName_EmptyString_ReturnsHostName
DnsObsoleteBeginEndGetHostByName_EmptyString_ReturnsHostName
Dns_GetHostEntry_HostString_Ok
Dns_GetHostEntryAsync_HostString_Ok

Which looks like the 4 mentioned in #1488 + now DnsGetHostEntry_LocalHost_ReturnsFqdnAndLoopbackIPs

@davidsh davidsh added the disabled-test The test is disabled in source code against the issue label Apr 4, 2020
@jkotas jkotas removed the blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms' label Apr 5, 2020
@wfurt
Copy link
Member

wfurt commented Apr 7, 2020

This may be misconfigured machine. Also SLES12 uses systemd and that resolver can synthesize response. I think we should collect more system info on failure. If we have helper diag functions, that may be helpful for other DNS test failures.

@alnikola alnikola assigned alnikola and unassigned MihaZupan Apr 23, 2020
@alnikola
Copy link
Contributor

@wfurt I think you are right because the test always fails when it's run on an sles.12.amd64.open agent with machine name 'localhost' which looks weird.

@alnikola
Copy link
Contributor

The failures were caused by a Helix infra issue which was resolved yesterday.

@karelz
Copy link
Member

karelz commented Apr 23, 2020

@alnikola did we re-enable the tests?
Was it the misconfiguration with localhost?

@karelz
Copy link
Member

karelz commented Apr 23, 2020

Reopening to re-enable the tests ...

[ActiveIssue("https://github.com/dotnet/runtime/issues/34317")]

@karelz karelz reopened this Apr 23, 2020
@jaredpar
Copy link
Member Author

@alnikola

The failures were caused by a Helix infra issue which was resolved yesterday.

What Helix issue was this? I'm looking for an issue in core-eng or arcade to link to. If they didn't create on for this problem we should push them to do so.

Issues are the primary way we track reliability between our services. If there is a bug in Helix, Azure, etc ... that impacted our reliability we should push to make sure that there is an issue tracking that. Always feel free to include me in the convo to help with this if needed.

@alnikola
Copy link
Contributor

The issue was closed without enabling the affected test by mistake. Will do it shortly.

alnikola added a commit that referenced this issue Apr 24, 2020
Test is enabled because the failures were caused by Helix infra issue (a misconfigured agent) which was fixed a couple of days ago.

Fixes #34317
@jaredpar
Copy link
Member Author

@alnikola

Where is the Helix issue that describes the bug that they fixed? That is what I'm interested in. If they're not filing bugs then it's issues we're not tracking. That means we can't track improvements.

@alnikola
Copy link
Contributor

alnikola commented Apr 27, 2020

@jaredpar I reported the issue with a strange agent name ('localhost') to the engineering services team and they said it's a known issue which has been already fixed. So, I don't have a link. Will ping you offline for the details.

@ghost ghost locked as resolved and limited conversation to collaborators Dec 9, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Net disabled-test The test is disabled in source code against the issue os-linux Linux OS (any supported distro) test-run-core Test failures in .NET Core test runs
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants