Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tests failures: System.Net.NameResolution.Tests / DnsObsolete* & Dns_GetHostEntry* #1488

Open
ghost opened this issue Sep 30, 2017 · 21 comments
Labels
area-System.Net disabled-test The test is disabled in source code against the issue os-linux Linux OS (any supported distro) os-mac-os-x macOS aka OSX test-run-core Test failures in .NET Core test runs
Milestone

Comments

@ghost
Copy link

ghost commented Sep 30, 2017

Type of failures

Affected tests:

  • System.Net.NameResolution.Tests.GetHostByNameTest (both always fail together)
    • DnsObsoleteGetHostByName_EmptyString_ReturnsHostName
    • DnsObsoleteBeginEndGetHostByName_EmptyString_ReturnsHostName
  • System.Net.NameResolution.Tests.GetHostEntryTest (both always fail together)
    • Dns_GetHostEntry_HostString_Ok
    • Dns_GetHostEntryAsync_HostString_Ok

Failure text:

  • OSX: Device not configured
  • Linux: No such device or address
System.Net.Internals.SocketExceptionFactory+ExtendedSocketException : Device not configured
OR
System.Net.Internals.SocketExceptionFactory+ExtendedSocketException : No such device or address

at System.Net.Dns.InternalGetHostByName(String hostName, Boolean includeIPv6) in /Users/buildagent/agent/_work/34/s/corefx/src/System.Net.NameResolution/src/System/Net/DNS.cs:line 46
at System.Net.Dns.GetHostByName(String hostName) in /Users/buildagent/agent/_work/34/s/corefx/src/System.Net.NameResolution/src/System/Net/DNS.cs:line 41
at System.Net.NameResolution.Tests.GetHostByNameTest.DnsObsoleteGetHostByName_EmptyString_ReturnsHostName() in /Users/buildagent/agent/_work/34/s/corefx/src/System.Net.NameResolution/tests/FunctionalTests/GetHostByNameTest.cs:line 108

Note:

  • Dns_GetHostEntryAsync_HostString_Ok fails almost always with same error twice, wrapped in System.AggregateException.

Older failure on OSX (prior to 2017/9/15)

Assert.Contains() Failure
Not found: DCI-Mac-Build-068.local
In value:  dci-mac-build-068.local
at System.Net.NameResolution.Tests.GetHostByNameTest.DnsObsoleteGetHostByName_EmptyString_ReturnsHostName() in /Users/buildagent/agent/_work/30/s/corefx/src/System.Net.NameResolution/tests/FunctionalTests/GetHostByNameTest.cs:line 108

History of failures

Each test couple always fails or passes - maybe bad machine setup?

Day Build OS Test
9/6 20170906.01 OSX10.12 DnsObsolete*
9/10 20170910.01 Ubuntu14.04 DnsObsolete* & Dns_GetHostEntry*
9/13 20170913.02 OSX10.12 DnsObsolete*
9/15 20170915.01 OSX10.12 DnsObsolete*
9/20 20170920.03 Debian90 DnsObsolete* & Dns_GetHostEntry*
9/30 20170930.01 OSX10.12 DnsObsolete* & Dns_GetHostEntry*
10/2 20171002.01 Debian87 DnsObsolete* & Dns_GetHostEntry*
10/21 20171021.01 Debian87 DnsObsolete* & Dns_GetHostEntry*
10/26 20171026.01 Suse42.2 DnsObsolete* & Dns_GetHostEntry*
11/13 20171113.03 OSX10.12 DnsObsolete* & Dns_GetHostEntry*
11/16 20171116.51 SLES12 DnsObsolete* & Dns_GetHostEntry*
11/26 20171126.02 Suse42.2 DnsObsolete* & Dns_GetHostEntry*

Similar failures to dotnet/corefx#20245, but at different times (probably runs on different worker machine).

Runfo Tracking Issue: dnsobsoletegethostbyname_emptystring_returnshostname

Build Definition Kind Run Name

Build Result Summary

Day Hit Count Week Hit Count Month Hit Count
0 0 0
@Sunny-pu
Copy link

Sunny-pu commented Oct 26, 2017

[EDIT] Test failure included in history in top post.

@karelz karelz changed the title Tests under: System.Net.NameResolution.Functional.Tests failed with "System.Net.Internals.SocketExceptionFactory+ExtendedSocketException : Device not configured" Tests failures: System.Net.NameResolution.Tests / DnsObsolete* & Dns_GetHostEntry* Dec 28, 2017
@karelz
Copy link
Member

karelz commented Dec 28, 2017

Updated test failure history in top post.

@benaadams
Copy link
Member

Seen in dotnet/corefx#35267

Debian.8.Amd64.Open-x64-Release

Message :
System.Net.Internals.SocketExceptionFactory+ExtendedSocketException : Name or service not known
Stack Trace :
   at System.Net.Dns.InternalGetHostByName(String hostName) in /__w/1/s/src/System.Net.NameResolution/src/System/Net/DNS.cs:line 68
   at System.Net.Dns.GetHostByName(String hostName) in /__w/1/s/src/System.Net.NameResolution/src/System/Net/DNS.cs:line 41
   at System.Net.NameResolution.Tests.GetHostByNameTest.DnsObsoleteGetHostByName_EmptyString_ReturnsHostName() in /__w/1/s/src/System.Net.NameResolution/tests/FunctionalTests/GetHostByNameTest.cs:line 108

@AriNuer
Copy link

AriNuer commented Jun 18, 2019

Same test failed with similar exception:

Failed test:
System.Net.NameResolution.Tests.GetHostByNameTest/DnsObsoleteBeginEndGetHostByName_EmptyString_ReturnsHostName

Message:

System.Net.Internals.SocketExceptionFactory+ExtendedSocketException : Name or service not known

Stack Trace :

   at System.Net.Dns.InternalGetHostByName(String hostName) in /_/src/System.Net.NameResolution/src/System/Net/DNS.cs:line 68
   at System.Net.Dns.ResolveCallback(Object context) in /_/src/System.Net.NameResolution/src/System/Net/DNS.cs:line 218
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw(Exception source) in /_/src/System.Private.CoreLib/shared/System/Runtime/ExceptionServices/ExceptionDispatchInfo.cs:line 69
   at System.Net.Dns.HostResolutionEndHelper(IAsyncResult asyncResult) in /_/src/System.Net.NameResolution/src/System/Net/DNS.cs:line 361
   at System.Net.Dns.EndGetHostByName(IAsyncResult asyncResult) in /_/src/System.Net.NameResolution/src/System/Net/DNS.cs:line 383
   at System.Net.NameResolution.Tests.GetHostByNameTest.DnsObsoleteBeginEndGetHostByName_EmptyString_ReturnsHostName() in /_/src/System.Net.NameResolution/tests/FunctionalTests/GetHostByNameTest.cs:line 119

Build: -20190617.81(Master)
Failing configurations:

  • RedHat.6.Amd64.Open-x64-Release
  • Debian.9.Amd64.Open-x64-Release

Details:
https://mc.dot.net/#/user/dotnet-bot/pr~2Fdotnet~2Fcorefx~2Frefs~2Fheads~2Fmaster/test~2Ffunctional~2Fcli~2Finnerloop~2F/20190617.81/workItem/System.Net.NameResolution.Functional.Tests/analysis/xunit/System.Net.NameResolution.Tests.GetHostByNameTest~2FDnsObsoleteBeginEndGetHostByName_EmptyString_ReturnsHostName

@karelz
Copy link
Member

karelz commented Oct 2, 2019

Triage: We should know why DNS resolution does not work sometimes -- we know it is not cached on Linux/Mac, but why does it happen so often in CI? Is it network flakiness? Or busy machine?

Plan: Add production diagnostics into DNS, then make runs with those changes to catch it. It will be useful also for production diagnostic by customers.

@karelz karelz transferred this issue from dotnet/corefx Jan 9, 2020
@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added area-System.Net untriaged New issue has not been triaged by the area owner labels Jan 9, 2020
@karelz karelz added os-linux Linux OS (any supported distro) os-mac-os-x macOS aka OSX test-run-core Test failures in .NET Core test runs and removed untriaged New issue has not been triaged by the area owner labels Jan 9, 2020
@karelz karelz added this to the 5.0 milestone Jan 9, 2020
@wfurt wfurt added the disabled-test The test is disabled in source code against the issue label Jan 28, 2020
@wfurt
Copy link
Member

wfurt commented Jan 28, 2020

no failure on Linux for 90 days. Some tests are still disabled on OSX and ARM64 (tracked by #27622).

@karelz
Copy link
Member

karelz commented Jan 29, 2020

@wfurt do we have any tests disabled against this specific bug? If not, let's close it.

@wfurt
Copy link
Member

wfurt commented Jan 29, 2020

We have four tests disabled at this moment for OSX and the same for ARM64 (tracked in #27622).

I suspect most of the failures are not product bugs and are caused by test environment.
AFAIK we changed our macOS infrastructure as well as ARM64 support matured.
How about moving the test to Outerloop? That would allow us to collect some stability data without impacting everyone's PRs. That also seems appropriate since tests do have a dependency on external service.

cc: @dotnet/ncl in case somebody has a better suggestion.

@ViktorHofer
Copy link
Member

A bunch of NameResolution tests failed again (including the ones mentioned in this issue): https://dnceng.visualstudio.com/public/_build/results?buildId=525372&view=ms.vss-test-web.build-test-results-tab&runId=16621996&resultId=137716&paneView=debug

@wfurt @davish suggestions for mitigation?

@davidsh
Copy link
Contributor

davidsh commented Feb 18, 2020

@wfurt @davish suggestions for mitigation?

Doing name resolution in a CI environment will always be flaky most likely.

System.Net.Internals.SocketExceptionFactory+ExtendedSocketException : Name or service not known

Perhaps we should consider an approach being done here in #32501 and only run these tests on a less regular basis. I.e.:

[Trait(XunitConstants.Category, XunitConstants.IgnoreForCI)] // DNS is flaky

FWIW, Our DNS APIs have been relatively stable from a product perspective and don't see a lot of risk running these tests less often.

@wfurt
Copy link
Member

wfurt commented Feb 18, 2020

We should work towards stabilizing core services (like DNS) in CI.

@ViktorHofer
Copy link
Member

What about leveraging the RetryHelper for DNS tests which are inherently flaky?

@davidsh
Copy link
Contributor

davidsh commented Feb 18, 2020

We should work towards stabilizing core services (like DNS) in CI.

What about leveraging the RetryHelper for DNS tests which are inherently flaky?

Do we understand whether the problem is intermittent (i.e. retrying tests will help) or due to a mis-configuration of some kind in CI (which retrying tests will not help).

@ViktorHofer
Copy link
Member

Do we understand whether the problem is intermittent (i.e. retrying tests will help) or due to a mis-configuration of some kind in CI (which retrying tests will not help).

I assumed it's intermittent but how would we find that out?

@davidsh
Copy link
Contributor

davidsh commented Feb 18, 2020

I assumed it's intermittent but how would we find that out?

Let's add some RetryHelper logic to some of the tests. Then run it in CI. If we see a significant improvement in test reliability, then we will know that the problem is intermittent.

@Anipik Anipik added the blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms' label Feb 25, 2021
@ViktorHofer
Copy link
Member

Tests were disabled on SLES with #48759.

@antonfirsov antonfirsov removed their assignment Mar 9, 2021
@ViktorHofer ViktorHofer removed the blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms' label Mar 15, 2021
@VincentBu

This comment has been minimized.

@VincentBu

This comment has been minimized.

MichalStrehovsky added a commit to MichalStrehovsky/runtime that referenced this issue Dec 9, 2021
This is a set of changes required to enable control flow guard enforcement within the process. Control flow guard is a security mitigation feature that validates that indirect calls only land on addresses that are valid targets of indirect calls. It has two parts: identifying valid targets of indirect calls within the process, and checking whether target of indirect call is valid before dispatching to it.

This implements annotations and enforcement within the unmanaged parts of the NativeAOT runtime and the annotation-only part for the managed code. Enforcement will follow later.

Three kinds of changes:

* A new version of Runtime.lib that enables `/guard:cf` flag. This is in addition to the existing libraries since we don't want code to pay the perf penalty if CFG is not enabled.
* Annotating methods as valid CFG targets in the AOT compiler and object file writer.
* MSBuild support for new `<ControlFlowGuard>Guard</ControlFlowGuard>` property that enables all of this (passes a switch to the AOT compiler, selects the guarded version of runtime libraries to link with, and passes a switch to link.exe to enable CFG for the process).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-System.Net disabled-test The test is disabled in source code against the issue os-linux Linux OS (any supported distro) os-mac-os-x macOS aka OSX test-run-core Test failures in .NET Core test runs
Projects
None yet
Development

Successfully merging a pull request may close this issue.