Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

System.NullReferenceException in System.Net.Http.WinHttpRequestCallback.RequestCallback #25494

Closed
serialseb opened this issue Mar 16, 2018 · 12 comments · Fixed by dotnet/corefx#28367

Comments

@serialseb
Copy link

Couldn't find any issue that seemed to cover the problem. I'm running kestrel in tests, and writing a reverse proxy component.

At an unknown point in running the tests, I get the following exception:

System.NullReferenceException: Object reference not set to an instance of an object.
   at System.Net.Http.WinHttpRequestCallback.RequestCallback(IntPtr handle, WinHttpRequestState state, UInt32 internetStatus, IntPtr statusInformation, UInt32 statusInformationLength)
   at System.Net.Http.WinHttpRequestCallback.WinHttpCallback(IntPtr handle, IntPtr context, UInt32 internetStatus, IntPtr statusInformation, UInt32 statusInformationLength)
   at Interop.WinHttp.WinHttpCloseHandle(IntPtr handle)
   at Interop.WinHttp.SafeWinHttpHandle.ReleaseHandle()
   at System.Runtime.InteropServices.SafeHandle.InternalDispose()
   at System.Net.Http.WinHttpRequestState.ClearSendRequestState()
   at System.Net.Http.WinHttpHandler.<StartRequest>d__105.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
   at System.Threading.ThreadPoolWorkQueue.Dispatch()

Repro:

It looks like the same stacktrace as in #24641

Note builds and tests succeed on core 2 on libcurl-based http client.

I tried not disposing request/response messages from the client, no change. I'm refactoring the code at the moment, so maybe the problem will go away, but i'm pretty confused.

@karelz
Copy link
Member

karelz commented Mar 17, 2018

Do you have reliable repro? Can you isolate it and share it with us?

@serialseb
Copy link
Author

serialseb commented Mar 17, 2018

The test above is failing reliably on all windows platforms, but not during the test itself. It simply causes the test run to fail with the above unhandled exception, but it's very isolated to that test. As the exception happens when coming back from a threadpool, it would seem, i have no understanding of where it's triggered or why.

It requires two kestrel servers, a from and a to, with calls from one to the other, making a request and reading the response from the first to the second server, and running on windows.

Things that have no impact: the destination http server timing out or not, a cancellation token being passed along or not, disposing HttpRequestMessage / HttpResponseMessage, HttpClient, kestrel servers or not.

I've already spent half a day trying to play with various combinations, so I'm already quite out of my budget trying to diagnose the problem further for now.

@serialseb
Copy link
Author

Ok, i've worked a bit more on this, and got as small a repro as i could figure out.

https://github.com/openrasta/openrasta-core/blob/winhttpHandler/src/Tests/Plugins.ReverseProxy/corefx_repro.cs

Leads to the same stack trace / exception / failure https://ci.appveyor.com/project/OpenRasta/openrasta-core/build/2.6.0-preview.1.1311+winhttpHandler

@karelz
Copy link
Member

karelz commented Mar 17, 2018

@serialseb I am not sure how to run the repro. Also it depends on Kestrel. Would it be possible to take it out into standalone console app? (less components in play, clearer repro)

@serialseb
Copy link
Author

@karelz I don't understand what you're asking for.

I think that ought to be enough for a repro?

@davidsh
Copy link
Contributor

davidsh commented Mar 20, 2018

@serialseb Thx for the information above including the callstack. After looking thru the callstack and the WinHttpHandler source code, I believe the NullReferenceException is happening from trying to de-reference the state object. There appears to be some race conditions where the state object has been unpinned and freed: Thus it is coming back as null from the FromIntPtr(context) call.

https://github.com/dotnet/corefx/blob/master/src/System.Net.Http.WinHttpHandler/src/System/Net/Http/WinHttpRequestCallback.cs#L45

We currently have an Assert after this line of code. But it probably needs to be an if (state == null) check instead.

Regarding your statement here:

You can run it, copy it outside if you wish, it's an xunit test.

Can you describe the exact steps you are running Xunit tests? Is this in Visual Studio or on the command line? Thx.

@davidsh davidsh self-assigned this Mar 20, 2018
@davidsh
Copy link
Contributor

davidsh commented Mar 20, 2018

Also, we (CoreFx) are not familar with building openrasta-core repo. That's why we usually ask for a smaller repro that just involves CoreFx code and possibly ASP.NET Core code.

Is it possible to trim this repro down to just ASP.NET Core and .NET Core components. I.e. can you create a standalone repro using Visual Studio and ASP.NET Core templates?
image

Otherwise, what are the detailed instructions for forking the openrasta-core repo and building everything including your specific repro tests in your 'winhttphandler' branch?

@davidsh
Copy link
Contributor

davidsh commented Mar 20, 2018

I spent a little time and was able to reconstruct a standalone repro using just ASP.NET Core 2.0 and .NET Core 2.0.

I've attached the full Visual Studio solution here.
AspNetCore2b.zip

I was able to reproduce the NullReferenceException with the same call stack you got above.

Thanks for reporting this and we will investigate the bug.

@davidsh
Copy link
Contributor

davidsh commented Mar 21, 2018

I was able to root cause this problem.

There are several different issues here. The first is that there is an error handling bug in WinHttpHandler. That bug is causing the NullReferenceException which is being thrown on a threadpool thread and crashing the app.

The second problem is that the root error is because your proxy scenario is attempting to do a GET request with "Transfer-Encoding: chunked" and a request body. The RFC for HTTP doesn't define what happens when a GET request has a request body. Some implementations will immediately throw an error (such as .NET Framework HttpWebRequest/HttpClient).

.NET Core HTTP stack allows this. But due to a bug in the underlying HTTP stack for .NET Core on Windows (WinHTTP), it is returning an error if you try to do a GET request and use chunked encoding. It would work with Content-Length semantics.

Your repro scenario is actually not choosing either. The code here is simply creating a StreamContent object to send. And the HTTP stack defaults to using chunked encoding.

There are a few workarounds that can prevent the error in the first place.

  1. You can wait until .NET Core 2.1 which will use a different HTTP stack (SocketsHttpHandler) by default. That HTTP stack which allow doing a GET request with request body.

  2. You can force 'Content-Length' semantics for the GET by buffering in the request body before you do the .SendAsync(). Simply add a line of code like this:

await requestMessage.Content.LoadIntoBufferAsync();
  1. You can change your proxy scenario code to avoid sending any request body with GET verbs. Most likely there isn't any actual body content anyway coming from the ASP.NET HttpContext object. So, you can modify your code here and skip setting the .Content property if the verb is a GET:

https://github.com/openrasta/openrasta-core/blob/master/src/Tests/Plugins.ReverseProxy/corefx_repro.cs#L78

In the meantime, we will work to fix the error handling bug in the underlying WinHttpHandler that causes the NullReferenceException. However, once that is fixed, the HttpClient.SendAsync() will still throw an exception but it will be a normal HttpRequestException if you try to do a GET with a request body and use chunked encoding. So, I suggest one of the workarounds above for your proxy scenario.

davidsh referenced this issue in dotnet/corefx Mar 22, 2018
This PR addresses some error handling problems in WinHttpHandler as well as
improves the error messages generated for WinHttpException.

This was root caused to a bug in WinHttp where it can't do a GET request with a
request body using chunked encoding. However, this exception was masked due to
an error handling bug caused by an inconsistency in how WinHttp associates
context values for callbacks during the WinHttpSendRequest API call. This PR now
fixes the error handling around synchronous errors being returned from
WinHttpSendRequest. The root WinHttp bug itself can't be fixed. So, doing a GET request
will still throw an exception, but it will be a catchable HttpRequestException.

Another bug was discovered while investigating #26278. WinHttpCloseHandle should only be
called once and should never race between threads. This is normally protected when
the request handle is closed via SafeHandle Dispose(). But there was a place in the
callback method where we called WinHttpCloseHandle directly against the raw handle.

Finally, the error messages for WinHttpExceptions have been improved to show the
error number and the probable WinHttp API call that generated the error. This will
save diagnostic time.

I also moved the WinHttpException source code back from Common. It was originally shared
between WinHttpHandler and ClientWebSocket. But then ClientWebSocket moved away from
WinHttp implementation. So, it makes more sense for this class to be under WinHttpHandler.

Example:

Original WinHttpException message: "The parameter is incorrect"
New WinHttpException message: "Error 87 calling WinHttpSendRequest, 'The parameter is incorrect'."

Closes #28156
Contributes to #26278
@serialseb
Copy link
Author

The stream would be empty in this scenario, are you saying streamcontent would trigger TE on request and then encode 0 causing the bug? I can detect a CL of 0 and null the content and à TE on request?

@davidsh
Copy link
Contributor

davidsh commented Mar 23, 2018

are you saying streamcontent would trigger TE on reques

Yes. When you create a StreamContent object, there is no default Content-Length created. When this request is about to be sent there is also no TE as well. CL is attempted first by asking the HttpContent object (StreamContent) to calculate its size via the TryComputeLength() method on the StreamContent object. However, this particular StreamContent was created from an HttpContext object and the 'CanSeek' property is false in this case. That means we can't compute the Content-Length property from the stream. So, thus TE is used. And since the verb is a GET, it will cause the WinHTTP bug.

If you can detect a CL of 0 by looking at the actual headers from the HttpContext object, then you can avoid creating a StreamContent object in the first place. Then the GET request would be a normal GET with no CL nor TE request headers.

@serialseb
Copy link
Author

serialseb commented Mar 23, 2018

Awesome I owe you a bunch of beers, gonna go back and fix that :)

@msftgits msftgits transferred this issue from dotnet/corefx Jan 31, 2020
@msftgits msftgits added this to the 2.1.0 milestone Jan 31, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Dec 17, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants