Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connections from the pool are not reused when using async methods in parallel #18

Closed
ajcvickers opened this issue Oct 30, 2017 · 46 comments
Assignees
Milestone

Comments

@ajcvickers
Copy link

Moved from dotnet/efcore#10169 reported by @wertzui

Whem I'm using async methods (like FirstOrDefaultAsync) to get a result from the Database inside a Parallel.For, the connections are not correctly reused from the connection pool.

The connections will rise to the configured Max Pool Size and after some time (around 1-2 minutes on my machine using localdb), exceptions will be thrown.

Using either async methods and a normal for-loop or non-async methods an a Parallel.For-loop, the connections are properly reused and I can observe that a normal for-loop uses 1 connection and a Parallel.For-loop uses 4 connections which corresponds to MaxDegreeOfParallelism.

Exception message:
System.InvalidOperationException occurred
  HResult=0x80131509
  Message=Timeout expired.  The timeout period elapsed prior to obtaining a connection from the pool.  This may have occurred because all pooled connections were in use and max pool size was reached.
Stack trace:
   at System.Data.Common.ADP.ExceptionWithStackTrace(Exception e)
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.EntityFrameworkCore.Storage.RelationalConnection.<OpenAsync>d__31.MoveNext()
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.EntityFrameworkCore.Query.Internal.AsyncQueryingEnumerable.AsyncEnumerator.<BufferlessMoveNext>d__9.MoveNext()
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.EntityFrameworkCore.Storage.ExecutionStrategy.<ExecuteImplementationAsync>d__33`2.MoveNext()
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at Microsoft.EntityFrameworkCore.Storage.ExecutionStrategy.<ExecuteImplementationAsync>d__33`2.MoveNext()
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.EntityFrameworkCore.Query.Internal.AsyncQueryingEnumerable.AsyncEnumerator.<MoveNext>d__8.MoveNext()
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.EntityFrameworkCore.Query.Internal.AsyncLinqOperatorProvider.SelectAsyncEnumerable`2.SelectAsyncEnumerator.<MoveNext>d__4.MoveNext()
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.EntityFrameworkCore.Query.Internal.AsyncLinqOperatorProvider.<_FirstOrDefault>d__82`1.MoveNext()
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.EntityFrameworkCore.Query.Internal.TaskResultAsyncEnumerable`1.Enumerator.<MoveNext>d__3.MoveNext()
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.EntityFrameworkCore.Query.Internal.AsyncLinqOperatorProvider.SelectAsyncEnumerable`2.SelectAsyncEnumerator.<MoveNext>d__4.MoveNext()
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.EntityFrameworkCore.Query.Internal.AsyncLinqOperatorProvider.ExceptionInterceptor`1.EnumeratorExceptionInterceptor.<MoveNext>d__5.MoveNext()
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Runtime.CompilerServices.ConfiguredTaskAwaitable`1.ConfiguredTaskAwaiter.GetResult()
   at EfPoolingTest.UnitTest1.<>c.<NormalContextParallel>b__4_0(Int32 i) in C:\Users\xxx\documents\visual studio 2017\Projects\EfPoolingTest\EfPoolingTest\UnitTest1.cs:line 70
   at System.Threading.Tasks.Parallel.<>c__DisplayClass19_0`1.<ForWorker>b__1(RangeWorker& currentWorker, Int32 timeout, Boolean& replicationDelegateYieldedBeforeCompletion)

Steps to reproduce

  • Clone https://github.com/wertzui/EfPoolingTest and open the project
  • Run the "ViewConnections.sql" script to see that there are no connections open (beside the 2 used for the script)
  • Run the test "NormalContextParallelAsyncConfigureAwaitFalse" (I suggest running in Debug mode, so you can see the output)
  • Run the "ViewConnections.sql" script multiple times to see the connections increase.
  • After some time you will also get an exception
  • Run the other tests and observe the connection count

Further technical details

EF Core version: 2.0.0 (also tested with 1.1.3)
Database Provider: Microsoft.EntityFrameworkCore.SqlServer/localdb
Operating system: Windows 10 FCU
IDE: Visual Studio 2017 15.4.1

@ajcvickers
Copy link
Author

I investigated the repro code (linked above) and EF is correctly closing the connections, but it looks like they don't make it back into the pool. This seems like a SQLClient issue.

@geleems
Copy link

geleems commented Feb 23, 2018

Sql Client Isolated repro

static void Main(string[] args)
        {
            Program p = new Program();
            p.Test();
        }

        public void Test()
        {
            Parallel.For(1, count + 1, i =>
            {
                if (i % 1000 == 0) Debug.WriteLine(i);

                DoSomething(i).ConfigureAwait(false).GetAwaiter().GetResult();
            });
        }

        public async Task<int> DoSomething(int id)
        {
            using (SqlConnection connection = new SqlConnection("Server=(localdb)\\mssqllocaldb;Initial Catalog=EfCorePoolingTest;Integrated Security=true;"))
            {
                await connection.OpenAsync();
                using (SqlCommand command = new SqlCommand($"select * from MyTable where Id={id}", connection))
                {
                    using (SqlDataReader reader = await command.ExecuteReaderAsync())
                    {
                        while (reader.Read())
                        {
                            for (int i = 0; i < reader.FieldCount; i++)
                            {
                                reader.GetValue(i);
                            }
                        }
                    }
                }
            }
            return 0;
        }

Script for checking sleeping connections on server

SELECT ConnectionStatus = CASE WHEN dec.most_recent_sql_handle = 0x0
        THEN 'Unused'
        ELSE 'Used'
        END
    , CASE WHEN des.status = 'Sleeping'
        THEN 'sleeping'
        ELSE 'Not Sleeping'
        END
    , ConnectionCount = COUNT(1)
    , des.login_name
FROM sys.dm_exec_connections dec
    INNER JOIN sys.dm_exec_sessions des
        ON dec.session_id = des.session_id
GROUP BY CASE WHEN des.status = 'Sleeping'
        THEN 'sleeping'
        ELSE 'Not Sleeping'
        END
    , CASE WHEN dec.most_recent_sql_handle = 0x0
        THEN 'Unused'
        ELSE 'Used'
        END
    , des.login_name

@saurabh500
Copy link
Contributor

This problem is present on the Framework as well.

@saurabh500
Copy link
Contributor

saurabh500 commented Feb 26, 2018

I figured that the connections from the connection pool are being reused. The pool reaches its max connection limit quickly because of async opens and parallel execution. However with Connection.Close() or Dispose() the connections are not making their way back into the pool quickly enough. As a result the connection pool hits a limit of max connections.
Further investigation needed to understand what in Close() is preventing the connections to get back on the pool fast enough.

@saurabh500
Copy link
Contributor

cc @afsanehr

@saurabh500 saurabh500 assigned AfsanehR-zz and unassigned saurabh500 Feb 26, 2018
@geleems geleems self-assigned this Mar 15, 2018
@AfsanehR-zz
Copy link

AfsanehR-zz commented Apr 11, 2018

Hello @wertzui, while we are investigating on this issue, would you mind trying to set the Timeout value on the connection string to a higher number? This should for now resolve your issue with the exceptions you are receiving. I tried with Timeout=300 and did not hit any exceptions. Thanks!

@ajcvickers
Copy link
Author

@afsanehr You will need to mention @wertzui, not me. I just moved the issue from the EF repo.

@AfsanehR-zz
Copy link

@ajcvickers corrected, thanks!

@frodgim
Copy link

frodgim commented Mar 12, 2019

How does this issue going? Has it already been resolved in any version of .net core? Please, let me know if there is any progress about it

@eem-dev
Copy link

eem-dev commented Mar 25, 2019

We are also very interest in this issue status. Is there any new about it?

We are experiencing this issue in our Web API wrote with ASP.NET Core 2.2, connecting in a SQL Server 2016 database, running in AWS on Linux environment.

After trying the workaround suggested in another thread here, we've noticed a very small improvement, but the error continues.

The workaround tried was this:

export COMPlus_ThreadPool_ForceMinWorkerThreads=200;
export COMPlus_ThreadPool_ForceMaxWorkerThreads=10000;

Somebody know if there are new about this error?

@hiral-cloud-evangelist
Copy link

I am facing same issue.. when trying to run 1000 request per second.

System.InvalidOperationException: Timeout expired. The timeout period elapsed prior to obtaining a connection from the pool. This may have occurred because all pooled connections were in use and max pool size was reached.
at System.Data.Common.ADP.ExceptionWithStackTrace(Exception e)
--- End of stack trace from previous location where exception was thrown ---
at Microsoft.EntityFrameworkCore.Storage.RelationalConnection.OpenDbConnectionAsync(Boolean errorsExpected, CancellationToken cancellationToken)

@Durkee
Copy link

Durkee commented May 9, 2019

++ (Team of 5)

@divega
Copy link

divega commented May 15, 2019

As recently announced in the .NET Blog, focus on new SqlClient features an improvements is moving to the new Microsoft.Data.SqlClient package. For this reason, we are moving this issue to the new repo at https://github.com/dotnet/SqlClient. We will still use https://github.com/dotnet/corefx to track issues on other providers like System.Data.Odbc and System.Data.OleDB, and general ADO.NET and .NET data access issues.

@divega divega transferred this issue from dotnet/corefx May 15, 2019
@divega divega added this to the 1.0.0 milestone May 17, 2019
@Wraith2
Copy link
Contributor

Wraith2 commented May 18, 2019

Does everyone who encounters this problem have something similar to the line from @geleems example above:

DoSomething(i).ConfigureAwait(false).GetAwaiter().GetResult();

doing a sync-to-async transition using GetAwaiter() ? Profiling this example shows me a huge amount of wait time at this location and it has nothing to with the database work that's going on it's just waiting for a signal and keeping a connection out of the pool while it waits. If so I may have a solution but I need to verify whether it's equivalent.

@Wraith2
Copy link
Contributor

Wraith2 commented May 19, 2019

There isn't a fault in the SqlClient, it's releasing the connections back into the pool as soon as it's able to. What is happening is thread starvation.

The parallel operation attempt to maintain a number of operations running at the same time. Each of those operations does a async to sync transition using GetAwaiter().GetResult() and in doing so causes the thread making that call to block. So you pick up a thread from the threadpool and then use it to issue a wait on a ManualResetEventSlim which needs to be signalled from another thread, that other thread is also from the threadpool. Eventually you end up with all threadpool threads being busy waiting and no more threads be available to scheduler the completers. The threadpool detects the load and will slowly add more threads and each time it adds some threads they'll run for a while possibly unblocking some previous actions until the system once again reaches a steady state of being blocked. In the extreme long term the threadpool thread count will exceed the number of sql pooled connections and you'll end up unable to get a connection to work with.

Graphically in the profiler you see this:
timeline
Where some blue patches of work are done and then everything ends up in grey wait state. Then a new thread gets spawned which completed some work allowing other threads to unblock and do a little work and then it all settled back into waiting again quickly. You can see that the new threads are added every second or so in this example which is the threadpool throttling new thread creation as it's supposed to.

In those grey waiting periods the call stacks are like this:
calltreePNG
Everything is just waiting for async work to complete by using a MRES blocking wait.

So how do you fix it? Glib answer: don't do this. Conceptual answer don't transition from sync to async, the worlds don't mix nicely but this might not be practical. Practically, don't use parallel sync waits to implement this behaviour. How you work around this depends entirely on the features of your problem. You could manually batch up async tasks and then do a Task.WaitAll(taskArray).GetAwaiter().GetResult() in a loop until you're done with however much input you've got to handle. use the GetAwaiter().GetResult() calls as little as possible and as high up in the call chain as you can. If your input is small you could just start a new detached task for each input and then wait on them all to finish with Task.WaitAll.

My question is why would you do this in the first place? You've got a set of input values that you want to operate on in parallel. That's exactly what SQL server is good at. Why not feed the whole lot of input into a stored procedure as a range or a tvp list and then issue a single command to do them all.

@roji
Copy link
Member

roji commented May 20, 2019

As @Wraith2 explained above, never do sync over async.

@divega
Copy link

divega commented May 20, 2019

My question is why would you do this in the first place? You've got a set of input values that you want to operate on in parallel. That's exactly what SQL server is good at. Why not feed the whole lot of input into a stored procedure as a range or a tvp list and then issue a single command to do them all.

My read is that the unit tests in the original repro were attempting to mimic what would happen in a Web application that receives a large number of identical requests that execute queries to be resolved.

But I don't disagree that calling GetAwaiter().GetResult() for every iteration is a root cause of thread starvation. In fact, the NormalContextAsync unit test that doesn't do this seems to reach steady state with only 4 connections.

Anyway, we received a couple of reports that describe similar symptoms, and I have asked for repros.

@wertzui
Copy link

wertzui commented May 24, 2019

My read is that the unit tests in the original repro were attempting to mimic what would happen in a Web application that receives a large number of identical requests that execute queries to be resolved.

That is correct. The original problem came from an ASP.Net Core MVC Web Application that received a lot of requests at the same time. To reproduce the same behavior I used Parallel.For.

@roji
Copy link
Member

roji commented May 24, 2019

@wertzui can you confirm that in the original web app synchronous blocking was also taking place (e.g. GetResult())? If so then this problem really is expected.

@Wraith2
Copy link
Contributor

Wraith2 commented Aug 2, 2019

I'd love to know how. It's be really nice to be able to see the code for this open source library.

@cheenamalhotra
Copy link
Member

It was a regression caused in the past, the fix has been verified.
Yes, we will be open source soon! :)

@David-Engel
Copy link
Contributor

@Wraith2 I think this change was the actual fix:
https://github.com/dotnet/corefx/pull/33660/files#diff-d5be75d14c8a1bc3bbd0ea1aa75e2ff5R1728
When I looked into #129, which is similar, the above change fixed the issue in M.D.SqlClient.

@cheenamalhotra
Copy link
Member

Preview release v1.0.19221.1 has been published containing fix. Closing issue.

@cheenamalhotra
Copy link
Member

cheenamalhotra commented Sep 4, 2019

Hi @rcarneironet

Please open a new issue and provide us with below details:

  • Driver version in use for Microsoft.Data.SqlClient (Try latest if not done already)
  • Try sequential connections to confirm Pooling works with Async Connection
  • Isolated standalone Repro App to repro concurrent connection issues
  • Have you considered locking for thread safety?

@elmojo
Copy link

elmojo commented Sep 26, 2019

I am experiencing this issue using EntityFrameworkCore 2.2.4 which is referencing EntityFrameworkCore.SqlServer 2.2.0 via AspNetCore.App 2.2.0. EntityFrameworkCore.SqlServer is referencing System.Data.SqlClient 4.6.1.

What do I need to do in order to get the implemented fix?

Also, what is the difference between Microsoft.Data.SqlClient and System.Data.SqlClient?

@cheenamalhotra
Copy link
Member

Hi @elmojo

EntityFrameworkCore.SqlServer 3.0.0 references Microsoft.Data.SqlClient, so upgrading EF Core to 3.0.0 should bring in the fix for this issue.

Microsoft.Data.SqlClient (M.D.S) is a union of System.Data.SqlClient (S.D.S) from .NET Framework and .NET Core (CoreFx) with namespace changes and new features added. Our aim is to provide smooth transition for end users from S.D.S library. More info here.

@elmojo
Copy link

elmojo commented Sep 27, 2019

Visual Studio will not allow me to upgrade to Microsoft.EntityFrameworkCore 3.0.0.

NU1608: Detected package version outside of dependency constraint: Microsoft.AspNetCore.App 2.2.0 requires Microsoft.EntityFrameworkCore (>= 2.2.0 && < 2.3.0) but version Microsoft.EntityFrameworkCore 3.0.0 was resolved.
NU1202: Package Microsoft.EntityFrameworkCore 3.0.0 is not compatible with netcoreapp2.2 (.NETCoreApp,Version=v2.2). Package Microsoft.EntityFrameworkCore 3.0.0 supports: netstandard2.1 (.NETStandard,Version=v2.1)

Microsoft.EntityFrameworkCore.SqlServer 2.2.0 is referenced by Microsoft.AspNetCore.App 2.2.0. There appear to be newer versions of Microsoft.AspNetCore.App (2.2.7 is the latest) but Visual Studio will not allow me to upgrade to them. It says they are blocked by my project. However, those newer versions do not reference Microsoft.EntityFrameworkCore 3.0.0 anyway.

Upgrading to .Net Core 3.0 is not an option as we are not yet using Visual Studio 2019.

Any other ideas on how to get this fix?

@roji
Copy link
Member

roji commented Sep 27, 2019

EF Core 3.0 requires .NET Core 3.0 (or more precisely .NET Standard 2.1) and will not run on older versions. At this point, the only way to use EF Core with Microsoft.Data.SqlClient is to use version 3.0, so you'll have to upgrade to get the fix for this bug.

@elmojo
Copy link

elmojo commented Sep 27, 2019

Well isn't that just great. Upgrading to Core 3.0 is not an option as our organization does not have VS 2019 approved as of yet.

Was this an introduced bug? My code worked previously (I believe on Core 2.1).

Any workarounds? I've tried increasing the connection timeout and the max pool size and neither work.

@Wraith2
Copy link
Contributor

Wraith2 commented Sep 27, 2019

How about not using Parallel?

@elmojo
Copy link

elmojo commented Sep 27, 2019

Yeah, that's what I ended up doing. I rewrote the batch program to send the requests to the webapi in series as opposed to parallel.

@cheenamalhotra
Copy link
Member

Hi @elmojo

We're considering to back port the fix to System.Data.SqlClient for compatibility with older version of .NET Core, will update you when the fix gets merged.

@MTrachsel
Copy link

Task.WaitAll(taskArray).GetAwaiter().GetResult() in a loop until you're done with however much input you've got to handle. use the GetAwaiter().GetResult() calls as little as possible and as high up in the call chain as you can. If your input is small you could just start a new detached task for each input and then wait on them all to finish with Task.WaitAll.

I thought .GetResult is also blocking?

@yukiwongky
Copy link
Contributor

@MTrachsel yes GetResult is blocking as mentioned in the discussion above.

@eglauko
Copy link

eglauko commented Nov 21, 2019

Hi @elmojo

We're considering to back port the fix to System.Data.SqlClient for compatibility with older version of .NET Core, will update you when the fix gets merged.

@cheenamalhotra It was fixed and released for System.Data.SqlClient ?

@cheenamalhotra
Copy link
Member

cheenamalhotra commented Nov 21, 2019

Hi @eglauko

It was in my list of things but I was wondering if fixing for 2.2 would be needed as .NET Core 2.2 is reaching End of Life in December (a couple of weeks left).

Is it possible to upgrade System.Data.SqlClient to v4.7.0 by specifying explicit package reference and continue using with your application?

System.Data.SqlClient v4.7.0 contains fix for this issue.

@eglauko
Copy link

eglauko commented Nov 22, 2019

Hello @cheenamalhotra, thanks for your attention.
Of course I can reference the System.Data.SqlClient package.
We planned to migrate to .Net Core 3.1 early next year, so, in our case, referencing the package would be enough.

@Zvikawein
Copy link

Hi,
I'm using a .Net Framework 4.6.1 and have the same issue.
How can I replace the usage of System.Data under the framework with the fixed System.Data.SqlClient?
When I reference it to my project it seems like VS is not recognizing it and keeps going to the original implementation.
For example if I go to implementation of SqlCommand, it takes me to System.Data under GAC and not the referenced dll.
Thanks.

@cheenamalhotra
Copy link
Member

Hi @Zvikawein

The issue shouldn't occur in .NET Framework though. Have you tried switching to Microsoft.Data.SqlClient? If you can reproduce issue with Microsoft.Data.SqlClient in .NET Framework, please open a new ticket with repro provided.

@scenerytickets
Copy link

Hello,everyone.I am using MySql,i have some troubble.I need help. error message: Timeout expired. The timeout period elapsed prior to obtaining a connection from the pool. This may have occurred because all pooled connections were in use and max pool size was reached.Sometimes, error message:Unable to connect to any of the specified MySQL hosts.Sometimes,error message:Fatal error encountered during command execution.
Environment: Linux、 MysSql、 .net core 2.1.5、 MySql.Data.EntityFrameworkCore 8.0.15.

@scenerytickets
Copy link

@cheenamalhotra , tks

@ErikEJ
Copy link
Contributor

ErikEJ commented Jul 4, 2020

@scenerytickets this is the repo for the SQL Server .NET client, please ask MySql support for help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests