Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dart hangs randomly on Windows #26400

Closed
azenla opened this issue May 4, 2016 · 21 comments
Closed

Dart hangs randomly on Windows #26400

azenla opened this issue May 4, 2016 · 21 comments
Assignees
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends.

Comments

@azenla
Copy link
Contributor

azenla commented May 4, 2016

We have had really serious issues with Dart (1.15.0 and 1.16.0) on Windows.
On Windows 10, we are not having issues, and our HTTP Server is running for a whole week so far.
On Windows 7 and Windows Server 2012 R2 Datacenter, the Dart VM hangs with no errors every day. It is running as a Windows Service in this case. No errors are ever reported.

I filed this separately from #25582, as that seems to be more development-tools related.

@mezoni
Copy link

mezoni commented May 4, 2016

The same problem. Atom.io plugin called dartlang almost always stops to analyze the code.
As far as I know it (plugin) uses Dart analysis server (which is the problem).
I don't think that the problem in the analysis server but problem is in that the server stops working after some time (it is impossible to describe in detail because it is not subject to the understanding).

@mezoni
Copy link

mezoni commented May 4, 2016

What about the thing if the Dart developers should compile Dart VM with source symbols, run some tools and when it "hangs" then attach .NET debugger and seeing it where it loops?

P.S.
I like debuggers!
My favorites was these very cool tools:

  • Turbo Debugger
  • SoftIce
  • WinIce

Interactive disassembler IDA Pro also was my coolest and useful tool!
...but this was a long time ago.

Как говорится "отладчик вам в руки" и проблема будет решена в мгновенье ока.

@azenla
Copy link
Contributor Author

azenla commented May 4, 2016

@mezoni That's probably the easiest way to figure it out.

@floitschG floitschG added the area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. label May 4, 2016
@zengyun261
Copy link

+1
I don't know how to solve the problem.

OS: Windows Server 2012, Windows 10
Dart version: 1.14,1.15,1.16...

@azenla
Copy link
Contributor Author

azenla commented May 15, 2016

@zengyun261 Is that a debugging session with a stacktrace when Dart froze?

@zengyun261
Copy link

runtime/vm/os_thread_win.cc: void OSThread::Join(ThreadJoinId id)

Because the thread handle was closed,after the thread exits, the thread ID may be used by other processes, OpenThread returns the thread ID from other processes, WaitForSingleObject can not be returned.
By Google Translate :(

@zanderso
Copy link
Member

zanderso commented May 15, 2016

@zengyun261 Many thanks for tracking this down. There are actually two problems here. First, the thread id recycling problem, and second: a worker thread shouldn't add itself to the idle list until after reaping exited threads. I will take a look at these.

@kaendfinger If you could still verify that you aren't hitting a different problem, that would be helpful. Thanks!

@zanderso zanderso self-assigned this May 15, 2016
@azenla
Copy link
Contributor Author

azenla commented May 15, 2016

@zanderso Will do.

zanderso added a commit that referenced this issue May 17, 2016
Also:
- Reaps exited threads in the thread pool before putting
a thread on the idle list so that a new arriving task
isn't blocked on a supposedly idle thread in the middle
of a join.
- Stops trying to join eventhandler threads on
Windows. Now that we're using the correct exit() call,
we probably don't have to worry about exit code pollution,
so joining the threads is unnecessary.

related #26400

R=asiva@google.com, iposva@google.com

Review URL: https://codereview.chromium.org/1978153002 .
@zanderso
Copy link
Member

Update: The above change should fix @zengyun261's hang. @zengyun261 if you're able to verify the fix that would be very helpful! @kaendfinger we think it's likely that you were hitting the same problem, so it might be worthwhile seeing if it's gone after the change. This change will be in the next dev release, you can also grab a recent bleeding edge release e.g. from here: https://gsdview.appspot.com/dart-archive/channels/be/raw/138232/sdk/dartsdk-windows-x64-release.zip

@azenla
Copy link
Contributor Author

azenla commented May 18, 2016

@zanderso Sounds good, we will do some testing and verify the issue is no longer present.

@julemand101
Copy link
Contributor

Seems to have fixed the problem I had with random hangs on my Windows 8 machine. My WebSocket server has now run in over 24 hours (before the fix, my server hangs after a few hours). I will let the server run over the weekend and see if it still runs on Monday.

@azenla
Copy link
Contributor Author

azenla commented May 20, 2016

@zanderso Things are looking better for us as well.

@zanderso
Copy link
Member

Great! I filed an issue to get the fix merged into the stable channel.

@mit-mit
Copy link
Member

mit-mit commented May 23, 2016

The next stable release is roughly a week from now -- can the fix in stable wait until then?

whesse pushed a commit that referenced this issue May 24, 2016
Also:
- Reaps exited threads in the thread pool before putting
a thread on the idle list so that a new arriving task
isn't blocked on a supposedly idle thread in the middle
of a join.
- Stops trying to join eventhandler threads on
Windows. Now that we're using the correct exit() call,
we probably don't have to worry about exit code pollution,
so joining the threads is unnecessary.

related #26400

R=asiva@google.com, iposva@google.com

Review URL: https://codereview.chromium.org/1978153002 .
@mit-mit
Copy link
Member

mit-mit commented May 25, 2016

Closing this as per above the fix is in. It will be available in 1.17 stable scheduled for next week.

@mit-mit mit-mit closed this as completed May 25, 2016
@mit-mit mit-mit added this to the 1.17 milestone May 25, 2016
@zanderso
Copy link
Member

Reopening to wait for customers to try the 1.16 patch release when it is available.

@zanderso zanderso reopened this May 25, 2016
@zanderso zanderso removed this from the 1.17 milestone May 25, 2016
@mit-mit
Copy link
Member

mit-mit commented May 25, 2016

@zanderso per my comment two days ago we are not expecting to do a 1.16 stable patch release, buy rather release this in the first 1.17 stable build scheduled for next week. Does that not work?

@zanderso
Copy link
Member

@mit-mit no one responded to your question in the affirmative, so I assumed we were still doing a patch release. I don't believe this change was cherry-picked into the dev branch, so I don't believe it will show up in 1.17 stable unless we patch it into 1.16 stable. @whesse please confirm that you are still working on a 1.16 patch release?

@zanderso
Copy link
Member

1.16.1 stable has been released with this fix. @kaendfinger if you could give it a shot that'd be great. Thanks!

@mit-mit mit-mit closed this as completed Jun 1, 2016
@zanderso zanderso reopened this Jun 1, 2016
@zanderso
Copy link
Member

zanderso commented Jun 1, 2016

@mit-mit Thanks for your concern about this issue. I will close it myself after @kaendfinger verifies that the 1.16.1 release does not exhibit the problem.

@zanderso
Copy link
Member

zanderso commented Jun 8, 2016

@kaendfinger reported offline that there have been no hangs since switching to 1.16.1, so I will close.

@zanderso zanderso closed this as completed Jun 8, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends.
Projects
None yet
Development

No branches or pull requests

7 participants