Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refrigerate our threads for later reuse #15424

Merged
merged 23 commits into from
Jul 19, 2023

Conversation

zadjii-msft
Copy link
Member

@zadjii-msft zadjii-msft commented May 24, 2023

This is my proposed solution to #15384.

Basically, the issue is that we cannot ever close a DesktopWindowXamlSource ("DWXS"). If we do, then any other thread that tries to access XAML metadata will explode, which happens frequently. A DWXS is inextricably linked to an HWND. That means we have to not only reuse DWXS's, but the HWNDs themselves. XAML also isn't agile, so we've got to keep the thread that the DWXS was started on alive as well.

To do this, we're going to introduce the ability to "refrigerate" and "reheat" window threads.

  • A window thread is "hot" if it's actively got a window, and is pumping window messages, and generally, is a normal thing.
  • When a window is closed, we need to "refrigerate" it's WindowThread and IslandWindow. WindowEmperor will take care of tracking the threads that are refrigerated.
  • When a new window is requested, the Emperor first try to "reheat"/"microwave" a refrigerated thread. When a thread gets reheated, we'll create a new AppHost (and TerminalWindow/Page), and we'll use the existing IslandWindow for that instance.

The metaphor is obviously ridiculous, but you get it so who cares.

In this way, we'll keep all the windows we've ever created around in memory, for later reuse. This means that the leak goes from (~12MB x number of windows closed) to (~12MB x maximum number of simultaneously open Terminal windows). It's still not good.

We won't do this on Windows 11, because the bug that is the fundamental premise of this issue is fixed already in the OS.

I'm not 100% confident in this yet.

  • There's still a d3d leak of some sort on exit in debug builds. (maybe D2D DEBUG ERROR - Memory leaks detected #15306 related)
    • havent seen this in a while. Must have been a issue in an earlier revision.
  • I need to validate more on Windows 11
    • BAD: Closing the last tab on Windows 11 doesn't close the window
    • BAD: Closing a window on Windows 11 doesn't close the window - it just closes the one tab item and keeps on choochin'
    • BAD: Close last tab, open new one, attempt to close window - ALL windows go *poof*. Cause of course. No break into post-mortem either.
  • more comments
  • maybe a diagram
  • Restoring windows is at the wrong place entirely? I once reopened the Terminal with two persisted windows, and it created one at 0,0
  • Remaining code TODO!s: 0 (?)
  • "warm-reloading" useTabsInTitlebar (change while terminal is running after closing a window, open a new one) REALLY doesn't work. Obviously restores the last kind of window. Yike.

is all about #15384

closes #15410 along the way. Might fork that fix off.

@github-actions

This comment has been minimized.

@zadjii-msft zadjii-msft marked this pull request as draft May 24, 2023 21:50
@zadjii-msft zadjii-msft changed the title Manually close the ContentDialog in teardown Refrigerate our threads for later reuse May 24, 2023
if (auto self{ weakThis.lock() })
{
auto lockedWindows{ self->_windows.lock() };
lockedWindows->push_back(window);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implements LIFO apparently; I suppose that is more efficient with LRU paging than FIFO.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm gonna be totally honest - I didn't think that much about this. I figured that with the frequency that windows are opened/closed, this is highly unlikely to be anywhere near a hot path.

@microsoft-github-policy-service microsoft-github-policy-service bot added the Issue-Bug It either shouldn't be doing this or needs an investigation. label May 25, 2023
@github-actions

This comment has been minimized.

DHowett pushed a commit that referenced this pull request May 26, 2023
…5451)

This is a resurrection of #5629. As it so happens, this crash-on-exit
was _not_ specific to my laptop. It's a bug in the XAML platform
somewhere, only on Windows 10.

In #14843, we moved this leak into `becomeMonarch`. Turns out, we don't
just need this leak for the monarch process, but for all of them.

It's not a real "leak", because ultimately, our `App` lives for the
entire lifetime of our process, and then gets cleaned up when we do. But
`dtor`ing the `App` - that's apparently a no-no.

Was originally in #15424, but I'm pulling it out for a super-hotfix
release.


Closes #15410

MSFT:35761869 looks like it was closed as no repro many moons ago. This
should close out our hits there (firmly **40% of the crashes we've
gotten on 1.18**)
DHowett pushed a commit that referenced this pull request May 26, 2023
…5451)

This is a resurrection of #5629. As it so happens, this crash-on-exit
was _not_ specific to my laptop. It's a bug in the XAML platform
somewhere, only on Windows 10.

In #14843, we moved this leak into `becomeMonarch`. Turns out, we don't
just need this leak for the monarch process, but for all of them.

It's not a real "leak", because ultimately, our `App` lives for the
entire lifetime of our process, and then gets cleaned up when we do. But
`dtor`ing the `App` - that's apparently a no-no.

Was originally in #15424, but I'm pulling it out for a super-hotfix
release.

Closes #15410

MSFT:35761869 looks like it was closed as no repro many moons ago. This
should close out our hits there (firmly **40% of the crashes we've
gotten on 1.18**)

(cherry picked from commit aa8ed8c)
Service-Card-Id: 89332890
Service-Version: 1.18
@zadjii-msft zadjii-msft marked this pull request as ready for review May 30, 2023 15:59
@github-actions

This comment has been minimized.

src/cascadia/WindowsTerminal/WindowEmperor.cpp Outdated Show resolved Hide resolved
src/cascadia/WindowsTerminal/WindowEmperor.cpp Outdated Show resolved Hide resolved
src/cascadia/WindowsTerminal/IslandWindow.cpp Outdated Show resolved Hide resolved
std::move(_warmWindow));

// raise the signal to unblock KeepWarm and start the window message loop again.
_microwaveBuzzer.notify_one();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just remember that C++20 added std::binary_semaphore and we could use it in place of a conditional variable. Instead of notify_one() it'd be release() and instead of wait() it'd be acquire().

fridge->push_back(window);
}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I forgot to comment this:
I think we should see if we can move the things that both WindowEmperor and WindowThread need access to into its own little struct. This way we would resolve the circular dependency between the two. Instead of "A <-> B" it'd be "A -> C & B -> C". It would allow us to move this code into WindowThread in some way and likely make this a bit easier to understand. (I struggled to understand how the Refrigerate/Microwaving flow works.)

Co-authored-by: Leonard Hecker <lhecker@microsoft.com>
Copy link
Member

@DHowett DHowett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

8/11

src/cascadia/WindowsTerminal/IslandWindow.cpp Outdated Show resolved Hide resolved
src/cascadia/WindowsTerminal/NonClientIslandWindow.cpp Outdated Show resolved Hide resolved
src/cascadia/WindowsTerminal/NonClientIslandWindow.cpp Outdated Show resolved Hide resolved
src/cascadia/WindowsTerminal/NonClientIslandWindow.h Outdated Show resolved Hide resolved
src/cascadia/WindowsTerminal/AppHost.cpp Show resolved Hide resolved
src/cascadia/WindowsTerminal/AppHost.cpp Outdated Show resolved Hide resolved
{
_window = std::make_unique<NonClientIslandWindow>(_windowLogic.GetRequestedTheme());
_window = std::move(window);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK so this was my fear. We always pop off whatever kind of window we're given, and we never make sure it's the right kind. That means that on Windows 11 if I change "Show Tabs in Titlebar" new windows will get it right (no refrigeration) and on Windows 10 it may get it wrong (it will microwave the wrong kind of window).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a scale of 1-blocking, where do we rate this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would call it blocking for this PR.

Copy link
Member Author

@zadjii-msft zadjii-msft Jul 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

brainstorm

  • keep two drawers in the fridge, one for each type of window. Evaluate what's needed, and get one of those specifically.
    • seems insane
  • Totally leak the fridge when that setting changes
    • will make sure we make the right kind of window
    • seems insane
    • leaks a bunch of memory when the setting changes
  • Find some way to exorcise a DWXS out of a window when we change that setting
    • pretty sure the DWXS is HARD pinned to the HWND
    • yes it's HARD linked to the HWND IDesktopWindowXamlSourceNative::AttachToWindow:

      Make sure that your code calls the AttachToWindow method only once per DesktopWindowXamlSource object. Calling this method more than once for a DesktopWindowXamlSource object could result in a memory leak

    • ... goddard, options
    • Maybe we can create a new NCIW from a IW? and steal it's hwnd?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we just pin the setting for the entire duration the application is running? That's how the setting is documented anyways right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

man I'd love that. That would sure be the easiest way.

Copy link
Member Author

@zadjii-msft zadjii-msft Jul 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Er, after discussion, pinning probably isn't the best, cause warm-reloading useTabsInTitlebar does work on Windows 11 (and 10) today and would break if we did that

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After discussing the impact with Mike, I'm choosing to not consider this blocking any longer.

It impacts the following scenario:

  1. Be on Win 10
  2. Open two windows
  3. Close one
  4. Change the "tabs in titlebar" setting
  5. Open a new one

src/cascadia/WindowsTerminal/AppHost.cpp Outdated Show resolved Hide resolved
src/cascadia/WindowsTerminal/AppHost.cpp Outdated Show resolved Hide resolved
@microsoft-github-policy-service microsoft-github-policy-service bot added the Needs-Author-Feedback The original author of the issue/PR needs to come back and respond to something label Jul 13, 2023
@microsoft-github-policy-service microsoft-github-policy-service bot removed the Needs-Author-Feedback The original author of the issue/PR needs to come back and respond to something label Jul 17, 2023
@DHowett
Copy link
Member

DHowett commented Jul 18, 2023

There are more unchecked TODOs in the PR body -- should we drive them down before merging?

@zadjii-msft
Copy link
Member Author

There are more unchecked TODOs in the PR body -- should we drive them down before merging?

I should more accurately update the PR body 😉

@zadjii-msft
Copy link
Member Author

Leaving notes here, because there's not another thread for "hot reload showTabsInTitlebar "

Notes that folks don't need to care about

If we could include NCIW.h in IW.cpp,

unique_ptr<IslandWindow> IslandWindow::ConvertToOther(requestedTheme)
{
    return std::move(std::make_unique<NonClientIslandWindow>(this, requestedTheme));
}

IslandWindow::IslandWindow(std::unique_ptr<IslandWindow>& oldWindow) noexcept :
    IslandWindow{ requestedTheme }
{
    _interopWindowHandle = std::move(oldWindow->_interopWindowHandle);
    _source = std::move(oldWindow->_source);
    // oldWindow is dead now. It should be freed by the caller.
}
NonClientIslandWindow::NonClientIslandWindow(std::unique_ptr<IslandWindow>& oldWindow, const ElementTheme& requestedTheme) noexcept :
    NonClientIslandWindow{ requestedTheme }
{
    _interopWindowHandle = std::move(oldWindow->_interopWindowHandle);
    _source = std::move(oldWindow->_source);
}

unique_ptr<IslandWindow> NonClientIslandWindow::ConvertToOther(requestedTheme) override
{
    return std::move(std::make_unique<IslandWindow>(this));
}

and then in apphost,

_window = std::move(_window.ConvertToOther(requestedTheme));

GAH but you STILL dont' know if you need to convert at all!

Like, you don't know if _useNonClientArea changed between that window and what we want to make, and we can't introspect that


hot reloading this is hard

Copy link
Member

@DHowett DHowett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm cool with this. Let's drive down the rest of those checkboxes.

Thanks for handling this, Mike!

src/cascadia/WindowsTerminal/AppHost.cpp Show resolved Hide resolved
src/cascadia/WindowsTerminal/WindowEmperor.cpp Outdated Show resolved Hide resolved
@zadjii-msft zadjii-msft enabled auto-merge (squash) July 19, 2023 15:18
@zadjii-msft zadjii-msft merged commit 7010626 into main Jul 19, 2023
14 of 15 checks passed
@zadjii-msft zadjii-msft deleted the dev/migrie/b/refrigerate-threads-for-later branch July 19, 2023 15:52
DHowett pushed a commit that referenced this pull request Sep 22, 2023
This is my proposed solution to #15384.

Basically, the issue is that we cannot ever close a
`DesktopWindowXamlSource` ("DWXS"). If we do, then any other thread that
tries to access XAML metadata will explode, which happens frequently. A
DWXS is inextricably linked to an HWND. That means we have to not only
reuse DWXS's, but the HWNDs themselves. XAML also isn't agile, so we've
got to keep the `thread` that the DWXS was started on alive as well.

To do this, we're going to introduce the ability to "refrigerate" and
"reheat" window threads.
* A window thread is "**hot**" if it's actively got a window, and is
pumping window messages, and generally, is a normal thing.
* When a window is closed, we need to "**refrigerate**" it's
`WindowThread` and `IslandWindow`. `WindowEmperor` will take care of
tracking the threads that are refrigerated.
* When a new window is requested, the Emperor first try to
"**reheat**"/"**microwave**" a refrigerated thread. When a thread gets
reheated, we'll create a new AppHost (and `TerminalWindow`/`Page`), and
we'll use the _existing_ `IslandWindow` for that instance.

<sub>The metaphor is obviously ridiculous, but _you get it_ so who
cares.</sub>

In this way, we'll keep all the windows we've ever created around in
memory, for later reuse. This means that the leak goes from (~12MB x
number of windows closed) to (~12MB x maximum number of simultaneously
open Terminal windows). It's still not good.

We won't do this on Windows 11, because the bug that is the fundamental
premise of this issue is fixed already in the OS.

I'm not 100% confident in this yet.
* [x] There's still a d3d leak of some sort on exit in debug builds.
(maybe #15306 related)
* havent seen this in a while. Must have been a issue in an earlier
revision.
* [x] I need to validate more on Windows 11
* [x] **BAD**: Closing the last tab on Windows 11 doesn't close the
window
* [x] **BAD**: Closing a window on Windows 11 doesn't close the window -
it just closes the one tab item and keeps on choochin'
* [x] **BAD**: Close last tab, open new one, attempt to close window -
ALL windows go \*poof\*. Cause of course. No break into post-mortem
either.
* [x] more comments
* [ ] maybe a diagram
* [x] Restoring windows is at the wrong place entirely? I once reopened
the Terminal with two persisted windows, and it created one at 0,0
* [x] Remaining code TODO!s: 0 (?)
* [ ] "warm-reloading" `useTabsInTitlebar` (change while terminal is
running after closing a window, open a new one) REALLY doesn't work.
Obviously restores the last kind of window. Yike.

is all about #15384

closes #15410 along the way. Might fork that fix off.

(cherry picked from commit 7010626)
Service-Card-Id: 89927087
Service-Version: 1.18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Issue-Bug It either shouldn't be doing this or needs an investigation.
Projects
Development

Successfully merging this pull request may close these issues.

Additional instances of WT crash on exit
4 participants