Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce text layout CPU usage when DWrite analysis is not needed #2959

Merged
merged 2 commits into from
Oct 17, 2019

Conversation

skyline75489
Copy link
Collaborator

Summary of the Pull Request

It takes many CPU cycles to calculate text layout & font fallback even it's not really necessary. This may seem a little aggresive but it indeed dramatically boost performance.

References

#806

PR Checklist

  • CLA signed. If not, go over here and sign the CLA
  • Tests added/passed
  • Requires documentation to be updated
  • I've discussed this with core contributors already. If not checked, I'm ready to accept this work might be rejected in favor of a different grand plan.

Detailed Description of the Pull Request / Additional comments

With the help of #2937 and this PR, vim performance is way more smooth than before.

@skyline75489 skyline75489 changed the title Reduce text layou CPU usage when run is whitespace and alphanum only Reduce text layout CPU usage when run is whitespace and alphanum only Sep 29, 2019
@skyline75489 skyline75489 changed the title Reduce text layout CPU usage when run is whitespace and alphanum only Reduce text layout CPU usage when run is whitespace or ASCII only Sep 30, 2019
@DHowett-MSFT
Copy link
Contributor

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@zadjii-msft zadjii-msft added Area-Rendering Text rendering, emoji, complex glyph & font-fallback issues Area-Performance Performance-related issue labels Oct 1, 2019
@skyline75489 skyline75489 force-pushed the fix/textLayoutPerf branch 2 times, most recently from 67b047d to e566a4a Compare October 2, 2019 12:24
@skyline75489
Copy link
Collaborator Author

Seems something wrong with the format. But I can't really tell what it is.

@zadjii-msft
Copy link
Member

@skyline75489 It's probably the all_of call, if I'm spitballing. Running tools\runformat in the root of the repo should fix it for you though :)

@skyline75489
Copy link
Collaborator Author

@zadjii-msft Thanks! I'll give it a try.

This PR is crucial almost the entire "make cmatrix great again" movement. Would love to see it merged.

Copy link
Member

@zadjii-msft zadjii-msft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks reasonable to me, but I don't think I want this in w/o @miniksa's sign off on it, since he knows the renderer best.

// Perform our custom font fallback analyzer that mimics the pattern of the real analyzers.
RETURN_IF_FAILED(_AnalyzeFontFallback(this, 0, textLength));
const bool whiteSpacesOnly = std::all_of(
_text.begin(), _text.end(), [](wchar_t c) noexcept {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, can we just pass &iswspace as the function? (my c++17-fu is not strong so this is mostly an educational question).

(and the same below)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeal it could be &iswspace. It's just initially the condition here is a little more than a simple function. So I left it here.

I'd also love to here from some one who knows this more. I'm afraid some of the optimization could be to aggresive. So no need to rush it.

@miniksa
Copy link
Member

miniksa commented Oct 11, 2019

  1. For the whitespace part:
  • Using the CRT function iswspace is uncomfortable to me because it is doing a locale detection in a way that is completely different and potentially out of sync from whatever locale settings we're using in the rest of DirectWrite.
  • We honestly shouldn't be dispatching draw requests for whitespace at all. This is the fault of
    FAIL_FAST_IF_FAILED(InvalidateAll());
    which I put in there to always force full frame drawing until we could just get things to work. Restore differential drawing to DirectWrite renderer #778 to restore differential drawing would remove that line and eliminate a grand majority of whitespace drawing through the existing dirty region tracking.
  1. For the ascii only part:
  • Again, not a fan of using the CRT functions.
  • If this method's really expensive, we can probably avoid it another way. I'm pretty sure _analyzer has a GetTextComplexity() method that would probably indicate to us whether or not we need to go into the whole expensive calculations or not. For that matter, the complexity function might be able to gate all of these calls.

I would prefer to always lean on further portions of the DirectWrite framework to analyze the text, determine complexity, or figure out which regions we can discard before calling an expensive function. I really don't want to mix in other technologies like the CRT here. If we don't know the right way to do something in DWrite, I'd rather lean on our internal contacts on those teams to help us figure it out than try to reverse it with a mix of other technologies.

Overall, though, I think this particular CPU concern is just majorly an artifact of #778 being incomplete.

@skyline75489
Copy link
Collaborator Author

@miniksa I agree with you on that #778 is the ultimate solution for overall performance. However, I'd argue that this optimization is still needed even #778 is complete. There are still many situations that requires entire screen redrawing(e.g. vim, cat, and less). Differential drawing won't optimize away the CPU burden with all those DWrite analysis.

I'm not a DWrite expert and I chose CRT on this one. Seems GetTextComplexity() also works great on this one.

@skyline75489 skyline75489 changed the title Reduce text layout CPU usage when run is whitespace or ASCII only Reduce text layout CPU usage when DWrite analysis is not needed Oct 12, 2019
@skyline75489
Copy link
Collaborator Author

The concern I have on this, is what will GetTextComplexity behave under more sophiscated condition. For example, different locale and font with ligatures support. If we can trust GetTextComplexity on this, I think it's safe to use it for a short cut.

@skyline75489
Copy link
Collaborator Author

I found a catch. If I open cmd.exe the Chinese character is not correctly rendered.

@skyline75489
Copy link
Collaborator Author

I added the check of text read length to eliminate the issue with cmd.exe.

Copy link
Member

@miniksa miniksa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm good with this now. Thanks!

Copy link
Contributor

@DHowett-MSFT DHowett-MSFT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't realize that we backed out our complexity analysis. Yeah, this is awesome. Thank you!

@DHowett-MSFT DHowett-MSFT merged commit 6f7ad99 into microsoft:master Oct 17, 2019
@skyline75489 skyline75489 deleted the fix/textLayoutPerf branch October 21, 2019 07:01
@ghost
Copy link

ghost commented Oct 23, 2019

🎉Windows Terminal Preview v0.6.2951.0 has been released which incorporates this pull request.:tada:

Handy links:

@madig
Copy link

madig commented Oct 25, 2019

Hrm, I remember someone saying somewhere that Safari uses a similar trick to save on layouting and that it can lead to tricky edge-cases with font features. A terminal is probably less likely to deal with OpenType Layout data in fonts, so it might be okay 🤔

ghost pushed a commit that referenced this pull request Jun 1, 2020
## Summary of the Pull Request

As the title suggests, this PR will make CustomTextLayout skip glyph shaping analysis when the entire text is detected as simple.

## References

My main reference is [DirectX Factor - Who’s Afraid of Glyph Runs?](https://docs.microsoft.com/en-us/archive/msdn-magazine/2013/november/directx-factor-who%e2%80%99s-afraid-of-glyph-runs)

And also #2959

## PR Checklist
* [x] Closes @skyline75489's continuous drive for perf gainz. 
* [x] CLA signed. 
* [x] Manual tests.
* [x] Nah on docs.
* [x] Discussed with core contributors in this PR.

## Detailed Description of the Pull Request / Additional comments

This can be seen as a followup of #2959. The idea is the same: make use of simple text (which makes up I think 95% of all terminal text) as much as possible.

The performance boost is huge. Cacafire is actually on fire this time (and remember I'm using 4K!). The frame rate is also boosted since more CPU time can be used for actual drawing.

Before:

![图片](https://user-images.githubusercontent.com/4710575/82913277-b21c3c00-9fa0-11ea-8785-a14b347bbcbd.png)

After:

![图片](https://user-images.githubusercontent.com/4710575/82912969-4afe8780-9fa0-11ea-8795-92617dde822f.png)

## Validation Steps Performed

Manually validated.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Performance Performance-related issue Area-Rendering Text rendering, emoji, complex glyph & font-fallback issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants