Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce InsertAttrRuns CPU usage #2937

Merged
merged 1 commit into from
Oct 18, 2019

Conversation

skyline75489
Copy link
Collaborator

Summary of the Pull Request

Reduces CPU usage in ATTR_ROW::InsertAttrRuns.

References

#806

PR Checklist

  • Closes Cursor movement in Vim/Neovim is slower in Terminal #806
  • CLA signed. If not, go over here and sign the CLA
  • Tests added/passed
  • Requires documentation to be updated
  • I've discussed this with core contributors already. If not checked, I'm ready to accept this work might be rejected in favor of a different grand plan.

Detailed Description of the Pull Request / Additional comments

I myself have experienced the slow cursor movement described in #806 . I found one hotspot is in ATTR_ROW::InsertAttrRuns. It is called many times during screen redrawing. All those vector initializing and copying consumes lots of CPU cycles.

The idea is simple: exit as quickly as possible. The original author has already done this optimization when _list.size() is 1 or 2. I took a little further and aggresive way.

I added quick exit on these following condition

  • Insert single color that is the same as the left color:
AAAAABBBBBBBCCC
     ^
AAAAAABBBBBBCCC
  • Insert single color that is the same as surrouding ones
AAAAABBBBBBBCCC
       ^
AAAAABBBBBBBCCC

I didn't change the documentation cuz I wanna hear some feedback from the team before moving on.

Validation Steps Performed

Open .vimrc with vim and see for youself. It's not really "buttery smooth" but it it whole lot better. The CPU usage is dropped by about 30%.

if (iStart == left->GetLength() && NewAttr == left->GetAttributes())
size_t lowerBound = 0;
size_t upperBound = 0;
for (int i = 0; i < _list.size(); i++)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::list::size() return size_t, so this should be:

for (size_t i = 0; i < _list.size(); i++)

@DHowett-MSFT
Copy link
Contributor

Hey there, thanks for working on this! The engineer who knows the most about this (read: the original author), @miniksa, is unfortunately going to be away for a couple business days. Just wanted to let you know that this is on our radar, but feedback may be slow.

@skyline75489
Copy link
Collaborator Author

@DHowett-MSFT Thanks for the heads up. I'll just dig a liittle deeper into this. Cheers for a better terminal 🍺

if (iStart == left->GetLength() && NewAttr == left->GetAttributes())
size_t lowerBound = 0;
size_t upperBound = 0;
for (auto i = 0; i < _list.size(); i++)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use of auto is good, elsewhere, but not in this case. To illustrate the point:

auto i = 0;  // deduces to int
auto i = 0u; // deduces to unsigned int 

The return type of std::list::size() is std::size_t. On x86 builds this is defined as unsigned int, but on x64 builds this is defined as unsigned long long int. So, when building x64, there is a difference of signed vs unsigned and 32-bit integer vs 64-bit integer.

When using auto (deducing to int) this is comparing an int with and unsigned long long int. If the size of the list is > MAX_INT (unlikely to be the case in this particular scenario) this will become an infinite loop.

To prevent unwanted surprises, this is best rewritten as:

for (size_t i = 0; i < _list.size(); i++)

or

for (std::size_t i = 0; i < _list.size(); i++)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I came from a Java/C# background and didn't realize this until now.

@skyline75489 skyline75489 changed the title Reduce InsertAttrRuns cpu hotspot Reduce InsertAttrRuns CPU usage Sep 29, 2019
@DHowett-MSFT
Copy link
Contributor

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@zadjii-msft zadjii-msft added Area-Output Related to output processing (inserting text into buffer, retrieving buffer text, etc.) Area-Performance Performance-related issue Product-Conhost For issues in the Console codebase labels Oct 1, 2019
@DHowett-MSFT
Copy link
Contributor

You will probably need to merge in master to get the CI pipeline to work again. Sorry about that!

@miniksa
Copy link
Member

miniksa commented Oct 11, 2019

@skyline75489, I love the idea and adding the two new cases.

I'd just want you to:

  1. Update the comments in and around the method to very verbosely describe how the method is working and what the cases that it is looking for are
  2. Add tests for your new cases to AttrRowTests in the Host.Tests.Unit library.

Thanks!

@skyline75489
Copy link
Collaborator Author

@miniksa Glad to hear it from you.

I've checked AttrRowTests and I think that this PR is already covered by TestInsertAttrRunsSingle, if I'm not mistaken. This PR does not break the tests, though. And I'm not sure what I could do now with the tests.

else if (_list.size() == 2 && newAttrs.at(0).GetLength() == 1)
// .. otherwise if we internally have a list of 2 or more and we're about to insert a single color
// it's possible that we just walk left-to-right through the row and find a quick exit.
else if (iStart > 0 && iStart == iEnd)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition here is, I think, the same as the previous one. But the performance is a lot better because we don't have to access those vector things. TBH I'm surprised by the slowness of vector, even though all you need is just the size().

@miniksa Is there anything that could possibly be broken by the change?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we stick to the old way, we would be using (_list.size() >= 2 && newAttrs.at(0).GetLength() == 1), just to be clear.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine as long as it works. I can't think of anything offhand that would affect the behavior here, so as long as the tests pass and you're satisfied with it... we'll take it and see what happens.

@skyline75489
Copy link
Collaborator Author

Come to think about it, this PR doesn't really add new cases. It simply broaden the original optimization to make it work when _list.size() is larger than 2. What used to work will still work. What used to be slower is now faster.

@skyline75489
Copy link
Collaborator Author

Houston, we got a problem. Can someone call /azp to rerun this?

@miniksa
Copy link
Member

miniksa commented Oct 18, 2019

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@miniksa
Copy link
Member

miniksa commented Oct 18, 2019

If that x64 one doesn't get unstuck with this run, we'll force override it when we have two sign-offs.

@miniksa miniksa added this to the Terminal-1910 milestone Oct 18, 2019
else if (_list.size() == 2 && newAttrs.at(0).GetLength() == 1)
// .. otherwise if we internally have a list of 2 or more and we're about to insert a single color
// it's possible that we just walk left-to-right through the row and find a quick exit.
else if (iStart > 0 && iStart == iEnd)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine as long as it works. I can't think of anything offhand that would affect the behavior here, so as long as the tests pass and you're satisfied with it... we'll take it and see what happens.

@carlos-zamora carlos-zamora merged commit 60b94a4 into microsoft:master Oct 18, 2019
@ghost
Copy link

ghost commented Oct 23, 2019

🎉Windows Terminal Preview v0.6.2951.0 has been released which incorporates this pull request.:tada:

Handy links:

ghost pushed a commit that referenced this pull request Apr 29, 2020
<!-- Enter a brief description/summary of your PR here. What does it fix/what does it change/how was it tested (even manually, if necessary)? -->
## Summary of the Pull Request

In tonight's episode of "I wanna my CPU back", we'll see a quite familiar face whose name is Mr.AttrRow.

<!-- Other than the issue solved, is this relevant to any other issues/existing PRs? --> 
## References

#2937

<!-- Please review the items on the PR checklist before submitting-->
## PR Checklist
* [ ] Closes #xxx
* [ ] CLA signed. If not, go over [here](https://cla.opensource.microsoft.com/microsoft/Terminal) and sign the CLA
* [ ] Tests added/passed
* [ ] Requires documentation to be updated
* [ ] I've discussed this with core contributors already. If not checked, I'm ready to accept this work might be rejected in favor of a different grand plan. Issue number where discussion took place: #xxx

<!-- Provide a more detailed description of the PR, other things fixed or any additional comments/features here -->
## Detailed Description of the Pull Request / Additional comments

I knew this is possible a long time ago. Just didn't got the chance to actually implement this. I understand that you guys are busy preparing the v1.0 release. So if this is a bad time, this can wait.

<!-- Describe how you validated the behavior. Add automated tests wherever possible, but list manual validation steps taken as well -->
## Validation Steps Performed

Manually tested. If all goes well, nothing will be broken.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Output Related to output processing (inserting text into buffer, retrieving buffer text, etc.) Area-Performance Performance-related issue Product-Conhost For issues in the Console codebase
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cursor movement in Vim/Neovim is slower in Terminal
6 participants