Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Made loop bound checking in hardware intrinsics more efficient #994

Closed
wants to merge 4 commits into from

Conversation

cTxplorer
Copy link

@cTxplorer cTxplorer commented Sep 22, 2018

Changed style to make loop bound checking more efficient.

Closes #835

@dnfclas
Copy link

dnfclas commented Sep 22, 2018

CLA assistant check
All CLA requirements met.

Copy link
Member

@codemzs codemzs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have benchmark numbers with and without this change?

@cTxplorer
Copy link
Author

@codemzs No, I don't have benchmark numbers.

@danmoseley
Copy link
Member

@cTxplorer thanks for doing this.

It may be that the JIT cannot currently even figure out that pEnd is a constant during the loop, so this does not save any work. It would probably be better to extract out something like var pEndMinus8 = pEnd - 8 and then you can do while (pDstCurrent <= pEndMinus8).

That may be worth doing but benchmarking will say for sure. To benchmark you need to build for .NET Core 3.0 since this codepath is not used otherwise. Hopefully @eerhardt can help me with this.

After that instructions to run the benchmarks are here although again there will be some flag required to make sure it runs on .NET Core 3.0

Thanks for your patience (this is probably more complex than you expected)

cc @tannergooding

@danmoseley
Copy link
Member

Instructions for building on 3.0 are here: #1032

@shauheen
Copy link
Contributor

Thanks @cTxplorer for looking at this, do you have any plans to update this PR?

@cTxplorer
Copy link
Author

@shauheen Ready to help :)

@tannergooding
Copy link
Member

@cTxplorer, the merge conflicts need to be resolved and the same feedback made in #1177 applies here (which is a smaller, but related PR).

  • The primary feedback is that pDstEnd - <const> needs to be pulled out into a local to ensure that it isn't done once per loop iteration.

@danmoseley
Copy link
Member

@cTxplorer please also share benchmark results, as in #1177.

@danmoseley
Copy link
Member

@cTxplorer could you clarify the purpose of the rename (81401e9) ? It seems that was intended for a separate PR, associated with some other issue. (And if renamed, probably the file should be too). We would like to keep each PR for a single purpose to keep things clear. Thanks

@danmoseley
Copy link
Member

cc @jwood803

@tannergooding
Copy link
Member

@cTxplorer, going to close this for now since we haven't had any update in over 10 days.

If you would like to continue working on this, feel free to open a new PR that resolves the conflicts and we will be happy to help push it through.

@ghost ghost locked as resolved and limited conversation to collaborators Mar 28, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants