Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for memory alignment #5931

Open
JeffCyr opened this issue May 24, 2016 · 12 comments
Open

Add support for memory alignment #5931

JeffCyr opened this issue May 24, 2016 · 12 comments
Labels
area-GC-coreclr design-discussion Ongoing discussion about design without consensus enhancement Product code improvement that does NOT require public API changes/additions
Milestone

Comments

@JeffCyr
Copy link
Contributor

JeffCyr commented May 24, 2016

There are some optimizations not available with managed code in .Net because there are currently no ways to enforce a memory alignment greater than the pointer size:

I have no idea if this is easy or hard in the current coreclr design, but it would be nice to have a MemoryAlignmentAttribute that could specify alignment minimally on class type and possibly on any class/struct/field.

My motivation for this feature would be to implement an UnfairSemaphore (#2383) that isn't randomly inefficient in x86 when its 64bit state crosses a cache line boundary.

I have created a gist to isolate the consequences of unaligned Interlocked:
https://gist.github.com/JeffCyr/9e162f440e30b567507cc95b6ba5a4a4

On my machine, unaligned Interlocked operation can be 61x slower.

category:proposal
theme:alignment
skill-level:expert
cost:large
impact:medium

@tannergooding
Copy link
Member

@JeffCyr, what about [StructLayout(LayoutKind.Sequential, Pack = 16)] (types are in the System.Runtime.InteropServices namespace)?

@JeffCyr
Copy link
Contributor Author

JeffCyr commented May 24, 2016

@tannergooding The Pack parameter won't affect the base address of the object.

@tannergooding
Copy link
Member

I created a proposal on the CoreFX side here: https://github.com/dotnet/corefx/issues/22790

@JeffCyr
Copy link
Contributor Author

JeffCyr commented Oct 11, 2017

It has been mentioned that this feature would require major changes to implement in the GC.

Do you think it would worth it to have a global App.Config setting to force 8 byte alignment of all ref types for x86 process running in a x64 OS?

This should be a lot simpler to implement and it resolves the random perf of Int64 in x86 processes. (e.g. #4811)

@tannergooding
Copy link
Member

As I understand, @Maoni0 and @swgillespie are the GC people to tag on these issues

@Maoni0
Copy link
Member

Maoni0 commented Oct 11, 2017

if you want all objects to have a different alignment that's trivial to implement - we have an Align function that enforces the alignment and is called by every place that calculates the size of an object but does introduce perf penalty as the alignment is no longer a const; if you want the alignment to be a property of a type (which is what FEATURE_STRUCTALIGN implements) that's certainly much more work (the implementation of FEATURE_STRUCTALIGN is incomplete right now) but also has perf penalty as already pointed out on the other thread.

there needs to be a cost-benefit analysis.

@JeffCyr
Copy link
Contributor Author

JeffCyr commented Oct 11, 2017

What about just changing the x86 alignment to 8-bytes instead of 4-bytes? The memory increase should be marginal no? And since x86 processors don't really exist anymore, all x86 app could perform better if the alignment match the processor architecture.

@hanblee
Copy link
Contributor

hanblee commented Oct 13, 2017

What about just changing the x86 alignment to 8-bytes instead of 4-bytes?

This seems to be an overkill for what you are trying to achieve, and I don't think the memory increase would be marginal. Moreover, this would not help with "Cache line alignment optimizations" goal listed above.

And since x86 processors don't really exist anymore, all x86 app could perform better if the alignment match the processor architecture.

I don't follow this statement. For best performance, the recommendation is to align data on natural alignment boundaries.

@JeffCyr
Copy link
Contributor Author

JeffCyr commented Oct 13, 2017

@hanblee

This seems to be an overkill for what you are trying to achieve, and I don't think the memory increase would be marginal

I don't see how changing to 8-bytes alignment in x86 could increase the memory usage significantly. The worst case is +4 bytes per object, so 4MB per million objects.

I don't follow this statement. For best performance, the recommendation is to align data on natural alignment boundaries.

I meant that nowadays, all x86 application run on a x64 CPU. So if all objects base address is 8-byte aligned, it guarantees that all 8 byte types are 8-byte aligned matching the underlying x64 CPU natural alignment.

Anyway, you're right that this proposition doesn't address the original issue, this conversation could be continued in another issue.

@Maoni0
Copy link
Member

Maoni0 commented Oct 13, 2017

I don't see how changing to 8-bytes alignment in x86 could increase the memory usage significantly. The worst case is +4 bytes per object, so 4MB per million objects.

the average size of objects on x86, according to analysis we did, was about 35 bytes, so a 4-byte increase is >10%. that is significant.

@Drawaes
Copy link
Contributor

Drawaes commented Oct 17, 2017

Worst case, if its 4 byte aligned and the law of large numbers kicks in then its 2 bytes average so ~ 5.7% still possibly significant...

@msftgits msftgits transferred this issue from dotnet/coreclr Jan 30, 2020
@msftgits msftgits added this to the Future milestone Jan 30, 2020
@BruceForstall BruceForstall added the JitUntriaged CLR JIT issues needing additional triage label Oct 28, 2020
@kunalspathak kunalspathak added area-GC-coreclr and removed area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI JitUntriaged CLR JIT issues needing additional triage labels Dec 22, 2022
@ghost
Copy link

ghost commented Dec 22, 2022

Tagging subscribers to this area: @dotnet/gc
See info in area-owners.md if you want to be subscribed.

Issue Details

There are some optimizations not available with managed code in .Net because there are currently no ways to enforce a memory alignment greater than the pointer size:

I have no idea if this is easy or hard in the current coreclr design, but it would be nice to have a MemoryAlignmentAttribute that could specify alignment minimally on class type and possibly on any class/struct/field.

My motivation for this feature would be to implement an UnfairSemaphore (#2383) that isn't randomly inefficient in x86 when its 64bit state crosses a cache line boundary.

I have created a gist to isolate the consequences of unaligned Interlocked:
https://gist.github.com/JeffCyr/9e162f440e30b567507cc95b6ba5a4a4

On my machine, unaligned Interlocked operation can be 61x slower.

category:proposal
theme:alignment
skill-level:expert
cost:large

Author: JeffCyr
Assignees: -
Labels:

enhancement, design-discussion, area-GC-coreclr

Milestone: Future

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-GC-coreclr design-discussion Ongoing discussion about design without consensus enhancement Product code improvement that does NOT require public API changes/additions
Projects
None yet
Development

No branches or pull requests

8 participants