A mechanism for specifying alignment on a field or struct should be supported. #22990

tannergooding · 2017-07-31T18:10:54Z

Rationale

In certain high performance or specialized data structures/algorithms, it is desirable to enforce an alignment for structs, fields, or locals.

Today, CoreFX provides several specialized data structures for which the runtime either has special alignment handling (System.Numerics.Vector) or for which they have some specialized padding (dotnet/corefx#22724).

As such, the framework/runtime should provide a mechanism for encforcing a specified alignment for structs and fields. Locals should also be included if that is feasible (I'm not sure if that is readily possible today given that attributes cannot be specified on locals).

Additional Thoughts

It might be worthwhile to additionally expose this on the existing StructLayoutAttribute as an Alignment property.

An alignment of 0 should be treated as "Automatic" (the current behavior of letting the runtime decide alignment).

A mechanism for aligning to the cache would be ideal (dotnet/corefx#22724 (comment)). This could perhaps be a special value that would otherwise be invalid (such as Alignment=-1). Other special alignments could also be allowed in a similar manner.

If a field specifies an alignment less than that of the struct, it should be aligned to the alignment of the struct. For example, if you do Alignment=8 on a Vector4 (which has an Alignment=16), the field should be treated as Alignment=16.

[Design Decision] If a struct specifies an alignment less than that of its first field it should either:
A. Align the struct as specified and add the appropriate padding so that the first field is also aligned as specified
-or-
B. Align the struct as per the requirements of the first field

[EDIT] Make reference to the PR a link by @karelz

The text was updated successfully, but these errors were encountered:

JonHanna · 2017-07-31T18:19:47Z

This could perhaps be a special value that would otherwise be invalid (such as Alignment=-1). Other special alignments could also be allowed in a similar manner.

Perhaps -4 to mean 4 octets from the end of the cache line?

tannergooding · 2017-08-02T16:44:35Z

This probably deserves/requires input from some runtime folks as well, given that they have the best understanding of how determining alignment works today.

@karelz, do you know who should be tagged?

tannergooding · 2017-09-30T05:42:39Z

Going to tag @jkotas, @fiigii, and @mellinoe right now.

This will be very useful for ensuring the backing data structures are properly aligned when they are used in combination with the Hardware Intrinsics feature.

jkotas · 2017-10-01T16:02:32Z

There are two aspects of this:

Controlling field offset alignment within type
Controlling alignment when the storage for the type is allocated (on GC heap, on stack, ...)

Is this issue about 1, 2 or both?

tannergooding · 2017-10-01T16:10:02Z

The second (controlling alignment for the entire type).

If that is provided, the first can be achieved by aligning the whole type and using the appropriate field offset attributes on the individual members.

tannergooding · 2017-10-01T16:14:22Z

Although, It does somewhat extend into the range of both when dealing with types that have an alignment but are also members of another type. I mentioned in the original comment some scenarios where this may come up.

jkotas · 2017-10-01T17:43:13Z

is GC feature. It is non pay-for-play GC feature at its core: The GC would need to look at alignment of every object in various situations (slows it down everywhere), but only a few situations benefit. It would need to be prototyped and we would need to get convinced that it is a good tradeoff to make. There was some work on this done earlier - the code is under FEATURE_STRUCTALIGN ifdefs.

tannergooding · 2017-10-02T03:08:34Z

@jkotas, thanks for the reference (going to take a look through this when I have some time)!

I'm guessing the issue isn't in the first allocation, since that can be considered "trivial". That is, you just need to allocate, at most, Size + (Alignment - 1) bytes and return the first address with the correct alignment.

So, I think the hardest part for the GC probably comes in play when the heap is compressed or when objects are otherwise moved, since alignment limits where it can be moved to. I'm wondering, however, if this can be done without bringing in too much cost.

I would think (possibly naively) that the GC would set a flag indicating whether an object is aligned (or maybe a separate tree containing these objects or something similar). Most objects are not expected to be aligned, so they don't need to do anything else. The few objects that are aligned need to relocated to an address that is still aligned. This can be any address that is between Size and Size + (Alignment - 1) bytes in length (where Size is for an address that is perfectly aligned and Size + (Alignment - 1) is an address with "worst case" alignment).

jkotas · 2017-10-02T04:17:57Z

It is the 10,000ft view of how this may work. You can tell from the 37 FEATURE_STRUCTALIGN ifdefs left over in the GC from previous attempt to implement this that it is not exactly trivial to implement. Also, I would expect that the implementation itself is not where most of the work would be - most of the work would be in both functional and performance testing.

tannergooding · 2017-10-20T06:55:12Z

I put a bit of thought into why the feature is requested (feel free to correct me if you disagree)...

On modern computers, unaligned reads/writes are (generally speaking) as fast as aligned reads/writes. The exception to this is when the load/store crosses a cache-line boundary (or worse, a page boundary).

Looking at the "Intel Optimization Manual", a load/store that crosses a cache-line boundary can take ~4.5x more cycles on modern CPUs and more on older (this is assuming I didn't miss a section that says something different for even newer processors).

The most commonly used alignments will likely be:

16 (SIMD128; SSE/SSE2)
32 (SIMD256; AVX, AVX2)
64 (SIMD512; AVX512)
Cache Line Size (also beneficial for concurrent code; although I didn't touch on that in depth)
Page Size (also beneficial for large blocks of memory, such as file reads; although I didn't touch on that in depth)

Other alignments (those between cache line size and page size), as far as I can tell, do not provide any real performance benefit. This is because there is no register which can read the data all at once and because it won't provide any additional guarantees of not crossing a cache-line or page boundary.

If getting the GC to support custom aligned types is hard (and not likely to get this feature any time soon), then is there a reasonable workaround for the near or long term?For example:

Providing a 'high-performance' API for allocating aligned blocks of memory not tracked by the GC
Providing a set of 'high-performance' APIs for manual memory management (allocating/freeing/zeroing/copying heaps/pages/blocks/etc)
- There are some issues with the existing memory management functions in the Marshal class, some of which are probably fixable

On the other hand, has any consideration been put in to support custom aligned types, but with certain limitations? For example:

custom alignment is supported, but only for specific sizes
custom alignment is supported, but only for arrays
- I can't, at least this late at night, think of any real world use-cases for heap allocated single-objects that require specific alignment, all the use cases that come to mind involve arrays and multiple reads/writes
- For single value-type objects (if required), stack respected custom alignment would likely work, even if heap respected custom-alignment didn't exist
- Only having stack respected custom alignment won't work for large arrays, since that can easily cause a stack overflow

vermorel · 2019-02-12T11:28:23Z

custom alignment is supported, but only for arrays.

Yes, arrays are our only use case. Actually, we would not even need all arrays, gaining control on byte[] arrays only would already be sufficient, thanks to MemoryMarshal.Cast.

saucecontrol · 2019-02-12T20:07:32Z

If https://github.com/dotnet/coreclr/issues/19936 is implemented, you'll at least be able to roll your own aligned buffers with the knowledge the GC will never move them.

benaadams · 2019-02-12T20:11:57Z

My interest would be for CMPXCHG16b with a object reference + tag type struct

vermorel · 2019-02-13T09:26:44Z

@saucecontrol A memory mapped file will already give you aligned buffers. However, it's an IDisposable object to deal with. To make aligned memory convenient, we need support from the GC.

saucecontrol · 2019-02-13T18:45:45Z

The issue I linked is specifically about adding GC support. It doesn't handle the alignment, but it solves the problem of the GC potentially moving something after you've found an aligned section to work with.

There's also https://github.com/dotnet/corefx/issues/31787, which addresses aligned allocation of arrays.

Unknown6656 · 2024-12-18T08:27:14Z

what is the current state of this proposal?

tannergooding · 2024-12-18T15:04:18Z

Closed/inactive. It would require substantial GC work to support and is low priority.

Data is already naturally aligned according to the primitive elements it contains and in the cases of things like arrays and large data processing there are often better ways to deal with the data (which you'll often have as part of your core algorithm anyways) such as doing opportunistic alignment.

If you'd like to manually manage memory, convenience APIs such as NativeMemory.AlignedAlloc exist and most BCL APIs take Span which allows you to transparently work with managed or native memory.

msftgits transferred this issue from dotnet/corefx Jan 31, 2020

msftgits added this to the 5.0 milestone Jan 31, 2020

maryamariyan added the untriaged New issue has not been triaged by the area owner label Feb 23, 2020

jeffschwMSFT removed the untriaged New issue has not been triaged by the area owner label Feb 24, 2020

AaronRobinsonMSFT modified the milestones: 5.0, Future May 14, 2020

ddobrev mentioned this issue Nov 7, 2020

A structure with a virtual table and a double field incorrectly marshalled with a sequential layout #44378

Closed

jkoritzinsky added this to AppModel Jul 28, 2023

dotnet-policy-service bot added backlog-cleanup-candidate An inactive issue that has been marked for automated closure. no-recent-activity labels Nov 13, 2024

dotnet-policy-service bot removed this from the Future milestone Nov 27, 2024

dotnet-policy-service bot closed this as completed Nov 27, 2024

dotnet-policy-service bot removed no-recent-activity backlog-cleanup-candidate An inactive issue that has been marked for automated closure. labels Dec 18, 2024

github-actions bot locked and limited conversation to collaborators Jan 18, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A mechanism for specifying alignment on a field or struct should be supported. #22990

A mechanism for specifying alignment on a field or struct should be supported. #22990

tannergooding commented Jul 31, 2017

JonHanna commented Jul 31, 2017

tannergooding commented Aug 2, 2017

tannergooding commented Sep 30, 2017

jkotas commented Oct 1, 2017

tannergooding commented Oct 1, 2017

tannergooding commented Oct 1, 2017

jkotas commented Oct 1, 2017 •

edited

Loading

tannergooding commented Oct 2, 2017

jkotas commented Oct 2, 2017

tannergooding commented Oct 20, 2017

vermorel commented Feb 12, 2019

saucecontrol commented Feb 12, 2019

benaadams commented Feb 12, 2019

vermorel commented Feb 13, 2019

saucecontrol commented Feb 13, 2019

Unknown6656 commented Dec 18, 2024

tannergooding commented Dec 18, 2024

A mechanism for specifying alignment on a field or struct should be supported. #22990

A mechanism for specifying alignment on a field or struct should be supported. #22990

Comments

tannergooding commented Jul 31, 2017

Rationale

Additional Thoughts

JonHanna commented Jul 31, 2017

tannergooding commented Aug 2, 2017

tannergooding commented Sep 30, 2017

jkotas commented Oct 1, 2017

tannergooding commented Oct 1, 2017

tannergooding commented Oct 1, 2017

jkotas commented Oct 1, 2017 • edited Loading

tannergooding commented Oct 2, 2017

jkotas commented Oct 2, 2017

tannergooding commented Oct 20, 2017

vermorel commented Feb 12, 2019

saucecontrol commented Feb 12, 2019

benaadams commented Feb 12, 2019

vermorel commented Feb 13, 2019

saucecontrol commented Feb 13, 2019

Unknown6656 commented Dec 18, 2024

tannergooding commented Dec 18, 2024

jkotas commented Oct 1, 2017 •

edited

Loading