-
Notifications
You must be signed in to change notification settings - Fork 153
Reader: support union types #275
Comments
The general idea is as follows. In Within each overlap set, we look to see if any of the field types are or have GC references. If not, we remove all of the fields from the original collection, and replace them by a single int8[] field that has the same extent and a null handle. If any field type is or has a GC reference, we recursively expand any GC reference containing structs until we have only non-GC types and GC references. We verify the GC references all line up and aren't overlapped by other fields. We then split the union range into sub-ranges around each set of GC references, and report int8[] fields for the non-GC parts and object* for the GC parts. [Note all this is done mainly so the LLVM type accurately describes the location of the GC references]. All these new fields are given null handles, and the original set of overlapping fields are removed from the field collection. Later on when looking up an LLVM type index by field handle via the |
Closes dotnet#275. Types with explicit layout may have fields that overlap one another (aka union types). Before this change we'd fail to compile any method that referenced one of these union types. We now model them as best we can using LLVM types. LLVM doesn't provide a strong way to describe unions. Instead we generally provide a byte array that covers the extent of the union as a representative placeholder. That means for some fields there is no exact LLMV type counterpart. Downstream consumers cope with this as follows: for any field within the extent of a set of overlapping fields, we omit that field handle from the `FieldIndexMap`, so that when a ldfld or similar goes to find out what LLVM type index to use, it comes up empty-handed. We already had a fall-back path here to simply use the EE provided field offset when this happens, and so now that path kicks in for accesses to overlapping fields, and subsequent pointer casts then fix up the types properly. However things are not quite that simple. We also want the LLVM type to fully describe the location of any GC references within the type, and it is valid for GC references to appear in one of these overlap sets. GC references must fully overlap one another and can't be safely overlapped with non-GC data (see Ecma-355, II.10.7). This means the extents of the GC references in an union partition the union into ranges of either non-GC data or GC references. We report the former using the byte arrays, and the latter as object references. Thus for instance a type `C` like ``` [StructLayout(LayoutKind.Explicit)] public struct A { [FieldOffset(0)]public string Name; [FieldOffset(8)]public int Size; } [StructLayout(LayoutKind.Explicit)] public struct C { [FieldOffset(0)]public A X; [FieldOffset(0)]public string Name; [FieldOffset(8)]public int Size; } ``` would on x64 be described as ``` type <{ %System.Object addrspace(1)*, [8 x i8] }> ``` This special handling for GC references is strictly only necessary for value types (since GC reference locations for value types on the stack must be reported at GC safe points) but for uniformity we do it for all types. We also continue to double-check value type GC locations against the GC pointer info provided by the EE.
Closes dotnet#275. Types with explicit layout may have fields that overlap one another (aka union types). Before this change we'd fail to compile any method that referenced one of these union types. We now model them as best we can using LLVM types. LLVM doesn't provide a strong way to describe unions. Instead we generally provide a byte array that covers the extent of the union as a representative placeholder. That means for some fields there is no exact LLMV type counterpart. Downstream consumers cope with this as follows: for any field within the extent of a set of overlapping fields, we omit that field handle from the `FieldIndexMap`, so that when a ldfld or similar goes to find out what LLVM type index to use, it comes up empty-handed. We already had a fall-back path here to simply use the EE provided field offset when this happens, and so now that path kicks in for accesses to overlapping fields, and subsequent pointer casts then fix up the types properly. However things are not quite that simple. We also want the LLVM type to fully describe the location of any GC references within the type, and it is valid for GC references to appear in one of these overlap sets. GC references must fully overlap one another and can't be safely overlapped with non-GC data (see Ecma-355, II.10.7). This means the extents of the GC references in an union partition the union into ranges of either non-GC data or GC references. We report the former using the byte arrays, and the latter as object references. Thus for instance a type `C` like ``` [StructLayout(LayoutKind.Explicit)] public struct A { [FieldOffset(0)]public string Name; [FieldOffset(8)]public int Size; } [StructLayout(LayoutKind.Explicit)] public struct C { [FieldOffset(0)]public A X; [FieldOffset(0)]public string Name; [FieldOffset(8)]public int Size; } ``` would on x64 be described as ``` type <{ %System.Object addrspace(1)*, [8 x i8] }> ``` This special handling for GC references is strictly only necessary for value types (since GC reference locations for value types on the stack must be reported at GC safe points) but for uniformity we do it for all types. We also continue to double-check value type GC locations against the GC pointer info provided by the EE.
Closes dotnet#275. Types with explicit layout may have fields that overlap one another (aka union types). Before this change we'd fail to compile any method that referenced one of these union types. We now model them as best we can using LLVM types. LLVM doesn't provide a strong way to describe unions. Instead we generally provide a byte array that covers the extent of the union as a representative placeholder. That means for some fields there is no exact LLMV type counterpart. Downstream consumers cope with this as follows: for any field within the extent of a set of overlapping fields, we omit that field handle from the `FieldIndexMap`, so that when a ldfld or similar goes to find out what LLVM type index to use, it comes up empty-handed. We already had a fall-back path here to simply use the EE provided field offset when this happens, and so now that path kicks in for accesses to overlapping fields, and subsequent pointer casts then fix up the types properly. However things are not quite that simple. We also want the LLVM type to fully describe the location of any GC references within the type, and it is valid for GC references to appear in one of these overlap sets. GC references must fully overlap one another and can't be safely overlapped with non-GC data (see Ecma-355, II.10.7). This means the extents of the GC references in an union partition the union into ranges of either non-GC data or GC references. We report the former using the byte arrays, and the latter as object references. Thus for instance a type `C` like ``` [StructLayout(LayoutKind.Explicit)] public struct A { [FieldOffset(0)]public string Name; [FieldOffset(8)]public int Size; } [StructLayout(LayoutKind.Explicit)] public struct C { [FieldOffset(0)]public A X; [FieldOffset(0)]public string Name; [FieldOffset(8)]public int Size; } ``` would on x64 be described as ``` type <{ %System.Object addrspace(1)*, [8 x i8] }> ``` This special handling for GC references is strictly only necessary for value types (since GC reference locations for value types on the stack must be reported at GC safe points) but for uniformity we do it for all types. We also continue to double-check value type GC locations against the GC pointer info provided by the EE.
Closes dotnet#275. Types with explicit layout may have fields that overlap one another (aka union types). Before this change we'd fail to compile any method that referenced one of these union types. We now model them as best we can using LLVM types. LLVM doesn't provide a strong way to describe unions. Instead we generally provide a byte array that covers the extent of the union as a representative placeholder. That means for some fields there is no exact LLMV type counterpart. Downstream consumers cope with this as follows: for any field within the extent of a set of overlapping fields, we omit that field handle from the `FieldIndexMap`, so that when a ldfld or similar goes to find out what LLVM type index to use, it comes up empty-handed. We already had a fall-back path here to simply use the EE provided field offset when this happens, and so now that path kicks in for accesses to overlapping fields, and subsequent pointer casts then fix up the types properly. However things are not quite that simple. We also want the LLVM type to fully describe the location of any GC references within the type, and it is valid for GC references to appear in one of these overlap sets. GC references must fully overlap one another and can't be safely overlapped with non-GC data (see Ecma-355, II.10.7). This means the extents of the GC references in an union partition the union into ranges of either non-GC data or GC references. We report the former using the byte arrays, and the latter as object references. Thus for instance a type `C` like ``` [StructLayout(LayoutKind.Explicit)] public struct A { [FieldOffset(0)]public string Name; [FieldOffset(8)]public int Size; } [StructLayout(LayoutKind.Explicit)] public struct C { [FieldOffset(0)]public A X; [FieldOffset(0)]public string Name; [FieldOffset(8)]public int Size; } ``` would on x64 be described as ``` type <{ %System.Object addrspace(1)*, [8 x i8] }> ``` This special handling for GC references is strictly only necessary for value types (since GC reference locations for value types on the stack must be reported at GC safe points) but for uniformity we do it for all types. We also continue to double-check value type GC locations against the GC pointer info provided by the EE.
Union types aren't common but can be created with explicit layout. Right now we bail out on any method that refers to a union.
See this TODO.
The text was updated successfully, but these errors were encountered: