Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: support an enum constraint on generic type parameters #262

Closed
gafter opened this issue Feb 5, 2015 · 26 comments
Closed

Proposal: support an enum constraint on generic type parameters #262

gafter opened this issue Feb 5, 2015 · 26 comments

Comments

@gafter
Copy link
Member

gafter commented Feb 5, 2015

Allow the constraints on a type parameter to include enum. Such a constrained type parameter could be instantiated with an enum type or with another type parameter that is so constrained.

This probably needs more design work.

@mikedn
Copy link

mikedn commented Feb 5, 2015

For reference: long, too long unfortunately, discussion from codeplex: http://roslyn.codeplex.com/discussions/572464

@HaloFour
Copy link

HaloFour commented Feb 5, 2015

Ultimately what we could distill from that is that thread is the following:

  1. The CLR supports three forms of enum constraints:
    1. System.Enum by itself which allows the generic type argument to be either System.Enum or a specific enum type
    2. System.Enum and class, which forces the generic type to be only System.Enum
    3. System.Enum and struct, which forces the generic type to be a specific enum type
  2. Supporting operators like & or | would not be possible given that the IL generated would have to be differently depending on the underlying type of the enum.
  3. F# supports an enum constraint that also constrains the underlying type to a specific integral type which would allow this functionality but it is not supported by the CLR.
type Class9<'T when 'T : enum<uint32>> =

In my opinion out of the three potential CLR constraint combinations 1.iii. is the one that would be the most useful. I don't know how one would write a generic method or type accepting System.Enum in place of a real enum type and what use that would be. Of course 1.ii. just seems completely useless.

Unfortunately, without support for 2.i. the use cases for this feature in general are mostly for the purposes of reflection. Generally improved versions of existing CLR functions, such as parsing the enum from a string, which is something that I have written myself to be both faster and more capable (e.g. can parse a flags value from a comma- or pipe-separated string.)

Given that these scenarios largely involve reflection that does pose a small problem if compiling to .NET Native as some of the existing methods of Enum are not friendly with .NET Native, specifically Enum.GetValues(Type) as internally it attempts to create a generic array. There are other less efficient ways to get to the values, such as calling Enum.GetNames(Type) and enumerating over that using Enum.Parse(Type, string).

@gafter
Copy link
Member Author

gafter commented Mar 2, 2015

@HaloFour
Copy link

I was toying with .NET Native in Visual Studio 2015 CTP6 and it seems that the issue with reflection and enums are no longer a problem as Type.GetEnumValues() no longer throws due to the call to Array.UnsafeCreateInstance.

@Ultrahead
Copy link

Another reference: http://roslyn.codeplex.com/discussions/543871

@Ultrahead
Copy link

Will this feature be included in C# 7?

@gafter
Copy link
Member Author

gafter commented Apr 1, 2016

It isn't on any of our published plans for C# 7.

@Ultrahead
Copy link

So, the Ready label means that work on this feature can start in any moment, right?

@gafter
Copy link
Member Author

gafter commented Apr 7, 2016

Yes, design work on this feature can start whenever the language design team feels it is their highest priority.

@OndrejPetrzilka
Copy link

Supporting operators like & or | would not be possible given that the IL generated would have to be differently depending on the underlying type of the enum.

Are you sure? I'm currently using this and it works perfectly for all kinds of enums. Not sure that it covers all cases tho.

    // Methods  
    .method public hidebysig static 
        bool IsSet<valuetype .ctor ([mscorlib]System.Enum) T> (
            !!T 'value',
            !!T 'flags'
        ) cil managed aggressiveinlining
    {
        .custom instance void [mscorlib]System.Runtime.CompilerServices.ExtensionAttribute::.ctor() = (01 00 00 00)
        .maxstack 8
        IL_0000: ldarg.0
        IL_0001: ldarg.1
        IL_0002: and
        IL_0003: ldarg.1
        IL_0004: ceq
        IL_0006: ret
    } // end of method EnumExtensions::IsSet

@OndrejPetrzilka
Copy link

OndrejPetrzilka commented Nov 4, 2016

Hmm, so it's not valid IL to for example load ulongand return it as ShortEnum? Because this is what I'm also doing and it's working good. Am I correct that FromUInt64 method below just takes the right amount of bits from ulong and 'discards' the rest?

Why does arithmetic require knowing the type, because of signed overflows or negative numbers?

    .method public hidebysig static 
        !!T FromUInt64<valuetype .ctor ([mscorlib]System.Enum) T> (
            uint64 'value'
        ) cil managed aggressiveinlining
    {
        .maxstack 8
        IL_0000: ldarg.0
        IL_0001: ret
    } // end of method EnumExtensions::FromUInt64
    .method public hidebysig static 
        uint64 ToUInt64<valuetype .ctor ([mscorlib]System.Enum) T> (
            !!T 'value'
        ) cil managed aggressiveinlining
    {
        .maxstack 8
        IL_0000: ldarg.0
        IL_0001: conv.u8 // I can do this here, I know what to return
        IL_0002: ret
    } // end of method EnumExtensions::ToUInt64

For me, it would be incredibly helpful to have just the constraint available for C#, then I would be able to create "any" necessary methods in C# by using these two IL methods above. Without the constraint I can use these two methods, but I cannot create any other method in C#.

Further, if these two methods would be directly available (or explicit ulong cast operators to/from System.Enum) I wouldn't have to use any IL at all.

So yes, the constraint have value even when arithmetic operations are not available.
It has high value when ulong cast operators/methods would be available.

@TylerBrinkley
Copy link

I apologize for the shameless plug but I believe this may remove most needs for adding the enum constraint. I've just released version 1.0.0 of Enums.NET, a high-performance type-safe .NET enum utility library. I've added every enum operation that I thought could be applied to all enums or flag enums.

Enums.NET available on NuGet and is compatible with .NET Framework 2.0+ as well as .NET Standard 1.0+. I'd greatly appreciate any feedback on it. Thanks.

@iam3yal
Copy link

iam3yal commented Nov 5, 2016

@TylerBrinkley There's a major difference between a library and a language feature, some people cannot use 3rd-party libraries, sometimes it just looks awkward and detached from the code and so expressing it through the language result in a more legible code but otherwise, good job. :)

@PathogenDavid
Copy link
Contributor

PathogenDavid commented Nov 5, 2016

@OndrejPetrzilka
For the sake of clarity, I did not write the IL in my comment. That is coming from CSC.

Am I correct that FromUInt64 method below just takes the right amount of bits from ulong and 'discards' the rest?

I am fairly certain this code will cause some strange behaviors when the value returned is used if the value didn't fit in the enum's underlying type.

I don't have a chance to test it right now, but I am fairly certain the following code will return false with FromUInt64 using your implementation because the value on the stack is never truncated.

static bool TestOverflowedEnumValueIsTruncated()
{
    ulong ulongvalue = 0xFFFF0001; // Value is too big for a ShortEnum (underlying type ushort), value is 0xFFFF0000 | ShortEnum.BitZero without truncating
    ShortEnum shortEnumValue = FromUInt64<ShortEnum>(ulongvalue);
    return shortEnumValue == ShortEnum.BitZero; // We'd expect this to return true if the value was properly truncated
}

With optimizations turned on, the following IL is generated:

.method private hidebysig static bool  TestOverflowedEnumValueIsTruncated() cil managed
{
  // Code size       15 (0xf)
  .maxstack  8
  IL_0000:  ldc.i4     0xffff0001
  IL_0005:  conv.u8
  IL_0006:  call       valuetype EnumBitwiseMath.ShortEnum EnumBitwiseMath.Program::FromUInt64(uint64)
  IL_000b:  ldc.i4.1
  IL_000c:  ceq
  IL_000e:  ret
} // end of method Program::TestOverflowedEnumValueIsTruncated

Notice that there is no conv.u2 between the call to FromUInt64 and the ceq.

This means that ceq compares the result of FromUInt64 0xffff0001 (since no conv.u2 was performed) and 0x00000001 (ShortEnum.BitZero), which will result in a false result on the stack, which is not what we'd expect

This is because CSC expects the values of enums on the stack to always be truncated to the underlying type. This is cheaper than truncating it all of the time when it is used.

For instance, look at a (non-generic) implementation of FromUInt64 (return (ShortEnum)x;) as generated by CSC:

.method private hidebysig static valuetype EnumBitwiseMath.ShortEnum 
        FromUInt64(uint64 x) cil managed
{
  // Code size       3 (0x3)
  .maxstack  8
  IL_0000:  ldarg.0
  IL_0001:  conv.u2
  IL_0002:  ret
} // end of method Program::FromUInt64

Notice, the conversion happens immediately. It doesn't put the non-truncated version of x onto the stack like your implementation does.


Nevermind, I went braindead when I wrote the bit below.

Additionally, you wrote this in your ToUInt64:

IL_0001: conv.u8 // I can do this here, I know what to return

This is just another version of the same problem. This conversion will need to be different depending on the underlying type. Your code actually wouldn't work for ShortEnum since it is 16 bits.

To clarify: If I add a value to my ShortEnum BitFifteen = 1 << 15 and call your function as ToUInt64(ShortEnum.BitFifteen | ShortEnum.BitZero) I will get 0x0001 when I expected 0x8001


Why does arithmetic require knowing the type, because of signed overflows or negative numbers?

Overflows in general, yeah.

TL;DR: Your code doesn't cover all the edge cases.

@OndrejPetrzilka
Copy link

OndrejPetrzilka commented Nov 5, 2016

@PathogenDavid
TL;DR: Cases you mentioned work, are there any other edge cases?

EDIT: I'm wrong here, I talk about conv.u8, but actually used conv.i8 during tests

I just wrote a unit test for the case you mentioned and it passes (x86/x64, release/debug). My idea was that it should work, because return value is passed by value, thus it copies only 16 bits from ulong when returning the value.

I have no idea why it works on release with aggressive inlining, maybe the method is inlined by JIT, but it's clever enough to detect that it should still take just the lower bits? Or it's stored in 64-bit register anyway and when doing operations on it (comparison to ShortEnum.BitZero), it uses instruction which works on lowest 16 bits? (I have no understanding of x86/x64 assembler).

[Flags]
enum ShortEnum : short
{
    None = 0,
    BitZero = 1,
    BitFifteen = unchecked((short)(1 << 15)),
}

[TestMethod]
public void TestLargeCast()
{
    // Value is too big for a ShortEnum (underlying type ushort), value is 0xFFFF0000 | ShortEnum.BitZero without truncating
    ulong ulongvalue = 0xFFFF0001;
    ShortEnum shortEnumValue = EnumExtensions.FromUInt64<ShortEnum>(ulongvalue);

    // We'd expect this to return true if the value was properly truncated
    Assert.IsTrue(shortEnumValue == ShortEnum.BitZero);
    // Another test to make sure it's same as direct cast
    Assert.IsTrue(shortEnumValue == (ShortEnum)ulongvalue);
}

IL_0001: conv.u8 // I can do this here, I know what to return

This is just another version of the same problem. This conversion will need to be different depending on the underlying type. Your code actually wouldn't work for ShortEnum since it is 16 bits.

I think this use is okay, converting any numeric value on stack to ulong should be fine, no matter what. Here's another test which passes too. It proves that in case you mentioned, return value is same as returned by direct cast. It's not 0x8001 though, I don't know why it should, but as far as I'm concerned, important thing is that it casts same as direct cast.

[TestMethod]
public void TestCastNegative()
{
    ShortEnum value = ShortEnum.BitFifteen | ShortEnum.BitZero;
    ulong normalCast = unchecked((ulong)value);
    ulong toUint64Cast = value.ToUInt64();

    Assert.IsTrue(normalCast == toUint64Cast);
}

@PathogenDavid
Copy link
Contributor

@OndrejPetrzilka
I went completely braindead when I wrote the comment about the conv.u8, sorry about that. (I read "convert to unsigned 8-bit number" instead of 8-byte.)

As for everything else...

My idea was that it should work, because return value is passed by value, thus it copies only 16 bits from ulong when returning the value.

Good point, I'm not sure how CLR handles this. I would imagine it just uses EAX for the return value, but maybe it is smart enough to truncate it for you if it thinks/knows it needs to be. However, the fact that my release-mode FromUInt64 includes the conv.u2 implies to me that it doesn't. (Otherwise, CSC is adding an unnecessary instruction.)

Or it's stored in 64-bit register anyway and when doing operations on it (comparison to ShortEnum.BitZero), it uses instruction which works on lowest 16 bits?

I would bet this is what is happening. I bet the JITter generates a 16-bit compare for the comparison since it knows ShoreEnum is only 16-bits. (Glanced at CodeGen::genCompareInt - looks like it does.)

However, I'd consider this getting into undefined JIT behavior. I'd be worried about how this would behave on other architectures (which may or may not support 16-bit compares) or more naive JIT implementations (that just use word-size-compares everywhere.)

@mikedn
Copy link

mikedn commented Nov 5, 2016

Hmm, so it's not valid IL to for example load ulong and return it as ShortEnum?

Such IL is unverifiable. It's not clear if it's also invalid, the ECMA spec doesn't say nothing about this. It happens so that the JIT compiler inserts a cast automatically in this case.

IL_0001: conv.u8 // I can do this here, I know what to return

Yes, you can do that but it's not 100% correct. If the underlying type of the enum is signed then you need to use conv.i8 and not conv.u8. This means that a negative value such as -1 is converted to 0xFFFFFFFF instead of 0xFFFFFFFFFFFFFFFF.

Here's another test which passes too. It proves that in case you mentioned, return value is same as returned by direct cast. It's not 0x8001 though, I don't know why it should, but as far as I'm concerned, important thing is that it casts same as direct cast.

That test should fail. normalCast is 0xFFFFFFFFFFFF8001 and toUint64Cast is 0xFFFF8001.

I have no idea why it works on release with aggressive inlining, maybe the method is inlined by JIT, but it's clever enough to detect that it should still take just the lower bits?

I'm not sure what you mean by "works on release with aggressive inlining". Aggressive inlining has no effect in this case, the inliner always rejects FromUInt64 with reason = return type mismatch.

@OndrejPetrzilka
Copy link

OndrejPetrzilka commented Nov 6, 2016

@mikedn @PathogenDavid

ToUInt64

Sorry for confusion, you were indeed right, conv.u8 causes problems with negative values. The tests fail with conv.u8, but all tests passes when I change it to conv.i8 (tested positive and negative short, ushort, long, ulong)

Does conv.i8 cover all edge all cases?

FromUInt64

Aggressive inlining has no effect in this case, the inliner always rejects FromUInt64 with reason = return type mismatch.

I had no idea that it works this way.

Is this IL better for conversion from ulong to enum?

    .method public hidebysig static 
        !!T FromUInt64<valuetype .ctor ([mscorlib]System.Enum) T> (
            uint64 'value'
        ) cil managed aggressiveinlining
    {
        .locals (!!T V_0)
        IL_0001: ldloca.s 0 // Load addr of local
        IL_0000: ldarga.s 0  // Load addr of argument
        IL_0002: cpobj !!T // Treat argument as T and copy sizeof(T) number of bits
        IL_0003: ldloc.0  // Load and return local
        IL_0004: ret
    }

@mikedn
Copy link

mikedn commented Nov 6, 2016

Does conv.i8 cover all edge all cases?

I don't think so. Try an enum having underlying type uint and value 0x80000000. You'll likely end up with 0x0000000080000000 and 0xFFFFFFFF80000000.

Is this IL better for conversion from ulong to enum?

Depends on what you mean by "better":

  • this does inline so in this regard it is better
  • the code is still unverifiable/incorrect: The behavior is unspecified if the type of the location referenced by src is not assignable-to (§I.8.7.3) the type of the location referenced by dest.
  • Taking the address of a variable tends to block various optimizations. In this case the value ends up being stored to memory and then reloaded.

@OndrejPetrzilka
Copy link

OndrejPetrzilka commented Nov 6, 2016

@mikedn

I don't think so. Try an enum having underlying type uint and value 0x80000000. You'll likely end up with 0x0000000080000000 and 0xFFFFFFFF80000000.

You're right, this test fails

Depends on what you mean by "better":

I was concerned about second point (verifiable/correct). How's that different from making union in C# (explicit struct layout with field offsets), because I believe that it's verifiable/correct when it contains no reference types. In worst case (performance wise), I can implement similar approach in IL as well to make it verifiable. Or union wouldn't work in IL, because it does not allow generic type in it?

Lets say I don't care about cast to ulong, I want to get the bits to perform bit operation on them. So ulong GetBits(TEnum) and TEnum SetBits(ulong) would be enough (it would use conv.u8)

In such case I would get "wrong" number, but I'd be able to apply bit operations correctly and cast it back to enum (if we figure out verifiable way). Bit operations would be: And, Or, Xor, Not, Ceq. Since there's no Add/Sub, I don't have to care about overflow and conversions.

Or we can skip GetBits/SetBits completely and just implement bit operations in type-independent way. Because I think this is perfectly valid IL:

.method public hidebysig static 
    !!T Clear<[mscorlib]System.Enum) T> (!!T 'value', !!T 'flags') cil managed aggressiveinlining
{
    IL_0000: ldarg.0
    IL_0001: ldarg.1
    IL_0002: not  // Valid but not verifiable: "Expected I, I4, or I8 on the stack."
    IL_0003: and
    IL_0004: ret
} // end of method EnumExtensions::Clear

I just though it would be nice to have a way of getting/setting all the bits at once as ulong (even when it wouldn't represent correct negative number, just the bits).

@OndrejPetrzilka
Copy link

OndrejPetrzilka commented Nov 6, 2016

    .method public hidebysig static 
        !!T FromUInt64<valuetype .ctor ([mscorlib]System.Enum) T> (
            uint64 'value'
        ) cil managed aggressiveinlining
    {
        .locals (!!T V_0)
        IL_0001: ldloca.s 0 // Load addr of local
        IL_0000: ldarga.s 0  // Load addr of argument
        IL_0002: cpobj !!T // Treat argument as T and copy sizeof(T) number of bits
        IL_0003: ldloc.0  // Load and return local
        IL_0004: ret
    }

Verifiability:

The tracked types of the destination (dest) and source (src) values shall both be managed pointers (&) to values whose types we denote destType and srcType, respectively. Finally, srcType shall be assignment-compatible with typeTok, and typeTok shall be assignment-compatible with destType. In the case of an Enum, its type is that of the underlying, or base, type of the Enum.

Based on above, it says it's valid when types are assignment-compatible. destType and typeTok are same types (local has the type of T, as well as argument of cpobj). So it boils down to: is srcType assignment-compatible with typeTok. Therefore: Is ulong assignment-compatible with short? (for example...because in case of enum, underlying type is used). I'll try to run it through PEVerify.exe later.

EDIT: it's not verified
EnumExtensions::FromUInt64[T]][offset 0x00000004][found Long][expected (unboxed) 'T'] Unexpected type on the stack.
...but it doesn't mean it's invalid IL

@mikedn
Copy link

mikedn commented Nov 6, 2016

Or union wouldn't work in IL, because it does not allow generic type in it?

Yep, you can't have a generic type that has explicit layout:

II.10.1.2 Type layout attributes
...
explicit: The layout of the fields is explicitly provided (§II.10.7). However, a generic type shall not have explicit layout.
...

@OmarTawfik
Copy link
Contributor

Assigning back to champion, as feature is now complete.

@gafter
Copy link
Member Author

gafter commented May 8, 2018

Addressed as part of C# 7.3, as described in dotnet/csharplang#104

For VB, I am not aware of any outstanding issue.

@reduckted
Copy link

@gafter This VB issue was added recently: dotnet/vblang#306

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests