Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

COM: non-optional output parameters inconsistent when passing NULL #35855

Closed
weltkante opened this issue May 5, 2020 · 16 comments
Closed

COM: non-optional output parameters inconsistent when passing NULL #35855

weltkante opened this issue May 5, 2020 · 16 comments

Comments

@weltkante
Copy link
Contributor

weltkante commented May 5, 2020

Scenario

Native Code calls HRESULT hr = p->GetValue(NULL); for a non-optional output parameter and observes the HRESULT.

Depending on how the managed COM interface was declared different things happen:

  • result GetValue()
    • always executes code and discards the result, returning S_OK
  • void GetValue(out result)
    • if result is an interface, code is not executed, E_POINTER is returned
    • if result is a primitive type, code is executed, writing the output variable causes NRE, which is then translated to E_POINTER
  • [PreserveSig] int GetValue(out result)
    • if result is an interface, code is not executed, E_POINTER is returned
    • if result is a primitive type, code is executed, writing the output variable causes NRE, which is then translated to E_POINTER
  • [PreserveSig] HRESULT GetValue(out result) with HRESULT being an enum or struct
    • if result is an interface, code is not executed, S_OK is returned
    • if result is a primitive type, code is executed, writing the output variable causes NRE, which is then ignored, returning S_OK anyways

Observed behavior

  • Variant (4) returns S_OK even though no code was executed or an exception was thrown

Expected behavior

  • Variant (4) should behave more like (2) or (3) and return E_POINTER instead of S_OK
    • do not claim S_OK when no code was executed
    • do not claim S_OK when an exception was thrown and swallowed

Impact

This is not a regression (.NET Core and Desktop Framework have the same behavior) but strongly impacts the porting efforts for WinForms, in particular writing unit tests covering failure cases while at the same time using strongly typed HRESULT return values.

  • WPF has already been using Variant (4) on Desktop and is still using it in Core
  • WinForms was using Variant (3) on Desktop and has been moved to (4) while porting to Core in order to make use of strong typing (extension methods etc.)
  • HRESULT structs or enums are likely also used in third party code

WPF is currently using HRESULT structs while WinForms is using HRESULT enums. There was discussion about moving to structs like WPF does, but as far as this issue is concerned it doesn't seem to make a difference.

The primary questions are whether

  • this is a bug and
  • can it be fixed or does it need to remain broken for compatibility forever?

As far as I can evaluate it APIs implemented in .NET via variant (4) are inherently broken in the failure case and everyone just hopes nobody calls them with invalid arguments.

Reproduction

Scenario implemented over at https://github.com/weltkante/ComInteropIssue with a reg-free-com ATL project and corresponding Core/Desktop managed console projects. (PS: I was a bit in a hurry so the interfaces are dual not IUnknown, if thats a problem I can update the project.)

INativeAPI GetInstance()                      -> HR=00000000 Called=True  Exception=none
int GetNumber()                               -> HR=00000000 Called=True  Exception=none
void GetInstance(out INativeAPI)              -> HR=80004003 Called=False Exception=none
void GetNumber(out int)                       -> HR=80004003 Called=True  Exception=NullReferenceException
int GetInstance(out INativeAPI)               -> HR=80004003 Called=False Exception=none
int GetNumber(out int)                        -> HR=80004003 Called=True  Exception=NullReferenceException
hresult-enum GetInstance(out INativeAPI)      -> HR=00000000 Called=False Exception=none
hresult-enum GetNumber(out int)               -> HR=00000000 Called=True  Exception=NullReferenceException
hresult-struct GetInstance(out INativeAPI)    -> HR=00000000 Called=False Exception=none
hresult-struct GetNumber(out int)             -> HR=00000000 Called=True  Exception=NullReferenceException

Original discussion here: dotnet/winforms#1932 (comment)
/cc @AaronRobinsonMSFT @JeremyKuhne @hughbe

@AaronRobinsonMSFT
Copy link
Member

@weltkante Thanks for the repro. We will take a look at this and investigate the root cause.

/cc @jkoritzinsky @elinor-fung

@hughbe
Copy link
Contributor

hughbe commented May 5, 2020

I must say it does seem unusual that changing the return value to an enumeration/struct would change the behaviour.

I do agree also that in the case of 1) I would expect E_POINTER. I would suspect that this bug(feature?) is especially annoying because callers usually only check If the HR succeeded rather than if the value was set, which goes against documentation of many APIs. That said, it probably is too breaking to accept.

To clarify with 1) this doesn’t depend on the return type at all does it?

@AaronRobinsonMSFT
Copy link
Member

TL;DR I looked into this and it is not something I am inclined to fix. It would require reworking COM interop stub generation in a non-trivial way. Ironically the problem area is the referenced issue on why we don't like altering these code paths - dotnet/coreclr#23955.

The problem is an accumulation of several issues - all of which reinforce why the next phase of COM interop needs to be out of the runtime. The failure occurs prior to making the COM call - which makes sense. We dereference the null value upon entering the stub in the marshal argument code path. What this means is we never get to the dispatch section of the stub and therefore never execute the unmarshal section for return values. This is where the return value type comes into play. Since the struct HRESULT and enum HRESULT patterns are technically invalid according to standard ABIs (structs don't get returned via registers and enums aren't blittable) they require unmarshalling. However, since the unmarshalling path never occurs because we bailed out too soon, we are left with default initialized locals and return that in the eax register for the HRESULT. int32 locals are default initialized to 0 which is S_OK.

Below is an example of an IL stub for the enum HRESULT type. The struct HRESULT is similar.

/*( 1)*/ ldind.i1 is where we dereference null. Reenter at IL_004c and then exit the catch block at /*( 1)*/ leave IL_0093 and subsequently return default initialized local 5.

// instance valuetype CoreApp.HRESULT_Enum(class CoreApp.INativeAPI_4&)
// instance int32(native int)
// Code size	150 (0x0096)
.maxstack 5 
.locals (int32,object,native int,int32,int32,int32)
// Marshal {
         /*( 0)*/ ldc.i4.0         
         /*( 1)*/ stloc.0          
IL_0002: /*( 0)*/ nop             // argument {  
         /*( 0)*/ ldarg.1          
         /*( 1)*/ ldind.i1         
         /*( 1)*/ pop              
         /*( 0)*/ ldc.i4.1         
         /*( 1)*/ stloc.0          
         /*( 0)*/ nop             // } argument 
         /*( 0)*/ nop             // return {  
         /*( 0)*/ nop             // } return 
// } Marshal 
// CallMethod {
         /*( 0)*/ ldarg.0          
         /*( 1)*/ ldloca.s        0x1 
         /*( 2)*/ call            native int [System.Private.CoreLib] System.StubHelpers.StubHelpers::GetStubContext() 
         /*( 3)*/ calli           instance CoreApp.HRESULT_Enum /* MT: 0x00007FFA76148410 */(CoreApp.INativeAPI_4 /* MT: 0x00007FFA761475F8 */&) 
// } CallMethod 
// UnmarshalReturn {
         /*( 2)*/ nop             // return {  
         /*( 2)*/ stloc.3          
         /*( 1)*/ ldloc.3          
         /*( 2)*/ stloc.s         0x4 
         /*( 1)*/ ldloc.s         0x4 
         /*( 2)*/ nop             // } return 
         /*( 2)*/ stloc.s         0x5 
// } UnmarshalReturn 
// Unmarshal {
         /*( 1)*/ nop             // argument {  
         /*( 1)*/ ldloc.1          
         /*( 2)*/ ldtoken         CoreApp.INativeAPI_4 
         /*( 3)*/ call            native int [System.Private.CoreLib] System.RuntimeTypeHandle::GetValueInternal(valuetype System.RuntimeTypeHandle) 
         /*( 3)*/ ldc.i4.0         
         /*( 4)*/ conv.i           
         /*( 4)*/ ldc.i4.8         
         /*( 5)*/ call            native int [System.Private.CoreLib] System.StubHelpers.InterfaceMarshaler::ConvertToNative(object,native int,native int,int32) 
         /*( 2)*/ stloc.2          
         /*( 1)*/ ldarg.1          
         /*( 2)*/ ldloc.2          
         /*( 3)*/ stind.i          
         /*( 1)*/ ldc.i4          0x40000001 
         /*( 2)*/ stloc.0          
         /*( 1)*/ nop             // } argument 
         /*( 1)*/ ldc.i4          0x7fffffff 
         /*( 2)*/ stloc.0          
         /*( 1)*/ leave           IL_007f 
IL_004c:
// } Unmarshal 
// ExceptionCleanup {
IL_004c: /*( 1)*/ ldloc.0          
         /*( 2)*/ ldc.i4          0x7ffffffe 
         /*( 3)*/ bgt             IL_007e 
         /*( 1)*/ ldloc.0          
         /*( 2)*/ ldc.i4          0x40000000 
         /*( 3)*/ ble             IL_006e 
         /*( 1)*/ ldloc.2          
         /*( 2)*/ brfalse         IL_006e 
         /*( 1)*/ ldloc.2          
         /*( 2)*/ call            void [System.Private.CoreLib] System.StubHelpers.InterfaceMarshaler::ClearNative(native int) 
IL_006e: /*( 1)*/ ldc.i4.0         
         /*( 2)*/ conv.i           
         /*( 2)*/ stloc.2          
         /*( 1)*/ ldarg.1          
         /*( 2)*/ brfalse         IL_007a 
         /*( 1)*/ ldarg.1          
         /*( 2)*/ ldloc.2          
         /*( 3)*/ stind.i          
IL_007a: /*( 1)*/ ldc.i4.0         
         /*( 2)*/ conv.i4          
         /*( 2)*/ stloc.s         0x4 
IL_007e:
// } ExceptionCleanup 
// Cleanup {
         /*( 1)*/ endfinally       
IL_007f:
// } Cleanup 
// ExceptionHandler {
         /*( 1)*/ leave           IL_0093 
IL_0084: /*( 1)*/ call            int32 [System.Private.CoreLib] System.Runtime.InteropServices.Marshal::GetHRForException(class System.Exception) 
         /*( 2)*/ pop              
         /*( 1)*/ ldc.i4.0         
         /*( 2)*/ conv.i4          
         /*( 2)*/ stloc.s         0x5 
         /*( 1)*/ leave           IL_0093 
IL_0093: /*( 1)*/ ldloc.s         0x5 
         /*( 2)*/ ret              
// } ExceptionHandler 
.try IL_0000 to IL_0084 catch handler IL_0084 to IL_0093
.try IL_0002 to IL_004c finally handler IL_004c to IL_007f

@AaronRobinsonMSFT AaronRobinsonMSFT removed the untriaged New issue has not been triaged by the area owner label May 6, 2020
@weltkante
Copy link
Contributor Author

As I understand it that means HRESULT enum/struct are never working correctly in failure cases because the unmarshal of the HRESULT is inside the try-section of the exception handler, not outside or in the catch-section. In other words passing NULL for out parameters or any other thrown exceptions won't be converted to a negative HRESULT like observed, and these aren't edge cases either.

/cc @JeremyKuhne @RussKie to figure out what that means for WinForms

TLDR: [PreserveSig] with strongly typed return values must never throw and callers must never pass NULL to non-optional out/ref parameters. This includes not unit-testing these cases.

Unfortunate how strongly typed HRESULT is just an accident that looked like it worked and now everyone is using it. Is there any outlook on "the next phase of COM" ?

@AaronRobinsonMSFT
Copy link
Member

As I understand it that means HRESULT enum/struct are never working correctly in failure cases

I think that is too strong of a statement. There are two things going on here because this is an interop boundary and as such is an attempt to reconcile differing semantic behavior.

  1. If a method is set with PreserveSig then it doesn't also throw exceptions. That is the implied expectation because it will provide an error code instead of an exception. Throwing an exception in this case should be considered what the C++ standard calls "unspecified behavior".

  2. Not all .NET constructs have corresponding COM equivalents (i.e. ref/out). These have strict semantic rules that can't be described in a COM API, outside of SAL annotations which aren't compilation failures. Instead of using those keywords, defining the interface with pointer types will enable the desired testing.

The above isn't an attempt to excuse the current state. I agree that it would be nice to fix. My intent is to say that any interop boundary is sensitive and often under-specified. The exhaustive testing that is going on in WinForms is going to continue to find these kind of subtle inconsistencies.

to figure out what that means for WinForms

I personally don't like the enum HRESULT/struct HRESULT patterns as I find it to be excessive boiler plate with little practical value. In my experience, there is only one reason to fiddle with PreserveSig - the API returns S_FALSE and that is important. If that isn't the case, then I would just let the runtime do the work to translate the HRESULT. I would be interested to hear other situations for why it is used.

Is there any outlook on "the next phase of COM" ?

The beginning is the introduction of the ComWrappers API. An example exposing IDispatch can be found here. This approach would also imply using pointer types as suggested above.

@AaronRobinsonMSFT
Copy link
Member

@jkoritzinsky for FYI about compat.

@weltkante
Copy link
Contributor Author

Throwing an exception in this case should be considered what the C++ standard calls "unspecified behavior".

I don't think thats a good idea, I have yet to see anyone wrap every COM interface implementation in try/catch blocks, as far as I'm aware everyone relies on the CLR marshalling layer to translate unexpected exceptions to HRESULTs.

With the move to .NET Core and everyone rewriting their interop layers (WinForms, VS Designer etc.) the strongly typed HRESULTs probably are only going to increase, as far as I'm aware its the preferred practice for the teams, so any stray exception will translate directly into a bug.

I would be interested to hear other situations for why it is used.

People are using it for strong typing, it makes C# interop code more pleasant to write and read. Having a HRESULT struct/enum where the debugger shows you what the number means is also a huge productivity aid.

I'm doing a lot of COM for various reasons (mentioned some over here). Setting aside WinForms contributions I use PreserveSig wherever failure codes are part of "normal operation", which happens often enough. I don't like throwing/catching exceptions during normal operation, it makes debugging a huge pain.

The beginning is the introduction of the ComWrappers API.

Yeah I've seen that, but I have a hard time understanding the approach. The sample (thanks for the link, wasn't aware of it) seems to be about creating a CCW right? I'll have to study it closer. Any info on how to handle RCWs?

My own usecases often focus around writing a C# interface and (if its not for win32 interop) deriving a TLB off of that, so if I can extend that TLB generation step to also generate ComWrapper source that'd certainly be interesting. Maybe as part of the new Roslyn source generators.

Marshaling between STA/MTA and across process boundaries is also a very important usecase, has this been considered in the ComWrapper API design? I've never written low level marshaling code, always relied on CLR or ATL magic making it happen (in combination with TLBs), so I don't know if it needs special support.

Sorry for asking all this here, but as a heavy COM user I'm very interested in where COM interop is going. If you want this discussion happen somewhere else I can ask again in a different context, just point me there.

@AaronRobinsonMSFT
Copy link
Member

Any info on how to handle RCWs?

The only example is the test tree:

protected override object CreateObject(IntPtr externalComObject, CreateObjectFlags flag)
{
var iid = typeof(ITrackerObject).GUID;
IntPtr iTrackerComObject;
int hr = Marshal.QueryInterface(externalComObject, ref iid, out iTrackerComObject);
Assert.AreEqual(0, hr);
return new ITrackerObjectWrapper(iTrackerComObject);
}

public class ITrackerObjectWrapper : ITrackerObject
{
private struct ITrackerObjectWrapperVtbl
{
public IntPtr QueryInterface;
public _AddRef AddRef;
public _Release Release;
public _AddObjectRef AddObjectRef;
public _DropObjectRef DropObjectRef;
}
private delegate int _AddRef(IntPtr This);
private delegate int _Release(IntPtr This);
private delegate int _AddObjectRef(IntPtr This, IntPtr obj, out int id);
private delegate int _DropObjectRef(IntPtr This, int id);
private readonly IntPtr instance;
private readonly ITrackerObjectWrapperVtbl vtable;
public ITrackerObjectWrapper(IntPtr instance)
{
var inst = Marshal.PtrToStructure<VtblPtr>(instance);
this.vtable = Marshal.PtrToStructure<ITrackerObjectWrapperVtbl>(inst.Vtbl);
this.instance = instance;
}

Maybe as part of the new Roslyn source generators.

Yes. That is one of the options we are considering for COM interop.

Marshaling between STA/MTA and across process boundaries is also a very important usecase, has this been considered in the ComWrapper API design? I've never written low level marshaling code, always relied on CLR or ATL magic making it happen (in combination with TLBs), so I don't know if it needs special support.

That is something that the runtime should not know about and the ComWrappers API doesn't handle any marshalling at all. It is up to the implementer of ComWrappers to make the decision on how to handle that. You are correct in that it is hard. When and if we actually provide a new COM interop solution using ComWrappers that would of course be included. The as yet unveiled CsWinRT tool was required to handle this.

It is entirely possible the current COM support is all that we ever provide. A possible replacement would be simply a consistent IUnknown type of solution that leverages the ABI, but avoids all other COM semantics. That decision is still TBD. The support full COM interop - STA/MTA/NTA, aggregation, TLB, proxy/stub, OutOfProc, InProc, et al - is something we are paying attention to, but honestly our future focus is on AOT and xplat. This means that replacing/fixing COM interop is lower priority. The IUnknown ABI on the other hand is very interesting and could be something that we could implement in a xplat way using ComWrappers. External tools like CsWinRT represent where we see the next generation of real COM interop going.

@AaronRobinsonMSFT
Copy link
Member

I am going to close this issue since we won't be fixing this due to the compat concerns. Unfortunately, it represents the same semantics already present in .NET Framework and therefore the bar is just too high to fix something that the community has lived with for such a long time.

@weltkante If you have questions about how we see COM interop moving forward feel free to file that issue. It is something I would like to hear from the community on. The ComWrappers is the first step in enabling the community to drive these decisions and where our focus should be.

@JeremyKuhne
Copy link
Member

In my experience, there is only one reason to fiddle with PreserveSig - the API returns S_FALSE and that is important.
...
I would be interested to hear other situations for why it is used.

Needing to look at the result is one you already mentioned. Another one is perf- if we want custom error handling we don't have to introduce a try .. catch that we'll never (or seldom) rethrow the original exception from.

Another huge one is when we implement the interfaces in managed. We are required to return specific HRESULTS on APIs for various states.

In regards to strongly typing, that helps significantly with correct usage. With structs in particular we can create custom operators which "force" people to do conversions correctly.

As to how we deal with this currently I don't think we should ever use out on COM interfaces (or callbacks) that we implement in managed code. That is my recommendation for WinForms. Things go terribly wrong even if you are aren't using [PreserveSig] with value types which I beat myself up on when I was cleaning up the stream wrapper for System.Drawing.

@hughbe
Copy link
Contributor

hughbe commented May 6, 2020

As to how we deal with this currently I don't think we should ever use out on COM interfaces (or callbacks) that we implement in managed code. That is my recommendation for WinForms. Things go terribly wrong even if you are aren't using [PreserveSig] with value types which I beat myself up on when I was cleaning up the stream wrapper for System.Drawing.

So would you advise IntPtr* pointers and using Marshal.GetComInterfaceForObject. I can happily do this :)

Edit: what is the perf like for Marshal.GetComInterfaceForObject? Any ways to improve it? Could we use the more modern APIs? Is this what the marshaller does internally when we return interfaces?

@AaronRobinsonMSFT
Copy link
Member

Another one is perf- if we want custom error handling we don't have to introduce a try .. catch that we'll never (or seldom) rethrow the original exception from.

@JeremyKuhne I have heard this argument a lot, but never observed any hard numbers. We already have multiple catch blocks in the generated IL stub. I would interested in how much gain there is by having 1 less.

Another huge one is when we implement the interfaces in managed. We are required to return specific HRESULTS on APIs for various states.

If you throw a COMException and provide the error code that is what is returned as an HRESULT. Alternatively throwing the correct exception should properly be translated - see below.

In regards to strongly typing, that helps significantly with correct usage. With structs in particular we can create custom operators which "force" people to do conversions correctly.

These are ints... wrap them in a ComException or rethrow them as an inner. In most cases well known exceptions will be properly thrown (i.e. E_INVALIDARG will properly translate to ArgumentExcepton).

As to how we deal with this currently I don't think we should ever use out on COM interfaces (or callbacks) that we implement in managed code. That is my recommendation for WinForms.

+1

So would you advise IntPtr* pointers and using Marshal.GetComInterfaceForObject. I can happily do this :)

It is possible. You could also do Foo* if that is more convenient.

what is the perf like for Marshal.GetComInterfaceForObject?
Is this what the marshaller does internally when we return interfaces?

Basically, yes.

@hughbe
Copy link
Contributor

hughbe commented May 6, 2020

It is possible. You could also do Foo* if that is more convenient.

Not sure you can use pointers to interfaces sadly :))

@AaronRobinsonMSFT
Copy link
Member

@hughbe Oh sorry. Yes you are right I believe, I am so used to fighting these issues with value types.

@RussKie
Copy link
Member

RussKie commented May 7, 2020

This topic is over my head, is there anything specific we want to add to our wiki/guidelines?

@AaronRobinsonMSFT
Copy link
Member

As to how we deal with this currently I don't think we should ever use out on COM interfaces (or callbacks) that we implement in managed code. That is my recommendation for WinForms. Things go terribly wrong even if you are aren't using [PreserveSig] with value types which I beat myself up on when I was cleaning up the stream wrapper for System.Drawing.

@RussKie From my perspective I think the key statement/guidance is from @JeremyKuhne above.

@ghost ghost locked as resolved and limited conversation to collaborators Dec 9, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants