Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for native dependencies #714

Closed
lilith opened this issue Mar 20, 2015 · 11 comments
Closed

Add support for native dependencies #714

lilith opened this issue Mar 20, 2015 · 11 comments

Comments

@lilith
Copy link

lilith commented Mar 20, 2015

A lot has been written about this (challenging) topic; so I'll link to what I've found (and please add more links in the comments).

I've been using a https download-during-boot approach for ImageResizer, but that is slow, unreliable, and annoying. I tried to create an example of how to build the ideal native/managed hybrid project, failed, then started a project to try to hot-fix the problem at runtime, and hit another series of roadblocks.

I've identified a few invalid assumptions that seem responsible for the current state of things.

Some of these are somewhat comical considering how easy it is to parse binaries for the major platforms and determine runtime compatibility.

  1. It is bad to assume all managed dlls are AnyCPU. x86 and x64-only binaries continue to have an important place. a) C++/CLI is still important, but only targets Win32 and x64. b) Binding generation tools like CppSharp cannot produce AnyCPU C# - the structure layouts (and other details) are calculated based on a pointer width assumption. c) There's also manually written C# that uses unsafe code or performs P/Invokes, and doesn't use IntPtr in all the right places, and is therefore x86 or x64-only.
  2. It is bad to assume that precompiled native binaries are appropriate for every operating system distribution. Source compilation needs to be first-class if we want to target a large number of linux or bsd distributions; we can't precompile for everything.
  3. It is bad to assume that binaries are small enough that all the permutations can go in the same zipped nuget package. OpenCV base libraries could easily be > 2 gigabytes if you did this. You'd also exhaust your server's disk space. And likely your bandwidth allotment. And requests would time out.
  4. It is bad to assume that a windows machine is capable of compiling native code.

Removing these assumptions, what new requirements are we left with?

  1. Nuget packages need to be able to reference other nuget packages - conditionally - based on architecture and operating system. This likely means that we need conditions in the lockfile, too - use version X of package Y on platform Z, etc. At build time, the right versions are copied.
  2. Nuget packages need to be able to describe native binaries - or rather - arbitrary files, and provide multiple versions for each target platform. For small binaries, a common pattern is likely to combine these two approaches, and provide x86/x64 binaries for windows, and a conditional reference for other platforms. For larger binaries, it's likely that the 'main' package will be empty, and simply list conditional, plat-specific packages as dependencies.
  3. Build time. This is a bit harder. What needs to happen?
    a) We need to gather the referenced files (nuget references, mind you), and verify that the output folder does not have any conflicting named files. If there are conflicting names, the output folder version MUST be deleted, so that we can AssemblyResolve or LoadLibrary the correct version. We then copy each of the files to an appropriate subfolder of the output folder (or, if AnyCPU, the output folder itself). Since VisualStudio is blind to compatibility (by choice, one must assume), we may end up fighting with the build process a bit. Perhaps disabling copylocal? Another nice sanity check would be to simply parse the binary headers of everything in the output folder and ensure they are all able to run on a common environment.
  4. Run time. This is where developers have to do a little magic, and call AssemblyResolve for non-AnyCPU managed dlls, and LoadLibrary (or a platform alternative) for the native dependencies. Given a standardized convention, we can make this co-operation possible without .NET runtime changes (although not optimal, since Assembly.Load won't use the default security context). It would be much better if .NET would modify the search path based on platform, as it does for culture. It would also be great if ASP.NET would apply some intelligence or header parsing to assemblies before globbing them all into memory. It's not hard, and only requires between 1 and 3 I/O reads of a few hundred bytes.
  5. Tooling. Tooling needs to understand that there are non-.NET dependencies involved. Unit test runners, in particular, are known for leaving behind native dependencies when they copy (or shadow copy) assembles for testing. These will need to understand the (transitive) dependencies. Perhaps we should emit some kind of manifest? While we can piggy-back on .NET assemblies to document dependencies (via resource manifests or regular assembly attributes), those .NET assemblies would need to document the final transitive set of native dependencies, which might be difficult to achieve.

So, I guess

  • Step 1: Establish (a) plat/arch conditions, and (b) target strings we can use as subfolders.
  • Step 2: Describe required changes to the nuget package specification
  • Step 3: Implement handling in paket for nuget conditions and native file manifests, through to lockfile.
  • Step 4: Implement paket support for triggering bash/.bat build scripts (perhaps requiring user authorization on a per-commit-id basis, if applied to a git reference)? CMake is probably most likely to be used, but if we require Git, we can require Bash.
  • Step 5: Implement build-time support for copying and manifest generation.
  • Step 6: Submit PRs to major runners and web frameworks to handle manifests and assembly loading properly.
  • Step 7: Make .NET into a real platform, where C interop is practical, so we can play with the big kids.
@lilith
Copy link
Author

lilith commented Mar 20, 2015

Conditions:

  • pointerSize=32|64. Let's say that our MSIL makes an assumption about pointer size, but not platform.
  • endianess="little|big" Sometimes this matters more than architecture, and ARM can switch between endianess modes.
  • architecture=x86|x64|IA64|Alpha|MIPS|HPPA|PowerPC|SPARC32|s390|s390x - Should probably support everything Mono does
  • os="posix|winish|linux|osx|win7|win8|win10" I'm not sure how to best divide windows (or linux) operating systems into groups, or to number them for easy inequality testing, but we probably want to establish sane identifiers that correspond to common build/api compat targets.

Given that a fallback mode (building from source) is likely popular, we want to make it easy to ensure that only 1 reference from a conditional set is chosen. We should probably group them within another element or provide an id that prevents duplicates.

@lilith
Copy link
Author

lilith commented Mar 20, 2015

Target strings need to be as generic as can be permitted based on their restrictions.

/ - root is AnyCPU
/32b/ - managed, assumes 32-bit pointer, otherwise portable
/64b/ - managed, assumes 64-bit pointer, otherwise portable
/x86/winish/ - 32-bit, requires windows APIs.

pointer size, architecutre, and endianess are combined into the first string. Pointer size and endianess are only included if the architecture string doesn't make them redundant. I.e, we would see /ARM-little/ and /ARM-big/, but not /x86-little/.

@yishaigalatzer
Copy link

The approach is quite similar to the ref/lib/rid approach of NuGet 3 packages and runtime.json file. See this link - http://docs.nuget.org/Create/uwp-create.

We are in a process of thinking about a gen2 of this layout and hope to share the idea not that long out once we put some more time into it.

@lilith
Copy link
Author

lilith commented Oct 16, 2015

Is there a list of RIDs?

We are in a process of thinking about a gen2 of this layout and hope to share the idea not that long out once we put some more time into it.

Could this be documented prior to implementation, so there can be community feedback stage? The current design doesn't seem to take many of the points above into account.

@yishaigalatzer
Copy link

RIDs are defined by the following nuget package - https://www.nuget.org/packages/Microsoft.NETCore.Platforms

And you can define your own as well.

@lilith
Copy link
Author

lilith commented Oct 16, 2015

@yishaigalatzer
Copy link

Yes

@ctaggart
Copy link
Contributor

I think it would be good for Paket to look at the runtime.json files in the nuget packages to get a list of additional transitive dependencies for each platform.

In this example:
http://blog.ctaggart.com/2015/10/minimal-coreclr-console-app.html
Paket could look at the System.Console's runtime.json to see that it depends on runtime.win7.System.Console.

{
  "runtimes": {
    "win7": {
      "System.Console": {
        "runtime.win7.System.Console": "4.0.0-beta-23409"
      }
    },
    "unix": {
      "System.Console": {
        "runtime.unix.System.Console": "4.0.0-beta-23409"
      }
    }
  }
}

@forki
Copy link
Member

forki commented Oct 23, 2015

#736

@forki
Copy link
Member

forki commented Apr 6, 2016

we now have basic support for native dependencies

@forki forki closed this as completed Apr 6, 2016
@NightOwl888
Copy link

Is there any documentation on how to build NuGet packages to support native dependencies?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants