Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A dynamic link library (DLL) initialization routine failed #18

Closed
danielcrabtree opened this issue Jun 11, 2017 · 26 comments
Closed

A dynamic link library (DLL) initialization routine failed #18

danielcrabtree opened this issue Jun 11, 2017 · 26 comments

Comments

@danielcrabtree
Copy link

When I install the 5.4.6 NuGet packages for RocksDbSharp and RocksDbNative and create a small sample application, I get a TypeInitializationException that seems to be caused by a Win32Exception, specifically "A dynamic link library (DLL) initialization routine failed".

I've tried replacing the native libraries obtained via NuGet with those in native-6e05979.zip from the v5.4.6 release, same result.

I've tried cloning the entire repository and running SimpleExampleLowLevel, same result.

I also note that there's a bug in download-native.cmd and download-native.sh. They should both point at https://github.com/warrenfalk/rocksdb-sharp/releases/download/v5.4.6/native-6e05979.zip but both point at different incorrect urls that result in 404s.

@warrenfalk
Copy link
Owner

The download issue is fixed. I had just forgotten to push the latest commit. Go ahead and pull. I am seeing if I can reproduce your nuget issue.

@warrenfalk
Copy link
Owner

I can't reproduce it. It would seem there is something in your environment. What framework are you targeting? What OS version? And is your executable 64 bit?

@danielcrabtree
Copy link
Author

In the NuGet test I was targeting .NET Framework 4.7 and 64 bit on Windows 10.

I get the same error when pulling RocksDbSharp source, running download-native.sh, compiling, and then running the examples in that. That uses .NET Framework 4.5 and 64 bit.

Is there anything I can do to get more information regarding the cause?

@warrenfalk
Copy link
Owner

I fear that the cause is within the native binary. You could try to build the native binary yourself (there is a script and instructions somewhere in there, but it's a finicky procedure).

You could also research the "A dynamic link library (DLL) initialization routine failed" error specifically. (and btw, what file and line does is the innermost exception raised from when you run in the IDE?)

You can also try using some older versions of the native binaries (download from an older release). They won't necessarily fail unless you try to use a feature/function that they don't have. You should at least get further. If that succeeds and you can tell me which worked - it might help.

There was a recent change contributed to the native binary to start statically linking the msvc runtime, but I would expect that to reduce issues like this, not create them. Still if you go back a couple of versions (to 5.2.1), you'll get one that still has the msvc runtime as a dependency.

I'm still researching to see if there's any other likely possibility. I am also on Windows 10, so it wouldn't seem to be that.

@danielcrabtree
Copy link
Author

Reverting to the native binaries on 5.2.1 fixes the issue, as does reverting to the native binaries on 5.3.4.1. So the problem seems to be with the native binary in 5.4.6.

The source of the exception in C# is the throw new NativeLoadException on line 404 of AutoNativeImport.cs. It loops around trying all the different paths, then when it finds the right path, it gets the "A dynamic link library (DLL) initialization routine failed" exception, it then continues searching for matches on other paths. Finally, it throws the exception on line 404.

When I try to run with a debugger attached, it crashes the process and so I can't get anything useful from that.

@warrenfalk
Copy link
Owner

Thanks for this info. It is possible that something changed in the rocksdb project in that version, then.

But I also can't rule out that something having changed in my build environment between those versions. So I am going to try a build without the static linking if you want to try that, and maybe re-build the previous version of the native dll and see if that makes a difference. This is hard to troubleshoot without knowing how to reproduce it myself.

@warrenfalk
Copy link
Owner

Here's a 5.4.6 dll that is built without statically linking the vc runtime. I suspect this could have something to do with your issue (but still wouldn't know exactly why). If you can, see if this version still throws that error (rename the dll to rocksdb.dll so it is found by the managed library)

https://github.com/warrenfalk/rocksdb-sharp/releases/download/v5.4.6/rocksdb-5.4.6-nonstatic-msvc.dll.zip

@danielcrabtree
Copy link
Author

I get the same error with rocksdb-5.4.6-nonstatic-msvc.dll on Windows 10.

I also tested everything on a different Windows 10 computer (although it has almost identical hardware and configuration) and got identical results.

I also tested on another computer running Windows Server 2008 R2 and get the following:
v5.3.4.1 native-98f8d47.zip - Works Fine
v5.4.6 native-6e05979.zip - Unknown Error 0xc000001d
rocksdb-5.4.6-nonstatic-msvc.dll - MSVCP140.dll is missing

Maybe Unknown Error 0xc000001d sheds more light on the issue, or maybe it's just another way of saying "A dynamic link library (DLL) initialization routine failed".

Is it worth installing the MSVC runtime and testing rocksdb-5.4.6-nonstatic-msvc.dll again on Windows Server 2008 R2?

@danielcrabtree
Copy link
Author

I tested on more systems and finally found a Windows 10 one where all 3 versions work correctly. But that doesn't really help explain the problem.

I also found that WerFault.exe writes a report after the crash. This contained the following info on Windows 10:
Sig[6].Name=Exception Code
Sig[6].Value=c000001d
Sig[7].Name=Exception Offset
Sig[7].Value=000000000017e382

So it seems that Unknown Error 0xc000001d on Windows Server 2008 R2 is equivalent to "A dynamic link library (DLL) initialization routine failed" on Windows 10.

@danielcrabtree
Copy link
Author

I've been looking through the commits on RocksDB and noticed this facebook/rocksdb@11c5d47.

The compilation instructions include this note:

By default the binary we produce is optimized for the platform you're compiling on (-march=native or the equivalent). SSE4.2 will thus be enabled automatically if your CPU supports it. To print a warning if your CPU does not support SSE4.2, build with USE_SSE=1 make static_lib or, if using CMake, cmake -DFORCE_SSE42=ON. If you want to build a portable binary, add PORTABLE=1 before your make commands, like this: PORTABLE=1 make static_lib.

It's possible that this is the underlying problem.

@warrenfalk
Copy link
Owner

That may very well be it. The problem is that those instructions are only correct for the non-windows build, and there isn't necessarily a ready-made translation to the windows build. But I will see if I can build with PORTABLE=1. Do you know if your CPU supports SSE4.2?

@danielcrabtree
Copy link
Author

According to Wikipedia, it supports SSE4.2, so it shouldn't be the issue, but I can't see what else it could be. I also wonder what else PORTABLE=1 might change.

I managed to build rocksdb.dll using your build script, but unfortunately that gives me the same exception. I haven't been able to figure out where to put PORTABLE=1 either.

@warrenfalk
Copy link
Owner

I just built it with PORTABLE=1. Looking at the commit you referenced, I can see that the CMake build was indeed also updated for PORTABLE=1 and it should go after the cmake command that generates the project files (example below). I don't have any way to verify that this changed anything in the resulting binary. You can find the portable version I just built, here: https://github.com/warrenfalk/rocksdb-sharp/releases/download/v5.4.6/rocksdb-5.4.6-portable.dll.zip

cmake -G "Visual Studio 14 2015 Win64" -DPORTABLE=1 -DOPTDBG=1 -DGFLAGS=0 -DSNAPPY=1 ..

Thanks for all your assistance so far. I really thought this might be it. If this doesn't work, and you are able to get the build script working, you might be able to do a git bisect to find the offending rocksdb commit. But that would take a while.

@danielcrabtree
Copy link
Author

rocksdb-5.4.6-portable.dll didn't fix the issue.

I've gone through many many builds and found the problematic commit. facebook/rocksdb@8a8c967

+  if (DEFINED AVX2)
+    set(USE_AVX2 ${AVX2})
+  else ()
+    set(USE_AVX2 1)
+  endif ()

This enables AVX2 unless AVX2 is otherwise defined.

The solution is to add -DAVX2=0 to the cmake line:
cmake -G "Visual Studio 15 2017 Win64" -DOPTDBG=1 -DGFLAGS=0 -DSNAPPY=1 -DAVX2=0

The problem is that the solution changes from one version to the next. In 5.4.6, https://github.com/facebook/rocksdb/blob/v5.4.6/CMakeLists.txt, you need to add -DWITH_AVX2=0. I've tested this and it works.

And it has changed again in the current master, https://github.com/facebook/rocksdb/blob/95b0e89b5de72c9572a4dbfd188414de67a2b521/CMakeLists.txt, which uses -DPORTABLE=1. I haven't tested this.

Another issue is that this will hurt performance on newer machines that support AVX2. In my case I need to disable it during development and then enable it for production where it's supported.

@warrenfalk
Copy link
Owner

This makes sense. I don't have an non AVX2 system readily available, so that's why I can't reproduce it.

This is great work, here. I'm so grateful for your willingness to assist. Many thanks, again for tracking this down.

So it's true: we would rather not disable AVX2 support as a matter of default, but as non-AVX2 processors are still fairly common (any 1st gen intel Core iX series, I believe, which I was still running a year ago), that may yet be the best approach.

I am open to suggestions as to how to best handle the situation.

Keep in mind that using the RocksDbNative nuget package is already somewhat suboptimal. It has to package the native binary for basically ever conceivable platform (which so far is only 3, but there are definitely more that could be targeted, each one making the package that much more dead weight to bear on the other platforms.) And the rocksdb native library at 4-6 MB, as you've probably noticed, isn't tiny; it also grows every release.

Really the way this should work is that you should install the native rocksdb library on the native platform. The manage code is written already to support this. In Linux, this would be through the package manager. In Windows, you'd just need to put the dll in the path and name it correctly "rocksdb-5.4.6.dll" and it will be picked up. This allows you to install the correct version on the correct machines. This is the way the native library systems were designed. But it's worth noting that almost all packaging systems for windows, linux, and mac, do not tend conventionally to install most optimized native libraries for the target machines either. Almost always, a native optimized to the least common denominator they support is what gets installed.

(Note: this can also be important if you have multiple small independent processes from separate executables running on the same machine. If each has loaded a rocksdb native dll from its own directory, then each of these will be loaded into memory. In contrast, if you'd installed the rocksdb native dll instead, then it's loaded once).

So I am leaning towards just turning of AVX2 by default and letting optimization be a process of building it yourself and installing the optimized version. There's only one issue remaining which is that the dll is preferentially loaded from the application's directory, so if you do reference the RocksDbNative package, then the machine-installed version will be ignored. I can think about how to best allow differing behavior.

@danielcrabtree
Copy link
Author

I think the overriding behavior is great, it allows:

  1. Machine-installed version + no package: benefit from sharing between processes
  2. No machine-installed version + package: simple to deploy + different versions of RocksDB for different processes
  3. Machine-installed version + package: some processes can share machine-installed version + other processed can override and use different versions of RocksDB from package

Leaving AVX2 off by default is sensible, by default it should work, not necessarily be optimized. This makes it easier to get started.

Would it be reasonable to have multiple native packages? E.g. RocksDbNative-Win, RocksDbNative-Mac, RocksDbNative-Linux. That would allow people to add just the packages they need, allowing them to avoid the dead-weight. With this approach you could even have different packages for some of the important options, e.g. RocksDbNative-Win-AVX2.

This adds the work of building all these packages, but once it is automated and you have the build environment setup, it shouldn't be too much trouble. It also saves users from having to compile it themselves, which is very much as you said, finicky.

@alexvaluyskiy
Copy link

SQLitePCL.raw is a good example, how to distribute a native library https://github.com/ericsink/SQLitePCL.raw
Users can choose - use machine installed version, or package, or OS-specific package.

What do you think, about distributing RocksDB on Chocolatey? Like Sqlite did https://chocolatey.org/packages/SQLite

@warrenfalk
Copy link
Owner

For a managed binary, the build process is designed to allow building once and running anywhere. All the tooling will generally just assume that the output will be platform independent.

This is in contrast to a native binary which has to be rebuilt for each target platform and has tooling to support that.

Trying to use NuGet to pull platform-specific binaries into a platform-agnostic build environment generally works against the design of NuGet and the build tools. If I commit my csproj to source control with a dependency on the specific environment of my development machine, then my project will be unusable by any other developers developing on or targeting a different environment.

In other words, it makes no sense to build a managed project multiple times all with exactly the same output where the only difference is which native dll is copied to the output folder. The choice of which native binary to copy is a deployment task, not a build task.

There is a lot more about this discussed at aspnet/dnx#402 and several other places. The problem still has no great solution.

My guiding principle is basically to make it as close to plug-and-play as possible while still allowing for custom overriding behavior which is why RocksDbNative exists at all, and I think the current solution already does both of these.

Based on that guiding principle, my plan is to re-release RocksDbNative with AVX2 turned off in order to make it more portable.

@jesuslpm
Copy link

jesuslpm commented Jun 14, 2017

Same problem here link library (DLL) initialization routine failed on my old laptop at home. However it works fine on my new laptop at work. I applaud your plans, I think portability is more important than a little performance boost for your specific platform. Something that just works is better. If you really need that little performance improvement go ahead and build your own native library.

Also, simplicity is winner, neat and shiny, but if you build a native library for each platform, simplicity goes away. Users would ask. Which native library should I choose? What the hell is AVX2? what has to do with RocksDb. It isn't related to floating point vectors? Very confusing for newcomers, I believe.

You are doing a great job bringing rocksdb to the .net ecosystem keeping it simple stupid to start with.

@warrenfalk
Copy link
Owner

Please try the native library at https://github.com/warrenfalk/rocksdb-sharp/releases/download/v5.4.6.1/native-6e05979.zip and report back whether this resolves the exception on the problem machines (I don't have one to test on).

I built this with -DPORTABLE=1 and -DAVX2=0. If this works, I'll publish a new RocksDbNative nuget package.

@danielcrabtree
Copy link
Author

That one doesn't work. The option you need when building against 5.4.6 is -DWITH_AVX2=0.

-DAVX2=0 works on earlier commits, but was changed to -DWITH_AVX2=0 by 5.4.6. It appears to change again in the current master to -DPORTABLE=1.

@jesuslpm
Copy link

jesuslpm commented Jun 16, 2017

It doesn't work, same error

@warrenfalk
Copy link
Owner

Yep, my bad. In my race to get that last build out before becoming unavailable for a couple days, I missed the part of @danielcrabtree's message noting it had changed from AVX2 to WITH_AVX2.

Rebuilt again, get here: https://github.com/warrenfalk/rocksdb-sharp/releases/download/v5.4.6.2/native-6e05979.zip

@jesuslpm
Copy link

It works! thank you!

@danielcrabtree
Copy link
Author

v5.4.6.2 works.

@warrenfalk
Copy link
Owner

Thanks @danielcrabtree and @jesuslpm , RocksDbNative package updated.

Akamig pushed a commit to planetarium/rocksdb-sharp that referenced this issue Jul 1, 2022
Fixed support for open as secondary with column families
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants