Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LinuxLiveDataReader.GetVersionInfo does not produce the correct answer #306

Closed
loop-evgeny opened this issue Nov 11, 2019 · 21 comments · Fixed by #321
Closed

LinuxLiveDataReader.GetVersionInfo does not produce the correct answer #306

loop-evgeny opened this issue Nov 11, 2019 · 21 comments · Fixed by #321

Comments

@loop-evgeny
Copy link

ClrMd 1.1.46104 (also reproduces on latest master), .NET Core 2.1 on Linux x64 (Ubuntu 18.04)

Unhandled Exception: System.ArgumentOutOfRangeException: Version's parameters must be greater than or equal to zero.
Parameter name: revision
   at System.Version..ctor(Int32 major, Int32 minor, Int32 build, Int32 revision)
   at Microsoft.Diagnostics.Runtime.Linux.LinuxLiveDataReader.ParseForVersion(String s)
   at Microsoft.Diagnostics.Runtime.Linux.LinuxLiveDataReader.GetVersionInfo(UInt64 addr, VersionInfo& version)
   at Microsoft.Diagnostics.Runtime.ModuleInfo.get_Version()
   at Microsoft.Diagnostics.Runtime.DataTargetImpl.InitVersions()
   at Microsoft.Diagnostics.Runtime.DataTargetImpl.get_ClrVersions()
   at (my code)

This happens when using a self-contained deployment and the directory name is a GUID, separated with dashes, like "909B593B-4E9B-4294-AC28-FD5A369FA776".

Looking at what's happening in LinuxLiveDataReader.GetVersionInfo, it's trying to parse the directory name containing .so files as a version (and apparently accepts any hexadecimal numbers, too!) but this obviously doesn't work for self-contained deployments: the directory name may be a valid version, but not the CLR version.

@loop-evgeny
Copy link
Author

It seems to somehow work for most instances of my application (all of them running from directories named as GUIDs), but not this one. Maybe it has to do with the order of LinuxLiveDataReader._memoryMapEntries.

@leculver
Copy link
Contributor

Yeah that's pretty sketchy code. I should have reviewed that one closer when I merged it.

I'll replace that with just a Regex after I confirm there's not a better way to get the version of .Net Core.

@loop-evgeny
Copy link
Author

It's a problem I've run into myself and I've been reading it from the .deps.json file for self-contained deployments. Here's the code I use to get the version of the running .NET Core app, which I think can be easily modified to work with arbitrary application paths:

        private const string FrameworkPackageName = "Microsoft.NETCore.App";

        private static Version GetNetCoreRuntimeVersion()
        {
            var version = Environment.Version;

            // In .NET Core earlier than than 3.0 Environment.Version returns this hardcoded version: https://github.com/dotnet/coreclr/pull/22664/files
            if (version != new Version(4, 0, 30319, 42000))
                return version;

            // ... so try to get the real version from the assembly path (yes, this is the only way: https://github.com/dotnet/corefx/issues/28132)
            // Of course, this does not work for self-contained deployments.

            var assembly = typeof(System.Runtime.GCSettings).Assembly;
            var assemblyPath = assembly.CodeBase.Split(new[] {Path.PathSeparator, Path.AltDirectorySeparatorChar}, StringSplitOptions.RemoveEmptyEntries);

            string versionString;

            int netCoreAppIndex = Array.IndexOf(assemblyPath, FrameworkPackageName);
            if (netCoreAppIndex >= 0 && netCoreAppIndex < assemblyPath.Length - 1)
            {
                versionString = assemblyPath[netCoreAppIndex + 1];
            }
            else
            {
                try
                {
                    versionString = GetNetCoreRuntimeVersionFromSelfContainedDeployment(); // Probably a self-contained deployment - try reading .deps.json
                }
                catch (Exception ex)
                {
                    Trace.WriteLine("Failed to get .NET Core version for self-contained deployment: " + ex);
                    return null;
                }
            }

            if (versionString == null)
                return null;

            Version.TryParse(versionString, out version); // version will be null if not parsed successfully
            return version;
        }

        private static string GetNetCoreRuntimeVersionFromSelfContainedDeployment()
        {
            // A huge hack based on https://stackoverflow.com/a/55720218/1536933
            // .NET Core apps published as "self-contained" have a .deps.json file like this
            // {
            // ...
            //  "targets": {
            //    ".NETCoreApp,Version=vX.Y/PLATFORM": {
            //      ...
            //      "Microsoft.NETCore.App/X.Y.Z"
            // where X.Y.Z is the .NET Core runtime version we're after!

            const string frameworkPropertyPrefix = FrameworkPackageName + "/";

            var entryAssembly = Assembly.GetEntryAssembly();
            if (entryAssembly == null)
                return null;

            var depsFilePath = Path.ChangeExtension(entryAssembly.Location, ".deps.json");
            if (!File.Exists(depsFilePath))
                return null;

            using (var fileReader = new StreamReader(depsFilePath))
            using (var jsonReader = new JsonTextReader(fileReader))
            {
                var depsJson = JObject.Load(jsonReader);
                
                var targets = depsJson.Property("targets")?.Value as JObject;
                if (targets == null)
                    return null;

                foreach (var target in targets.Properties())
                {
                    if (target.Value is JObject targetObject)
                    {
                        foreach (var targetProperty in targetObject.Properties())
                        {
                            if (targetProperty.Name.StartsWith(frameworkPropertyPrefix, StringComparison.Ordinal))
                                return targetProperty.Name.Substring(frameworkPropertyPrefix.Length);
                        }
                    }
                }

                return null;
            }
        }

@leculver
Copy link
Contributor

I think we should just stop trying to parse the version on Linux, at least for now. On Windows we embed this information near the beginning of the dll. On linux this lives at a really deep offset. (For example search libcoreclr.so for the ascii string "@(#)", which is where the version is. In my local test environment the offset was 0x0079f480 into the binary.

I'm not sure of a faster way to look this information up other than searching the binary for it, but maybe there's a pointer I could look for.

I'm also not sure this string would be placed into a mini coredump if generated through dotnet-dump. This would mean even if we fixed the live data target to search for and read it, we would get inconsistent results in a coredump case where libcoreclr.so wasn't available locally.

I will probably circle back to this in the future.

@loop-evgeny
Copy link
Author

Isn't that going to break something else? Was the version not used for anything?

@leculver
Copy link
Contributor

Inside the library, it was used to make windows symbol server requests for mscordaccore_[version/arch].dll but we don't use that archiving mechanism for Linux. I double checked that we don't use that version for anything in .Net Core (but we do on desktop) and tested it.

I have two main thoughts here:

  1. I'd rather report no answer (version of 0.0.0.0) all of the time rather than report the correct answer some of the time.
  2. I should have left this issue open to track fixing LinuxLiveDataReaderin the future.

@leculver leculver reopened this Nov 21, 2019
@leculver leculver changed the title "Version's parameters must be greater than or equal to zero" exception attaching to linux self-contained deployment process LinuxLiveDataReader.GetVersionInfo does not produce the correct answer Nov 21, 2019
@nxtn
Copy link
Contributor

nxtn commented Nov 21, 2019

I'm not sure of a faster way to look this information up other than searching the binary for it, but maybe there's a pointer I could look for.

Searching for the version string seems viable. One "@(#)Version" string is located in the "nativeStringResourceTable_mscorrc_debug" table with an offset of about 2k bytes. The string resource table is in the library exports.

@leculver
Copy link
Contributor

@NextTurn Assuming that table goes into Mini dumps, yep that could be the fix here.

The underlying problem is consistent behavior. If you do dotnet-dump -type Mini, does that string table make it into the dump file? If it does, we are in business. If it doesn't then we'll have inconsistent behavior between live targets, "Heap" coredumps, and "Mini" coredumps. @mikem8361 might know the answer, but if not I can run a test of it.

Inconsistent behavior isn't the end of the world here as long as we don't report the wrong version, but I'd like to understand where we are before diving into this. (Also I'm working on getting the codebase ready for v2, so this issue is taking a back seat for a bit. I'm happy with returning 0.0.0.0 for now.)

@nxtn
Copy link
Contributor

nxtn commented Nov 21, 2019

Verified the string goes into "Heap"/"Mini" dumps from dotnet-dump.

@mikem8361
Copy link
Member

dotnet-dump/SOSHosting has C# code that searches for the @(#)Version for the "eeversion" SOS command. This code is only used for libcoreclr.so.

https://github.com/dotnet/diagnostics/blob/7fef83c075965a10e25d3dec551ea6cca4ca8c83/src/SOS/SOS.Hosting/SOSHost.cs#L667
https://github.com/dotnet/diagnostics/blob/7fef83c075965a10e25d3dec551ea6cca4ca8c83/src/SOS/SOS.Hosting/SOSHost.cs#L764

@leculver
Copy link
Contributor

leculver commented Dec 4, 2019

@NextTurn Fixed this one in PR #349. Thanks!

@leculver leculver closed this as completed Dec 4, 2019
@nxtn
Copy link
Contributor

nxtn commented Dec 5, 2019

https://github.com/dotnet/diagnostics/blob/7fef83c075965a10e25d3dec551ea6cca4ca8c83/src/SOS/SOS.Hosting/SOSHost.cs#L768

@mikem8361 FYI the use of ModuleInfo.FileSize here is incorrect: the file size on disk is often smaller than the image span size in memory. It happens to work for libcoreclr.so.

Test results of LinuxLiveDataReader.GetVersionInfo using ModuleInfo.FileSize, corresponding to those in #349:

/usr/share/dotnet/shared/Microsoft.NETCore.App/3.0.1/System.Native.so
v0.0.0.00
/usr/share/dotnet/shared/Microsoft.NETCore.App/3.0.1/libclrjit.so
v4.700.19.47502
/usr/share/dotnet/shared/Microsoft.NETCore.App/3.0.1/libcoreclr.so
v4.700.19.47502
/usr/share/dotnet/shared/Microsoft.NETCore.App/3.0.1/libhostpolicy.so
v0.0.0.00
/usr/share/dotnet/host/fxr/3.0.1/libhostfxr.so
v0.0.0.00

@weltkante

This comment has been minimized.

@loop-evgeny
Copy link
Author

What would be the first version to contain this fix and when is it expected to be released, please?

@nxtn
Copy link
Contributor

nxtn commented Jan 24, 2020

1.1.61812 seems to contain this fix.

@loop-evgeny
Copy link
Author

loop-evgeny commented Jan 24, 2020

Thanks, I upgraded to that version and a process that caused the previous exception now throws a different one:

Unhandled Exception: System.IO.IOException: I/O error occurred.
   at Microsoft.Diagnostics.Runtime.Linux.Reader.Read[T](Int64 position)
   at Microsoft.Diagnostics.Runtime.Linux.ElfFile..ctor(Reader reader, Int64 position, Boolean isVirtual)
   at Microsoft.Diagnostics.Runtime.Linux.LinuxLiveDataReader.GetVersionInfo(UInt64 baseAddress, VersionInfo& version)
   at Microsoft.Diagnostics.Runtime.ModuleInfo.get_Version()
   at Microsoft.Diagnostics.Runtime.DataTargetImpl.InitVersions()
   at Microsoft.Diagnostics.Runtime.DataTargetImpl.get_ClrVersions()

Not a very informative error. :)

@leculver
Copy link
Contributor

Hmm, interesting. I think the only place that would fail is here:

ElfHeaderCommon common = reader.Read<ElfHeaderCommon>(position);

Unfortunately all of the linux core dump reading code is really, really hard to debug. It's on my todo list to clean up, but I haven't gotten to it yet. I think I'm going to try to tackle that today (but I can't promise I'll wrap that up today).

@nxtn
Copy link
Contributor

nxtn commented Jan 25, 2020

Opened #528

@loop-evgeny
Copy link
Author

loop-evgeny commented Feb 13, 2020

Should I create a new issue about the new error? (Since this one is closed, but there is still a problem.)

@nxtn
Copy link
Contributor

nxtn commented Feb 13, 2020

Didn't #528 fix the new problem?

@loop-evgeny
Copy link
Author

Just tested it now and yes, it did, thank you. (I wasn't following that PR.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants