Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NativeAOT-LLVM: Merge from NativeAOT and make EETypePtr intrinsic #606

Closed
wants to merge 265 commits into from

Conversation

yowl
Copy link
Contributor

@yowl yowl commented Jan 28, 2021

This PR replaces #583 as that fast-forwarded the commits, losing the history. Makes EETypePtrOf<T> an intrinsic as per #577

stephentoub and others added 30 commits January 11, 2021 16:00
…xtensions (#46819)

* Avoid OidLookup.ToOid call per extension in X509Certificate2.Extensions

`X509Certificate2.Extensions` invokes `CertificatePal.Extensions`, which in turn creates the extensions collection, invoking `new Oid(string)` for each.  This in turn calls `OidLookup.ToOid` in order to gather the friendly name, even though in many situations, no one actually cares about the friendly name.  We can instead call `new Oid(string, null)`, which makes the friendly name lazily initialized on first use, saving the `OidLookup.ToOid` call when it's not needed, and significantly reducing the time to call Extensions (in particular when the predefined OID lookup tables don't contain the OID for an extension and when it can't be found on lookup).

* Allow OidLookup.ToOid to cache failure

ToOid has a cache, but it only caches successful results.  If ToOid fails to find the relevant OID, nothing is cached, which makes the failure path very expensive, as every ToOid call for that OID takes the slow path.  This lets it be cached.
* remove TSNC_WaitUntilGCFinished, noone checks for it

* remove TS_BlockGCForSO, the secenario no longer can happen.

* unused flag related to user-requested thread suspension
Mono on netcore doesn't support COM wrappers that System.Speech depends
on. Disable System.Speech.Tests on Mono.
…46780)

* Quiet the output from emsdk_env.sh to make the build less verbose.

* Quiet mono-cil-strip as well.
* Resolve ILLink warnings in System.Reflection.DispatchProxy

I also took this chance to remove unnecessary GetTypeInfo() calls, since some of them needed to be removed.

Contributes to #45623

* Update annotations for PR feedback.

Remove DynamicallyAccessedMemberTypes.All from the interface and base class types. If unused members are trimmed, this won't cause problems with DispatchProxy.

* Add a trimming test for DispatchProxy since we are suppressing ILLinker warnings.
* Fix handling of module override token in signature parser. When the override is present, a new SignatureDecoder is created and used as the decoder for the final signature with the fixup kind (and module override flag which is stored in the upper bit of the fixup kind byte) already parsed. This causes the remainder of the signature to be parsed as a full R2R signature which is now missing the fixup type.
* Instead of creating a new decoder when a module override is present, set up the initial decoder's metadata reader in the constructor by detecting the module override up front.
* Fix `TryLocateNativeReadyToRunHeader` to swallow `BadImageFormatException` and return true / false whether the image has a native R2R header (for composite images).
* Running R2RDump on IL assemblies with no R2R now emits an error message that the assembly is not R2R instead of an unhelpful error about some RVA offset conversion failing.
* GT_NEG optimization for multiplication and division

* Distribute negation over parenthetical multiplication or division.

* Removing duplicate logic that I had put in accidently.

* Check overflow and other conditions before performing morph

* Resolved merge conflict and cleanup morph.cpp

* Formatting morph.cpp

* Returning tree after performing smpop again to fix flags

* Formatting

* Added check for optimizations, formatting.

* Using gtIsActiveCSE_Candidate instead of fgGlobalMorph

* Update src/coreclr/jit/morph.cpp

Co-authored-by: Sergey Andreenko <seandree@microsoft.com>

* Formatting

* delete formatting changes.

* Add a test.

* Change the conditions a bit.

* Better names for the tests.

Co-authored-by: Sergey Andreenko <seandree@microsoft.com>
* Detect inner loop and add 10 bytes of padding at the beginning

* generate nop in previous blocks

* TODO: figure out if anything needs to be done in optCanonicalizeLoop

* Add COMPlus_JitAlignLoopMinBlockWeight and COMPlus_JitAlignLoopMaxCodeSize

- Add 2 variables to control which loops get aligned
- Moved padding after the conditional/unconditional jump of previous block

* Reuse AlignLoops flag for dynamic loop alignment

* Detect back edge and count no. of instructions before doing loop alignment

* fix bugs

* propagate the basic block flag

* Switch from instrCount to codeSize

* JitAlignLoopWith32BPadding

* Add emitLoopAlign32Bytes()

* wip

* Add logic to avoid emitting nop if not needed

* fix a condition

* Several things:

- Replaced JitAlignLoopWith32BPadding with JitAlignLoopBoundary
- Added JitAlignLoopForJcc
- Added logging of boundary and point where instruction splitting happpens
- Add logic to take into consideration JCC.

* Added JitAlignLoopAdaptive algorithm

* wip

* revert emitarm64.cpp changes

* fix errors during merge

* fix build errors

* refactoring and cleanup

* refactoring and build errors fix

* jit format

* one more build error

* Add emitLoopAlignAdjustments()

* Update emitLoopAlignAdjustments to just include loopSize calc

* Remove #ifdef ADAPTIVE_LOOP_ALIGNMENT

* Code cleanup

* minor fixes

* Fix issues:
- Make sure all `align` instructions for non-adaptive fall under same IG
- Convert some variables to `unsigned short`
- Fixed the maxPadding amount for adaptive alignment calculation

* Other fixes

* Remove align_loops flag from coreclr

* Review feedback

- Do not align loop if it has call
- Created `emitSetLoopBackEdge()` to isolate `emitCurIG` inside emitter class
- Created `emitOutputAlign()` to move the align instruction output logic
- Renamed emitVariableeLoopAlign() to emitLongLoopAlign()
- Created `optIdentifyLoopsForAlignment()` to identify loops that need alignment
- Added comments at various places

* jit format

* Add FEATURE_LOOP_ALIGN

* remove special case for align

* Do not propagate BBF_LOOP_ALIGN in certain cases

* Introduce instrDescAlign and emitLastAlignedIgNum

* Several changes:

- Perform accurate padding size before outputting align instruction
- During outputting, just double check if the padding needed matches to what was calculated.
- If at any time, instruction sizes are over-estimated before the last align instruction,
  then compensate them by adding NOP.
- As part of above step, do not perform encoding "VEX prefix shortening" if there is align
  instruction in future.
- Fix edge cases where because of loop cloning or resolution phase of register allocator, the
  loops are marked such that they cover the loops that are already mark for alignment. Fix by
  resetting their IGF_LOOP_ALIGN flag.
- During loop size calculation, if the last IG also has `align` flag, then do not take into account
  the align instruction's size because they are reserved for the next loop.

* jit format

* fix issue related to needLabel

* align memory correctly in superpmi

* Few more fixes:

- emitOffsAdj takes into account for any mis-prediction of jump. If we compensate that mis-prediction, that off that adjustment.
- Record the lastAlignIG only for valid non-zero align instructions

* minor JITDUMP messages

* Review comments

* missing check

* Mark the last align IG the one that has non-zero padding

* More review comments

* Propagate BBF_LOOP_ALIGN for compacting blocks

* Handle ALIGN_LOOP flag for loops that are unrolled

* jit format

* Loop size upto last back-edge instead of first back-edge

* Take loop weight in consideration

* remove align flag if loop is no longer valid

* Adjust loop block weight to 4 instead of 8

* missing space after rebase

* fix the enum values after rebase

* review feedback

* Add missing #ifdef DEBUG
* fallback to GetAddrInfoW if GetAddrInfoExW fails

* Update src/libraries/System.Net.NameResolution/src/System/Net/Dns.cs

Co-authored-by: Cory Nelson <phrosty@gmail.com>

* Update src/libraries/System.Net.NameResolution/src/System/Net/NameResolutionPal.Windows.cs

Co-authored-by: Jan Kotas <jkotas@microsoft.com>

* feedback from review

* update fakes

* attempt to fix telemetry

* remofe forced telemetry

* fix spelling

* improve duration precision

* task refactor

* update fakes again

* rename function

Co-authored-by: Cory Nelson <phrosty@gmail.com>
Co-authored-by: Jan Kotas <jkotas@microsoft.com>
Fix primarily two main issues triggering races related to ref counting on
EventPipeThread.

First was an incorrect mapping of hash map add method in CoreClr shim.
Intent of this method was a add_replace, but CoreClr shim ended up with
Add implementation. That could result in unreferenced threads being added
into thread_sequence_number_map, that on free would release the ref count
creating unbalance, triggering EventPipeThreads to be terminated to soon.

There is also a theoretical race (don't know if it has been hit) that
manifests when we end up adding threads into thread_sequence_number_map
instead of replacing an existing already ref counted item. If we add,
then we will hit the same race, so added a fix making sure we detect
this case and correctly makes an addref to the thread. This potential
issue exists in C++ version of EventPipe library as well.

The other issue triggering another race related to ref counted threads.
The streaming thread in C library used ep_thread_get_or_create to get a
reference to itself. This will create an EventPipeThread
instance in TLS, but it will also add thread into global thread list.
Since the streaming thread doesn’t do additional writes of events it
will not get a thread session state, meaning that it will keep its original
ref count. This opens up a race when the TLS destructor decrements
the reference count, and another thread disabling a session. The
thread that disables the session will get a copy of all running threads,
but if the thread running TLS destructor managed to be the first
hitting ref count down to 0, but other thread will beat it taking
locks protecting list, the thread with refcount == 0 will still
be in the list copy, meaning the other thread will end up with a
potentially freed object on its thread list copy.

I made two fixes for this issue, first, don’t use ep_thread_get_or_create
from streaming thread, since there is an ability to get hold of
runtime specific thread object, that will eliminate the race between
a thread added into EventPipe, but not writing events, so it won’t have
an thread session state holding an additional ref count. Second, since
the race is still there due to the spit between ref count and lock
protecting the list, I also made a change in the C library, moving the
register/unregister on the list into the ep_thread_get_or_create (register)
and TLS destructor (unregister), meaning that the list will track live
threads currently using EventPipe library, and the EventPipeThread
object is still ref counted, meaning that it still can outlive the
lifetime of the physical thread.

Doing this split close the potential race condition (and even fix the
scenario if we kept the streaming thread added into the global thread list).
This hypothetical race condition exists in C++ library as well but might
have been mitigated by ref counting thread session state and making sure
the TLS destructor won’t race with a thread doing disable since thread
session state ref count will be held past par of disable that could race.
* Update dependencies from https://github.com/mono/linker build 20210111.1

Microsoft.NET.ILLink.Tasks
 From Version 6.0.0-alpha.1.21057.1 -> To Version 6.0.0-alpha.1.21061.1

* Update dependencies from https://github.com/mono/linker build 20210111.2

Microsoft.NET.ILLink.Tasks
 From Version 6.0.0-alpha.1.21057.1 -> To Version 6.0.0-alpha.1.21061.2

* Update dependencies from https://github.com/mono/linker build 20210111.3

Microsoft.NET.ILLink.Tasks
 From Version 6.0.0-alpha.1.21057.1 -> To Version 6.0.0-alpha.1.21061.3

* Update dependencies from https://github.com/mono/linker build 20210111.4

Microsoft.NET.ILLink.Tasks
 From Version 6.0.0-alpha.1.21057.1 -> To Version 6.0.0-alpha.1.21061.4

* Update dependencies from https://github.com/mono/linker build 20210111.5

Microsoft.NET.ILLink.Tasks
 From Version 6.0.0-alpha.1.21057.1 -> To Version 6.0.0-alpha.1.21061.5

Co-authored-by: dotnet-maestro[bot] <dotnet-maestro[bot]@users.noreply.github.com>
…804)

* Update dependencies from https://github.com/dotnet/xharness build 20210111.2

Microsoft.DotNet.XHarness.CLI , Microsoft.DotNet.XHarness.TestRunners.Xunit
 From Version 1.0.0-prerelease.20630.1 -> To Version 1.0.0-prerelease.21061.2

* Update dependencies from https://github.com/dotnet/runtime build 20210110.3

Microsoft.NETCore.ILAsm , runtime.native.System.IO.Ports , Microsoft.NETCore.DotNetHost , Microsoft.NET.Sdk.IL , Microsoft.NETCore.DotNetHostPolicy , System.Runtime.CompilerServices.Unsafe , System.Text.Json
 From Version 6.0.0-alpha.1.21053.3 -> To Version 6.0.0-alpha.1.21060.3

* Update dependencies from https://github.com/dotnet/xharness build 20210111.3

Microsoft.DotNet.XHarness.CLI , Microsoft.DotNet.XHarness.TestRunners.Xunit
 From Version 1.0.0-prerelease.20630.1 -> To Version 1.0.0-prerelease.21061.3

Co-authored-by: dotnet-maestro[bot] <dotnet-maestro[bot]@users.noreply.github.com>
* Resolve ILLink warnings in System.Reflection.TypeExtensions

Contributes to #45623

* Update API Compat baseline for Type.GetMembers() attribute change.
…sions (#46814)

Co-authored-by: Mitchell Hwang <mitchell.hwang@microsoft.com>
* Fix PKT

* Fix SAPI errors

* Disable test as appropriate

* enable more tests to fail to get messages

* reenable tests

* Update src/libraries/System.Speech/tests/SpeechRecognizerTests.cs

Co-authored-by: Stephen Toub <stoub@microsoft.com>

Co-authored-by: Stephen Toub <stoub@microsoft.com>
* Fix Ilasm Round Trip script by adding retry logic
Fix bash syntax
Update bash syntax to use while true

* Fix bash syntax error missing then

* bash - increment ilasm_count
* Adding support for Math/MathF.SinCos

* Fix the handling of sincos on __APPLE__

* Adjusting the SinCos internal call to work on all platforms

* Mark the public SinCos as intrinsic and ensure it is handled in gentree

* Dropping the NI_System_Math_SinCos intrinsic as it requires more extensive changes

* Removing unnecessary casts from the COMSingle and COMDouble SinCos calls
* Add RequiresAssemblyFilesAttribute

* Apply suggestions from code review

Co-authored-by: Eric Erhardt <eric.erhardt@microsoft.com>
…46772)

Refactor `WasmApp.targets` to:

1. Move duplicated properties, targets from the various wasm projects to a common location
2. Split `WasmApp.targets` into:
     a. `WasmApp.InTree.props`, and `WasmApp.props`
     b. `WasmApp.InTree.targets`, and `WasmApp.Targets`

- The props/targets split makes this similar to a SDK
- `WasmApp.{props, targets}` - have the core bits to build a wasm app, but don't have anything specific to the `dotnet/runtime` tree
- `WasmApp.InTree.{props, targets}` - have bits to facilitate local builds in `dotnet/runtime`, eg. to use a local build of the runtime pack

This simplifies project files a lot. And sets things up for being able to have a wasm SDK, and be able to use them to build wasm apps outside the tree.

Note: In case of the *InTree* files, they depend on runtime specific properties being set, like `$(ArtifactsBinDir)` which are set in inherited `Directory.Build*` files, which tend to override properties even if they are already set. Due to this, in some cases we have to import `*InTree*` files after the inherited ones, and set some properties before, and some after importing `Directory.Build.*`.

* [wasm] WasmAppBuilder: add new input for extra files to deploy

.. to the bundle. This eliminates the need to hack it in with
aftertargets.

* [wasm] samples: Extract the common bits to Directory.Build.*

- Also, introduce WasmApp.props, to have some of the property defaults
  that aren't specific to the samples

* [wasm] samples: no need to add cs files manually

* [wasm] Extract common bits from wasm FunctionalTests to Directory.Build*

* [wasm] Add WasmApp.InTree.{props,targets}

The *InTree* files have the properties/targets required to build within
the tree. The main `WasmApp.{props,targets}` should be on the way to
becoming useful outside the tree.

* [wasm] Update debugger-tests to use the InTree targets

* [wasm] Add  to debugger tests, consistent with other make targets

* [wasm] use TrickRuntimePackLocation target only if applicable

* [wasm] add EMSDK_PATH default value for wasm projects

* [wasm] Move RebuildWasmAppBuilder to InTree.targets

* [wasm] Simplify functionalTests project files

* [wasm] Fix tests in src/tests run with WasmTestRunner

* little cleanup

* Address review feedback from @mdh1418

* add comment

* [wasm] update build/README.md

* Add comment for InTree* files

* [wasm] sample - move some non-debug properties to D.B.props

* [wasm] Move CopySampleAppToHelix* to InTree.targets
* Implement managed side of CLong/CULong/NFloat.

* Make CLong, CULong, and NFloat intrinsically handled correctly by the JIT.

* Add framework tests for CLong, CULong, and NFloat.

* Add interop test of CLong to validate calling convention semantics.

* Update CULong.cs

* Fix implicit conversions.

* Fix overflow and equality test failures.

* Fix formatting.

* Fix formatting and add function header.

* Add doc comments.

* Don't throw on float out of range. Rename tests.

* Rewrite EqualsTest implementations more straightforward.

* Fix NFloat tests.

* Use .Equals instead of ==

* Use ToString directly instead of hard coding the expected value.

* Update the emitted assembly stub's thiscall retbuf support for x86 to account for the new native exchange types.

* Add sizeof tests.

* Add test with struct containing CLong.

* Disable ThisCallTest on Mono due to #46820

* validate type name.
…ternalParent` (#46875)

throws `System.PlatformNotSupportedException : Cannot wait on monitors on this runtime.`

Partially fixes dotnet/runtime#46768
geoffkizer and others added 27 commits January 22, 2021 12:43
move existing remote server tests in HttpClientHandlerTest.cs into a new file, HttpClientHandlerTest.RemoteServer.cs

Co-authored-by: Geoffrey Kizer <geoffrek@windows.microsoft.com>
…2.1 (#47332)

[master] Update dependencies from mono/linker
* Delete custom FastRandom from ThreadPool

- Consolidates 128/256-bit variants of Xoshiro** into the same class name
- Uses that from ThreadPool instead of its custom xorshift-based algorithm

Throughput stays the same (~2.5ns per next random value) and incurs just one additional object allocation per thread pool thread.

* Consolidate TARGET_64/32BIT DefineConstants
* Remove excess allocations in Uri.ReCreateParts

* Fix Compression offset

* Revert VSB optimizations

* Use noEscape.Length

* Simplify TryGetUnicodeEquivalent

* Remove ValueStringBuilderExtensions

* Index into chars instead of dest

* Remove unreachable code block

* Use StackallocThreshold constant

* Add comments about why MemoryMarshal is used to recreate the span
- address difference I introduced in dotnet/aspnetcore#29511 and more
    - conflict detected in dotnet/aspnetcore#29520
- also s|aspnet/aspnetcore|dotnet/aspnetcore|
* Remove dead code around x86 delegate interop.

* Simplify GetUMEntryThunk.
* Fold casts of constants in the importer

* Do not use a separate local for the cast operand

* Do not try to access the operation that has been folded

* Condition the call to CheckDivideByConstOptimized on success of the folding
…NativeAOT-LLVM

# Conflicts:
#	eng/Subsets.props
#	src/coreclr/CMakeLists.txt
#	src/coreclr/nativeaot/System.Private.CoreLib/src/System.Private.CoreLib.csproj
#	src/coreclr/tools/aot/ILCompiler/ILCompiler.csproj
#	src/libraries/Native/Unix/System.Globalization.Native/CMakeLists.txt
[master] Update dependencies from mono/linker
* Fixed compilation for libicu 68.

* Fixed missing include.

Co-authored-by: Jan Kotas <jkotas@microsoft.com>
* Optimize Interlocked.Exchange and Interlocked.CompareExchange for IntPtr

* Address feedback
If the generic method was metadata-enabled, it would be emitted as a `QualifiedMethod`, not `MemberReference`.
<!--
Thank you for your Pull Request!

If you are new to contributing to Mono, please try to do your best at conforming to our coding guidelines http://www.mono-project.com/community/contributing/coding-guidelines/ but don't worry if you get something wrong. One of the project members will help you to get things landed.

Does your pull request fix any of the existing issues? Please use the following format: Fixes #issue-number
-->

Co-authored-by: vargaz <vargaz@users.noreply.github.com>
Use the non-Emit codepath if `IsDynamicCodeSupported` is false.
@yowl
Copy link
Contributor Author

yowl commented Jan 29, 2021

this is a mess :-(

@yowl yowl closed this Jan 29, 2021
yowl pushed a commit to yowl/runtimelab that referenced this pull request Mar 2, 2023
* Support Arm64 "constructed" constants in SuperPMI asm diffs

SuperPMI asm diffs tries to ignore constants that can change between
multiple replays, such as addresses that the replay engine must generate
and not simply hand back from the collected data.

Often, addresses have associated relocations generated during replay.
SuperPMI can use these relocations to adjust the constants to allow
two replays to match. However, there are cases on Arm64 where an address
both doesn't report a relocation and is "constructed" using multiple
`mov`/`movk` instructions.

One case is the `allocPgoInstrumentationBySchema()`
API which returns a pointer to a PGO data buffer. An address within this
buffer is constructed via a sequence such as:
```
mov     x0, #63408
movk    x0, #23602, lsl dotnet#16
movk    x0, dotnet#606, lsl dotnet#32
```

When SuperPMI replays this API, it constructs a new buffer and returns that
pointer, which is used to construct various actual addresses that are
generated as "constructed" constants, shown above.

This change "de-constructs" the constants and looks them up in the replay
address map. If base and diff match the mapped constants, there is no asm diff.

* Fix 32-bit build

I don't think we fully support 64-bit replay on 32-bit host, but this
fix at least makes it possible for this case.

* Support more general mov/movk sequence

Allow JIT1 and JIT2 to have a different sequence of
mov/movk[/movk[/movk]] that map to the same address in the
address map. That is, the replay constant might require a different
set of instructions (e.g., if a `movk` is missing because its constant
is zero).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.