Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[interp] Intrinsify IndexOfOrLessThan #40705

Closed
wants to merge 1 commit into from

Conversation

BrzVlad
Copy link
Member

@BrzVlad BrzVlad commented Aug 12, 2020

Speeds up json deserialization by 5-10%

Following discussion from #39733

Speeds up json deserialization by 5-10%
@ghost
Copy link

ghost commented Aug 12, 2020

Tagging subscribers to this area: @BrzVlad
See info in area-owners.md if you want to be subscribed.

@@ -1895,6 +1895,11 @@ interp_handle_intrinsics (TransformData *td, MonoMethod *target_method, MonoClas
!strncmp ("System.Runtime.Intrinsics", klass_name_space, 25) &&
!strcmp (tm, "get_IsSupported")) {
*op = MINT_LDC_I4_0;
} else if (!strcmp (m_class_get_image (target_method->klass)->assembly_name, "System.Text.Json") &&
!strcmp (klass_name_space, "System.Text.Json") &&
!strcmp (klass_name, "JsonReaderHelper")) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if this check is rigorous enough

@jkotas
Copy link
Member

jkotas commented Aug 12, 2020

The strategy of intrinsifyng random methods in up-stack libraries can quickly result into unmaintainable codebase.

Is it really that important to speed up json parsing by 5-10% with the interpreter?

I think we are striking a wrong balance between performance and maintainability here.

@EgorBo
Copy link
Member

EgorBo commented Aug 12, 2020

The strategy of intrinsifyng random methods in up-stack libraries can quickly result into unmaintainable codebase.

Is it really that important to speed up json parsing by 5-10% with the interpreter?

I think we are striking a wrong balance between performance and maintainability here.

I'd also provide simple for-loops implementations for different "unrolled by hands" things over BCL, e.g. https://github.com/dotnet/runtime/blob/master/src/libraries/System.Private.CoreLib/src/System/SpanHelpers.Byte.cs#L98-L186
can be implemented in 3 lines of code = smaller size, better interp performance as far as I understand.

@jkotas
Copy link
Member

jkotas commented Aug 12, 2020

Would it be better to provide the small simpler implementations in C#? We have prior art for this - look for *.SizeOpt.cs under libraries.

@BrzVlad
Copy link
Member Author

BrzVlad commented Aug 12, 2020

Here is my overall opinion on interp/bcl interaction.

  • I don't like adding intrinsics. At the end of the day they are just duplicated code and a hack.
  • I'm fine with a few intrinsics here and there, especially on low level bcl structures, like Span. This commit is definitely a little bit out of line since it is intrinsifying a method that is not even in corlib.
  • I assume that most of these few intrinsics that were added as of late will be removed at some point and we will AOT the code in question instead.
  • I don't have strong feelings about this PR either way. I think however, with this change, we are gaining some performance on a critical wasm path with very little work.
  • I don't plan on adding any more intrinsics. There aren't really other obvious candiates for blazor workloads that matter (rendering and json serialization)
  • I don't think it is a good idea to do bcl changes to facilitate interpreter performance yet, because the interpreter level of supported optimizations is not yet matured. We might just waste time and make certain bcl paths ugly in order to do certain optimizations that could become useless in 1 year.
  • I think that once we squeeze everything we can from interp performance, than we can make a more informed decision on what type of changes we can do in the bcl to be more interpreter friendly.

This is more of a theoretical perspective. Given that, in practice we might want to have some perf gains faster, I'm fine with some temporary uglier changes here and there.

@jkotas
Copy link
Member

jkotas commented Aug 12, 2020

we are gaining some performance on a critical wasm path with very little work.

Do we have a list of the critical end-to-end wasm scenarios somewhere? How much is this moving the needle on these scenarios?

@lewing
Copy link
Member

lewing commented Aug 12, 2020

Beyond general performance, improving JSON Serialization/Deserialization speed was identified as a key metric for the wasm net5.0 release. Adding @danroth27 more details.

@danroth27
Copy link
Member

Do we have a list of the critical end-to-end wasm scenarios somewhere?

Yes, for .NET 5 we are focused on optimizing Blazor component rendering, particularly high density UI like you would find in a grid component, and JSON serialization/deserialization. @SteveSandersonMS and @pranavkm can point you to the specific benchmark implementations.

Copy link
Member

@steveharter steveharter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@pranavkm
Copy link
Contributor

@jkotas
Copy link
Member

jkotas commented Aug 13, 2020

Do we have goals we are trying to hit with these benchmarks for .NET 5? What is the expected impact of this change on these specific benchmarks?

@SamMonoRT
Copy link
Member

@pranavkm @jkotas - any concern merging this while we answer the question of specific benchmarks and potential improvement ?

@SamMonoRT
Copy link
Member

also waiting for mono\mono mirror CI results

@danroth27
Copy link
Member

Do we have goals we are trying to hit with these benchmarks for .NET 5?

@jkotas Yes, the goal for JSON serialization in .NET 5 is to double the performance. We've pretty much hit that goal at this point, but given that when we started this work JSON serialization was over an order of magnitude slower on WebAssembly than on .NET Core, there is still plenty of room for improvement.

@jkotas
Copy link
Member

jkotas commented Aug 14, 2020

any concern merging this while we answer the question of specific benchmarks and potential improvement ?

I do have concerns with merging this. This is architecture-wise unsound change.

Given that we have hit our performance goals for .NET 5, I think this should be closed. We will get more performance improvements in .NET 6 from AOT that won't require this change and that will deliver orders of magnitude more improvements that this change.

@danroth27
Copy link
Member

I do have concerns with merging this. This is architecture-wise unsound change.
Given that we have hit our performance goals for .NET 5, I think this should be closed.

Even though we've currently hit our perf goals for .NET 5, that doesn't mean that additional perf gains in this area aren't valuable. Could we take this change and the perf gain now for .NET 5, and then file a tracking issue to revert it as part of doing the AoT work in .NET 6? It seems like that would deliver the most value to customers and would address the long-term architecture concern.

@jkotas
Copy link
Member

jkotas commented Aug 14, 2020

I do not think that this improvement is worth this hack. These hacks tend to live their own life, cause pain, and it is hard to get rid of them.

Wasm in .NET 5 is going to be incredibly slow for anything compute intensive. This hack is not going to change what kind of wasm-targeting applications are feasible in .NET 5.

@benaadams
Copy link
Member

Would it be better to provide the small simpler implementations in C#? We have prior art for this - look for *.SizeOpt.cs under libraries.

Would something like this work? cdbd955 (from #40877)

@steveharter
Copy link
Member

Would it be better to provide the small simpler implementations in C#? We have prior art for this - look for *.SizeOpt.cs under libraries.

Would something like this work? cdbd955 (from #40877)

Note the original PR #39733 added a small bit of C# but in the last commit of that PR it was removed assuming mono intrinsics were less controversial.

@steveharter
Copy link
Member

Do we have a list of the critical end-to-end wasm scenarios somewhere?

Yes, for .NET 5 we are focused on optimizing Blazor component rendering, particularly high density UI like you would find in a grid component, and JSON serialization/deserialization. @SteveSandersonMS and @pranavkm can point you to the specific benchmark implementations.

I also added this doc (as a Runtime Discussion) that goes over the serialization performance.

@ericstj
Copy link
Member

ericstj commented Aug 18, 2020

@jkotas what if these things had similar behavior to cross-gen'ed assemblies that are in the shared framework? I'm not too familiar with the mechanics of how the mono-interpreter-intrinsics work, but it seems like a similar problem and we're OK with the lifecycle/architectural characteristics of crossgen/ngen.

@jkotas
Copy link
Member

jkotas commented Aug 19, 2020

JIT/crossgen/ngen does not special case random private methods from up-stack assemblies. All methods that are special-cased by the codegen as intrinsics should live in CoreLib.

For up-stack assemblies, I am ok with special casing code patterns (ie not special casing private methods with specific names), so that the optimization kicks in for any method that contains specific pattern. A lot of codegen optimizations are like this.

Also, System.Text.Json is not the only (json) parser our there. If we were to special case its specific private implementation details by name in codegen or interpreter, it creates unfair playing field for the larger ecosystem.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants