Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Uri] A lightweight alternative for System.Uri #34873

Closed
Rodrigo-Andrade opened this issue Apr 12, 2020 · 14 comments
Closed

[Uri] A lightweight alternative for System.Uri #34873

Rodrigo-Andrade opened this issue Apr 12, 2020 · 14 comments
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Net tenet-performance Performance related issue
Milestone

Comments

@Rodrigo-Andrade
Copy link

As the SocketsHttpHandler gets faster, many of my hot paths are now in System.Uri class.

There is a lot of parsing in any case of paths and/or QueryString parameters that varies on each request. System.Uri will always validate the entire string, not to mention that a string must be allocated each time. Would be nice to have an alternative, be forsake System.Uri for a new type in HttpRequestMessage, or provide "unsafe" apis for System.Uri, so it would not validate every part.

@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added area-System.Net untriaged New issue has not been triaged by the area owner labels Apr 12, 2020
@ghost
Copy link

ghost commented Apr 12, 2020

Tagging @dotnet/ncl as an area owner. If you would like to be tagged for a label, please notify danmosemsft.

@davidsh
Copy link
Contributor

davidsh commented Apr 12, 2020

@MihaZupan

@MihaZupan
Copy link
Member

MihaZupan commented Apr 13, 2020

System.Uri will always validate the entire string

It tries not to - it will lazily parse more if/when some properties are actually read. One big problem with this rn is that if the Uri contains just a single non-ascii char/escaped sequence it will give up completely and do a full parse every time. Changing this would be beneficial, though not trivial.

Before considering a new type, we should invest into perf improvement opportunities in the Uri implementation. Work that improves Uri for net5 like #34864 #32552 #34774 #34794 #32025 #32694 #31860 will improve throughput and allocations (especially if you deal with Uris containing any non-ascii chars).

More considerations such as #34618 should remove all overhead in cases when making multiple requests to the same REST API path. This should be very common given how many APIs are using POST, where the Uri can be reused.

#32606 and #32606 (comment) for improving perf when working with query strings.

UriBuilder is another type that can be significantly improved, and there are more improvements that could be made in Uri itself.

It's possible we could expose an API to fetch properties into a Span (suggested in #22903 (comment)) to avoid some overhead. It is now feasible to implement such an API, after changes like #34864.

That said, I wouldn't mind a high-perf type that only deals with Uris relevant to HttpClient/SocketsHttpHandler. We should, however, invest into perf improvements in the current Uri implementation first.

many of my hot paths are now in System.Uri class

Are you dealing with Uris that contain non-ascii chars? If you've measured perf here, are you seeing a lot of CPU spent or just a ton of allocations? If it's a lot of CPU time you are seeing, could you share a sample Uri that you are dealing with.

@Rodrigo-Andrade
Copy link
Author

Rodrigo-Andrade commented Apr 13, 2020

I don't have examples of the most expensive uris, but i think i have an snapshot of a trace from our Api Gateway (runs on dotnetcore), it sees all uris from all microservices.

image

This is the use breakdown from that app, in this case is net5 preview2 on windows.

I'll try to check if non-ascii chars is a common case.

@karelz karelz added api-suggestion Early API idea and discussion, it is NOT ready for implementation needs more info tenet-performance Performance related issue and removed untriaged New issue has not been triaged by the area owner labels Apr 13, 2020
@karelz karelz changed the title A lightweight alternative for System.Uri [Uri] A lightweight alternative for System.Uri May 6, 2020
@MihaZupan MihaZupan added this to the Future milestone Jul 13, 2020
@ghost ghost added the no-recent-activity label Dec 7, 2020
@dotnet dotnet deleted a comment Dec 7, 2020
@ghost ghost removed the no-recent-activity label Dec 7, 2020
@karelz
Copy link
Member

karelz commented Jan 15, 2021

@Rodrigo-Andrade how much CPU % does the whole Uri class take from your workload?
I wonder what you're doing with the Uris -- for example, it did not pop in our YARP reverse proxy scenarios, nor have we seen it in our partner workloads which got to us ...

@Rodrigo-Andrade
Copy link
Author

It improved a lot since the creation of this issue (my god it was 8 months ago, feels like yesterday, crazy times),
In terms of cpu time, things are much better now (i think some stuff got into de 5.0 release?)
Now the main problem is allocations:
image
It's the number 1 allocator on this workload.
The main code to create those uris is
new Uri($"{endpoint.Scheme}://{endpoint.Host}{GetAdjustedPath()}{GetAdjustedQueryString()}")

@MihaZupan
Copy link
Member

How are the Uris being used - only getting passed to HttpClient, or are you accessing other properties?
If calling GetComponents, what parameters are you using?

Yes, quite a lot of Uri improvements went into the 5.0 release.
Excess allocations in ReCreateParts will also be greatly reduced for the 6.0 release.

@Rodrigo-Andrade
Copy link
Author

Only getting passed to HttpClient.

@Rodrigo-Andrade
Copy link
Author

Just an update, the new ctor: Uri(string uriString, in UriCreationOptions creationOptions), solved most of my overhead, now the cpu time is in the host validation.
I am curious as to what prompted this new ctor that was introduced so late in the dev cycle...

@MihaZupan
Copy link
Member

MihaZupan commented Nov 9, 2021

Have you tried benchmarking your scenario on 6.0 without using the new ctor? There were significant improvements to ReCreateParts allocations in 6.0 (#34864).

There is some more context behind the API in #59099. The key reasons behind it being #52628 and #58057.
The aim behind the DangerousUseRawPathAndQuery flag was not performance, as you have to take care of all the formatting yourself anyway.

@karelz
Copy link
Member

karelz commented Nov 9, 2021

I wonder if we can close the issue then. Looks like the major concern has been addressed.

@Rodrigo-Andrade
Copy link
Author

I think you may, but this will eventually come back i think.

@karelz
Copy link
Member

karelz commented Nov 11, 2021

Sure, if it comes back with reasonable scenarios and impact on customers/apps, we can always revisit.
At this moment, this is a single request in 1.5 years.

Perhaps our iterations on perf will eventually show it up in traces and that might make the case.

@karelz karelz closed this as completed Nov 11, 2021
@karelz karelz modified the milestones: Future, 7.0.0 Nov 11, 2021
@MihaZupan
Copy link
Member

@Rodrigo-Andrade if you are seeing a performance improvement from using DangerousUseRawPathAndQuery, this means that Uri would be doing some transformations to your input. What kind of logic are you using to encode the Uri now?

@ghost ghost locked as resolved and limited conversation to collaborators Dec 12, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Net tenet-performance Performance related issue
Projects
None yet
Development

No branches or pull requests

5 participants