-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Uri] Uri.IsWellFormedUriString() returns false for a URL which is correct #21626
Comments
Do you know what the behavior is on .NET Framework? In general, .NET Core behaves like .NET Framework. |
@davidsh Indeed, I just checked and get the same behavior or .Net 4.5.2. However this doesn't explain why this function returns false for this URL? |
Thx for confirming .NET Framework behavior. This will have to be investigated to see why this is returning false. |
I just found whenever a '%' appears in a url, Uri.IsWellFormedUriString() will return false. Hope this can be a starting point to investigate this issue. |
This still reproduces if you shorten the URI to As far as I can tell, this happens because the URI contains both the character "é" ( If you don't encode the comma (i.e. I have no idea if this behavior is correct. |
As @svick said , I managed to overcome this issue by decoding the url. string decodedUrl = HttpUtility.UrlEncode(url);
Uri.IsWellFormedUriString(decodedUrl, UriKind.RelativeOrAbsolute); |
My best guess here is that somewhere in the code we are checking the string for encoded non-reserved characters, and that check incorrectly considers commas to be unreserved. This should be a fairly simple issue to address for someone that wants to learn more about URI, so I'll mark this as up for grabs. If it lasts too long without getting picked up, I'll go ahead and fix it. |
I'm seeing this behaviour in a .NetFramework 4.5 project also. |
@hades200082 we are not tracking .NET Framework bugs in CoreFX repo. |
In .net core 2.1 I am also encountering what looks to be the same bug, or a very similar bug. var uri = @"https://maps.googleapis.com/maps/api/geocode/json?address=%2C%2CMontr%C3%A9al%2CQuebec%2CCanada&sensor=false";
Uri.IsWellFormedUriString(uri, UriKind.Absolute); //returns false, however above URI is valid. However, if I leave the URI unencoded it passes the IsWellFormedUriString check: var uri = @"https://maps.googleapis.com/maps/api/geocode/json?address=,,Montréal,Quebec,Canada&sensor=false";
Uri.IsWellFormedUriString(uri, UriKind.Absolute); //returns true |
@nicholasb90 can you please create minimal repro? (as in "shortest problematic Uri possible") |
cc @wtgodbe |
I suspect the issue is related to combining encoded characters that require one encode value and characters that require multiple encode values. For example, 学 encodes to %E5%AD%A6 while [ encodes to %5B. Here are some examples: public class UriTests
{
[Fact] // Fails
public void IsWellFormedUriString_ReturnTrue_GivenEncodedQueryStringWithCommaAndAccentCharacter()
{
var uri = @"http://g.c/j?a=%2C%C3%A9"; //encoded characters in query: ,é
Assert.True(Uri.IsWellFormedUriString(uri, UriKind.Absolute));
}
[Fact] // Passes
public void IsWellFormedUriString_ReturnTrue_GivenEncodedQueryStringWithComma()
{
var uri = @"http://g.c/j?a=%2C"; //encoded characters in query: ,
Assert.True(Uri.IsWellFormedUriString(uri, UriKind.Absolute));
}
[Fact] // Passes
public void IsWellFormedUriString_ReturnTrue_GivenEncodedQueryStringWithAccentCharacter()
{
var uri = @"http://g.c/j?a=%C3%A9"; //encoded characters in query: é
Assert.True(Uri.IsWellFormedUriString(uri, UriKind.Absolute));
}
[Fact] // Fails
public void IsWellFormedUriString_ReturnTrue_GivenEncodedQueryStringWithOpenBracketAndDoubleByteCharacter()
{
var uri = @"http://g.c/j?a=%E5%AD%A6%5B"; //encoded characters in query: 学[
Assert.True(Uri.IsWellFormedUriString(uri, UriKind.Absolute));
}
[Fact] // Passes
public void IsWellFormedUriString_ReturnTrue_GivenEncodedQueryStringWithOpenBracket()
{
var uri = @"http://g.c/j?a=%5B"; //encoded characters in query: [
Assert.True(Uri.IsWellFormedUriString(uri, UriKind.Absolute));
}
[Fact] // Passes
public void IsWellFormedUriString_ReturnTrue_GivenEncodedQueryStringWithDoubleByteCharacter()
{
var uri = @"http://g.c/j?a=%E5%AD%A6"; //encoded characters in query: 学
Assert.True(Uri.IsWellFormedUriString(uri, UriKind.Absolute));
}
} |
Try enabling IDN and IRI-Parsing in your App.config by adding this to your configuration section to ensure correct handling for international character set: <uri>
<idn enabled="All"/>
<iriParsing enabled="true"/>
</uri> Afer doing this, you should create a decoded version of your URL like this to avoid complications between encoded and decoded URLs: string decodedURL = HttpUtility.UrlDecode(yourURLString); Now you can check like this: if (Uri.IsWellFormedUriString(yourURLString, UriKind.Absolute) || Uri.IsWellFormedUriString(decodedURL , UriKind.Absolute)) Maybe this is not a perfect solution, but the closest one for me to get this working as reliable as possible. Btw. I'm using .Net Framework 4.5.2, but I guess it should also work with lower versions. |
I just ran into this one. You probably have plenty of examples, but just to further confirm @nicholasb90 's hypothesis: Assert.True(Uri.IsWellFormedUriString("http://myhost.com/%26", UriKind.Absolute)); // pass
Assert.True(Uri.IsWellFormedUriString("http://myhost.com/%C3%A9", UriKind.Absolute)); //pass
Assert.True(Uri.IsWellFormedUriString("http://myhost.com/%26%C3%A9", UriKind.Absolute)); //fail Is this a recommended work-around, i.e. using |
Triage: This will be breaking change - we will have to document it at minimum. |
I've had similar issues. In my case the method IsWellFormedUriString failed if it contained %2D instead of hyphen character (-) |
@FaizulHussain can you please update your reply with the code (e.g. like in https://github.com/dotnet/corefx/issues/19630#issuecomment-529069574)? It will be harder to miss in future. |
I just ran into this issue at work and can add that using non-ascii characters like Å or ตั together with any of the RFC 3986 section 2.2 Reserved Characters fails, ! * ' ( ) ; : @ & = + $ , / ? # [ ]. |
`Uri.IsWellFormedUriString()` doesn't return the expected result for specific urls, removed until the DotNet team actually resolves it ( dotnet/runtime#21626 )
We also just encountered this issue. And I just noticed this is up-for-grabs. A friend and I could be interested in doing this. @karelz everything good for a PR? And on the documentation part, whose responsibility would be that? The pr author, or you guys? I imagine the latter? |
For anyone that also encountered this issue on .Net Core, this may be of help: we tried every solution proposed out there (including the ones mentioned in this issue) and none worked for us. We thought of calling other language's libraries via interop, using regex, and other things, but ideally we wanted to stick to the .net implementation. Thus, we came up with a workaround for the time being (while we don't fix the underlying bug):
This won't work for cases where the offending character combinations are present in the user info or host (such as |
It has total 13 customer reports (1 was offline) -- only 2 are upvotes of the top post (1 is the original post). Can I ask everyone who've hit it to please upvote the top post? It will help us prioritize. Moving it to .NET 7.0 as it has rather larger impact. cc @MihaZupan |
This URI is also failing on .NET 6.0 @karelz .NET 6.0 was just released... having to wait for .NET 7 for a critical bug that should have been fixed 4 years ago is a bit ridiculous. |
When initializing a DataServiceContext there was a mandatory URI check relying on .NETs Uri.IsWellFormedUriString. Sadly there is an active issue on the .NET side that incorrectly flags URIs like "http://192.168.0.1:1234/Instance/ODataV4/Company('123-Customer Place Süd-Ost')/" as being invalid. The progress is tracked here: dotnet/runtime#21626 To give the user a simple option to prevent this issue from hindering further development with the library I added a simple bool to ignore this error.
Uri.IsWellFormedUri() reports a false negative when mixing characters like Å or ตั together with any of the RFC 3986 section 2.2 Reserved Characters, ! * ' ( ) ; : @ & = + $ , / ? # [ ]. This change adds a (failing) unit test for this bug. Tests dotnet#21626
Closing in favour of #72632 |
I have a C# (.Net Core 1.1) app that needs to check if a URL is valid. I used the Uri.IsWellFormedUriString() which works pretty well but have a doubt about this one below which returns false. It seems to me that the URL is perfectly valid?
I used the very same URL with the PHP function below which says the URL is correctly formatted:
If I refer to the RFC3986 it seems this URL is correct. Am I missing something here?
The text was updated successfully, but these errors were encountered: