Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to send request URL path with dot-segments (without resolving) #8657

Open
jacob-pro opened this issue Jan 20, 2025 · 10 comments
Open
Labels
bug Bug in existing code

Comments

@jacob-pro
Copy link

jacob-pro commented Jan 20, 2025

It doesn't seem to be possible to send a URL with raw / unresolved dot-segments.

The problem happens even when passing a Java String or Java URL. How to reproduce:

@Test
void testUrl() throws Exception {
    ServerSocket serverSocket = new ServerSocket(0, 50);
    try (val executor = Executors.newVirtualThreadPerTaskExecutor()) {
        executor.submit(() -> {
            try {
                val socket = serverSocket.accept();
                try (var is = socket.getInputStream(); var os = socket.getOutputStream()) {
                    BufferedReader reader = new BufferedReader(new InputStreamReader(is));
                    var firstLine = reader.readLine(); // e.g. 'GET /http/foo HTTP/1.1'
                    System.out.println("Request Line: " + firstLine);
                    socket.close();
                }
            } catch (IOException e) {
                // ignored
            }
        });
        String url = "http://localhost:" + serverSocket.getLocalPort() + "/abc/../123";
        Request request = new Request.Builder().url(url).build();
        OkHttpClient client = new OkHttpClient().newBuilder().build();
        try (var response = client.newCall(request).execute()) {
            System.out.println(response);
        }
    }
}

Outputs:

Request Line: GET /123 HTTP/1.1

Expected:

Request Line: GET /abc/../123 HTTP/1.1

Similar problems:

  1. OkHttpClient automatically replaces a \ with a / in the URL
  2. OkHttpClient automatically escapes non-standard URL characters such as |<>"

An example reason this could be a problem is if you are using untrusted / external input to build a URL then you don't want to risk allowing access to an unexpected web-server directory via .. or an unexpected \ to / conversion.

Is there anyway to disable this URL normalisation behaviour?

@jacob-pro jacob-pro added the bug Bug in existing code label Jan 20, 2025
@yschimke
Copy link
Collaborator

OkHttp tries strongly to follow the specifications and do the right thing in all cases.

But there are some extra methods like addEncodedPathSegment that allow you to avoid this helpful and correct behaviour when needed.

@yschimke
Copy link
Collaborator

From the tests, it may still be doing more changes than you want

https://github.com/square/okhttp/blob/okhttp_4x/okhttp/src/test/java/okhttp3/HttpUrlTest.java

So if this is not enough then I'm not sure we will introduce APIs to use (what we would consider based on specs) less correct paths.

@jacob-pro
Copy link
Author

jacob-pro commented Jan 20, 2025

Thanks for replying 👍

It doesn't seem to creating the URL objects that are doing the transformations, for example:

New example below

So it must be something happening somewhere inside the OkHttpClient?

Would you be able to point me in the direction of where this is implemented? Maybe then I can understand what transformations you do and how to work around them.

@jacob-pro
Copy link
Author

not sure we will introduce APIs to use (what we would consider based on specs) less correct paths.

Yeah I fully appreciate that you want to implement the spec correctly, but maybe you don't want to permit sending these characters at all if they are not valid URLs.

My concern is mainly around the automatic transformations that you are doing and the vulnerabilities that could be introduced as a result when dealing with URLs sent between different libraries/servers/implementations.

Jetty for example will by default raise exceptions for some of these invalid URLs (although this can be disabled and allowed to pass-through)

https://github.com/jetty/jetty.project/blob/c3f88bafb4e393f23204dc14dc57b042e84debc7/jetty-core/jetty-http/src/main/java/org/eclipse/jetty/http/UriCompliance.java#L42-L49

@yschimke
Copy link
Collaborator

It doesn't seem to creating the URL objects that are doing the transformations, for example:

This example is using ... instead of ..

@yschimke
Copy link
Collaborator

The tests refer to https://datatracker.ietf.org/doc/html/rfc3986#section-5.4 which seems pretty clear for parsing.

I believe if we don't do this, things like caching won't work correctly. So this isn't likely to change.

@jacob-pro
Copy link
Author

jacob-pro commented Jan 21, 2025

This example is using ... instead of ..

Sorry! Yes you are right, it is the HttpUrl object that modifies them:

@Test
void testUrl() throws Exception {
    var okHttpUrl = HttpUrl.get("https://example.com/abc/../123");
    assertThat(okHttpUrl.toString()).isEqualTo("https://example.com/123");

    var okHttpBuilder = HttpUrl.get("https://example.com").newBuilder().addEncodedPathSegments("abc/../123").build();
    assertThat(okHttpBuilder.toString()).isEqualTo("https://example.com/123");

    val javaUrl = new URL("https://example.com/abc/../123");
    assertThat(javaUrl.toString()).isEqualTo("https://example.com/abc/../123");

    Request okHttpRequest = new Request.Builder()
        .url("https://example.com/abc/../123")  // String input
        .build();
    assertThat(okHttpRequest.url().toString()).isEqualTo("https://example.com/123");
}

@yschimke
Copy link
Collaborator

@swankjesse thoughts?

@jacob-pro
Copy link
Author

jacob-pro commented Jan 21, 2025

What also seems odd is that the addEncodedPathSegments still decodes the double dots:

// No encoding
var url1 = HttpUrl.get("https://example.com/").newBuilder().addEncodedPathSegments("foo/../bar").build();
assertThat(url1.toString()).isEqualTo("https://example.com/bar");

// Encoded once
var url2 = HttpUrl.get("https://example.com/").newBuilder().addEncodedPathSegments("foo/%2e%2e/bar").build();
assertThat(url2.toString()).isEqualTo("https://example.com/bar");

// Encoded twice
var url3 = HttpUrl.get("https://example.com/").newBuilder().addEncodedPathSegments("foo/%252e%252e/bar").build();
assertThat(url3.toString()).isEqualTo("https://example.com/foo/%252e%252e/bar");

@jacob-pro
Copy link
Author

Interestingly it is possible to do this:

            HttpUrl mockUrl = spy(HttpUrl.get(url));
            when(mockUrl.encodedPath()).thenReturn("/abc/../123");
            Request request = new Request.Builder().url(mockUrl).build();

resulting in:

Request Line: GET /abc/../123 HTTP/1.1

which is used here when sending the request:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bug in existing code
Projects
None yet
Development

No branches or pull requests

2 participants