Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for obtaining path parameter information from HttpServletRequest #67

Open
glassfishrobot opened this issue Mar 11, 2013 · 6 comments
Assignees
Labels
Enhancement New feature or request

Comments

@glassfishrobot
Copy link

URI paths can contain parameters, independent of the query string. Parameters belong to the path segment within which they appear and to the URI as a whole. Parameters are separated from the segment and from each other with a semicolon (;), and parameter values are separated from each other with a comma (,). For example, consider the following URL:

http://www.example.org/foo;x=1;y=2/bar;a=3,4;y=5

In this example, x is 1 and y is 2 or the /foo segment, while a is [3, 4] and y is 5 for the /bar segment. For the entire URL, a appears once and has 2 values, x appears once and has 1 value, and y appears twice and has 1 value and 1 value.

The servlet spec already recognizes path parameters, though it does not actually provide an interface for extracting them. As an example of this, if application /foo is deployed at example.org and it has a Servlet mapped to /bar, the aforementioned URL will match that context and servlet. A call to HttpServletRequest#getContextPath() will return /foo, not /foo;x=1;y=2, and a call to HttpServletRequest#getServletPath() will return /bar, not /bar;a=3,4;y=5.

This suggestion is to add two methods to HttpServletRequest:

...
    /**
     * Returns all of the path (matrix) parameters that appear in the request URI. The keys in the
     * map are the parameter names. The map values are lists of entries. If a parameter appears
     * in one path segment, there will be one value in the list, and that value may be one or more
     * strings. If a parameter appears in multiple path segments, there will be a value in the list
     * for each path segment, in the order the path segments appear in the URI. Each value may
     * be one or more strings.
     * <p>
     * Path parameters are separated from their segments and the ... [explanation from above]
     *
     * @return the parameters present in all path segments in the URI.
     */
    Map<String, List<String[]>> getPathParameters(); // could be Map<String, List<List<String>>> instead ...
    /**
     * Returns a list of all path segments in the request URI. Path segments are separated by the
     * forward slash (/). The path segments returned by this method will include the context
     * path and the Servlet path.
     *
     * @return a list of all the path segments in the request URI, in the order they appear.
     */
    List<PathSegment> getPathSegments();
...

A call to either getPathParameters or getPathSegments results in the processing and caching of all path parameters. This is independent of the processing and caching of request parameters (getParameter, getParameterNames, etc.). The processing of path parameters should not trigger the processing of request parameters, and vice versa. If easier/more efficient, the container may process path parameters when it decodes the URI (note that parameter processing should be performed against the URI before decoding, but parameter names and values should be decoded).

(Importantly, if I call getPathParameters or getPathSegments within a filter, it should not block while POST parameters or multipart data (unrelated) are processed.)

The new javax.servlet.http.PathSegment interface is modeled off of the javax.ws.rs.core.PathSegment interface, which exists for the same purpose:

package javax.servlet.http;

public interface PathSegment
{
    /**
     * Returns the path for this specific segment, including the leading forward slash (/).
     *
     * @return the path for this segment.
     */
    String getPath();

    /**
     * Returns the path (matrix) parameters that appear in this segment. The keys in
     * the map are the parameter names. The values are all of the values assigned to
     * the corresponding parameters. A parameter may have one or more values.
     * <p>
     * Path parameters are separated from their segments and the ... [explanation from above]
     */
    Map<String, String[]> getParameters(); // could be Map<String, List<String>> instead }

There is currently a workaround to accomplishing this, though it has its disadvantages. Parameters could simply be processed as-needed by the application using its own or third-party code. Or a filter could be written to process parameters and add them to the request as a request attribute. The key problem with both of these approaches is that the container knows what character encoding was used for the URI, but the application does not. It would be more accurate and reliable for the container to perform the parameter processing.

For the most information, I have included parts a sample filter below that I created for use in my application. Some of the code (namely the POJOs) is inferred.

...
    @Override
    public void doFilter(ServletRequest request, ServletResponse response,
         FilterChain chain) throws IOException, ServletException
    {
        String[] paths = ((HttpServletRequest)request).getRequestURI()
.substring(1).split("/");
        PathInfo info = new PathInfo();

        for(String path : paths)
        {
            String[] parts = path.split(";");
            PathSegment segment = new PathSegment();
            segment.path = parts[0];
            for(int i = 1; i < parts.length; i++)
            {
String[] p = parts[i].split("=", 2);
String key = decode(p[0]);
if(p.length == 2)
    segment.parameters.put(key, decode(p[1].split(",", -1)));
else
    segment.parameters.put(key, new String[] {""});
if(!info.parameters.containsKey(key))
    info.parameters.put(key, new ArrayList<>());
info.parameters.get(key).add(segment.parameters.get(key));
            }
            info.segments.add(segment);
        }

        request.setAttribute("com.wrox.pathInfo", info);

        chain.doFilter(request, response);
    }

    private String decode(String original)
    {
        try {
            return URLDecoder.decode(original, "UTF-8");
        } catch (UnsupportedEncodingException e) {
            throw new RuntimeException(e); // not possible
        }
    }

    private String[] decode(String[] original)
    {
        String[] newValues = new String[original.length];
        for(int i = 0; i < original.length; i++)
        {
            try {
newValues[i] = URLDecoder.decode(original[i], "UTF-8");
            } catch (UnsupportedEncodingException e) {
throw new RuntimeException(e); // not possible
            }
        }
        return newValues;
    }
...

Estimate 30 minutes to add the relevant methods/interfaces and 2.5 hours to update the spec doc.

Environment

n/a

@glassfishrobot
Copy link
Author

@glassfishrobot Commented
Reported by beamerblvd

@glassfishrobot
Copy link
Author

@glassfishrobot Commented
beamerblvd said:
Also, I believe the spec should specify the following important notes:

  • Containers should preserve empty parameter values. So, if a parameter exists where x=1,,2,3,,,4,, the resulting values should be ["1", "", "2", "3", "", "", "4", ""]. If x=, then the resulting values should be [""].
  • Users should be warned that browsers do not recognize or interpret path parameters, and as a result they can interfere with cookies. If a cookie is set to path /foo, requests to /foo/bar will include the cookie but requests to /foo;a=1/bar will not include the cookie. However, requests to /foo/bar;a=1 will include the cookie, since the path parameters are not interfering with the cookie path in this case.

@glassfishrobot
Copy link
Author

@glassfishrobot Commented
beamerblvd said:
If this can make it in Servlet 3.1, great. If not, no big deal. There is a workaround, so it is not crucial that this be in 3.1.

@glassfishrobot
Copy link
Author

@glassfishrobot Commented
rstoyanchev said:
While the above understanding of path parameters is correct, note that it represents one of several styles of path parameters. RFC 3986 (section 3.3) is relatively vague and leaves a lot of room:

For example, the semicolon (";") and equals ("=") reserved characters are
often used to delimit parameters and parameter values applicable to
that segment.  The comma (",") reserved character is often used for
similar purposes.  For example, one URI producer might use a segment
such as "name;v=1.1" to indicate a reference to version 1.1 of
"name", whereas another might use a segment such as "name,1.1" to
indicate the same.

This probably reflects the fact that a few different styles of path parameters have evolved over time in the absence of a very precise definition. In addition to the above examples, here is one other example from the StackExchange API where a path segment contains a ";" separated list of ids (the ";" in this case is merely a separator):

http://api.stackoverflow.com/1.1/usage/methods/comments-by-ids

@glassfishrobot
Copy link
Author

@glassfishrobot Commented
This issue was imported from java.net JIRA SERVLET_SPEC-67

@glassfishrobot
Copy link
Author

@glassfishrobot glassfishrobot self-assigned this Jun 6, 2018
@gregw gregw added Enhancement New feature or request and removed Component: Misc labels Jan 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants