URL scheme characters admitted by DECODE-URL more restrictive than those admitted by TRANSCODE #1327

rebolbot · 2009-11-08T10:48:47Z

Submitted by: meijeru

DECODE-URL parses the scheme part of an URL (before :) with the following charset:
A - Z a - z 0 - 9 + - . (this is in accordance with RFC 1738).
Then it does TO-LIT-WORD, which eliminates the case of an initial digit or + -, which seems to be allowed by RFC 1738.
TRANSCODE (i.e. the lexical scan) admits the following characters before the characteristic : of a URL! literal:

in initial position A - Z a-z ! & = ? * . ^ _ ` | ~ (note the absence of digits and + -).

In subsequent positions: anything from ! to ~ except [ ] { } ( ) " / :
Thus TRANSCODE is much more permissive than either RFC 1738 or DECODE-URL.

The restrictions mentioned would merit documenting, I think.

^{CC - Data [ Version: alpha 94 Type: Issue Platform: All Category: Datatype Reproduce: Always Fixed-in:none ]}

IngoHohmann mentioned this issue Feb 5, 2020

Semicolons should be legal in URL #2381

Open

Siskin-Bot mentioned this issue Feb 15, 2020

URL scheme characters admitted by DECODE-URL more restrictive than those admitted by TRANSCODE Oldes/Rebol-issues#1327

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

URL scheme characters admitted by DECODE-URL more restrictive than those admitted by TRANSCODE #1327

URL scheme characters admitted by DECODE-URL more restrictive than those admitted by TRANSCODE #1327

rebolbot commented Nov 8, 2009

URL scheme characters admitted by DECODE-URL more restrictive than those admitted by TRANSCODE #1327

URL scheme characters admitted by DECODE-URL more restrictive than those admitted by TRANSCODE #1327

Comments

rebolbot commented Nov 8, 2009