Skip to content
This repository has been archived by the owner on Apr 22, 2023. It is now read-only.

punycode: encoder contains bugs #2072

Closed
bnoordhuis opened this issue Nov 11, 2011 · 2 comments
Closed

punycode: encoder contains bugs #2072

bnoordhuis opened this issue Nov 11, 2011 · 2 comments
Assignees
Labels

Comments

@bnoordhuis
Copy link
Member

Not all test cases from http://tools.ietf.org/html/rfc3492#section-7.1 pass. Tests added in bnoordhuis/node@1f217ee.

FAIL: expected "ihqwcrb4cv8a8dqg056pqjye", got "ihqwcrb2cv8a8dqg056pqjye"
FAIL: expected "他们为什么不说中文", got "诵为斈不他们中什么"
FAIL: expected "ihqwctvzc91f659drss3x8bo0yb", got "ihqwctvzcv8e659drss3x8bo0yb"
FAIL: expected "他們爲什麽不說中文", got "ꁈ他谵什朒不倠中玼"
FAIL: expected "b1abfaaepdrnnbgefbaDotcwatmq2g4l", got "b1abfaaepdrnnbgefbadotcwatmq2g4l"
FAIL: expected "TisaohkhngthchnitingVit-kjcr8268qyxafd2f1b9g", got "TisaohkhngthchnitingVit-kjcr8268qyxafd2f1b3g"
FAIL: expected "TạisaohọkhôngthểchỉnóitiếngViệt", got "TạisaohkhôngtọhểchỉnóitiếngViệt"
FAIL: expected "3B-ww4c5e180e575a65lsy2b", got "3B-ww4c5e708d575a65lsy2b"
FAIL: expected "3年B組金八先生", got "廿3総鉜B八疪先"

cc @mathiasbynens

@ghost ghost assigned bnoordhuis Nov 11, 2011
@mathiasbynens
Copy link

I have a working implementation that passes all unit tests except the Russian (Cyrillic) example string. I suspect that’s a typo in the RFC.

https://github.com/bestiejs/punycode.js

To run the unit tests in Node.js, clone the repository, cd into it, and then run node tests/tests.js.

$ node tests/tests.js 
 PASS - Punycode.utf16.decode
 PASS - Punycode.utf16.encode
 PASS - Punycode.decode
 FAIL - Punycode.encode
    PASS | EQ | ASCII string that breaks the existing rules for host-name labels | 
    PASS | EQ | ok | 
    PASS | EQ | ok | 
    PASS | EQ | ok | 
    PASS | EQ | ok | 
    PASS | EQ | ok | 
    PASS | EQ | ok | 
    PASS | EQ | ok | 
    PASS | EQ | Vietnamese | 
    PASS | EQ | Spanish | 
    FAIL | EQ | Russian (Cyrillic) | Expected: b1abfaaepdrnnbgefbaDotcwatmq2g4l, Actual: b1abfaaepdrnnbgefbadotcwatmq2g4l
    PASS | EQ | Korean (Hangul syllables) | 
    PASS | EQ | Japanese (kanji and hiragana) | 
    PASS | EQ | Hindi (Devanagari) | 
    PASS | EQ | Hebrew | 
    PASS | EQ | Czech | 
    PASS | EQ | Chinese (traditional) | 
    PASS | EQ | Chinese (simplified) | 
    PASS | EQ | Arabic (Egyptian) | 
    PASS | EQ | long string with both ASCII and non-ASCII characters | 
    PASS | EQ | mix of ASCII and non-ASCII characters | 
    PASS | EQ | multiple non-ASCII characters | 
    PASS | EQ | a single non-ASCII character | 
    PASS | EQ | a single basic code point | 
 PASS - Punycode.toUnicode
 PASS - Punycode.toASCII
----------------------------------------
    PASS: 63  FAIL: 1  TOTAL: 64
    Finished in 20 milliseconds.
----------------------------------------

@bnoordhuis
Copy link
Member Author

Fixed in 326b2cb. Thanks, Mathias.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants