Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v6.x backport of 'latin1' encoding #8022

Merged
merged 4 commits into from
Aug 10, 2016
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -145,7 +145,7 @@ ADDONS_BINDING_GYPS := \

ADDONS_BINDING_SOURCES := \
$(filter-out test/addons/??_*/*.cc, $(wildcard test/addons/*/*.cc)) \
$(filter-out test/addons/??_*/*.h, $(wildcard test/addons/*/*.h)) \
$(filter-out test/addons/??_*/*.h, $(wildcard test/addons/*/*.h))

# Implicitly depends on $(NODE_EXE), see the build-addons rule for rationale.
# Depends on node-gyp package.json so that build-addons is (re)executed when
Expand Down
15 changes: 12 additions & 3 deletions doc/api/buffer.md
Original file line number Diff line number Diff line change
Expand Up @@ -165,12 +165,21 @@ The character encodings currently supported by Node.js include:
this encoding will also correctly accept "URL and Filename Safe Alphabet" as
specified in [RFC 4648, Section 5].

* `'binary'` - A way of encoding the buffer into a one-byte (`latin-1`)
encoded string. The string `'latin-1'` is not supported. Instead, pass
`'binary'` to use `'latin-1'` encoding.
* `'latin1'` - A way of encoding the buffer into a one-byte encoded string
(as defined by the IANA in [RFC1345](https://tools.ietf.org/html/rfc1345),
page 63, to be the Latin-1 supplement block and C0/C1 control codes).

* `'binary'` - Alias for `latin1`.

* `'hex'` - Encode each byte as two hexadecimal characters.

_Note_: Today's browsers follow the [WHATWG
spec](https://encoding.spec.whatwg.org/) that aliases both `latin1` and
`iso-8859-1` to `win-1252`. Meaning, while doing something like `http.get()`,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

quick note, cc1318b contains documentation changes for these entries. not sure if it's worth while to also back port those.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if this PR lands the commits from #7784 should land (mostly) cleanly and will be included automatically then? cc1318b definitely does, so I’d expect it to be picked up. I can add that one here if you think it makes reviewing easier?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm not worried about adding them in this PR. i don't even feel particularly strong about landing the doc changes. just thought it be useful to bring it up.

if the returned charset is one of those listed in the WHATWG spec it's possible
that the server actually returned `win-1252` encoded data, and using `latin1`
encoding may incorrectly decode the graphical characters.

## Buffers and TypedArray

Buffers are also `Uint8Array` TypedArray instances. However, there are subtle
Expand Down
68 changes: 34 additions & 34 deletions doc/api/crypto.md
Original file line number Diff line number Diff line change
Expand Up @@ -160,7 +160,7 @@ console.log(encrypted);
### cipher.final([output_encoding])

Returns any remaining enciphered contents. If `output_encoding`
parameter is one of `'binary'`, `'base64'` or `'hex'`, a string is returned.
parameter is one of `'latin1'`, `'base64'` or `'hex'`, a string is returned.
If an `output_encoding` is not provided, a [`Buffer`][] is returned.

Once the `cipher.final()` method has been called, the `Cipher` object can no
Expand Down Expand Up @@ -198,13 +198,13 @@ The `cipher.setAutoPadding()` method must be called before [`cipher.final()`][].
### cipher.update(data[, input_encoding][, output_encoding])

Updates the cipher with `data`. If the `input_encoding` argument is given,
it's value must be one of `'utf8'`, `'ascii'`, or `'binary'` and the `data`
it's value must be one of `'utf8'`, `'ascii'`, or `'latin1'` and the `data`
argument is a string using the specified encoding. If the `input_encoding`
argument is not given, `data` must be a [`Buffer`][]. If `data` is a
[`Buffer`][] then `input_encoding` is ignored.

The `output_encoding` specifies the output format of the enciphered
data, and can be `'binary'`, `'base64'` or `'hex'`. If the `output_encoding`
data, and can be `'latin1'`, `'base64'` or `'hex'`. If the `output_encoding`
is specified, a string using the specified encoding is returned. If no
`output_encoding` is provided, a [`Buffer`][] is returned.

Expand Down Expand Up @@ -277,7 +277,7 @@ console.log(decrypted);
### decipher.final([output_encoding])

Returns any remaining deciphered contents. If `output_encoding`
parameter is one of `'binary'`, `'base64'` or `'hex'`, a string is returned.
parameter is one of `'latin1'`, `'base64'` or `'hex'`, a string is returned.
If an `output_encoding` is not provided, a [`Buffer`][] is returned.

Once the `decipher.final()` method has been called, the `Decipher` object can
Expand Down Expand Up @@ -313,13 +313,13 @@ The `decipher.setAutoPadding()` method must be called before
### decipher.update(data[, input_encoding][, output_encoding])

Updates the decipher with `data`. If the `input_encoding` argument is given,
it's value must be one of `'binary'`, `'base64'`, or `'hex'` and the `data`
it's value must be one of `'latin1'`, `'base64'`, or `'hex'` and the `data`
argument is a string using the specified encoding. If the `input_encoding`
argument is not given, `data` must be a [`Buffer`][]. If `data` is a
[`Buffer`][] then `input_encoding` is ignored.

The `output_encoding` specifies the output format of the enciphered
data, and can be `'binary'`, `'ascii'` or `'utf8'`. If the `output_encoding`
data, and can be `'latin1'`, `'ascii'` or `'utf8'`. If the `output_encoding`
is specified, a string using the specified encoding is returned. If no
`output_encoding` is provided, a [`Buffer`][] is returned.

Expand Down Expand Up @@ -361,7 +361,7 @@ Computes the shared secret using `other_public_key` as the other
party's public key and returns the computed shared secret. The supplied
key is interpreted using the specified `input_encoding`, and secret is
encoded using specified `output_encoding`. Encodings can be
`'binary'`, `'hex'`, or `'base64'`. If the `input_encoding` is not
`'latin1'`, `'hex'`, or `'base64'`. If the `input_encoding` is not
provided, `other_public_key` is expected to be a [`Buffer`][].

If `output_encoding` is given a string is returned; otherwise, a
Expand All @@ -371,45 +371,45 @@ If `output_encoding` is given a string is returned; otherwise, a

Generates private and public Diffie-Hellman key values, and returns
the public key in the specified `encoding`. This key should be
transferred to the other party. Encoding can be `'binary'`, `'hex'`,
transferred to the other party. Encoding can be `'latin1'`, `'hex'`,
or `'base64'`. If `encoding` is provided a string is returned; otherwise a
[`Buffer`][] is returned.

### diffieHellman.getGenerator([encoding])

Returns the Diffie-Hellman generator in the specified `encoding`, which can
be `'binary'`, `'hex'`, or `'base64'`. If `encoding` is provided a string is
be `'latin1'`, `'hex'`, or `'base64'`. If `encoding` is provided a string is
returned; otherwise a [`Buffer`][] is returned.

### diffieHellman.getPrime([encoding])

Returns the Diffie-Hellman prime in the specified `encoding`, which can
be `'binary'`, `'hex'`, or `'base64'`. If `encoding` is provided a string is
be `'latin1'`, `'hex'`, or `'base64'`. If `encoding` is provided a string is
returned; otherwise a [`Buffer`][] is returned.

### diffieHellman.getPrivateKey([encoding])

Returns the Diffie-Hellman private key in the specified `encoding`,
which can be `'binary'`, `'hex'`, or `'base64'`. If `encoding` is provided a
which can be `'latin1'`, `'hex'`, or `'base64'`. If `encoding` is provided a
string is returned; otherwise a [`Buffer`][] is returned.

### diffieHellman.getPublicKey([encoding])

Returns the Diffie-Hellman public key in the specified `encoding`, which
can be `'binary'`, `'hex'`, or `'base64'`. If `encoding` is provided a
can be `'latin1'`, `'hex'`, or `'base64'`. If `encoding` is provided a
string is returned; otherwise a [`Buffer`][] is returned.

### diffieHellman.setPrivateKey(private_key[, encoding])

Sets the Diffie-Hellman private key. If the `encoding` argument is provided
and is either `'binary'`, `'hex'`, or `'base64'`, `private_key` is expected
and is either `'latin1'`, `'hex'`, or `'base64'`, `private_key` is expected
to be a string. If no `encoding` is provided, `private_key` is expected
to be a [`Buffer`][].

### diffieHellman.setPublicKey(public_key[, encoding])

Sets the Diffie-Hellman public key. If the `encoding` argument is provided
and is either `'binary'`, `'hex'` or `'base64'`, `public_key` is expected
and is either `'latin1'`, `'hex'` or `'base64'`, `public_key` is expected
to be a string. If no `encoding` is provided, `public_key` is expected
to be a [`Buffer`][].

Expand Down Expand Up @@ -460,7 +460,7 @@ Computes the shared secret using `other_public_key` as the other
party's public key and returns the computed shared secret. The supplied
key is interpreted using specified `input_encoding`, and the returned secret
is encoded using the specified `output_encoding`. Encodings can be
`'binary'`, `'hex'`, or `'base64'`. If the `input_encoding` is not
`'latin1'`, `'hex'`, or `'base64'`. If the `input_encoding` is not
provided, `other_public_key` is expected to be a [`Buffer`][].

If `output_encoding` is given a string will be returned; otherwise a
Expand All @@ -476,14 +476,14 @@ The `format` arguments specifies point encoding and can be `'compressed'`,
`'uncompressed'`, or `'hybrid'`. If `format` is not specified, the point will
be returned in `'uncompressed'` format.

The `encoding` argument can be `'binary'`, `'hex'`, or `'base64'`. If
The `encoding` argument can be `'latin1'`, `'hex'`, or `'base64'`. If
`encoding` is provided a string is returned; otherwise a [`Buffer`][]
is returned.

### ecdh.getPrivateKey([encoding])

Returns the EC Diffie-Hellman private key in the specified `encoding`,
which can be `'binary'`, `'hex'`, or `'base64'`. If `encoding` is provided
which can be `'latin1'`, `'hex'`, or `'base64'`. If `encoding` is provided
a string is returned; otherwise a [`Buffer`][] is returned.

### ecdh.getPublicKey([encoding[, format]])
Expand All @@ -495,13 +495,13 @@ The `format` argument specifies point encoding and can be `'compressed'`,
`'uncompressed'`, or `'hybrid'`. If `format` is not specified the point will be
returned in `'uncompressed'` format.

The `encoding` argument can be `'binary'`, `'hex'`, or `'base64'`. If
The `encoding` argument can be `'latin1'`, `'hex'`, or `'base64'`. If
`encoding` is specified, a string is returned; otherwise a [`Buffer`][] is
returned.

### ecdh.setPrivateKey(private_key[, encoding])

Sets the EC Diffie-Hellman private key. The `encoding` can be `'binary'`,
Sets the EC Diffie-Hellman private key. The `encoding` can be `'latin1'`,
`'hex'` or `'base64'`. If `encoding` is provided, `private_key` is expected
to be a string; otherwise `private_key` is expected to be a [`Buffer`][]. If
`private_key` is not valid for the curve specified when the `ECDH` object was
Expand All @@ -512,7 +512,7 @@ public point (key) is also generated and set in the ECDH object.

Stability: 0 - Deprecated

Sets the EC Diffie-Hellman public key. Key encoding can be `'binary'`,
Sets the EC Diffie-Hellman public key. Key encoding can be `'latin1'`,
`'hex'` or `'base64'`. If `encoding` is provided `public_key` is expected to
be a string; otherwise a [`Buffer`][] is expected.

Expand Down Expand Up @@ -604,7 +604,7 @@ console.log(hash.digest('hex'));
### hash.digest([encoding])

Calculates the digest of all of the data passed to be hashed (using the
[`hash.update()`][] method). The `encoding` can be `'hex'`, `'binary'` or
[`hash.update()`][] method). The `encoding` can be `'hex'`, `'latin1'` or
`'base64'`. If `encoding` is provided a string will be returned; otherwise
a [`Buffer`][] is returned.

Expand All @@ -615,7 +615,7 @@ called. Multiple calls will cause an error to be thrown.

Updates the hash content with the given `data`, the encoding of which
is given in `input_encoding` and can be `'utf8'`, `'ascii'` or
`'binary'`. If `encoding` is not provided, and the `data` is a string, an
`'latin1'`. If `encoding` is not provided, and the `data` is a string, an
encoding of `'utf8'` is enforced. If `data` is a [`Buffer`][] then
`input_encoding` is ignored.

Expand Down Expand Up @@ -678,7 +678,7 @@ console.log(hmac.digest('hex'));
### hmac.digest([encoding])

Calculates the HMAC digest of all of the data passed using [`hmac.update()`][].
The `encoding` can be `'hex'`, `'binary'` or `'base64'`. If `encoding` is
The `encoding` can be `'hex'`, `'latin1'` or `'base64'`. If `encoding` is
provided a string is returned; otherwise a [`Buffer`][] is returned;

The `Hmac` object can not be used again after `hmac.digest()` has been
Expand All @@ -688,7 +688,7 @@ called. Multiple calls to `hmac.digest()` will result in an error being thrown.

Updates the `Hmac` content with the given `data`, the encoding of which
is given in `input_encoding` and can be `'utf8'`, `'ascii'` or
`'binary'`. If `encoding` is not provided, and the `data` is a string, an
`'latin1'`. If `encoding` is not provided, and the `data` is a string, an
encoding of `'utf8'` is enforced. If `data` is a [`Buffer`][] then
`input_encoding` is ignored.

Expand Down Expand Up @@ -768,7 +768,7 @@ object, it is interpreted as a hash containing two properties:
* `key` : {String} - PEM encoded private key
* `passphrase` : {String} - passphrase for the private key

The `output_format` can specify one of `'binary'`, `'hex'` or `'base64'`. If
The `output_format` can specify one of `'latin1'`, `'hex'` or `'base64'`. If
`output_format` is provided a string is returned; otherwise a [`Buffer`][] is
returned.

Expand All @@ -779,7 +779,7 @@ called. Multiple calls to `sign.sign()` will result in an error being thrown.

Updates the `Sign` content with the given `data`, the encoding of which
is given in `input_encoding` and can be `'utf8'`, `'ascii'` or
`'binary'`. If `encoding` is not provided, and the `data` is a string, an
`'latin1'`. If `encoding` is not provided, and the `data` is a string, an
encoding of `'utf8'` is enforced. If `data` is a [`Buffer`][] then
`input_encoding` is ignored.

Expand Down Expand Up @@ -831,7 +831,7 @@ console.log(verify.verify(public_key, signature));

Updates the `Verify` content with the given `data`, the encoding of which
is given in `input_encoding` and can be `'utf8'`, `'ascii'` or
`'binary'`. If `encoding` is not provided, and the `data` is a string, an
`'latin1'`. If `encoding` is not provided, and the `data` is a string, an
encoding of `'utf8'` is enforced. If `data` is a [`Buffer`][] then
`input_encoding` is ignored.

Expand All @@ -843,7 +843,7 @@ Verifies the provided data using the given `object` and `signature`.
The `object` argument is a string containing a PEM encoded object, which can be
one an RSA public key, a DSA public key, or an X.509 certificate.
The `signature` argument is the previously calculated signature for the data, in
the `signature_format` which can be `'binary'`, `'hex'` or `'base64'`.
the `signature_format` which can be `'latin1'`, `'hex'` or `'base64'`.
If a `signature_format` is specified, the `signature` is expected to be a
string; otherwise `signature` is expected to be a [`Buffer`][].

Expand All @@ -869,7 +869,7 @@ or [buffers][`Buffer`]. The default value is `'buffer'`, which makes methods
default to [`Buffer`][] objects.

The `crypto.DEFAULT_ENCODING` mechanism is provided for backwards compatibility
with legacy programs that expect `'binary'` to be the default encoding.
with legacy programs that expect `'latin1'` to be the default encoding.

New applications should expect the default to be `'buffer'`. This property may
become deprecated in a future Node.js release.
Expand All @@ -889,7 +889,7 @@ recent OpenSSL releases, `openssl list-cipher-algorithms` will display the
available cipher algorithms.

The `password` is used to derive the cipher key and initialization vector (IV).
The value must be either a `'binary'` encoded string or a [`Buffer`][].
The value must be either a `'latin1'` encoded string or a [`Buffer`][].

The implementation of `crypto.createCipher()` derives keys using the OpenSSL
function [`EVP_BytesToKey`][] with the digest algorithm set to MD5, one
Expand All @@ -913,7 +913,7 @@ recent OpenSSL releases, `openssl list-cipher-algorithms` will display the
available cipher algorithms.

The `key` is the raw key used by the `algorithm` and `iv` is an
[initialization vector][]. Both arguments must be `'binary'` encoded strings or
[initialization vector][]. Both arguments must be `'latin1'` encoded strings or
[buffers][`Buffer`].

### crypto.createCredentials(details)
Expand Down Expand Up @@ -968,7 +968,7 @@ recent OpenSSL releases, `openssl list-cipher-algorithms` will display the
available cipher algorithms.

The `key` is the raw key used by the `algorithm` and `iv` is an
[initialization vector][]. Both arguments must be `'binary'` encoded strings or
[initialization vector][]. Both arguments must be `'latin1'` encoded strings or
[buffers][`Buffer`].

### crypto.createDiffieHellman(prime[, prime_encoding][, generator][, generator_encoding])
Expand All @@ -979,7 +979,7 @@ optional specific `generator`.
The `generator` argument can be a number, string, or [`Buffer`][]. If
`generator` is not specified, the value `2` is used.

The `prime_encoding` and `generator_encoding` arguments can be `'binary'`,
The `prime_encoding` and `generator_encoding` arguments can be `'latin1'`,
`'hex'`, or `'base64'`.

If `prime_encoding` is specified, `prime` is expected to be a string; otherwise
Expand Down Expand Up @@ -1345,7 +1345,7 @@ unified Stream API, and before there were [`Buffer`][] objects for handling
binary data. As such, the many of the `crypto` defined classes have methods not
typically found on other Node.js classes that implement the [streams][stream]
API (e.g. `update()`, `final()`, or `digest()`). Also, many methods accepted
and returned `'binary'` encoded strings by default rather than Buffers. This
and returned `'latin1'` encoded strings by default rather than Buffers. This
default was changed after Node.js v0.8 to use [`Buffer`][] objects by default
instead.

Expand Down
10 changes: 5 additions & 5 deletions lib/_http_outgoing.js
Original file line number Diff line number Diff line change
Expand Up @@ -130,7 +130,7 @@ OutgoingMessage.prototype._send = function(data, encoding, callback) {
data = this._header + data;
} else {
this.output.unshift(this._header);
this.outputEncodings.unshift('binary');
this.outputEncodings.unshift('latin1');
this.outputCallbacks.unshift(null);
this.outputSize += this._header.length;
if (typeof this._onPendingData === 'function')
Expand Down Expand Up @@ -453,7 +453,7 @@ OutgoingMessage.prototype.write = function(chunk, encoding, callback) {
if (typeof chunk === 'string' &&
encoding !== 'hex' &&
encoding !== 'base64' &&
encoding !== 'binary') {
encoding !== 'latin1') {
len = Buffer.byteLength(chunk, encoding);
chunk = len.toString(16) + CRLF + chunk + CRLF;
ret = this._send(chunk, encoding, callback);
Expand All @@ -468,7 +468,7 @@ OutgoingMessage.prototype.write = function(chunk, encoding, callback) {
this.connection.cork();
process.nextTick(connectionCorkNT, this.connection);
}
this._send(len.toString(16), 'binary', null);
this._send(len.toString(16), 'latin1', null);
this._send(crlf_buf, null, null);
this._send(chunk, encoding, null);
ret = this._send(crlf_buf, null, callback);
Expand Down Expand Up @@ -581,10 +581,10 @@ OutgoingMessage.prototype.end = function(data, encoding, callback) {
};

if (this._hasBody && this.chunkedEncoding) {
ret = this._send('0\r\n' + this._trailer + '\r\n', 'binary', finish);
ret = this._send('0\r\n' + this._trailer + '\r\n', 'latin1', finish);
} else {
// Force a flush, HACK.
ret = this._send('', 'binary', finish);
ret = this._send('', 'latin1', finish);
}

if (this.connection && data)
Expand Down
2 changes: 1 addition & 1 deletion lib/_tls_wrap.js
Original file line number Diff line number Diff line change
Expand Up @@ -608,7 +608,7 @@ TLSSocket.prototype.setServername = function(name) {

TLSSocket.prototype.setSession = function(session) {
if (typeof session === 'string')
session = Buffer.from(session, 'binary');
session = Buffer.from(session, 'latin1');
this._handle.setSession(session);
};

Expand Down
Loading