proposal: spec: export uncased identifiers like 日本語 #16033

robpike · 2016-06-10T13:23:53Z

The current export rule says, in brief, a variable is exported if its first character is an upper-case letter. This means that there is no way to have an exported identifier that is a word in a non-alphabetic language such as Japanese or Chinese.

I propose we specify the rule the other way around:

A variable is not exported if its first character is a lower-case letter or underscore.

This makes that a name like 日本語 would be exported; to avoid export call it _日本語.

The change is pretty minor and will have almost no effect on existing programs I know of, but will cause programs using Han identifiers to export previously unexported identifiers. I don't know of any, but the point that it is impossible to have an exported non-alphabetic identifier has been made many times, always (to me at least) by people who speak alphabetic languages.

Still, this would fix that problem simply, if it needs fixing.

Moved to #20706:
On a related note, some writing systems - Devanagari is one (see #5167) require combining characters. The current identifier rules forbid combining characters; perhaps that should be relaxed, although that will require a canonicalization rule for combining characters. Unicode does have a definition for identifiers (http://unicode.org/reports/tr31/); perhaps Go should use it. Note that the addition of combining characters, allied with the export proposal above, would make it possible to export Devanagari identifiers.

SamWhited · 2016-06-10T19:20:42Z

On a related note, some writing systems - Devanagari is one (see #5167) require combining characters. The current identifier rules forbid combining characters; perhaps that should be relaxed, although that will require a canonicalization rule for combining characters.

I started work (and then abandoned) a draft RFC for a PRECIS profile for programing language identifiers a while back. Specifically this used the NFC canonicalization rule for combining characters and a width mapping rule to ensure that full-width and half-width characters were mapped to their decomposition mappings (on some machines a keystroke may output a full-width character while on another machine the same keystroke outputs the half-width version; this presumably would be very confusing for users in locales that use full-width characters heavily such as many East Asian languages).

I'd love to see a PRECIS profile used for all Go identifiers in the future, and would be happy to pick back up the RFC work and help the community standardize on a set of rules.

EDIT: Related Rust issue: rust-lang/rust#28979

rsc · 2017-06-17T17:44:46Z

Duplicate of #5763, which has richer discussion. Closing this one.

robpike added LanguageChange Suggested changes to the Go language v2 An incompatible library change labels Jun 10, 2016

quentinmit added this to the Proposal milestone Jul 29, 2016

rsc modified the milestones: Unplanned, Proposal Nov 21, 2016

gopherbot added the Proposal label Mar 20, 2017

SamWhited mentioned this issue Apr 11, 2017

Tracking issue for non-ASCII identifiers (feature "non_ascii_idents") rust-lang/rust#28979

Closed

rsc changed the title ~~proposal: adjustments to identifiers and export rules~~ proposal: spec: export uncased identifiers like 日本語 Jun 16, 2017

rsc mentioned this issue Jun 16, 2017

proposal: spec: allow combining characters in identifiers #20706

Open

rsc closed this as completed Jun 17, 2017

golang locked and limited conversation to collaborators Jun 17, 2018

gopherbot added the FrozenDueToAge label Jun 17, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

proposal: spec: export uncased identifiers like 日本語 #16033

proposal: spec: export uncased identifiers like 日本語 #16033

robpike commented Jun 10, 2016 •

edited by rsc

Loading

SamWhited commented Jun 10, 2016 •

edited

Loading

rsc commented Jun 17, 2017

proposal: spec: export uncased identifiers like 日本語 #16033

proposal: spec: export uncased identifiers like 日本語 #16033

Comments

robpike commented Jun 10, 2016 • edited by rsc Loading

SamWhited commented Jun 10, 2016 • edited Loading

rsc commented Jun 17, 2017

robpike commented Jun 10, 2016 •

edited by rsc

Loading

SamWhited commented Jun 10, 2016 •

edited

Loading