FR-020-014 5.3 [lex.charset] Replace "translation character set" by "Unicode" P2749 #422

wg21bot · 2022-10-23T19:07:57Z

C++23 introduces the term "translation character set" to designate Unicode scalar values.
This new term is C++ specific and has no benefit over the terms scalar value or codepoints (both can be used interchangeably as surrogates are not permitted after phase 1 of translation).
Because other terms exist, and because making characters up for non-assigned codepoints doesn't match any possible definition of the term "character", we would like to the term "translation character set" replaced by "Unicode" and "elements of the translation character set" replaced by codepoint or scalar value. In places in [lex] where the term character is used to mean "codepoint", it should be replaced by "codepoint".

tahonermann · 2023-01-07T15:44:26Z

SG16 discussed this issue during its 2022-12-14 telecon. The following poll was taken:

Poll 1.1: Encourage further work on expressing the semantics of C++ lexing in terms
of the terminology defined in the Unicode Standard.
- Attendees: 6
  
  SF F N A SA
  
  4 1 0 1 0
- Strong consensus.

We still lack proposed wording to address this issue. I am retaining the SG16 label pending a wording proposal. If such a proposal does not materialize or we are unable to review it in time, then this NB comment will need to be resolved as having no consensus for a change.

tahonermann · 2023-01-08T05:56:53Z

A draft paper is available to address this issue, but it has not been published in a mailing yet. D2749R0 (Down with ”character”).

tahonermann · 2023-01-26T17:25:46Z

SG16 reviewed a draft of what will become P2749R0 (Down with ”character”) during its 2023-01-25 telecon but has not yet polled forwarding it. That paper seeks to resolve this NB comment. SG16 will continue its review on 2023-02-01. I'm retaining the SG16 label for now.

tahonermann · 2023-02-05T04:02:54Z

SG16 continued its review of what will become P2749R0 (Down with ”character”) during its 2023-02-01 telecon. The following polls were taken.

Poll 1.1: D2749R0 "Down with 'character'" should be included in the IS only if the updates to whitespace specification described in P2348 "Whitespaces Wording Revamp" are also included.
- Attendees: 7
  
  SF F N A SA
  
  0 4 1 0 1
- Weak consensus in favor.
Poll 1.2: Forward D2749R0 "Down with 'character'", revised as discussed, to CWG for C++23 as the recommended resolution of ballot comment FR-020-014.
- Attendees: 7
  
  SF F N A SA
  
  1 1 3 2 0
- No consensus.
Poll 1.3: Recommend rejection of ballot comment FR-020-014 as no consensus for change.
- Attendees: 7
  
  F N A
  
  4 1 1
- Consensus in favor.

Though there was no consensus to forward the paper for C++23, there is strong support to continue work on the paper for a later standard; the rejection at this point has to do with a desire for further review and a desire to expand the scope of the paper to avoid introducing inconsistencies in core wording.

I'm removing the SG16 label; this NB comment is ready for CWG review.

jensmaurer · 2023-02-07T05:18:25Z

Rejected. There was no consensus for a change at this time.

wg21bot added the CWG Core label Oct 23, 2022

wg21bot added this to the CD C++23 milestone Oct 23, 2022

tahonermann added the SG16 Unicode label Oct 25, 2022

jensmaurer changed the title ~~FR 5.3 [lex.charset] Replace "translation character set" by "Unicode"~~ FR-020-014 5.3 [lex.charset] Replace "translation character set" by "Unicode" Nov 3, 2022

jensmaurer changed the title ~~FR-020-014 5.3 [lex.charset] Replace "translation character set" by "Unicode"~~ FR-020-014 5.3 [lex.charset] Replace "translation character set" by "Unicode" P2749 Jan 21, 2023

jensmaurer added the needs-paper label Jan 21, 2023

tahonermann removed the SG16 Unicode label Feb 5, 2023

jensmaurer added the rejected No consensus for a change. label Feb 7, 2023

jensmaurer closed this as not planned Won't fix, can't repro, duplicate, stale Feb 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FR-020-014 5.3 [lex.charset] Replace "translation character set" by "Unicode" P2749 #422

FR-020-014 5.3 [lex.charset] Replace "translation character set" by "Unicode" P2749 #422

wg21bot commented Oct 23, 2022

tahonermann commented Jan 7, 2023

tahonermann commented Jan 8, 2023

tahonermann commented Jan 26, 2023

tahonermann commented Feb 5, 2023

jensmaurer commented Feb 7, 2023

FR-020-014 5.3 [lex.charset] Replace "translation character set" by "Unicode" P2749 #422

FR-020-014 5.3 [lex.charset] Replace "translation character set" by "Unicode" P2749 #422

Comments

wg21bot commented Oct 23, 2022

tahonermann commented Jan 7, 2023

tahonermann commented Jan 8, 2023

tahonermann commented Jan 26, 2023

tahonermann commented Feb 5, 2023

jensmaurer commented Feb 7, 2023