carbon-language · zygoloid · Sep 24, 2021 · Aug 4, 2021 · Aug 4, 2021 · Aug 4, 2021
diff --git a/proposals/p0702.md b/proposals/p0702.md
@@ -22,7 +22,8 @@ SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
 -   [Details](#details)
     -   [Precedence](#precedence)
     -   [Associativity](#associativity)
-    -   [Conversions](#conversions)
+    -   [Built-in comparisons and implicit conversions](#built-in-comparisons-and-implicit-conversions)
+        -   [Performance](#performance)
     -   [Overloading](#overloading)
     -   [Default implementations for basic types](#default-implementations-for-basic-types)
 -   [Rationale based on Carbon's goals](#rationale-based-on-carbons-goals)
@@ -32,7 +33,6 @@ SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
     -   [Convert operands like C++](#convert-operands-like-c)
     -   [Provide a three-way comparison operator](#provide-a-three-way-comparison-operator)
     -   [Allow comparisons as the operand of `not`](#allow-comparisons-as-the-operand-of-not)
-    -   [Disallow relational comparisons of Boolean values](#disallow-relational-comparisons-of-boolean-values)
 
 <!-- tocstop -->
 
@@ -122,6 +122,16 @@ perform three-way comparisons.
 Chained comparisons are an error: a comparison expression cannot appear as an
 unparenthesized operand of another comparison operator.
 
+For built-in types, we will follow these rules:
+
+-   A comparison produces either a mathematically correct answer or a
+    compilation error.
+-   Do not perform implicit conversions that lose information.
+-   Never invent an intermediate type that is larger than both operand types.
+
+One indirect consequence of the second and third rule is that we never convert
+both operands in a comparison.
+
 ## Details
 
 All six operators are infix binary operators. For standard Carbon types, they
@@ -163,33 +173,101 @@ if (a == b == c) {}
 if (m > 1 == n > 1) {}
 ```
 
-### Conversions
+### Built-in comparisons and implicit conversions
+
+When both operands are of standard Carbon integer types (`Int(n)` or
+`Unsigned(n)`), the narrower operand is converted to the width of the wider
+operand, then a mathematically-correct comparison is performed.
 
-When both operands are of standard Carbon numeric types (`Int(n)` or
-`Float(n)`), no conversions are performed on either operand, and the result is
-the mathematically correct result for that comparison, or `False` if either
-operand is a NaN. For example:
+When both operands are of standard Carbon floating-point types (`Float(n)`), the
+narrower operand is converted to the width of the wider operand, then a
+`Float(n)` comparison is performed.
+
+When one operand is of floating-point type and the other is of integer type, if
+all values of the integer type can be exactly represented in the floating-point
+type, it is converted to the floating-point type. Otherwise, the integer operand
+is required to be a constant and is required to be exactly representable in the
+floating-point type. Otherwise the comparison is rejected.
+
+For example:
 
 ```
 // The value of `v` is True, because `a` is less than `b`, even though the
 // result of either an `i32` comparison or a `u32` comparison would be False.
 fn f(a: i32, b: u32) -> Bool { return a < b; }
 let v: Bool = f(-1, 4_000_000_000);
 
-// The value of `w` is False, because `f` has value 999999984306749440, which
-// is not exactly equal to n.
+// This does not compile, because `i64` values in general (and 10^18 in
+// particular) are not exactly representable in the type `f32`.
 let f: f32 = 1.0e18;
 let n: i64 = 1_000_000_000_000_000_000;
 let w: Bool = f == n;
 ```
 
-An equivalent viewpoint is that the comparison is performed in a hypothetical
-suffiicently large type. For example, a comparison of `i32` against `u32` can be
-performed in `i64`, and a comparison of `f32` against `i32` can be performed in
-`f64`. However, no such type is required to actually exist.
-
-Note that this diverges from C++, which would convert both operands to a common
-type first, sometimes performing a lossy conversion.
+More generally, we support the following implicit conversions:
+
+-   from `Int(n)` to `Int(m)` if `m > n`,
+-   from `Unsigned(n)` to `Int(m)` or `Unsigned(m)` if `m > n`,
+-   from `Float(n)` to `Float(m)` if `m > n`, and
+-   from `Int(n)` to `Float(m)` if `Float(m)` can represent all values of
+    `Int(n)` or if the integer operand is a constant that is exactly
+    representable in the floating-point type.
+
+This proposal does not take an explicit stance on what it means for the integer
+operand to be a constant, except that integer literals are expected to qualify
+(so `f >= 0` for a floating-point value `f` is valid). For the time being, no
+expression forms other than integer literals should be permitted to implicitly
+convert from integer to floating-point type, although we expect this to be
+revised by future proposals.
+
+Other than the rule for converting a constant integer to floating-point, these
+rules can be summarized as: a type `T` can be converted to `U` if every value of
+type `T` is a value of type `U`.
+
+We support comparison operations (both equality and relational) on the following
+operand types, for each suitable value `n`:
+
+-   `Int(n)` vs `Int(n)`
+-   `Unsigned(n)` vs `Unsigned(n)`
+-   `Int(n)` vs `Unsigned(n)`
+-   `Unsigned(n)` vs `Int(n)`
+-   `Float(n)` vs `Float(n)`
+
+When searching for a suitable comparison, we only consider performing
+conversions on one of the two operands, never both. We expect for this to also
+apply to [overloaded](#overloading) comparison operators for user-defined types,
+but this proposal does not constrain that decision.
+
+The two kinds of mixed-type comparison may be [less efficient](#performance)
+than the other kinds due to the slightly wider domain.
+
+Note that this approach diverges from C++, which would convert both operands to
+a common type first, sometimes performing a lossy conversion potentially giving
+an incorrect result, sometimes converting both operands, and sometimes using a
+wider type than either of the operand types.
+
+#### Performance
+
+The choice to give correct results for signed/unsigned comparisons has a
+performance impact in practice, because it exposes operations that some
+processors do not currently directly support.
+[Sample microbenchmarks](https://godbolt.org/z/dfGe4MhEx) for implementations of
+several operations show the following performance on x86_64 (use the Quick-bench
+link in Compiler Explorer to run the benchmarks):
+
+| Operation   | Mathematical comparison time | C++ comparison time | Ratio |
+| ----------- | ---------------------------- | ------------------- | ----- |
+| `i64 < u64` | 2814                         | 992                 | 2.8x  |
+| `u64 < i64` | 1957                         | 1012                | 1.9x  |
+
+The mixed-type operations are typically 2-3x slower than the same-type
+operations. However, this is a predictable performance change, and can be
+controlled by the developer by converting the operands to a suitable type prior
+to the conversion if a faster same-type comparison is preferred over a correct
+mixed-type comparison.
+
+It is likely that better code could be generated than that currently produced in
+this benchmark for these operations.
 
 ### Overloading
 
@@ -198,9 +276,8 @@ relational comparisons. The exact design of those interfaces is left to a future
 proposal. As non-binding design guidance for such a proposal:
 
 -   The interface for equality comparisons should primarily provide the ability
-    to override the behavior of `==`. The
-    !=`operator can optionally also be overridden, with a default implementation that returns`not
-    (a == b)`.
+    to override the behavior of `==`. The `!=` operator can optionally also be
+    overridden, with a default implementation that returns `not (a == b)`.
 -   The interface for relational comparisons should primarily provide the
     ability to specify a three-way comparison operator. The individual
     relational comparison operators can optionally be overridden separately,
@@ -211,21 +288,24 @@ proposal. As non-binding design guidance for such a proposal:
 
 ### Default implementations for basic types
 
-In addition to being defined for standard Carbon numeric types, equality
-comparisons are also defined for all "data" types:
+In addition to being defined for standard Carbon numeric types, equality and
+relational comparisons are also defined for all "data" types:
 
 -   Tuples.
 -   Structs (structural data classes).
 -   Classes implementing an interface that identifies them as data classes.
 
-In addition, relational comparisons are defined for tuples, and provide a
-lexicographical ordering.
+Relational comparisons for these types provide a lexicographical ordering. This
+proposal defers to
+[#561](https://github.com/carbon-language/carbon-lang/pull/561) for details on
+comparison support for classes.
 
-In each case, the ordering is only available if it is supported by all element
+In each case, the comparison is only available if it is supported by all element
 types.
 
-The `Bool` type supports equality comparisons and relational comparisons. For
-relational comparisons, `False` is treated as being less than `True`.
+The `Bool` type should be treated as a choice type, and so should support
+equality comparisons and relational comparisons if and only if choice types do
+in general. That decision is left to a future proposal.
 
 ## Rationale based on Carbon's goals
 
@@ -267,6 +347,8 @@ Disadvantages:
 -   Unfamiliar to C++ programmers.
 -   `a /= b` would likely be expected to mean an `a = a / b` compound
     assignment.
+-   Breaks consistency with Python, which uses `not` for logical negation and
+    `!=` for inequality comparison.
 
 We could use `=/=` instead of `!=` for not-equal comparisons.
 
@@ -276,7 +358,8 @@ Advantages:
 
 Disadvantages:
 
--   This would be inventive and unlike all other languages.
+-   This would be inventive and unlike all other languages. As above, breaks
+    consistency with Python.
 -   This would make `=/=` one character longer, and harder to type on US-ASCII
     keyboards because the keys are distant but likely to be typed with the same
     finger.
@@ -288,6 +371,7 @@ We could support Python-like chained comparisons.
 Advantages:
 
 -   Small ergonomic improvement for range comparisons.
+-   Middle operand is evaluated only once.
 
 Disadvantages:
 
@@ -296,7 +380,17 @@ Disadvantages:
     changing the semantics of the operator expression as we can no longer move
     from the operand.
 -   Both short-circuiting behavior and non-short-circuiting behavior will be
-    surprising and unintuitive to some.
+    surprising and unintuitive to some. The short-circiuting option will
+    introduce control flow without a keyword to announce it, which goes against
+    our design decision to use a keyword for `and` and `or` to announce the
+    control flow. The non-short-circuiting option will evaluate subexpressions
+    unnecessarily, which creates a tension with our performance goal.
+-   Experienced C++ developers may expect a different behavior, such as
+    `a < b == cmp` comparing the result of `a < b` against the Boolean value
+    `cmp`.
+
+See also the ongoing discussion in
+[#451](https://github.com/carbon-language/carbon-lang/issues/451).
 
 ### Convert operands like C++
 
@@ -308,8 +402,10 @@ Advantages:
 -   May ease migration from C++.
 -   May allow programmers to reuse some intuition, for example when comparing
     floating-point values against integer values.
--   May allow more efficient machine code to be generated for source code that
+-   Allows more efficient machine code to be generated for source code that
     takes no special care about the types of comparison operands.
+-   Improves performance predictability for C++ developers unfamiliar with
+    Carbon's rules.
 
 Disadvantages:
 
@@ -344,19 +440,5 @@ Advantages:
 Disadvantages:
 
 -   Introduces ambiguity when comparing Boolean values: `not cond1 == cond2`
-    might intend to compare `not cond1` to `cond2` rather than cmoparing
+    might intend to compare `not cond1` to `cond2` rather than comparing
     `cond1 != cond2`.
-
-### Disallow relational comparisons of Boolean values
-
-We could disallow ordered comparisons of Boolean values.
-
-Advantages:
-
--   Disallows an operation that might be unintended.
-
-Disadvantages:
-
--   Disallows an operation that might be intended.
--   Likely to make `Bool` behave differently from discriminated union types,
-    which are likely to be treated as data types.