Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added floating point/int summary #49

Merged
merged 3 commits into from
Dec 20, 2018
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions reference/src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,5 +15,6 @@
- [Uninitialized memory](./active_discussion/uninitialized_memory.md)
- [Data representation](./representation.md)
- [Structs and tuples](./representation/structs-and-tuples.md)
- [Integers and Floating Points] (./integers-floatingpoint.md)
- [Optimizations](./optimizations.md)
- [Optimizing immutable memory](./optimizations/immutable_memory.md)
47 changes: 47 additions & 0 deletions reference/src/representation/integers-floatingpoint.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# Representation of Boolean, Floating Point, and Integral Types
This chapter represents the consensus from issue [#9]. It documents the memory layout and considerations for `bool`, `usize`, `isize`, floating point types, and integral types.
[#9]: https://github.com/rust-rfcs/unsafe-code-guidelines/issues/9

## Overview
These are all scalar types, representing a single value. These types have no representation variants (no #[repr(C)] or #[repr(Rust)]. Their size is fixed and well-defined across FFI boundaries and map to their corresponding integral types in the C ABI.
- `bool`: 1 byte
- any `bool` can be cast into an integer, taking on the values 1 (true) or 0 (false)
- `usize`, `isize`: pointer-sized unsigned/signed integer type
- `u8` .. `u128`, `i8` .. `i128`
- {8, 16, 32, 64, 128}-bit unsigned integer
- {8, 16, 32, 64, 128}-bit signed integer
- `f32`, `f64`
- IEEE floats
- 32-bit or 64-bit
- `char`
- C++ char: equivalent to either `i8`/`u8`
- Rust char: 32-bit
- not ABI compatible
- represents [Unicode scalar value](http://www.unicode.org/glossary/#unicode_scalar_value)

##`usize`/`isize`
Types `usize` and `isize` are committed to having the same size as a native pointer on the platform. The representation of `usize` determines the following:
- how much a pointer of a certain type can be offseted,
- the maximum size of Rust objects (because size_of/size_of_val return `usize`),
- the maximum number of elements in an array ([T; N: usize]),
- `usize`/`isize` in C FFI are compatible with C's `uintptr_t` / `intptr_t`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means that usize/isize have the same size and alignment as uintptr_t and intptr_t, it might be worth it to spell this out.


The maximum size of any single value must fit within `isize` to [ensure that pointer diff is representable](https://github.com/rust-rfcs/unsafe-code-guidelines/pull/5#discussion_r212703192).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Above the second point states that "the maximum size of Rust objects" is determined by usize, but here it is stated that "The maximum size of any single value must fit within isize".

It is a bit unclear to me what the difference between value and object is here. We probably want to use consistent terminology here but sadly #40 is not merged yet.


`usize` and C’s `unsized` are *not* equivalent.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this statement mean ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably that on most 64-bit platforms, unsized is 32 bits. Worth spelling out though.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is unsized? My best guess it's that it's a typo of unsigned (aka unsigned int).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I grabbed this from #9 (comment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nikomatsakis that looks like a typo.


## Booleans
The `bool` type has size 1 byte, but is not specified to be C-compatible. This [discussion](https://github.com/rust-lang/rust/pull/46176#issuecomment-359593446) has more details. For Rust to support a platform, the platform’s C must have the boolean type be a byte, where `true = 1` and `false = 0`.
Copy link
Contributor

@gnzlbg gnzlbg Nov 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the consensus on that PR was a bit different.

IIUC we had consensus that:

  • bool has a size of 1 (and an alignment of 1 ?)
  • true as i8 == 1 and false as i8 == 0
  • Iff the C platform has a _Bool, this _Bool has a size of 1 (and an alignment of 1?), and the integer representation of _Bool is 0 == (int)false and 1 == (int)true:
    • then (note: currently, this is the case for all platforms that Rust supports):
      • bool is usable on C FFI
      • bool has the same layout as _Bool (size and alignment)
    • otherwise:
      • bool is not a proper ctype (not C FFI safe, etc. - probably should raise improper_ctype warning). This does not mean that it is impossible to do C FFI with the platform. It only means that using bool on C FFI might not do what you expect (e.g. it might work if there is a char on the C side, but probably won't work if there is a _Bool).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The team decision in the linked comment treats the possibility of a C platform where _Bool is weird in the opposite fashion. Quoting:

bool has the same representation as the platform's _Bool type.

on every platform we currently support, this means [...]

It's not clear to me whether the third possibility of simply not supporting such weird C platforms (in the same way we don't support CHAR_BITS != 8) was explicitly considered, but I really don't see how what you wrote here can be interpreted into this decision.

Copy link
Contributor

@gnzlbg gnzlbg Nov 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear to me whether the third possibility of simply not supporting such weird C platforms (in the same way we don't support CHAR_BITS != 8) was explicitly considered,

I think it was tangentially addressed here: rust-lang/rust#46176 (comment)

The argument was that we don't want people to create and use a c_bool out of portability concerns for platforms we might never support. I don't think this means that it is impossible to do C FFI with such a platform, so requiring _Bool to satisfy our constraints might be too strong.

We had a tangential discussion (here: #46 and #9) and on Zulip and IIRC the consensus was that it would just mean that bool would not do what you intend it to do when used on C FFI in a platform where _Bool does not satisfy our constraints. In such a platform, warning about an improper_ctype on such platforms could be a potential way forward.

IIRC @briansmith argued in that PR (here: #46) that they often use Rust C FFI to interface with binary objects that were not produced by C, and that requiring this from the C platform was too strong of a requirement. Even if the C platform _Bool doesn't satisfy our constraints, they can write assembly whose ABI does, and they'd like to be able to use bool in such platforms to interface with that kind of code. If this would raise an improper_ctypes warning, that's something they could allow().

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember all these discussions, but at the end of the day we have an explicit T-lang T-compiler decision (_Bool == Rust bool, so it always does the right thing in C FFI), so I don't know how you can claim there's consensus for something incompatible with that decision.

Copy link
Contributor

@gnzlbg gnzlbg Nov 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so I don't know how you can claim there's consensus for something incompatible with that decision.

If this should exclusively stick to that particular discussion and ignore all discussions that have happened since, then this document should probably only say something like:

Rust bool has the same layout as _Bool.

Note: on all platforms that Rust currently supports, sizeof(_Bool) == 1, 1 == (int)true, and 0 == (int)false.

Note: if the platform does not have a C ABI, or this C ABI does not have a _Bool, the layout of bool as well as the result of casting a bool to an integer are unspecified.

The moment we say something like "bool has a size of 1 byte" we are making bool different from C _Bool.

The discussion that has happened here since pretty much concluded that bool's size as well as the result of casting a bool to an integer are both guaranteeing, and that it is also worth guaranteeing that C FFI with platforms where _Bool does not exactly match bool is possible.

Copy link
Contributor

@gnzlbg gnzlbg Nov 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That issue was closed in favor of rust-lang/rust#46156 , which pretty much just documented that size_of bool == 1.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then either there is a contradiction somewhere and T-lang T-compiler need to clear it up, or the docs modified in rust-lang/rust#46156 are only intended to cover the platforms we currently support, while the UCG documents take a broader view.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is worth discussing in the next meeting.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Argh, there is something of a contradiction, isn't there? Still, I think the intent was pretty clearly documented in the comments on rust-lang/rust#46176, and it's more that the docs in #46156 are incomplete and should be clarified. I don't recall, did we discuss this in our last meeting?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following meeting consensus, this should be updated as described in #53 (comment), right?


For full ABI compatibility details, see [Gankro’s post] (https://gankro.github.io/blah/rust-layouts-and-abis/#the-layoutsabis-of-builtins).
## Relationship to C integer hierarchy
C integers:
- char: at least 8 bits
- short: at least 16 bits (also at least a char)
- int: at least a short (intended to be a native integer size)
- long: at least 32 bits (also at least an int)
- long long: at least 64 bits (also at least a long)
The C integer types specify a minimum size, but not the exact size. For this reason, Rust integer types are not necessarily compatible with the “corresponding” C integer type. Instead, use the corresponding fixed size data types (e.g. `i64` in Rust would correspond to `int64_t` in C).

## Controversies
None