Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate Read::initializer in favor of ptr::freeze #58363

Closed
wants to merge 11 commits into from
6 changes: 6 additions & 0 deletions src/libcore/intrinsics.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1523,4 +1523,10 @@ extern "rust-intrinsic" {
/// Emits a `!nontemporal` store according to LLVM (see their docs).
/// Probably will never become stable.
pub fn nontemporal_store<T>(ptr: *mut T, val: T);

/// Freezes undefined data.
///
/// Exposed as [`std::ptr::freeze`](../../std/ptr/fn.freeze.html).
#[cfg(not(stage0))]
pub fn freeze<T>(ptr: *mut T, count: usize);
sfackler marked this conversation as resolved.
Show resolved Hide resolved
}
67 changes: 67 additions & 0 deletions src/libcore/ptr.rs
Original file line number Diff line number Diff line change
Expand Up @@ -946,6 +946,73 @@ pub unsafe fn write_volatile<T>(dst: *mut T, src: T) {
intrinsics::volatile_store(dst, src);
}

/// Freezes `count * size_of::<T>()` bytes of memory, converting uninitialized data into
/// arbitrary but fixed data.
sfackler marked this conversation as resolved.
Show resolved Hide resolved
///
/// Uninitialized memory has undefined contents, and interaction with those contents
Copy link
Member

@RalfJung RalfJung Feb 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In #58468, I use terminology like "uninitialized memory does not have a fixed value/content", to try and distinguish it from memory that contains unknown but fixed data. Do you think it would make sense to also use such terminology here, to explain why interactions can cause UB?

/// can easily cause undefined behavior. Freezing the memory avoids those issues by
/// converting the memory to an initialized state without actually needing to write to
/// all of it. This function has no effect on memory which is already initialized.
///
/// While this function does not actually physically write to memory, it acts as if it
/// does. For example, calling this method on data which is concurrently accessible
/// elsewhere is undefined behavior just as it would be to use `ptr::write`.
///
/// # Warning
///
/// Take care when using this function as the uninitialized memory being frozen can
/// contain bits and pieces of previously discarded data, including sensitive
/// information such as cryptographic keys.
///
/// # Safety
///
/// Behavior is undefined if any of the following conditions are violated:
///
/// * `dst` must be [valid] for writes.
///
/// * Every bit representation of `T` must be a valid value.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is that a precondition to freeze? freeze does not assert validity of anything.

freeze on a *mut bool is not UB. Just dereferencing the pointer later is.

Copy link
Member Author

@sfackler sfackler Feb 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that what you meant in this comment: #58363 (comment). If the validity requirements are less strict I can move this to be a more general warning about use of the function rather than a hard UB constraint.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The phrasing here seems congruent with the behavior of freeze::<*mut bool>(...) tho. The value is the pointer (*mut bool), not the pointee (bool).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The validity requirements come in once you actually operate on or a make a copy of a value at some type. So the following is UB:

let x = MaybeUninit::<bool>::uninitialized();
ptr::freeze(x.as_mut_ptr());
let x = x.into_initialized(); // UB

But it's not the freeze that is UB, it's the into_initialized!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RalfJung Oh... you're talking about *mut bool as the type of the first argument, not T == *mut bool... i.e. ptr::freeze<bool>(x.as_mut_ptr(): *mut bool) rather than ptr::freeze<*mut bool>(??: *mut *mut bool). As you have phrased it with x.into_initialized() being UB, this seems more of a post-condition?

///
/// Note that even if `T` has size `0`, the pointer must be non-NULL and properly aligned.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Note that even if `T` has size `0`, the pointer must be non-NULL and properly aligned.
/// Note that even if `size_of::<T>() == 0`, the pointer must be non-NULL and properly aligned.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this formulation is used consistently throughout this file, so I'd prefer that if it gets changed that happens in a separate PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah; that's a good idea; I'll see if I can remember to make such a PR... :)

///
/// [valid]: ../ptr/index.html#safety
///
/// # Examples
///
/// ```ignore (io-is-in-std)
/// use std::io::{self, Read};
/// use std::mem;
/// use std::ptr;
///
/// pub fn read_le_u32<R>(reader: &mut R) -> io::Result<u32>
/// where
/// R: Read,
/// {
/// unsafe {
/// // We're passing this buffer to an arbitrary reader and aren't
/// // guaranteed they won't read from it, so freeze to avoid UB.
/// let mut buf: [u8; 4] = mem::uninitialized();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a FIXME somewhere about porting this to MaybeUninit?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(This is still open, from what I can see)

/// ptr::freeze(buf.as_mut_ptr(), buf.len());
/// reader.read_exact(&mut buf)?;
///
/// Ok(u32::from_le_bytes(buf))
/// }
/// }
/// ```
#[inline]
#[unstable(feature = "ptr_freeze", issue = "0")]
#[cfg(not(stage0))]
pub unsafe fn freeze<T>(dst: *mut T, count: usize) {
intrinsics::freeze(dst, count)
}
///
#[inline]
#[unstable(feature = "ptr_freeze", issue = "0")]
#[cfg(stage0)]
pub unsafe fn freeze<T>(dst: *mut T, count: usize) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the right interface? It's currently a bit weird in that we don't actually use the count. It could alternatively just take the pointer, and say that it freezes all memory reachable through it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, should this be unsafe in the first place? Since it's not actually modifying any of the pointed-to data, does it matter if it's valid or not?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it's already "basically stable", I wonder if this should take &mut [T] and move to std::mem?

I'd naively think that it could be safe and probably should be, but I'm not an expert!

We also briefly discussed maybe only taking u8 for now? I'm not sure how useful this would be beyond u8 and other POD types

Copy link
Member

@RalfJung RalfJung Feb 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should take raw pointers so people don't have to create references to uninitialized data.

count being unused just comes from the fact that LLVM does not support "real" freeze, but I think this is a much better interface than "reachable from".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we change this to T: ?Sized so that you can pass a slice in? Then the count parameter would no longer be necessary.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK there's currently no way to form a *mut [T] without going through a reference first.

let _ = count;
asm!("" :: "r"(dst) : "memory" : "volatile");
}

#[lang = "const_ptr"]
impl<T: ?Sized> *const T {
/// Returns `true` if the pointer is null.
Expand Down
20 changes: 19 additions & 1 deletion src/librustc_codegen_llvm/intrinsic.rs
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,8 @@ use rustc::ty::{self, Ty};
use rustc::ty::layout::{self, LayoutOf, HasTyCtxt, Primitive};
use rustc_codegen_ssa::common::{IntPredicate, TypeKind};
use rustc::hir;
use syntax::ast::{self, FloatTy};
use std::ffi::CStr;
use syntax::ast::{self, FloatTy, AsmDialect};
use syntax::symbol::Symbol;
use builder::Builder;
use value::Value;
Expand Down Expand Up @@ -695,6 +696,23 @@ impl IntrinsicCallMethods<'tcx> for Builder<'a, 'll, 'tcx> {
return;
}

"freeze" => {
let dst = args[0].immediate();
let r = self.inline_asm_call(
CStr::from_bytes_with_nul(b"\0").unwrap(),
CStr::from_bytes_with_nul(b"r,~{memory}\0").unwrap(),
&[dst],
self.type_void(),
true,
false,
AsmDialect::Att,
);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could optimize this when count is a constant to avoid the memory clobber by casting dst to a [u8; count] and using asm!("" : "=*m" (dst) : "0" (dst)).

A memory clobber can be expensive since it forces the compiler to reload the values of all values that are potentially globally visible.

This optimization requires count to be properly const-propagated through inlining. Can we do this?

Copy link
Contributor

@hanna-kruppe hanna-kruppe Feb 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are inlining and (limited) constant propagation passes for MIR but the former is not enabled by default and it's not clear when it will be.

I can't find any indication that your proposal will actually result in improved codegen (see this test case). Pointee types in LLVM IR are meaningless and slated for removal (if only someone would push that refactoring over the finish line), so I see no reason why the LLVM IR pointee type should have bearing on how many bytes the inline asm can write through the pointer.

Now, if there was a way to indicate that the inline asm guarantees it only accesses a certain range of bytes around the address that's passed in, that would help, but AFAIK there's no such thing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I tried this out in C: https://godbolt.org/z/Fbz1o5

It seems that GCC interprets "m" constraints very precisely, while it seems that LLVM just treats all memory constraints (even read-only memory inputs) as a general memory clobber.

I suppose that in this case we can just stick to the existing formulation since there is no advantage in optimizing it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it seems like this is just going to be inherently a bit imprecise until LLVM lands freeze support on its end.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot review this part, as I have no knowledge of these assembly annotation.

@Amanieu could you confirm that this tells the compiler that the assembly block may read and/or write all memory reachable from dst, such that it can neither optimize away preceding writes nor following loads?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you confirm that this tells the compiler that the assembly block may read and/or write all memory reachable from dst, such that it can neither optimize away preceding writes nor following loads?

@comex put it succinctly here: rust-lang/rfcs#2360 (comment)

memory makes LLVM treats this as readnone = readonly = false, which means that this can read and write to all memory that's not private. This includes all globals, heap memory, etc. and AFAICT definitely includes all memory reachable from dst.

if r.is_none() {
bug!("broken freeze ASM");
}
return;
}

_ => bug!("unknown intrinsic '{}'", name),
};

Expand Down
12 changes: 12 additions & 0 deletions src/librustc_typeck/check/intrinsic.rs
Original file line number Diff line number Diff line change
Expand Up @@ -379,6 +379,18 @@ pub fn check_intrinsic_type<'a, 'tcx>(tcx: TyCtxt<'a, 'tcx, 'tcx>,
(1, vec![ tcx.mk_mut_ptr(param(0)), param(0) ], tcx.mk_unit())
}

"freeze" => {
(1,
vec![
tcx.mk_ptr(ty::TypeAndMut {
ty: param(0),
mutbl: hir::MutMutable
}),
tcx.types.usize,
],
tcx.mk_unit())
}

ref other => {
struct_span_err!(tcx.sess, it.span, E0093,
"unrecognized intrinsic function: `{}`",
Expand Down
6 changes: 5 additions & 1 deletion src/libstd/fs.rs
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,9 @@

use fmt;
use ffi::OsString;
use io::{self, SeekFrom, Seek, Read, Initializer, Write};
use io::{self, SeekFrom, Seek, Read, Write};
#[allow(deprecated)]
use io::Initializer;
use path::{Path, PathBuf};
use sys::fs as fs_imp;
use sys_common::{AsInnerMut, FromInner, AsInner, IntoInner};
Expand Down Expand Up @@ -600,6 +602,7 @@ impl Read for File {
}

#[inline]
#[allow(deprecated)]
unsafe fn initializer(&self) -> Initializer {
Initializer::nop()
}
Expand All @@ -624,6 +627,7 @@ impl<'a> Read for &'a File {
}

#[inline]
#[allow(deprecated)]
unsafe fn initializer(&self) -> Initializer {
Initializer::nop()
}
Expand Down
8 changes: 6 additions & 2 deletions src/libstd/io/buffered.rs
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,11 @@ use io::prelude::*;
use cmp;
use error;
use fmt;
use io::{self, Initializer, DEFAULT_BUF_SIZE, Error, ErrorKind, SeekFrom};
use io::{self, DEFAULT_BUF_SIZE, Error, ErrorKind, SeekFrom};
#[allow(deprecated)]
use io::Initializer;
use memchr;
use ptr;

/// The `BufReader` struct adds buffering to any reader.
///
Expand Down Expand Up @@ -92,7 +95,7 @@ impl<R: Read> BufReader<R> {
unsafe {
let mut buffer = Vec::with_capacity(cap);
buffer.set_len(cap);
inner.initializer().initialize(&mut buffer);
ptr::freeze(buffer.as_mut_ptr(), cap);
BufReader {
inner,
buf: buffer.into_boxed_slice(),
Expand Down Expand Up @@ -236,6 +239,7 @@ impl<R: Read> Read for BufReader<R> {
}

// we can't skip unconditionally because of the large buffer case in read.
#[allow(deprecated)]
unsafe fn initializer(&self) -> Initializer {
self.inner.initializer()
}
Expand Down
5 changes: 4 additions & 1 deletion src/libstd/io/cursor.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,9 @@ use io::prelude::*;

use core::convert::TryInto;
use cmp;
use io::{self, Initializer, SeekFrom, Error, ErrorKind};
use io::{self, SeekFrom, Error, ErrorKind};
#[allow(deprecated)]
use io::Initializer;

/// A `Cursor` wraps an in-memory buffer and provides it with a
/// [`Seek`] implementation.
Expand Down Expand Up @@ -229,6 +231,7 @@ impl<T> Read for Cursor<T> where T: AsRef<[u8]> {
}

#[inline]
#[allow(deprecated)]
unsafe fn initializer(&self) -> Initializer {
Initializer::nop()
}
Expand Down
7 changes: 6 additions & 1 deletion src/libstd/io/impls.rs
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
use cmp;
use io::{self, SeekFrom, Read, Initializer, Write, Seek, BufRead, Error, ErrorKind};
use io::{self, SeekFrom, Read, Write, Seek, BufRead, Error, ErrorKind};
#[allow(deprecated)]
use io::Initializer;
use fmt;
use mem;

Expand All @@ -14,6 +16,7 @@ impl<'a, R: Read + ?Sized> Read for &'a mut R {
}

#[inline]
#[allow(deprecated)]
unsafe fn initializer(&self) -> Initializer {
(**self).initializer()
}
Expand Down Expand Up @@ -83,6 +86,7 @@ impl<R: Read + ?Sized> Read for Box<R> {
}

#[inline]
#[allow(deprecated)]
unsafe fn initializer(&self) -> Initializer {
(**self).initializer()
}
Expand Down Expand Up @@ -172,6 +176,7 @@ impl<'a> Read for &'a [u8] {
}

#[inline]
#[allow(deprecated)]
unsafe fn initializer(&self) -> Initializer {
Initializer::nop()
}
Expand Down
21 changes: 19 additions & 2 deletions src/libstd/io/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -367,7 +367,7 @@ fn read_to_end_with_reservation<R: Read + ?Sized>(r: &mut R,
g.buf.reserve(reservation_size);
let capacity = g.buf.capacity();
g.buf.set_len(capacity);
r.initializer().initialize(&mut g.buf[g.len..]);
ptr::freeze(g.buf.as_mut_ptr().add(g.len), g.buf.len() - g.len);
}
}

Expand Down Expand Up @@ -543,6 +543,8 @@ pub trait Read {
/// [`Initializer::nop()`]: ../../std/io/struct.Initializer.html#method.nop
/// [`Initializer`]: ../../std/io/struct.Initializer.html
#[unstable(feature = "read_initializer", issue = "42788")]
#[rustc_deprecated(since = "1.33.0", reason = "use std::ptr::freeze instead")]
#[allow(deprecated)]
#[inline]
unsafe fn initializer(&self) -> Initializer {
Initializer::zeroing()
Expand Down Expand Up @@ -869,12 +871,22 @@ pub trait Read {

/// A type used to conditionally initialize buffers passed to `Read` methods.
#[unstable(feature = "read_initializer", issue = "42788")]
#[derive(Debug)]
#[rustc_deprecated(since = "1.33.0", reason = "use std::ptr::freeze instead")]
pub struct Initializer(bool);

#[allow(deprecated)]
#[unstable(feature = "read_initializer", issue = "42788")]
impl fmt::Debug for Initializer {
fn fmt(&self, fmt: &mut fmt::Formatter<'_>) -> fmt::Result {
fmt.debug_tuple("Initializer").field(&self.0).finish()
}
}

#[allow(deprecated)]
impl Initializer {
/// Returns a new `Initializer` which will zero out buffers.
#[unstable(feature = "read_initializer", issue = "42788")]
#[rustc_deprecated(since = "1.33.0", reason = "use std::ptr::freeze instead")]
#[inline]
pub fn zeroing() -> Initializer {
Initializer(true)
Expand All @@ -889,20 +901,23 @@ impl Initializer {
/// the method accurately reflects the number of bytes that have been
/// written to the head of the buffer.
#[unstable(feature = "read_initializer", issue = "42788")]
#[rustc_deprecated(since = "1.33.0", reason = "use std::ptr::freeze instead")]
#[inline]
pub unsafe fn nop() -> Initializer {
Initializer(false)
}

/// Indicates if a buffer should be initialized.
#[unstable(feature = "read_initializer", issue = "42788")]
#[rustc_deprecated(since = "1.33.0", reason = "use std::ptr::freeze instead")]
#[inline]
pub fn should_initialize(&self) -> bool {
self.0
}

/// Initializes a buffer if necessary.
#[unstable(feature = "read_initializer", issue = "42788")]
#[rustc_deprecated(since = "1.33.0", reason = "use std::ptr::freeze instead")]
#[inline]
pub fn initialize(&self, buf: &mut [u8]) {
if self.should_initialize() {
Expand Down Expand Up @@ -1698,6 +1713,7 @@ impl<T: Read, U: Read> Read for Chain<T, U> {
self.second.read(buf)
}

#[allow(deprecated)]
unsafe fn initializer(&self) -> Initializer {
let initializer = self.first.initializer();
if initializer.should_initialize() {
Expand Down Expand Up @@ -1895,6 +1911,7 @@ impl<T: Read> Read for Take<T> {
Ok(n)
}

#[allow(deprecated)]
unsafe fn initializer(&self) -> Initializer {
self.inner.initializer()
}
Expand Down
7 changes: 6 additions & 1 deletion src/libstd/io/stdio.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,9 @@ use io::prelude::*;
use cell::RefCell;
use fmt;
use io::lazy::Lazy;
use io::{self, Initializer, BufReader, LineWriter};
use io::{self, BufReader, LineWriter};
#[allow(deprecated)]
use io::Initializer;
use sync::{Arc, Mutex, MutexGuard};
use sys::stdio;
use sys_common::remutex::{ReentrantMutex, ReentrantMutexGuard};
Expand Down Expand Up @@ -67,6 +69,7 @@ impl Read for StdinRaw {
fn read(&mut self, buf: &mut [u8]) -> io::Result<usize> { self.0.read(buf) }

#[inline]
#[allow(deprecated)]
unsafe fn initializer(&self) -> Initializer {
Initializer::nop()
}
Expand Down Expand Up @@ -297,6 +300,7 @@ impl Read for Stdin {
self.lock().read(buf)
}
#[inline]
#[allow(deprecated)]
unsafe fn initializer(&self) -> Initializer {
Initializer::nop()
}
Expand All @@ -317,6 +321,7 @@ impl<'a> Read for StdinLock<'a> {
self.inner.read(buf)
}
#[inline]
#[allow(deprecated)]
unsafe fn initializer(&self) -> Initializer {
Initializer::nop()
}
Expand Down
Loading