Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make CStr an extern type #64021

Closed
wants to merge 9 commits into from
Closed
106 changes: 52 additions & 54 deletions src/libstd/ffi/c_str.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ use crate::borrow::{Cow, Borrow};
use crate::cmp::Ordering;
use crate::error::Error;
use crate::fmt::{self, Write};
use crate::hash::{Hasher, Hash};
use crate::io;
use crate::mem;
use crate::memchr;
Expand Down Expand Up @@ -134,10 +135,9 @@ pub struct CString {
/// in each pair are borrowed references; the latter are owned
/// strings.
///
/// Note that this structure is **not** `repr(C)` and is not recommended to be
/// placed in the signatures of FFI functions. Instead, safe wrappers of FFI
/// functions may leverage the unsafe [`from_ptr`] constructor to provide a safe
/// interface to other consumers.
/// Note that this structure is `repr(C)` and is can be
oli-obk marked this conversation as resolved.
Show resolved Hide resolved
/// placed in the signatures of FFI functions. A `&CStr` is equivalent to `*const c_char`,
/// but offers a richer API.
oli-obk marked this conversation as resolved.
Show resolved Hide resolved
///
/// # Examples
///
Expand Down Expand Up @@ -193,20 +193,21 @@ pub struct CString {
/// [`String`]: ../string/struct.String.html
/// [`CString`]: struct.CString.html
/// [`from_ptr`]: #method.from_ptr
#[derive(Hash)]
#[stable(feature = "rust1", since = "1.0.0")]
// FIXME:
// `fn from` in `impl From<&CStr> for Box<CStr>` current implementation relies
// on `CStr` being layout-compatible with `[u8]`.
// When attribute privacy is implemented, `CStr` should be annotated as `#[repr(transparent)]`.
// Anyway, `CStr` representation and layout are considered implementation detail, are
// not documented and must not be relied upon.
pub struct CStr {
// FIXME: this should not be represented with a DST slice but rather with
// just a raw `c_char` along with some form of marker to make
// this an unsized type. Essentially `sizeof(&CStr)` should be the
// same as `sizeof(&c_char)` but `CStr` should be an unsized type.
inner: [c_char]
pub struct CStr(CStrExtern);

extern "C" {
// HACK: workaround for https://github.com/rust-lang/rust/issues/46665
type CStrExtern;
}

#[stable(feature = "rust1", since = "1.0.0")]
impl Hash for CStr {
fn hash<H: Hasher>(&self, state: &mut H) {
// We also hash a trailing zero to make `CStr` and `CString` produce the same hash
// for the same string.
self.to_bytes_with_nul().hash(state);
}
oli-obk marked this conversation as resolved.
Show resolved Hide resolved
}

/// An error indicating that an interior nul byte was found.
Expand Down Expand Up @@ -598,7 +599,11 @@ impl CString {
/// ```
#[stable(feature = "into_boxed_c_str", since = "1.20.0")]
pub fn into_boxed_c_str(self) -> Box<CStr> {
unsafe { Box::from_raw(Box::into_raw(self.into_inner()) as *mut CStr) }
let ptr: *mut [u8] = Box::into_raw(self.into_inner());
// SAFETY: Casting away the length information is fine as `CStr`'s length computation works
// by counting the elements up to the first null. Since this `CString` must be null
// terminated, the `CStr` can properly compute its length
unsafe { Box::from_raw(ptr as *mut CStr) }
}

/// Bypass "move out of struct which implements [`Drop`] trait" restriction.
Expand Down Expand Up @@ -933,22 +938,16 @@ impl fmt::Display for IntoStringError {

impl CStr {
/// Wraps a raw C string with a safe C string wrapper.
/// This operation is a zero cost conversion.
///
/// This function will wrap the provided `ptr` with a `CStr` wrapper, which
/// allows inspection and interoperation of non-owned C strings. This method
/// is unsafe for a number of reasons:
///
/// * There is no guarantee to the validity of `ptr`.
/// * The returned lifetime is not guaranteed to be the actual lifetime of
/// `ptr`.
/// * There is no guarantee that the memory pointed to by `ptr` contains a
/// valid nul terminator byte at the end of the string.
/// * It is not guaranteed that the memory pointed by `ptr` won't change
/// before the `CStr` has been destroyed.
///
/// > **Note**: This operation is intended to be a 0-cost cast but it is
/// > currently implemented with an up-front calculation of the length of
/// > the string. This is not guaranteed to always be the case.
/// * The pointer must be non-null,
/// * The returned lifetime must be actual lifetime of the memory that `ptr` points to,
/// * The memory pointed to by `ptr` must have a nul terminator byte within the allocation,
/// * The memory pointed by `ptr` must not be modified before the `CStr` has been destroyed.
///
/// # Examples
///
Expand All @@ -969,9 +968,7 @@ impl CStr {
/// ```
#[stable(feature = "rust1", since = "1.0.0")]
oli-obk marked this conversation as resolved.
Show resolved Hide resolved
pub unsafe fn from_ptr<'a>(ptr: *const c_char) -> &'a CStr {
let len = sys::strlen(ptr);
let ptr = ptr as *const u8;
CStr::from_bytes_with_nul_unchecked(slice::from_raw_parts(ptr, len as usize + 1))
&*(ptr as *const CStr)
}

/// Creates a C string wrapper from a byte slice.
Expand Down Expand Up @@ -1024,7 +1021,7 @@ impl CStr {
///
/// This function will cast the provided `bytes` to a `CStr` wrapper without
/// performing any sanity checks. The provided slice **must** be nul-terminated
/// and not contain any interior nul bytes.
/// and its length will be truncated to its first interior nul (if any).
///
/// # Examples
///
Expand Down Expand Up @@ -1094,17 +1091,17 @@ impl CStr {
#[inline]
#[stable(feature = "rust1", since = "1.0.0")]
pub const fn as_ptr(&self) -> *const c_char {
self.inner.as_ptr()
self as *const CStr as *const c_char
}

/// Converts this C string to a byte slice.
///
/// The returned slice will **not** contain the trailing nul terminator that this C
/// string has.
///
/// > **Note**: This method is currently implemented as a constant-time
/// > cast, but it is planned to alter its definition in the future to
/// > perform the length calculation whenever this method is called.
/// > **Note**: This method performs the length calculation whenever this method is called.
/// > Calculating the length of the C string requires iterating through all bytes of the
/// > C string and can thus be an expensive operation.
///
/// # Examples
///
Expand All @@ -1126,9 +1123,9 @@ impl CStr {
/// This function is the equivalent of [`to_bytes`] except that it will retain
/// the trailing nul terminator instead of chopping it off.
///
/// > **Note**: This method is currently implemented as a 0-cost cast, but
/// > it is planned to alter its definition in the future to perform the
/// > length calculation whenever this method is called.
/// > **Note**: This method performs the length calculation whenever this method is called.
/// > Calculating the length of the C string requires iterating through all bytes of the
/// > C string and can thus be an expensive operation.
///
/// [`to_bytes`]: #method.to_bytes
///
Expand All @@ -1143,7 +1140,13 @@ impl CStr {
#[inline]
#[stable(feature = "rust1", since = "1.0.0")]
pub fn to_bytes_with_nul(&self) -> &[u8] {
unsafe { &*(&self.inner as *const [c_char] as *const [u8]) }
// SAFETY: safe references to `CStr` can only exist if they point to a memory location
// that is terminated by a `\0`.
Copy link
Contributor

@abonander abonander Sep 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason the original code in from_ptr didn't assert that the pointer was not null? I'm guessing it either caused miscompilation or pessmizations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know. I'm guessing an oversight?

unsafe {
let len = sys::strlen(self as *const CStr as *const c_char);
let ptr = self as *const CStr as *const u8;
slice::from_raw_parts(ptr, len as usize + 1)
}
}

/// Yields a [`&str`] slice if the `CStr` contains valid UTF-8.
Expand All @@ -1152,10 +1155,8 @@ impl CStr {
/// function will return the corresponding [`&str`] slice. Otherwise,
/// it will return an error with details of where UTF-8 validation failed.
///
/// > **Note**: This method is currently implemented to check for validity
/// > after a constant-time cast, but it is planned to alter its definition
/// > in the future to perform the length calculation in addition to the
/// > UTF-8 check whenever this method is called.
/// > **Note**: This method performs the length calculation in
/// > addition to the UTF-8 check whenever this method is called.
///
/// [`&str`]: ../primitive.str.html
///
Expand All @@ -1169,9 +1170,7 @@ impl CStr {
/// ```
#[stable(feature = "cstr_to_str", since = "1.4.0")]
pub fn to_str(&self) -> Result<&str, str::Utf8Error> {
// N.B., when `CStr` is changed to perform the length check in `.to_bytes()`
// instead of in `from_ptr()`, it may be worth considering if this should
// be rewritten to do the UTF-8 check inline with the length calculation
// FIXME: this should be rewritten to do the UTF-8 check inline with the length calculation
// instead of doing it afterwards.
str::from_utf8(self.to_bytes())
}
Expand All @@ -1185,10 +1184,8 @@ impl CStr {
/// [`U+FFFD REPLACEMENT CHARACTER`][U+FFFD] and return a
/// [`Cow`]`::`[`Owned`]`(`[`String`]`)` with the result.
///
/// > **Note**: This method is currently implemented to check for validity
/// > after a constant-time cast, but it is planned to alter its definition
/// > in the future to perform the length calculation in addition to the
/// > UTF-8 check whenever this method is called.
/// > **Note**: This method performs the length calculation in
/// > addition to the UTF-8 check whenever this method is called.
///
/// [`Cow`]: ../borrow/enum.Cow.html
/// [`Borrowed`]: ../borrow/enum.Cow.html#variant.Borrowed
Expand Down Expand Up @@ -1244,8 +1241,9 @@ impl CStr {
/// ```
#[stable(feature = "into_boxed_c_str", since = "1.20.0")]
pub fn into_c_string(self: Box<CStr>) -> CString {
let raw = Box::into_raw(self) as *mut [u8];
CString { inner: unsafe { Box::from_raw(raw) } }
unsafe {
oli-obk marked this conversation as resolved.
Show resolved Hide resolved
CString::from_raw(Box::into_raw(self) as *mut CStr as *mut c_char)
}
}
}

Expand Down
1 change: 1 addition & 0 deletions src/libstd/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -263,6 +263,7 @@
#![feature(exact_size_is_empty)]
#![feature(exhaustive_patterns)]
#![feature(external_doc)]
#![feature(extern_types)]
#![feature(fn_traits)]
#![feature(format_args_nl)]
#![feature(generator_trait)]
Expand Down