Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve safety & usability of |size_t| and |ssize_t|. #28096

Closed
wants to merge 4 commits into from

Conversation

briansmith
Copy link
Contributor

I mostly just want to get feedback on this change before I do any of the additional work that should be done:

  • Fix any compiler errors caused by changing the definitions of these types.
  • Remove any now-redundant (and potentially unsafe) casting in the repository.

@rust-highfive
Copy link
Collaborator

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @nikomatsakis (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. The way Github handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@bluss
Copy link
Member

bluss commented Aug 30, 2015

I think your rationale for defining size_t as usize where possible is pretty good. Anyone know why it uses u32/u64 right now?

I wonder if this is pulling too much C-standard lawyering into Rust, a place where it does not really belong. Rust already establishes some fundamentals which mean that not every theoretical C platform can be supported. I believe these are examples:

  • Rust assumes the type u8 exists, basically that the byte is 8-bit, and further that types u16, u32, u64 exist.
  • Rust assumes all (thin) pointer types are the same size.

@eefriedman
Copy link
Contributor

I think the reason size_t isn't currently defined this way is damage from the time when isize was named "int". Essentially, the same reason that isize/usize currently trigger the improper_ctypes lint. (I think there was some discussion of this on my improper_ctypes rewrite PR.)

We can do whatever we want internally, but the libc crate on crates.io might have compatibility concerns.

@briansmith
Copy link
Contributor Author

The improper_ctypes rewrite you are referring to is #26583, I think.

@briansmith
Copy link
Contributor Author

The improper_ctypes lint contains this code:

            ty::TyInt(ast::TyIs) => {
                FfiUnsafe("found Rust type `isize` in foreign module, while \
                          `libc::c_int` or `libc::c_long` should be used")
            }
            ty::TyUint(ast::TyUs) => {
                FfiUnsafe("found Rust type `usize` in foreign module, while \
                          `libc::c_uint` or `libc::c_ulong` should be used")
            }

it seems like this would need to be changed so that it warns only when size_t isn't an alias for usize and/or when ssize_t isn't an alias for isize. However, we don't support any such platforms currently so perhaps I can just remove this chunk?

@eefriedman
Copy link
Contributor

Yes, it should be fine to just remove that chunk.

It is safer to alias |size_t|/|ssize_t| as |usize|/|size_t| on
platforms where they are the same size, in order to reduce
unneccessary and hazardous casting.

In theory, there are platforms where |usize| and |isize| are not the
same as |size_t| and |ssize_t|, and in that case we **should** have
this warning, but we don't currently support any such platforms, so we
don't bother adding any logic to deal with that situation yet.
@alexcrichton
Copy link
Member

Hm I share @eefriedman's concerns about the liblibc crate on crates.io as well, this looks like it's got good potential to be a sizable breaking change. Overall I'm a bit on the fence about this change. In the rare case that usize != size_t then we're basically pushing the breakage from runtime to compile time (which seems good), but on the other hand it now seems that if that's the case then libc basically can't be right, which seems unfortunate.

I think the breakage point of libc on crates.io may push me more towards the status quo for now, though.

@briansmith
Copy link
Contributor Author

it now seems that if that's the case then libc basically can't be right, which seems unfortunate.

What do you mean "can't be right?" If/when such a platform is supported, libc will need to be adapted. Such an adaptation would probably involve the introduction of helper function to do the possibly-truncating casting in a safe and convenient way. But, that was the case already. This change just makes it more obvious. And, I have a feeling that such a platform may never be added.

I think the breakage point of libc on crates.io may push me more towards the status quo for now, though

I think that this is unlikely to break a lot of stuff, because I imagine very little code is written assuming that libc::size_t is u32 or that libc::ssize_t is i32. I've got this repo building with these changes now locally on Linux, and that's a good sign as far as the compatibility impact.

@alexcrichton
Copy link
Member

@briansmith oh hm yeah that's a good point, we just wouldn't define size_t = usize on all platforms, we'd just say something like size_t = u32 on those obscure platforms.

Out of curiosity, did you encounter any breakage when building the compiler itself? Lots of casts look like they were removed but I doubt they were necessary to remove.

Some good test cases may also be building both Cargo and Servo with a modified libc (e.g. using a Cargo path override), it'd certainly give me quite a bit of confidence in pushing this forward :)

I didn't try to find every such cast. I concentrated on the files that
had now-unnecessary casts from |usize| to |size_t|.
@bors
Copy link
Contributor

bors commented Sep 3, 2015

☔ The latest upstream changes (presumably #28138) made this pull request unmergeable. Please resolve the merge conflicts.

@alexcrichton alexcrichton added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label Sep 3, 2015
@alexcrichton
Copy link
Member

Tagging with T-libs to ensure that this at least comes up during triage, but cc @rust-lang/libs as well

@alexcrichton
Copy link
Member

Ok, the libs team talked about this RFC in triage today, and we are in general in favor of this approach. In that sense, we'd "like to merge" this RFC, but we're also considering the breakage in terms of external crates and such.

The good news is that I'm working on a revamp of the liblibc crate to live outside this repository (but get imported in). This is going to be a breaking change to the liblibc library no matter what, which is already going to be signaled through semver, so that would be a great opportunity to include changes like this. Overall that should assuage my worries about breakage as it's communicated through semver and we can announce it broadly.

Along these lines, here's what I think we should do about this PR currently:

  • The change to size_t and ssize_t may want to be backed out for now. We could apply it locally in the repository and hold off on publishing it, but it may be best to just hold out waiting for the liblibc revamp.
  • The change to the FFI lint seems like it should land ASAP. If we start liberally using usize and isize in FFI bindings we certainly shouldn't be receiving warnings about them!

@briansmith would you be ok cutting this PR back to just modify the FFI lint? It's something we'd probably consider backporting to beta as well, and I can be sure to include the change to size_t and ssize_t in the reorganization of liblibc. How's that sound?

@briansmith
Copy link
Contributor Author

I agree it makes sense to change the FFI lint early. I will split that out into a separate commit.

However, I don't think that this is actually a breaking change in practice. For all supported platforms, AFAICT, ssize_t is already equivalent to isize and size_t is already equivalent to usize, except for the need for a cast. That is, this change makes some casts unnecessary but it is very unlikely to cause more casts to be needed. The only case where it would break stuff is where somebody is passing an i32/u32 or i64/u32 to a function that takes an ssize_t/size_t, which seems likely to be rare.

In particular, when I built the rust compiler (the rust repo), there were ZERO failures building the compiler and libraries. The only failures right now (IIRC) are due to the fact that I don't know how to modify the tests for the lint, because I don't know how to format the comments that indicate the expected error text.

Like I said, though, I will split out the changes to the lint so that it can land before the change for actually changing the definitions of ssize_t/size_t. In fact, it would be nice if the changes to the lint can be uplifted to beta as otherwise code that needs to work on stable would have to wait two releases before removing the casts.

@alexcrichton
Copy link
Member

Yeah I'm not sure that the breakage will be all that large, but I could imagine that some crates may be passing a size_t to a function that expects a u64 indirectly, however. For example lots of types in liblibc are defined to u64 and size_t will silently coerce to any one of them. Regardless, though, the way to evaluate this would be to schedule a crater run, but unfortunately that can't be done easily for external crates.io crates, and with an upcoming breaking change to liblibc it's easiest to just throw it in there with the rest of the changes.

@alexcrichton
Copy link
Member

ping @briansmith, are you still ok paring this back to just the lint changes?

@alexcrichton
Copy link
Member

I'm gonna close this in favor of #28779 and rust-lang/rfcs#1291, thanks for the PR regardless though @briansmith!

@briansmith
Copy link
Contributor Author

Why did you close this when the main point of the PR was not addressed? This should be reopened.

@sfackler
Copy link
Member

sfackler commented Oct 7, 2015

The main point of the PR is addressed in the linked RFC: https://github.com/rust-lang/rfcs/pull/1291/files#diff-0c0c1a31105548fdfea87a54035357e7R131

@briansmith
Copy link
Contributor Author

My mistake. Sorry!

On Tue, Oct 6, 2015 at 9:41 PM, Steven Fackler notifications@github.com
wrote:

The main point of the PR is addressed in the linked RFC:
https://github.com/rust-lang/rfcs/pull/1291/files#diff-0c0c1a31105548fdfea87a54035357e7R131


Reply to this email directly or view it on GitHub
#28096 (comment).

https://briansmith.org/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants