Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add lossless debug implementation for unix OsStrs #46798

Merged
merged 1 commit into from
Dec 18, 2017

Conversation

Diggsey
Copy link
Contributor

@Diggsey Diggsey commented Dec 18, 2017

Fixes #22766

Invalid utf8 byte sequences are replaced with \xFF style escape codes, while valid utf8 goes through the normal Debug implementation.

This is necessarily different from the windows Debug implementation, which uses \u{xxxx} style escape sequences for unpaired surrogates, but both implementations are consistent in that they are both lossless, and display invalid sequences in the way most similar to existing language syntax.

r? @dtolnay

Copy link
Member

@dtolnay dtolnay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@dtolnay
Copy link
Member

dtolnay commented Dec 18, 2017

@bors r+

@bors
Copy link
Contributor

bors commented Dec 18, 2017

📌 Commit 4c5e070 has been approved by dtolnay

@bors
Copy link
Contributor

bors commented Dec 18, 2017

⌛ Testing commit 4c5e070709686d388281454e1a1384748f5afb01 with merge 286b7b6457b0b923cbc60d0c37b8c814554b6060...

@dtolnay
Copy link
Member

dtolnay commented Dec 18, 2017

The new file is going to need a license header.

[00:04:12] tidy error: /checkout/src/libstd/sys_common/bytestring.rs: incorrect license
[00:04:14] some tidy checks failed

@bors r-

@bors
Copy link
Contributor

bors commented Dec 18, 2017

💔 Test failed - status-appveyor

@Diggsey
Copy link
Contributor Author

Diggsey commented Dec 18, 2017

Oops!

// option. This file may not be copied, modified, or distributed
// except according to those terms.

#![allow(dead_code)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still required?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, on windows builds the function is never called.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it work if we do #[cfg(any(unix, target_os = "redox", target_arch = "wasm32"))] pub mod bytestring instead? I already see some of that for pub mod gnu and pub mod net. That way we continue to catch accidental dead code in this file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so - see src\tools\tidy\src\pal.rs

It looks like the goal is to avoid having platform specific code in sys_common and net is a temporary exception. The closest existing file to this is wtf8, which also uses #[allow(dead_code)].

@dtolnay
Copy link
Member

dtolnay commented Dec 18, 2017

@bors r+

@bors
Copy link
Contributor

bors commented Dec 18, 2017

📌 Commit 8fac7d9 has been approved by dtolnay

@bors
Copy link
Contributor

bors commented Dec 18, 2017

⌛ Testing commit 8fac7d9 with merge a3a7203...

bors added a commit that referenced this pull request Dec 18, 2017
Add lossless debug implementation for unix OsStrs

Fixes #22766

Invalid utf8 byte sequences are replaced with `\xFF` style escape codes, while valid utf8 goes through the normal `Debug` implementation.

This is necessarily different from the windows Debug implementation, which uses `\u{xxxx}` style escape sequences for unpaired surrogates, but both implementations are consistent in that they are both lossless, and display invalid sequences in the way most similar to existing language syntax.

r? @dtolnay
@dtolnay dtolnay added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label Dec 18, 2017
@bors
Copy link
Contributor

bors commented Dec 18, 2017

☀️ Test successful - status-appveyor, status-travis
Approved by: dtolnay
Pushing a3a7203 to master...

@bors bors merged commit 8fac7d9 into rust-lang:master Dec 18, 2017
@Diggsey Diggsey deleted the debug-osstr branch December 27, 2017 01:33
JohnTitor added a commit to JohnTitor/rust that referenced this pull request Jul 2, 2021
Merge `sys_common::bytestring` back into `os_str_bytes`

`bytestring` contains code for correctly debug formatting a byte slice (`[u8]`). This functionality is and has historically only been used to provide the debug formatting of byte-based os-strings (on unix etc.).

Having this functionality in the separate `bytestring` module was useful in the past to reduce duplication, as [when it was added](rust-lang#46798) `os_str_bytes` was still split into `sys::{unix, redox, wasi, etc.}::os_str`. However, now that is no longer the case, there is not much reason for the `bytestring` functionality to be separate from `os_str_bytes`; I don't think it is very likely that another part of std will need to handle formatting byte strings that are not os-strings in the future (everything should be `utf8`). This is why this PR merges the functionality of `bytestring` directly into the debug implementation in `os_str_bytes`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants