-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update Windows Argument Parsing #87580
Conversation
r? @dtolnay (rust-highfive has picked a reviewer for you, use r? to override) |
r? @rust-lang/libs |
/// | ||
/// This function was tested for equivalence to the C/C++ parsing rules using an | ||
/// extensive test suite available at | ||
/// <https://github.com/ChrisDenton/winarg/tree/std>. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it be possible to integrate some or all of this into our testsuite?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, more extensive testing is slow because of the overhead of starting a process but perhaps some more limited examples can be tested. However, I wasn't sure about compiling and running a C++ within the testsuite. Is that ok?
PR looks great @ChrisDenton. I caught a couple of typos you might want to fix but other than that I'm happy to merge this when you're ready. @bors delegate+ |
✌️ @ChrisDenton can now approve this pull request |
One follow up, since this changes runtime behavior it would help to have a short overview we can add to the release notes written up on how this changes the runtime behavior and why we feel this breakage is justified. |
Good point. I'm running local tests now and maybe I can figure out a summary in the meantime. I'm not very good a being concise but this is the gist of it: Most of the parsing differences are minor and only really affect weird edge-cases. The big change is escaping The main reason for updating the parsing rules is that it provides more consistent behaviour across modern (since 2008) utilities. This matters both for the user and for shells. E.g. powershell takes a string provided by the user, parses it according to powershell rules (expanding variables, etc), then produces a command line which it passes to the program. It expects the application to parse this command line in a standard way. So both powershell and the application should fully agree on how many arguments there are and what those arguments are. However, the new double quote escaping rules are mostly for users. This is of particular concern when using the command prompt (aka |
That looks great, I've gone ahead and added a link to the overview comment in the PR description. Thank you again. |
I'd also add the small note that since Microsoft were willing to make the changes to their C/C++ argument parsing, they were presumably confident that it wouldn't be too disruptive. Of course that doesn't rule out the possibility of this change breaking something in the Rust ecosystem. |
66b574e
to
fbe5691
Compare
@bors r+ |
📌 Commit fbe56912f601c300d9b34b5b4f953742440c5b51 has been approved by |
I just realized I should have FCPed this since it's a change to the stable API even though it's not adding anything new. In this case it's probably okay since we already discussed this a bit and the change seems pretty subtle and straightforward but I'm gonna cc @rust-lang/libs-api just to make sure everyone has a chance to see it. |
Ah, right. And I think I should have used r= there? Sorry, I'm a bit fuzzy about Bors commands. |
Nonono, you didn't do anything wrong. I shouldn't have delegated bors before the FCP. This is all me, not you, and not a big deal either way. |
⌛ Testing commit fbe56912f601c300d9b34b5b4f953742440c5b51 with merge f20948d35517bbc3b7f7c79fe7df1ccf2d4e3d85... |
💥 Test timed out |
The final comment period, with a disposition to merge, as per the review above, is now complete. As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed. The RFC will be merged soon. |
@bors r+ |
📌 Commit e26dda5 has been approved by |
…-ou-se Update Windows Argument Parsing Fixes rust-lang#44650 The Windows command line is passed to applications [as a single string](https://docs.microsoft.com/en-us/archive/blogs/larryosterman/the-windows-command-line-is-just-a-string) which the application then parses to get a list of arguments. The standard rules (as used by C/C++) for parsing the command line have slightly changed over the years, most recently in 2008 which added new escaping rules. This PR implements the new rules as [described on MSDN](https://docs.microsoft.com/en-us/cpp/cpp/main-function-command-line-args?view=msvc-160#parsing-c-command-line-arguments) and [further detailed here](https://daviddeley.com/autohotkey/parameters/parameters.htm#WIN). It has been tested against the behaviour of C++ by calling a C++ program that outputs its raw command line and the contents of `argv`. See [my repo](https://github.com/ChrisDenton/winarg/tree/std) if anyone wants to reproduce my work. For an overview of how this PR changes argument parsing behavior and why we feel it is warranted see rust-lang#87580 (comment). For some examples see: rust-lang#87580 (comment)
…-ou-se Update Windows Argument Parsing Fixes rust-lang#44650 The Windows command line is passed to applications [as a single string](https://docs.microsoft.com/en-us/archive/blogs/larryosterman/the-windows-command-line-is-just-a-string) which the application then parses to get a list of arguments. The standard rules (as used by C/C++) for parsing the command line have slightly changed over the years, most recently in 2008 which added new escaping rules. This PR implements the new rules as [described on MSDN](https://docs.microsoft.com/en-us/cpp/cpp/main-function-command-line-args?view=msvc-160#parsing-c-command-line-arguments) and [further detailed here](https://daviddeley.com/autohotkey/parameters/parameters.htm#WIN). It has been tested against the behaviour of C++ by calling a C++ program that outputs its raw command line and the contents of `argv`. See [my repo](https://github.com/ChrisDenton/winarg/tree/std) if anyone wants to reproduce my work. For an overview of how this PR changes argument parsing behavior and why we feel it is warranted see rust-lang#87580 (comment). For some examples see: rust-lang#87580 (comment)
☀️ Test successful - checks-actions |
Pkgsrc changes: * Remove one now-longer-applicable patch, adjust a few others * Bump bootstrap requirements to 1.55.0. Upstream changes: Version 1.56.0 (2021-10-21) ======================== Language -------- - [The 2021 Edition is now stable.][rust#88100] See [the edition guide][rust-2021-edition-guide] for more details. - [The pattern in `binding @ pattern` can now also introduce new bindings.] [rust#85305] - [Union field access is permitted in `const fn`.][rust#85769] [rust-2021-edition-guide]: https://doc.rust-lang.org/nightly/edition-guide/rust-2021/index.html Compiler -------- - [Upgrade to LLVM 13.][rust#87570] - [Support memory, address, and thread sanitizers on aarch64-unknown-freebsd.][rust#88023] - [Allow specifying a deployment target version for all iOS targets][rust#87699] - [Warnings can be forced on with `--force-warn`.][rust#87472] This feature is primarily intended for usage by `cargo fix`, rather than end users. - [Promote `aarch64-apple-ios-sim` to Tier 2\*.][rust#87760] - [Add `powerpc-unknown-freebsd` at Tier 3\*.][rust#87370] - [Add `riscv32imc-esp-espidf` at Tier 3\*.][rust#87666] \* Refer to Rust's [platform support page][platform-support-doc] for more information on Rust's tiered platform support. Libraries --------- - [Allow writing of incomplete UTF-8 sequences via stdout/stderr on Windows.] [rust#83342] The Windows console still requires valid Unicode, but this change allows splitting a UTF-8 character across multiple write calls. This allows, for instance, programs that just read and write data buffers (e.g. copying a file to stdout) without regard for Unicode or character boundaries. - [Prefer `AtomicU{64,128}` over Mutex for Instant backsliding protection.] [rust#83093] For this use case, atomics scale much better under contention. - [Implement `Extend<(A, B)>` for `(Extend<A>, Extend<B>)`][rust#85835] - [impl Default, Copy, Clone for std::io::Sink and std::io::Empty][rust#86744] - [`impl From<[(K, V); N]>` for all collections.][rust#84111] - [Remove `P: Unpin` bound on impl Future for Pin.][rust#81363] - [Treat invalid environment variable names as non-existent.][rust#86183] Previously, the environment functions would panic if given a variable name with an internal null character or equal sign (`=`). Now, these functions will just treat such names as non-existent variables, since the OS cannot represent the existence of a variable with such a name. Stabilised APIs --------------- - [`std::os::unix::fs::chroot`] - [`UnsafeCell::raw_get`] - [`BufWriter::into_parts`] - [`core::panic::{UnwindSafe, RefUnwindSafe, AssertUnwindSafe}`] These APIs were previously stable in `std`, but are now also available in `core`. - [`Vec::shrink_to`] - [`String::shrink_to`] - [`OsString::shrink_to`] - [`PathBuf::shrink_to`] - [`BinaryHeap::shrink_to`] - [`VecDeque::shrink_to`] - [`HashMap::shrink_to`] - [`HashSet::shrink_to`] These APIs are now usable in const contexts: - [`std::mem::transmute`] - [`[T]::first`][`slice::first`] - [`[T]::split_first`][`slice::split_first`] - [`[T]::last`][`slice::last`] - [`[T]::split_last`][`slice::split_last`] Cargo ----- - [Cargo supports specifying a minimum supported Rust version in Cargo.toml.] [`rust-version`] This has no effect at present on dependency version selection. We encourage crates to specify their minimum supported Rust version, and we encourage CI systems that support Rust code to include a crate's specified minimum version in the text matrix for that crate by default. Compatibility notes ------------------- - [Update to new argument parsing rules on Windows.][rust#87580] This adjusts Rust's standard library to match the behavior of the standard libraries for C/C++. The rules have changed slightly over time, and this PR brings us to the latest set of rules (changed in 2008). - [Disallow the aapcs calling convention on aarch64][rust#88399] This was already not supported by LLVM; this change surfaces this lack of support with a better error message. - [Make `SEMICOLON_IN_EXPRESSIONS_FROM_MACROS` warn by default][rust#87385] - [Warn when an escaped newline skips multiple lines.][rust#87671] - [Calls to `libc::getpid` / `std::process::id` from `Command::pre_exec` may return different values on glibc <= 2.24.][rust#81825] Rust now invokes the `clone3` system call directly, when available, to use new functionality available via that system call. Older versions of glibc cache the result of `getpid`, and only update that cache when calling glibc's clone/fork functions, so a direct system call bypasses that cache update. glibc 2.25 and newer no longer cache `getpid` for exactly this reason. Internal changes ---------------- These changes provide no direct user facing benefits, but represent significant improvements to the internals and overall performance of rustc and related tools. - [LLVM is compiled with PGO in published x86_64-unknown-linux-gnu artifacts.][rust#88069] This improves the performance of most Rust builds. - [Unify representation of macros in internal data structures.][rust#88019] This change fixes a host of bugs with the handling of macros by the compiler, as well as rustdoc. [`std::os::unix::fs::chroot`]: https://doc.rust-lang.org/stable/std/os/unix/fs/fn.chroot.html [`Iterator::intersperse`]: https://doc.rust-lang.org/stable/std/iter/trait.Iterator.html#method.intersperse [`Iterator::intersperse_with`]: https://doc.rust-lang.org/stable/std/iter/trait.Iterator.html#method.intersperse [`UnsafeCell::raw_get`]: https://doc.rust-lang.org/stable/std/cell/struct.UnsafeCell.html#method.raw_get [`BufWriter::into_parts`]: https://doc.rust-lang.org/stable/std/io/struct.BufWriter.html#method.into_parts [`core::panic::{UnwindSafe, RefUnwindSafe, AssertUnwindSafe}`]: rust-lang/rust#84662 [`Vec::shrink_to`]: https://doc.rust-lang.org/stable/std/vec/struct.Vec.html#method.shrink_to [`String::shrink_to`]: https://doc.rust-lang.org/stable/std/string/struct.String.html#method.shrink_to [`OsString::shrink_to`]: https://doc.rust-lang.org/stable/std/ffi/struct.OsString.html#method.shrink_to [`PathBuf::shrink_to`]: https://doc.rust-lang.org/stable/std/path/struct.PathBuf.html#method.shrink_to [`BinaryHeap::shrink_to`]: https://doc.rust-lang.org/stable/std/collections/struct.BinaryHeap.html#method.shrink_to [`VecDeque::shrink_to`]: https://doc.rust-lang.org/stable/std/collections/struct.VecDeque.html#method.shrink_to [`HashMap::shrink_to`]: https://doc.rust-lang.org/stable/std/collections/hash_map/struct.HashMap.html#method.shrink_to [`HashSet::shrink_to`]: https://doc.rust-lang.org/stable/std/collections/hash_set/struct.HashSet.html#method.shrink_to [`std::mem::transmute`]: https://doc.rust-lang.org/stable/std/mem/fn.transmute.html [`slice::first`]: https://doc.rust-lang.org/stable/std/primitive.slice.html#method.first [`slice::split_first`]: https://doc.rust-lang.org/stable/std/primitive.slice.html#method.split_first [`slice::last`]: https://doc.rust-lang.org/stable/std/primitive.slice.html#method.last [`slice::split_last`]: https://doc.rust-lang.org/stable/std/primitive.slice.html#method.split_last [`rust-version`]: https://doc.rust-lang.org/nightly/cargo/reference/manifest.html#the-rust-version-field [rust#87671]: rust-lang/rust#87671 [rust#86183]: rust-lang/rust#86183 [rust#87385]: rust-lang/rust#87385 [rust#88100]: rust-lang/rust#88100 [rust#86860]: rust-lang/rust#86860 [rust#84039]: rust-lang/rust#84039 [rust#86492]: rust-lang/rust#86492 [rust#88363]: rust-lang/rust#88363 [rust#85305]: rust-lang/rust#85305 [rust#87832]: rust-lang/rust#87832 [rust#88069]: rust-lang/rust#88069 [rust#87472]: rust-lang/rust#87472 [rust#87699]: rust-lang/rust#87699 [rust#87570]: rust-lang/rust#87570 [rust#88023]: rust-lang/rust#88023 [rust#87760]: rust-lang/rust#87760 [rust#87370]: rust-lang/rust#87370 [rust#87580]: rust-lang/rust#87580 [rust#83342]: rust-lang/rust#83342 [rust#83093]: rust-lang/rust#83093 [rust#88177]: rust-lang/rust#88177 [rust#88548]: rust-lang/rust#88548 [rust#88551]: rust-lang/rust#88551 [rust#88299]: rust-lang/rust#88299 [rust#88220]: rust-lang/rust#88220 [rust#85835]: rust-lang/rust#85835 [rust#86879]: rust-lang/rust#86879 [rust#86744]: rust-lang/rust#86744 [rust#84662]: rust-lang/rust#84662 [rust#86593]: rust-lang/rust#86593 [rust#81050]: rust-lang/rust#81050 [rust#81363]: rust-lang/rust#81363 [rust#84111]: rust-lang/rust#84111 [rust#85769]: rust-lang/rust#85769 (comment) [rust#88490]: rust-lang/rust#88490 [rust#88269]: rust-lang/rust#88269 [rust#84176]: rust-lang/rust#84176 [rust#88399]: rust-lang/rust#88399 [rust#88227]: rust-lang/rust#88227 [rust#88200]: rust-lang/rust#88200 [rust#82776]: rust-lang/rust#82776 [rust#88077]: rust-lang/rust#88077 [rust#87728]: rust-lang/rust#87728 [rust#87050]: rust-lang/rust#87050 [rust#87619]: rust-lang/rust#87619 [rust#81825]: rust-lang/rust#81825 (comment) [rust#88019]: rust-lang/rust#88019 [rust#87666]: rust-lang/rust#87666
Pkgsrc changes: * Bump bootstrap kit version to 1.55.0. * Adjust patches as needed, some no longer apply (so removed) * Update checksum adjustments. * Avoid rust-llvm on SunOS * Optionally build docs * Remove reference to closed/old PR#54621 Upstream changes: Version 1.56.1 (2021-11-01) =========================== - New lints to detect the presence of bidirectional-override Unicode codepoints in the compiled source code ([CVE-2021-42574]) [CVE-2021-42574]: https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-42574 Version 1.56.0 (2021-10-21) ======================== Language -------- - [The 2021 Edition is now stable.][rust#88100] See [the edition guide][rust-2021-edition-guide] for more details. - [The pattern in `binding @ pattern` can now also introduce new bindings.] [rust#85305] - [Union field access is permitted in `const fn`.][rust#85769] [rust-2021-edition-guide]: https://doc.rust-lang.org/nightly/edition-guide/rust-2021/index.html Compiler -------- - [Upgrade to LLVM 13.][rust#87570] - [Support memory, address, and thread sanitizers on aarch64-unknown-freebsd.] [rust#88023] - [Allow specifying a deployment target version for all iOS targets][rust#87699] - [Warnings can be forced on with `--force-warn`.][rust#87472] This feature is primarily intended for usage by `cargo fix`, rather than end users. - [Promote `aarch64-apple-ios-sim` to Tier 2\*.][rust#87760] - [Add `powerpc-unknown-freebsd` at Tier 3\*.][rust#87370] - [Add `riscv32imc-esp-espidf` at Tier 3\*.][rust#87666] \* Refer to Rust's [platform support page][platform-support-doc] for more information on Rust's tiered platform support. Libraries --------- - [Allow writing of incomplete UTF-8 sequences via stdout/stderr on Windows.] [rust#83342] The Windows console still requires valid Unicode, but this change allows splitting a UTF-8 character across multiple write calls. This allows, for instance, programs that just read and write data buffers (e.g. copying a file to stdout) without regard for Unicode or character boundaries. - [Prefer `AtomicU{64,128}` over Mutex for Instant backsliding protection.] [rust#83093] For this use case, atomics scale much better under contention. - [Implement `Extend<(A, B)>` for `(Extend<A>, Extend<B>)`][rust#85835] - [impl Default, Copy, Clone for std::io::Sink and std::io::Empty][rust#86744] - [`impl From<[(K, V); N]>` for all collections.][rust#84111] - [Remove `P: Unpin` bound on impl Future for Pin.][rust#81363] - [Treat invalid environment variable names as non-existent.][rust#86183] Previously, the environment functions would panic if given a variable name with an internal null character or equal sign (`=`). Now, these functions will just treat such names as non-existent variables, since the OS cannot represent the existence of a variable with such a name. Stabilised APIs --------------- - [`std::os::unix::fs::chroot`] - [`UnsafeCell::raw_get`] - [`BufWriter::into_parts`] - [`core::panic::{UnwindSafe, RefUnwindSafe, AssertUnwindSafe}`] These APIs were previously stable in `std`, but are now also available in `core`. - [`Vec::shrink_to`] - [`String::shrink_to`] - [`OsString::shrink_to`] - [`PathBuf::shrink_to`] - [`BinaryHeap::shrink_to`] - [`VecDeque::shrink_to`] - [`HashMap::shrink_to`] - [`HashSet::shrink_to`] These APIs are now usable in const contexts: - [`std::mem::transmute`] - [`[T]::first`][`slice::first`] - [`[T]::split_first`][`slice::split_first`] - [`[T]::last`][`slice::last`] - [`[T]::split_last`][`slice::split_last`] Cargo ----- - [Cargo supports specifying a minimum supported Rust version in Cargo.toml.] [`rust-version`] This has no effect at present on dependency version selection. We encourage crates to specify their minimum supported Rust version, and we encourage CI systems that support Rust code to include a crate's specified minimum version in the text matrix for that crate by default. Compatibility notes ------------------- - [Update to new argument parsing rules on Windows.][rust#87580] This adjusts Rust's standard library to match the behavior of the standard libraries for C/C++. The rules have changed slightly over time, and this PR brings us to the latest set of rules (changed in 2008). - [Disallow the aapcs calling convention on aarch64][rust#88399] This was already not supported by LLVM; this change surfaces this lack of support with a better error message. - [Make `SEMICOLON_IN_EXPRESSIONS_FROM_MACROS` warn by default][rust#87385] - [Warn when an escaped newline skips multiple lines.][rust#87671] - [Calls to `libc::getpid` / `std::process::id` from `Command::pre_exec` may return different values on glibc <= 2.24.][rust#81825] Rust now invokes the `clone3` system call directly, when available, to use new functionality available via that system call. Older versions of glibc cache the result of `getpid`, and only update that cache when calling glibc's clone/fork functions, so a direct system call bypasses that cache update. glibc 2.25 and newer no longer cache `getpid` for exactly this reason. Internal changes ---------------- These changes provide no direct user facing benefits, but represent significant improvements to the internals and overall performance of rustc and related tools. - [LLVM is compiled with PGO in published x86_64-unknown-linux-gnu artifacts.] [rust#88069] This improves the performance of most Rust builds. - [Unify representation of macros in internal data structures.][rust#88019] This change fixes a host of bugs with the handling of macros by the compiler, as well as rustdoc. [`std::os::unix::fs::chroot`]: https://doc.rust-lang.org/stable/std/os/unix/fs/fn.chroot.html [`Iterator::intersperse`]: https://doc.rust-lang.org/stable/std/iter/trait.Iterator.html#method.intersperse [`Iterator::intersperse_with`]: https://doc.rust-lang.org/stable/std/iter/trait.Iterator.html#method.intersperse [`UnsafeCell::raw_get`]: https://doc.rust-lang.org/stable/std/cell/struct.UnsafeCell.html#method.raw_get [`BufWriter::into_parts`]: https://doc.rust-lang.org/stable/std/io/struct.BufWriter.html#method.into_parts [`core::panic::{UnwindSafe, RefUnwindSafe, AssertUnwindSafe}`]: rust-lang/rust#84662 [`Vec::shrink_to`]: https://doc.rust-lang.org/stable/std/vec/struct.Vec.html#method.shrink_to [`String::shrink_to`]: https://doc.rust-lang.org/stable/std/string/struct.String.html#method.shrink_to [`OsString::shrink_to`]: https://doc.rust-lang.org/stable/std/ffi/struct.OsString.html#method.shrink_to [`PathBuf::shrink_to`]: https://doc.rust-lang.org/stable/std/path/struct.PathBuf.html#method.shrink_to [`BinaryHeap::shrink_to`]: https://doc.rust-lang.org/stable/std/collections/struct.BinaryHeap.html#method.shrink_to [`VecDeque::shrink_to`]: https://doc.rust-lang.org/stable/std/collections/struct.VecDeque.html#method.shrink_to [`HashMap::shrink_to`]: https://doc.rust-lang.org/stable/std/collections/hash_map/struct.HashMap.html#method.shrink_to [`HashSet::shrink_to`]: https://doc.rust-lang.org/stable/std/collections/hash_set/struct.HashSet.html#method.shrink_to [`std::mem::transmute`]: https://doc.rust-lang.org/stable/std/mem/fn.transmute.html [`slice::first`]: https://doc.rust-lang.org/stable/std/primitive.slice.html#method.first [`slice::split_first`]: https://doc.rust-lang.org/stable/std/primitive.slice.html#method.split_first [`slice::last`]: https://doc.rust-lang.org/stable/std/primitive.slice.html#method.last [`slice::split_last`]: https://doc.rust-lang.org/stable/std/primitive.slice.html#method.split_last [`rust-version`]: https://doc.rust-lang.org/nightly/cargo/reference/manifest.html#the-rust-version-field [rust#87671]: rust-lang/rust#87671 [rust#86183]: rust-lang/rust#86183 [rust#87385]: rust-lang/rust#87385 [rust#88100]: rust-lang/rust#88100 [rust#86860]: rust-lang/rust#86860 [rust#84039]: rust-lang/rust#84039 [rust#86492]: rust-lang/rust#86492 [rust#88363]: rust-lang/rust#88363 [rust#85305]: rust-lang/rust#85305 [rust#87832]: rust-lang/rust#87832 [rust#88069]: rust-lang/rust#88069 [rust#87472]: rust-lang/rust#87472 [rust#87699]: rust-lang/rust#87699 [rust#87570]: rust-lang/rust#87570 [rust#88023]: rust-lang/rust#88023 [rust#87760]: rust-lang/rust#87760 [rust#87370]: rust-lang/rust#87370 [rust#87580]: rust-lang/rust#87580 [rust#83342]: rust-lang/rust#83342 [rust#83093]: rust-lang/rust#83093 [rust#88177]: rust-lang/rust#88177 [rust#88548]: rust-lang/rust#88548 [rust#88551]: rust-lang/rust#88551 [rust#88299]: rust-lang/rust#88299 [rust#88220]: rust-lang/rust#88220 [rust#85835]: rust-lang/rust#85835 [rust#86879]: rust-lang/rust#86879 [rust#86744]: rust-lang/rust#86744 [rust#84662]: rust-lang/rust#84662 [rust#86593]: rust-lang/rust#86593 [rust#81050]: rust-lang/rust#81050 [rust#81363]: rust-lang/rust#81363 [rust#84111]: rust-lang/rust#84111 [rust#85769]: rust-lang/rust#85769 (comment) [rust#88490]: rust-lang/rust#88490 [rust#88269]: rust-lang/rust#88269 [rust#84176]: rust-lang/rust#84176 [rust#88399]: rust-lang/rust#88399 [rust#88227]: rust-lang/rust#88227 [rust#88200]: rust-lang/rust#88200 [rust#82776]: rust-lang/rust#82776 [rust#88077]: rust-lang/rust#88077 [rust#87728]: rust-lang/rust#87728 [rust#87050]: rust-lang/rust#87050 [rust#87619]: rust-lang/rust#87619 [rust#81825]: rust-lang/rust#81825 (comment) [rust#88019]: rust-lang/rust#88019 [rust#87666]: rust-lang/rust#87666 Version 1.55.0 (2021-09-09) ============================ Language -------- - [You can now write open "from" range patterns (`X..`), which will start at `X` and will end at the maximum value of the integer.][83918] - [You can now explicitly import the prelude of different editions through `std::prelude` (e.g. `use std::prelude::rust_2021::*;`).][86294] Compiler -------- - [Added tier 3\* support for `powerpc64le-unknown-freebsd`.][83572] \* Refer to Rust's [platform support page][platform-support-doc] for more information on Rust's tiered platform support. Libraries --------- - [Updated std's float parsing to use the Eisel-Lemire algorithm.][86761] These improvements should in general provide faster string parsing of floats, no longer reject certain valid floating point values, and reduce the produced code size for non-stripped artifacts. - [`string::Drain` now implements `AsRef<str>` and `AsRef<[u8]>`.][86858] Stabilised APIs --------------- - [`Bound::cloned`] - [`Drain::as_str`] - [`IntoInnerError::into_error`] - [`IntoInnerError::into_parts`] - [`MaybeUninit::assume_init_mut`] - [`MaybeUninit::assume_init_ref`] - [`MaybeUninit::write`] - [`array::map`] - [`ops::ControlFlow`] - [`x86::_bittest`] - [`x86::_bittestandcomplement`] - [`x86::_bittestandreset`] - [`x86::_bittestandset`] - [`x86_64::_bittest64`] - [`x86_64::_bittestandcomplement64`] - [`x86_64::_bittestandreset64`] - [`x86_64::_bittestandset64`] The following previously stable functions are now `const`. - [`str::from_utf8_unchecked`] Cargo ----- - [Cargo will now deduplicate compiler diagnostics to the terminal when invoking rustc in parallel such as when using `cargo test`.][cargo/9675] - [The package definition in `cargo metadata` now includes the `"default_run"` field from the manifest.][cargo/9550] - [Added `cargo d` as an alias for `cargo doc`.][cargo/9680] - [Added `{lib}` as formatting option for `cargo tree` to print the `"lib_name"` of packages.][cargo/9663] Rustdoc ------- - [Added "Go to item on exact match" search option.][85876] - [The "Implementors" section on traits no longer shows redundant method definitions.][85970] - [Trait implementations are toggled open by default.][86260] This should make the implementations more searchable by tools like `CTRL+F` in your browser. - [Intra-doc links should now correctly resolve associated items (e.g. methods) through type aliases.][86334] - [Traits which are marked with `#[doc(hidden)]` will no longer appear in the "Trait Implementations" section.][86513] Compatibility Notes ------------------- - [std functions that return an `io::Error` will no longer use the `ErrorKind::Other` variant.][85746] This is to better reflect that these kinds of errors could be categorised [into newer more specific `ErrorKind` variants][79965], and that they do not represent a user error. - [Using environment variable names with `process::Command` on Windows now behaves as expected.][85270] Previously using envionment variables with `Command` would cause them to be ASCII-uppercased. - [Rustdoc will now warn on using rustdoc lints that aren't prefixed with `rustdoc::`][86849] [86849]: rust-lang/rust#86849 [86513]: rust-lang/rust#86513 [86334]: rust-lang/rust#86334 [86260]: rust-lang/rust#86260 [85970]: rust-lang/rust#85970 [85876]: rust-lang/rust#85876 [83572]: rust-lang/rust#83572 [86294]: rust-lang/rust#86294 [86858]: rust-lang/rust#86858 [86761]: rust-lang/rust#86761 [85769]: rust-lang/rust#85769 [85746]: rust-lang/rust#85746 [85305]: rust-lang/rust#85305 [85270]: rust-lang/rust#85270 [84111]: rust-lang/rust#84111 [83918]: rust-lang/rust#83918 [79965]: rust-lang/rust#79965 [87370]: rust-lang/rust#87370 [87298]: rust-lang/rust#87298 [cargo/9663]: rust-lang/cargo#9663 [cargo/9675]: rust-lang/cargo#9675 [cargo/9550]: rust-lang/cargo#9550 [cargo/9680]: rust-lang/cargo#9680 [cargo/9663]: rust-lang/cargo#9663 [`array::map`]: https://doc.rust-lang.org/stable/std/primitive.array.html#method.map [`Bound::cloned`]: https://doc.rust-lang.org/stable/std/ops/enum.Bound.html#method.cloned [`Drain::as_str`]: https://doc.rust-lang.org/stable/std/string/struct.Drain.html#method.as_str [`IntoInnerError::into_error`]: https://doc.rust-lang.org/stable/std/io/struct.IntoInnerError.html#method.into_error [`IntoInnerError::into_parts`]: https://doc.rust-lang.org/stable/std/io/struct.IntoInnerError.html#method.into_parts [`MaybeUninit::assume_init_mut`]: https://doc.rust-lang.org/stable/std/mem/union.MaybeUninit.html#method.assume_init_mut [`MaybeUninit::assume_init_ref`]: https://doc.rust-lang.org/stable/std/mem/union.MaybeUninit.html#method.assume_init_ref [`MaybeUninit::write`]: https://doc.rust-lang.org/stable/std/mem/union.MaybeUninit.html#method.write [`Seek::rewind`]: https://doc.rust-lang.org/stable/std/io/trait.Seek.html#method.rewind [`ops::ControlFlow`]: https://doc.rust-lang.org/stable/std/ops/enum.ControlFlow.html [`str::from_utf8_unchecked`]: https://doc.rust-lang.org/stable/std/str/fn.from_utf8_unchecked.html [`x86::_bittest`]: https://doc.rust-lang.org/stable/core/arch/x86/fn._bittest.html [`x86::_bittestandcomplement`]: https://doc.rust-lang.org/stable/core/arch/x86/fn._bittestandcomplement.html [`x86::_bittestandreset`]: https://doc.rust-lang.org/stable/core/arch/x86/fn._bittestandreset.html [`x86::_bittestandset`]: https://doc.rust-lang.org/stable/core/arch/x86/fn._bittestandset.html [`x86_64::_bittest64`]: https://doc.rust-lang.org/stable/core/arch/x86_64/fn._bittest64.html [`x86_64::_bittestandcomplement64`]: https://doc.rust-lang.org/stable/core/arch/x86_64/fn._bittestandcomplement64.html [`x86_64::_bittestandreset64`]: https://doc.rust-lang.org/stable/core/arch/x86_64/fn._bittestandreset64.html [`x86_64::_bittestandset64`]: https://doc.rust-lang.org/stable/core/arch/x86_64/fn._bittestandset64.html
…ToArgvW On Windows, the command line arguments of a program are a single WTF-16 encoded string and its up to the program to split it into an array of strings. In C/C++, the entry point of the C runtime takes care of splitting the command line and passing argc/argv to the main function. ziglang#18309 updated ArgIteratorWindows to match the behavior of CommandLineToArgvW, but it turns out that CommandLineToArgvW's behavior does not match the behavior of the C runtime post-2008. In 2008, the C runtime argv splitting changed how it handles consecutive double quotes within a quoted argument (it's now considered an escaped quote, e.g. `"foo""bar"` post-2008 would get parsed into `foo"bar`). This commit makes ArgIteratorWindows match the behavior of the post-2008 C runtime, and adds a standalone test that verifies the behavior matches both the MSVC and MinGW argv splitting exactly in all cases (it checks that randomly generated command line strings get split the same way). The motivation here is roughly the same as when the same change was made in Rust (rust-lang/rust#87580), that is (paraphrased): - Consistent behavior between Zig and modern C/C++ programs - Allows users to escape double quotes in a way that can be more straightforward Additionally, the suggested mitigation for BatBadBut (https://flatt.tech/research/posts/batbadbut-you-cant-securely-execute-commands-on-windows/) relies on the post-2008 argv splitting behavior for roundtripping of the arguments given to `cmd.exe`. Note: it's not necessary for the suggested mitigation to work, but it is necessary for the suggested escaping to be parsed back into the intended argv by ArgIteratorWindows.
…ToArgvW On Windows, the command line arguments of a program are a single WTF-16 encoded string and it's up to the program to split it into an array of strings. In C/C++, the entry point of the C runtime takes care of splitting the command line and passing argc/argv to the main function. ziglang#18309 updated ArgIteratorWindows to match the behavior of CommandLineToArgvW, but it turns out that CommandLineToArgvW's behavior does not match the behavior of the C runtime post-2008. In 2008, the C runtime argv splitting changed how it handles consecutive double quotes within a quoted argument (it's now considered an escaped quote, e.g. `"foo""bar"` post-2008 would get parsed into `foo"bar`), and the rules around argv[0] were also changed. This commit makes ArgIteratorWindows match the behavior of the post-2008 C runtime, and adds a standalone test that verifies the behavior matches both the MSVC and MinGW argv splitting exactly in all cases (it checks that randomly generated command line strings get split the same way). The motivation here is roughly the same as when the same change was made in Rust (rust-lang/rust#87580), that is (paraphrased): - Consistent behavior between Zig and modern C/C++ programs - Allows users to escape double quotes in a way that can be more straightforward Additionally, the suggested mitigation for BatBadBut (https://flatt.tech/research/posts/batbadbut-you-cant-securely-execute-commands-on-windows/) relies on the post-2008 argv splitting behavior for roundtripping of the arguments given to `cmd.exe`. Note: it's not necessary for the suggested mitigation to work, but it is necessary for the suggested escaping to be parsed back into the intended argv by ArgIteratorWindows after being run through a `.bat` file.
…ToArgvW On Windows, the command line arguments of a program are a single WTF-16 encoded string and it's up to the program to split it into an array of strings. In C/C++, the entry point of the C runtime takes care of splitting the command line and passing argc/argv to the main function. ziglang#18309 updated ArgIteratorWindows to match the behavior of CommandLineToArgvW, but it turns out that CommandLineToArgvW's behavior does not match the behavior of the C runtime post-2008. In 2008, the C runtime argv splitting changed how it handles consecutive double quotes within a quoted argument (it's now considered an escaped quote, e.g. `"foo""bar"` post-2008 would get parsed into `foo"bar`), and the rules around argv[0] were also changed. This commit makes ArgIteratorWindows match the behavior of the post-2008 C runtime, and adds a standalone test that verifies the behavior matches both the MSVC and MinGW argv splitting exactly in all cases (it checks that randomly generated command line strings get split the same way). The motivation here is roughly the same as when the same change was made in Rust (rust-lang/rust#87580), that is (paraphrased): - Consistent behavior between Zig and modern C/C++ programs - Allows users to escape double quotes in a way that can be more straightforward Additionally, the suggested mitigation for BatBadBut (https://flatt.tech/research/posts/batbadbut-you-cant-securely-execute-commands-on-windows/) relies on the post-2008 argv splitting behavior for roundtripping of the arguments given to `cmd.exe`. Note: it's not necessary for the suggested mitigation to work, but it is necessary for the suggested escaping to be parsed back into the intended argv by ArgIteratorWindows after being run through a `.bat` file.
…ToArgvW On Windows, the command line arguments of a program are a single WTF-16 encoded string and it's up to the program to split it into an array of strings. In C/C++, the entry point of the C runtime takes care of splitting the command line and passing argc/argv to the main function. ziglang#18309 updated ArgIteratorWindows to match the behavior of CommandLineToArgvW, but it turns out that CommandLineToArgvW's behavior does not match the behavior of the C runtime post-2008. In 2008, the C runtime argv splitting changed how it handles consecutive double quotes within a quoted argument (it's now considered an escaped quote, e.g. `"foo""bar"` post-2008 would get parsed into `foo"bar`), and the rules around argv[0] were also changed. This commit makes ArgIteratorWindows match the behavior of the post-2008 C runtime, and adds a standalone test that verifies the behavior matches both the MSVC and MinGW argv splitting exactly in all cases (it checks that randomly generated command line strings get split the same way). The motivation here is roughly the same as when the same change was made in Rust (rust-lang/rust#87580), that is (paraphrased): - Consistent behavior between Zig and modern C/C++ programs - Allows users to escape double quotes in a way that can be more straightforward Additionally, the suggested mitigation for BatBadBut (https://flatt.tech/research/posts/batbadbut-you-cant-securely-execute-commands-on-windows/) relies on the post-2008 argv splitting behavior for roundtripping of the arguments given to `cmd.exe`. Note: it's not necessary for the suggested mitigation to work, but it is necessary for the suggested escaping to be parsed back into the intended argv by ArgIteratorWindows after being run through a `.bat` file.
…ToArgvW On Windows, the command line arguments of a program are a single WTF-16 encoded string and it's up to the program to split it into an array of strings. In C/C++, the entry point of the C runtime takes care of splitting the command line and passing argc/argv to the main function. ziglang#18309 updated ArgIteratorWindows to match the behavior of CommandLineToArgvW, but it turns out that CommandLineToArgvW's behavior does not match the behavior of the C runtime post-2008. In 2008, the C runtime argv splitting changed how it handles consecutive double quotes within a quoted argument (it's now considered an escaped quote, e.g. `"foo""bar"` post-2008 would get parsed into `foo"bar`), and the rules around argv[0] were also changed. This commit makes ArgIteratorWindows match the behavior of the post-2008 C runtime, and adds a standalone test that verifies the behavior matches both the MSVC and MinGW argv splitting exactly in all cases (it checks that randomly generated command line strings get split the same way). The motivation here is roughly the same as when the same change was made in Rust (rust-lang/rust#87580), that is (paraphrased): - Consistent behavior between Zig and modern C/C++ programs - Allows users to escape double quotes in a way that can be more straightforward Additionally, the suggested mitigation for BatBadBut (https://flatt.tech/research/posts/batbadbut-you-cant-securely-execute-commands-on-windows/) relies on the post-2008 argv splitting behavior for roundtripping of the arguments given to `cmd.exe`. Note: it's not necessary for the suggested mitigation to work, but it is necessary for the suggested escaping to be parsed back into the intended argv by ArgIteratorWindows after being run through a `.bat` file.
…ToArgvW On Windows, the command line arguments of a program are a single WTF-16 encoded string and it's up to the program to split it into an array of strings. In C/C++, the entry point of the C runtime takes care of splitting the command line and passing argc/argv to the main function. ziglang#18309 updated ArgIteratorWindows to match the behavior of CommandLineToArgvW, but it turns out that CommandLineToArgvW's behavior does not match the behavior of the C runtime post-2008. In 2008, the C runtime argv splitting changed how it handles consecutive double quotes within a quoted argument (it's now considered an escaped quote, e.g. `"foo""bar"` post-2008 would get parsed into `foo"bar`), and the rules around argv[0] were also changed. This commit makes ArgIteratorWindows match the behavior of the post-2008 C runtime, and adds a standalone test that verifies the behavior matches both the MSVC and MinGW argv splitting exactly in all cases (it checks that randomly generated command line strings get split the same way). The motivation here is roughly the same as when the same change was made in Rust (rust-lang/rust#87580), that is (paraphrased): - Consistent behavior between Zig and modern C/C++ programs - Allows users to escape double quotes in a way that can be more straightforward Additionally, the suggested mitigation for BatBadBut (https://flatt.tech/research/posts/batbadbut-you-cant-securely-execute-commands-on-windows/) relies on the post-2008 argv splitting behavior for roundtripping of the arguments given to `cmd.exe`. Note: it's not necessary for the suggested mitigation to work, but it is necessary for the suggested escaping to be parsed back into the intended argv by ArgIteratorWindows after being run through a `.bat` file.
Fixes #44650
The Windows command line is passed to applications as a single string which the application then parses to get a list of arguments. The standard rules (as used by C/C++) for parsing the command line have slightly changed over the years, most recently in 2008 which added new escaping rules.
This PR implements the new rules as described on MSDN and further detailed here. It has been tested against the behaviour of C++ by calling a C++ program that outputs its raw command line and the contents of
argv
. See my repo if anyone wants to reproduce my work.For an overview of how this PR changes argument parsing behavior and why we feel it is warranted see #87580 (comment).
For some examples see: #87580 (comment)