-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Several completion/globbing tests SIGSEGV on i386 #10944
Comments
Okay, how exactly are you printing that? Where are you putting the printlns, what are you printing? I don't believe there's a lot of recursion here? I added a FLOGF to WildCardExpander::expand, and I get a bunch of output there, but if I add another FLOGF to expand::stage_wildcards, you can see that they alternate - stage_wildcards runs expand once per directory in $PATH. With the directory/command created, running PATH="$path/test6.tmp2.\(paren\).dir:$PATH" target/release/fish --no-config --debug complete -c 'complete -C"test6"' I get output like:
Both built as release and as debug. Here's my patch: diff --git a/src/expand.rs b/src/expand.rs
index 99b418b97..14d461367 100644
--- a/src/expand.rs
+++ b/src/expand.rs
@@ -15,6 +15,7 @@ use crate::common::{
use crate::complete::{CompleteFlags, Completion, CompletionList, CompletionReceiver};
use crate::env::{EnvVar, Environment};
use crate::exec::exec_subshell_for_expand;
+use crate::FLOGF;
use crate::future_feature_flags::{feature_test, FeatureFlag};
use crate::history::{history_session_id, History};
use crate::operation_context::OperationContext;
@@ -1476,7 +1477,14 @@ impl<'a, 'b, 'c> Expander<'a, 'b, 'c> {
result = ExpandResult::new(ExpandResultCode::wildcard_no_match);
let mut expanded_recv = out.subreceiver();
+ let mut edstr = WString::new();
+ for ed in &effective_working_dirs {
+ edstr.push_utfstr(L!(":"));
+ edstr.push_utfstr(&ed);
+ }
+ FLOGF!(complete, "working_dirs: '%ls'", edstr);
for effective_working_dir in effective_working_dirs {
+ FLOGF!(complete, "stage_expand: '%ls', '%ls'", path_to_expand, effective_working_dir);
let expand_res = wildcard_expand_string(
&path_to_expand,
&effective_working_dir,
diff --git a/src/wildcard.rs b/src/wildcard.rs
index a84754ac0..93ce3f96f 100644
--- a/src/wildcard.rs
+++ b/src/wildcard.rs
@@ -10,6 +10,7 @@ use crate::common::{
UnescapeStringStyle, WILDCARD_RESERVED_BASE, WSL,
};
use crate::complete::{CompleteFlags, Completion, CompletionReceiver, PROG_COMPLETE_SEP};
+use crate::FLOGF;
use crate::expand::ExpandFlags;
use crate::fallback::wcscasecmp;
use crate::future_feature_flags::feature_test;
@@ -510,6 +511,7 @@ mod expander {
return;
}
+ FLOGF!(complete, "expand: '%ls', '%ls', '%ls'", base_dir, wc, effective_prefix);
// Get the current segment and compute interesting properties about it.
let (wc_segment, wc_remainder) = if let Some(next_slash) = wc.find_char('/') {
let (seg, rem) = wc.split_at(next_slash); |
would be great if we could find an i386 system where we can reproduce this. It would be great if we could get a stacktrace with line numbers; since it only happens on a release build maybe add a print statement before every line to narrow it down. It indeed does not look like infinite recursion (speak up if it is). It also seems unlikely that we're overflowing the stack here. |
At entrypoint: diff --git a/src/wildcard.rs b/src/wildcard.rs
index a84754ac..28fa8382 100644
--- a/src/wildcard.rs
+++ b/src/wildcard.rs
@@ -506,6 +506,7 @@ mod expander {
/// This is usually the same thing as the original wildcard, but for fuzzy matching, we
/// expand intermediate segments. effective_prefix is always either empty, or ends with a slash
pub fn expand(&mut self, base_dir: &wstr, wc: &wstr, effective_prefix: &wstr) {
+ eprintln!("::wildcard::expander::WildCardExpander::expand(base_dir = {base_dir:?}, wc = {wc:?}, effective_prefix = {effective_prefix:?})");
if self.interrupted_or_overflowed() {
return;
}
Indeed. Then it's not a recursion issue, but something else.
i386 is mostly "emulated" on amd64 platforms in Debian these days, including on build daemons and porter boxes. You can create an i386 chroot with With that, I'm also starting to speculate that the emulation might be at fault.. just a guess.
Doing the experiment ™ right now. |
::wildcard::expander::WildCardExpander::expand
probably recurses too much
So uh, the experiment had some interesting results. Running the tests as scripts (
Doing as krobelus suggested, sprinkling line number Lines 729 to 736 in 94dfe1b
With let Some(dev_inode) = entry.dev_inode() else {
eprintln!("INT11");
continue;
};
eprintln!("INT12");
if !self.visited_files.insert(dev_inode) {
eprintln!("INT13");
// Symlink loop! This directory was already visited, so skip it.
continue;
}
eprintln!("INT14"); It SIGSEGV'd right after printing "INT12". That is, Now here's the interesting part: with Now I do want to get my hands on a real i386 machine like krobelus suggested. Again, this might be emulation or otherwise downstream shenanigans, so I don't want to divert too much of your time before really needed. |
Okay, I created a debian chroot - with debootstrap, because that's packaged on archlinux:
installed the required packages and rust via rustup - because this is stable and it would have rust 1.63. And... other than a few permission tests failing because I've not bothered to set up a non-root user in the chroot everything passes. I do not get segfaults. |
Maybe tools like valgrind or AddressSanitizer will flag a problem. Chances are they don't but it should be fairly easy to try
If that doesn't work, I'd probably minimize the crashing program. It is highly likely that it's possible to reduce this to a small, single-file program with basically no dependencies. Reducing the input program step-by-step should be a reliable way to narrow down the problem. |
@nc7s can you give more detail about the environment, OS and hardware here? Is this Debian Experimental? Using the system rustc/cargo? The RelWithDebInfo builds should have line number debug info, but it appears they may not. |
zanchey:
Added in the topic. With krobelus' suggestion, valgrind shows (after the "INT12" mark):
I'll be reducing the program as krobelus suggested. |
Thanks faho for taking the time to check on a different distro. I did the similar, but on Fedora. Debian shipped rustc continues to produce failing results, while rustup downloaded does not. (Though only for the one discussed above; other tests fail for different reasons, incl. excessive output and timeout.) I'll redirect this to our rustc maintainers. Thanks for the time and help. Meanwhile, a fun aside: |
Yeah, that smells like a miscompilation. I'm going to close this, assuming it's in Debian's rustc - there are some patches related to x86 in there. |
It's not rustc; I've compiled on unstable with the system rustc and it passes the tests fine, but the build with the Debian patches applies segfaults. |
where are the Debian patches? (i.e. the ones used in this build from the other thread) |
How did you build it? Are "the Debian patches" those in fish's debian/patches/? |
https://salsa.debian.org/debian/fish/-/tree/master/debian/patches |
I've installed a new Debian unstable tree, unpacked the distro package with It's not the patches; the tests pass ok with those all applied but with the |
OK, I managed to get a build with symbols working. The crash is in Line 729 in b896193
Leaving me very suspicious of https://salsa.debian.org/rust-team/debcargo-conf/-/blob/master/src/libc/debian/patches/time64-10-stat.patch and https://salsa.debian.org/rust-team/debcargo-conf/-/blob/master/src/libc/debian/patches/time64-18-fix-stat-i386.patch |
If we are to raise suspicion over Debian packaged (and patched) dependencies, it would be literally searching for a needle in a haystack.. As for that crash, I've too made it there, after the "printing to sink" trick mentioned above. The patches you mentioned do seem relevant at the surface as they touch the "stat" part, which |
Nice catch - time64-18-fix-stat-i386.patch contributed to everything except complete.fish, and that is fixed if libc has all patches stripped (I didn't bother trying one by one). The maintainer is investigating. Thanks for your sharp senses ;) |
Debian i386 was excluded from the time64 transition. The time64 patches we are carrying for the libc crate should take account of this and be a no-op on i386 but clearly they aren't. I'm currently trying to narrow down which patch is the problem.. |
Bug found and fixed, some padding fields were lost from the definition of struct stat when building with the time64 patches in a non-time64 configuration. Though you probably shouldn't be using struct stat in the first place, you should probably be using struct stat64 and associated APIs for large file support. |
I think that arises because glibc maps |
With libc patches updated by plugwash, it now builds and tests smoothly on i386: https://buildd.debian.org/status/package.php?p=fish&suite=experimental Thank you all for the help! (Fails on mips64el, that should be easy and reserved for later :>) |
No, the underlying syscall is still correctly |
@plugwash Can you provide the upstream discussion link for posterity? |
There isn't really an "upstream discussion link" for this particular issue, I essentially maintain the time64 patches in Debian's rust-libc package. The patches were based on rust-lang/libc#3175 but are not identical to it for several reasons.
There is also a related discussion issue for time64 in rust-libc at http://deb.debian.org/debian/pool/main/k/kdepim-runtime/kdepim-runtime_24.12.0-2.dsc Initially when I uploaded the patches to Debian, the series consisted of the commits from the then-current version of the pull request, plus a Debian specific patch to not enable time64 on i386. After doing so I quickly ran into multiple issues related to stat. One of them with discussed with upstream, but another was not and I just dealt with it downstream. Perhaps they should have been, but I was trying to juggle a lot of different issues at the time. Much later this issue showed up on my rader, I was alerted to it on IRC and after nc7s confirmed it was related to the time64 patches I started working on narrowing down exactly where the problem was. As part of doing so I re-ordered our time64 patch series to make it easier to test things. After confirming it was related to the stat patches, I then wrote some test code to allow me to print out the size of the various stat related structures both with and without the patches applied (which on i386 should be identical, since our patch series does not enable time64 on i386) I determined that "struct stat" had a different size, while the other stat-related structs I looked at seemed ok. So I carefully compared the before and after definitions of struct stat and discovered the missing padding fields. Which I added in https://salsa.debian.org/rust-team/debcargo-conf/-/commit/6f7deb2063cd1b46685bdbde598eccdd816b77b1 I've just made a post to the pull request, giving a link to the "fix-stat-i386" patch and suggesting to upstream that they need to perform verification tests both with and without time64 enabled. |
This is the problem reported in #10633 for 4.0b1, that fish segfaults in tests related to globs on i386.
My development/packaging setup:
Failed tests (see log for output of each test):
With the offending test taken out as a separate .fish script (from tests/checks/complete.fish, CHECK at line 359), running it as
build/fish t.fish
will always SIGSEGV. The core dump shows:Then I went the good ol' way: sprinkling
eprintln!
across the code.With
-DCMAKE_BUILD_TYPE=Debug
, the test somehow passed:With
=RelWithDebInfo
, though, it eprinted the same result, but failed with the same output as in Debian builds (or lack of, that is).This method seems to recurse a lot. On i386 it recursed 10 times. On amd64 it recursed 14 times, though passing the test. In other tests it recurses even more.
This looks like an under-optimized recursion to me, but I don't really know much about this part.
The text was updated successfully, but these errors were encountered: