Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[std.http] zig build fails to download large archives from github #15342

Closed
Tracked by #16672
LordMZTE opened this issue Apr 17, 2023 · 11 comments · Fixed by #16990
Closed
Tracked by #16672

[std.http] zig build fails to download large archives from github #15342

LordMZTE opened this issue Apr 17, 2023 · 11 comments · Fixed by #16990
Labels
bug Observed behavior contradicts documented or intended behavior zig build system std.Build, the build runner, `zig build` subcommand, package management
Milestone

Comments

@LordMZTE
Copy link
Contributor

LordMZTE commented Apr 17, 2023

Zig Version

0.11.0-dev.2648+3cd19dd89

Steps to Reproduce and Observed Behavior

Note: This is almost certainly a bug in std.http rather than the build system.

  1. Create a minimal build.zig:
const std = @import("std");

pub fn build(b: *std.Build) void {
    _ = b;
}
  1. Create a build.zig.zon with a link to reproduce the bug:
.{
    .name = "test",
    .version = "0.0.0",

    .dependencies = .{
        .test_dep = .{
            // The code of this repo is likely not important, it's just a link that allows to reproduce this.
            .url = "https://github.com/lordmzte/dawn/archive/9e2a193db129ebaf42eccfb9d6bb40055645c5d9.tar.gz",
        },
    },
}
  1. zig build

  2. The build will fail with one of two errors, seemingly at random:

  • error: TarComponentsOutsideStrippedPrefix
  • error: BadReaderState

Note that the tar archive the link points to is very large (>90MB). I suspect that this causes the GitHub server to use chunked encoding, which may cause either invalid data or a crash in the HTTP implementation in the two possible cases.

This is the stack trace produced by a debug build of Zig in the former case:

error: TarComponentsOutsideStrippedPrefix
/home/lordmzte/dev/zig/lib/std/tar.zig:181:13: 0x6000f17 in stripComponents (zig)
            return error.TarComponentsOutsideStrippedPrefix;
            ^
/home/lordmzte/dev/zig/lib/std/tar.zig:130:35: 0x6001c2c in pipeToFileSystem__anon_49378 (zig)
                const file_name = try stripComponents(unstripped_file_name, options.strip_components);
                                  ^
/home/lordmzte/dev/zig/src/Package.zig:561:5: 0x6002a57 in unpackTarball__anon_49145 (zig)
    try std.tar.pipeToFileSystem(out_dir, decompress.reader(), .{
    ^
/home/lordmzte/dev/zig/src/Package.zig:490:13: 0x60139ea in fetchAndUnpack (zig)
            try unpackTarball(gpa, &req, tmp_directory.handle, std.compress.gzip);
            ^
/home/lordmzte/dev/zig/src/Package.zig:282:25: 0x6018e6b in fetchAndAddDependencies (zig)
        const sub_pkg = try fetchAndUnpack(
                        ^
/home/lordmzte/dev/zig/src/main.zig:4402:13: 0x5e7a46a in cmdBuild (zig)
            try fetch_result;
            ^
/home/lordmzte/dev/zig/src/main.zig:298:9: 0x5e46dfb in mainArgs (zig)
        return cmdBuild(gpa, arena, cmd_args);
        ^
/home/lordmzte/dev/zig/src/main.zig:211:5: 0x5e458eb in main (zig)
    return mainArgs(gpa, arena, args);
    ^

Expected Behavior

The dependency is downloaded.

@LordMZTE LordMZTE added the bug Observed behavior contradicts documented or intended behavior label Apr 17, 2023
@Vexu Vexu added the zig build system std.Build, the build runner, `zig build` subcommand, package management label Apr 17, 2023
@Vexu Vexu added this to the 0.12.0 milestone Apr 17, 2023
@dadrian
Copy link

dadrian commented Jun 11, 2023

I don't have a minimally failing example, but I think I'm seeing the same thing directly in the HTTP client if I get a response that is both chunked and gzip'd. Posting just in case this is helpful.

Snipped loop, reading an HTTP response that has already been wait'd,

            var out = ArrayList(u8).init(allocator);
            var total: usize = 0;
            var buf: [4096]u8 = undefined;
            var w = out.writer();
            while (true) {
                const n = try req.readAll(&buf);
                total += n;
                if (n == 0) break;
                try w.writeAll(buf[0..n]);
            }

Ends with a GZIP error when it tries to read one more checksum byte after already reaching the end of its input stream.

C:\Users\david\scoop\apps\zig-dev\current\lib\std\io\reader.zig:56:37: 0x7ff69b7c954b in readNoEof (load.exe.obj)
            if (amt_read < buf.len) return error.EndOfStream;
                                    ^
C:\Users\david\scoop\apps\zig-dev\current\lib\std\io\reader.zig:256:13: 0x7ff69b7d21dd in readBytesNoEof__anon_10654 (load.exe.obj)
            try self.readNoEof(&bytes);
            ^
C:\Users\david\scoop\apps\zig-dev\current\lib\std\io\reader.zig:294:27: 0x7ff69b7d4201 in readIntLittle__anon_10657 (load.exe.obj)
            const bytes = try self.readBytesNoEof((@typeInfo(T).Int.bits + 7) / 8);
                          ^
C:\Users\david\scoop\apps\zig-dev\current\lib\std\compress\gzip.zig:136:26: 0x7ff69b7807fa in read (load.exe.obj)
            const hash = try self.in_reader.readIntLittle(u32);
                         ^
C:\Users\david\scoop\apps\zig-dev\current\lib\std\http\Client.zig:756:54: 0x7ff69b767cdd in read (load.exe.obj)
            .gzip => |*gzip| gzip.read(buffer) catch return error.DecompressionFailure,
                                                     ^
C:\Users\david\scoop\apps\zig-dev\current\lib\std\http\Client.zig:787:25: 0x7ff69b767ad2 in readAll (load.exe.obj)
            const amt = try read(req, buffer[index..]);

C:\Users\david\src\damp\zig\s3.zig:283:27: 0x7ff69b7689e8 in call (load.exe.obj)
                const n = try req.readAll(&buf);
                
[snip]

The actual response size is only 373 bytes, so the loop should only execute once in normal execution.

@luizberti
Copy link
Contributor

I don't have a minimally failing example

I do, and it is Zig itself :)

.{
    .name = "foo",
    .version = "0.0.0",
    .dependencies = .{
        .zig = .{
            .url = "https://github.com/ziglang/zig/archive/bf827d0b555df47ad2a2ea2062e2c855255c74d1.tar.gz",
        },
    },
}

@andrewrk andrewrk modified the milestones: 0.13.0, 0.11.0 Jul 17, 2023
@andrewrk andrewrk modified the milestones: 0.11.0, 0.11.1 Jul 24, 2023
@mitchellh
Copy link
Contributor

Noting that this is blocking Ghostty from adopting the package manager. The following projects are all too build to big:

  • libxml2
  • harfbuzz
  • Mach's xcode-frameworks

@truemedian
Copy link
Contributor

This issue should be renamed, the std.http bug has been fixed for a while now. It's now a std.tar issue.

189900 added a commit to 189900/zig that referenced this issue Aug 27, 2023
Handles .extended_header type to parse PAX attributes and check if they override
the path of the next file. Increases maximum file path length from 255 to 1014.

Fixes ziglang#15342
@189900
Copy link
Contributor

189900 commented Aug 27, 2023

The reproduction tar (link provided above) contains PAX attributes.
These attributes store file paths that don't fit in the header, specifically:

zig-bf827d0b555df47ad2a2ea2062e2c855255c74d1/test/cases/compile_errors/assigning_to_struct_or_union_fields_that_are_not_optionals_with_a_function_that_returns_an_optional.zig
zig-bf827d0b555df47ad2a2ea2062e2c855255c74d1/test/cases/compile_errors/regression_test_2980_base_type_u32_is_not_type_checked_properly_when_assigning_a_value_within_a_struct.zig

#16990 adds support for this.

189900 added a commit to 189900/zig that referenced this issue Aug 28, 2023
Handles .extended_header type to parse PAX attributes and check if they override
the path of the next file. Increases maximum file path length from 255 to 1014.

Fixes ziglang#15342
189900 added a commit to 189900/zig that referenced this issue Aug 29, 2023
Handles .extended_header type to parse PAX attributes and check if they override
the path of the next file. Increases file path limit to std.fs.MAX_PATH_BYTES.

Fixes ziglang#15342
@dadrian
Copy link

dadrian commented Aug 30, 2023

What was the issue / fix / PR for the HTTP part of the issue?

@truemedian
Copy link
Contributor

If I remember correctly #15445 fixed the problem encountered here, the redirect was causing the issue.

Vexu pushed a commit that referenced this issue Sep 8, 2023
Handles .extended_header type to parse PAX attributes and check if they override
the path of the next file. Increases file path limit to std.fs.MAX_PATH_BYTES.

Fixes #15342
@winterqt
Copy link
Contributor

If I remember correctly #15445 fixed the problem encountered here, the redirect was causing the issue.

@truemedian Do you remember/know how #15376 and various other issues only mention EndOfStream being encountered for the HTTP issue, while this issue's OP mentions TarComponentsOutsideStrippedPrefix and BadReaderState?

Is it possible that this issue was always about the PAX attributes, and the HTTP issue was merely a red herring?

@truemedian
Copy link
Contributor

truemedian commented Sep 11, 2023

The HTTP issue was certainly happening for everyone who tried to make a request to github.com/*/archive, because they all redirect to the github CDN. But that issue has been fixed for the past 5 months.

@winterqt
Copy link
Contributor

winterqt commented Sep 11, 2023

Right, but the redirect issue was fixed after this issue was filed.

The other issue about the redirect bug mentions EndOfStream, not any of the two errors that this one points to (TarComponentsOutsideStrippedPrefix and BadReaderState).

This makes me wonder if the HTTP bug was really related here (even though it should be because it uses the same GitHub archive link in the build.zig.zon).

Guess I could test it by reverting the PAX attribute impl and seeing what gets thrown.

@ivanstepanovftw
Copy link

ivanstepanovftw commented Sep 17, 2023

zig-linux-x86_64-0.12.0-dev.389+61b70778b not resolved =/

Can I specify local folder path instead of using .dependencies .url ? I have very slow internet, possibly slowest you ever can imagine, and I was downloaded git repo, put it into tar.gz folder, opened IntelliJ server to host the archive and I am getting subj. And this error (TarComponentsOutsideStrippedPrefix) does not cares if the archive is small (under 1M) when I cut off large unrelated folders.

The repo I am talking is capy, UI for Zig. I was hoping that I can learn Zig while doing UI while I have worst ever days with slowest possibly internet, but...

@andrewrk andrewrk modified the milestones: 0.11.1, 0.12.0 Jan 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Observed behavior contradicts documented or intended behavior zig build system std.Build, the build runner, `zig build` subcommand, package management
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants