Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

curl-impersonate: init at 0.5.4 and replace curl-impersonate-bin #194310

Merged
merged 2 commits into from
Jul 23, 2023

Conversation

lilyinstarlight
Copy link
Member

@lilyinstarlight lilyinstarlight commented Oct 4, 2022

Description of changes

This replaces the curl-impersonate-bin package with a from-source build that should work on more than just x86_64-linux

Notes:

  • There is an update script to regenerate deps.nix since it has to be kept up to date with the dependency versions defined in the makefile
  • There is no generic go modules fetcher so I used (fetchGoModules { ... }).go-modules to get at the internal fetcher and grab the boringssl deps but still letting curl-impersonate's build process actually do the building
  • The TLS fingerprints all match as they are supposed to and this is verified by the source project's regression tests (which create PCAPs and analyze the TLS client hello in them against known signatures)
  • Theoretically this can build on Darwin, but I haven't tried it and I am not certain ca certs would get picked up correctly
    • Edit: It looks like only curl-impersonate-chrome built on Darwin in ofborg and only on aarch64-darwin (due to old sdk on x86_64), and curl-impersonate-ff failed because libtool is missing -static on both for some reason
    • Edit: Darwin works now yay!
  • I submitted the --enable-static fix upstream in PR Fix --disable-static and --enable-static=no lwthiker/curl-impersonate#117 (edit: this has been merged)

cc: @deliciouslytyped @ivan

Closes #169150

Things done
  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandbox = true set in nix.conf? (See Nix manual)
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 22.11 Release Notes (or backporting 22.05 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
    • (Release notes changes) Ran nixos/doc/manual/md-to-db.sh to update generated release notes
  • Fits CONTRIBUTING.md.

Copy link
Member

@ivan ivan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very cool and it does stuff I didn't know about :)

  1. This built and ran fine on my x86_64 NixOS master, and I didn't see CA issues with either -ff or -chrome.

  2. I didn't notice anything wrong in default.nix, but someone else should take a look.

  3. Shell completions (I tested zsh) seem to be missing for curl-impersonate-ff and curl-impersonate-chrome, but they don't seem important enough to block this.

  4. The update script did not work for me on NixOS master:

nixpkgs/pkgs/tools/networking/curl-impersonate# ./update.py
Traceback (most recent call last):
  File "/home/at/trash/nixpkgs/pkgs/tools/networking/curl-impersonate/./update.py", line 86, in <module>
    pkgpath = findpath(attr)
  File "/home/at/trash/nixpkgs/pkgs/tools/networking/curl-impersonate/./update.py", line 35, in findpath
    path = json.loads(subprocess.run(
  File "/nix/store/ay0kw1fy63pwa8n06wvm89d4p30jsajy-python3-3.10.7/lib/python3.10/subprocess.py", line 524, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['nix', '--extra-experimental-features', 'nix-command', 'eval', '--json', '.#curl-impersonate.meta.position']' returned non-zero exit status 1.

@lilyinstarlight
Copy link
Member Author

lilyinstarlight commented Oct 4, 2022

This is very cool and it does stuff I didn't know about :)

Yeah I had to dive deep into the nixpkgs code to even figure out how to do some of this stuff...

Also I've moved some things around based on the output of nixpkgs-hammering, and I set meta.broken = stdenv.isDarwin for now until the macOS SDK bump is complete and I'll look into fixing it then

  1. Shell completions (I tested zsh) seem to be missing for curl-impersonate-ff and curl-impersonate-chrome, but they don't seem important enough to block this.

I just added a postInstall hook to manually build and patch the completions for each executable and install them now (since the curl-impersonate build system does not seem to provide a way to do this and curl's makefile/perl script both assume an exe name of just curl)

Do completions work for you now?

  1. The update script did not work for me on NixOS master:
nixpkgs/pkgs/tools/networking/curl-impersonate# ./update.py
Traceback (most recent call last):
  File "/home/at/trash/nixpkgs/pkgs/tools/networking/curl-impersonate/./update.py", line 86, in <module>
    pkgpath = findpath(attr)
  File "/home/at/trash/nixpkgs/pkgs/tools/networking/curl-impersonate/./update.py", line 35, in findpath
    path = json.loads(subprocess.run(
  File "/nix/store/ay0kw1fy63pwa8n06wvm89d4p30jsajy-python3-3.10.7/lib/python3.10/subprocess.py", line 524, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['nix', '--extra-experimental-features', 'nix-command', 'eval', '--json', '.#curl-impersonate.meta.position']' returned non-zero exit status 1.

Oops I think I forgot a flag. Can you try it again with the changes I just force-pushed?

pkgs/tools/networking/curl-impersonate/default.nix Outdated Show resolved Hide resolved
Comment on lines 4 to 29
{
"curl-7.84.0.tar.xz" = fetchurl {
url = "https://curl.se/download/curl-7.84.0.tar.xz";
hash = "sha256-LRGLQ/VHv+W66AbY1HtOWW6lslpsHwgK70n7zYF8Xbg=";
};

"brotli-1.0.9.tar.gz" = fetchurl {
url = "https://github.com/google/brotli/archive/refs/tags/v1.0.9.tar.gz";
hash = "sha256-+ejYHQQFumbRgVKa9CozVPg4yTkJX/mZMNpqqc32/kY=";
};

"nss-3.77.tar.gz" = fetchurl {
url = "https://ftp.mozilla.org/pub/security/nss/releases/NSS_3_77_RTM/src/nss-3.77-with-nspr-4.32.tar.gz";
hash = "sha256-j6UEIeWcTk5x0Z8j3WV3uK+xwT4vm38OyxpVrjd1KV8=";
};

"boringssl.zip" = fetchurl {
url = "https://github.com/google/boringssl/archive/3a667d10e94186fd503966f5638e134fe9fb4080.zip";
hash = "sha256-HsDIkd1x5IH49fUF07dJaabMIMsQygW+NI7GneULpA8=";
};

"nghttp2-1.46.0.tar.bz2" = fetchurl {
url = "https://github.com/nghttp2/nghttp2/releases/download/v1.46.0/nghttp2-1.46.0.tar.bz2";
hash = "sha256-moKXjIcAcbdp8n0riBkct3/clFpRwdaFx/YafhP8Ryk=";
};
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we vendor those downloads instead of using ours?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would worry if we changed the versions too much, the TLS fingerprints may no longer match the browsers it is supposed to be impersonating or the patches from the project may not apply cleanly. Admittedly I did not test that theory, so it is possible using our dependencies could be fine

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, I just noticed this thread (I'm the author of curl-impersonate). I believe you can use whatever nghttp2 and brotli versions you'd like without it affecting the TLS signatures. I didn't see a case yet where it mattered (when I started curl-impersonate I didn't know that hence the fixed versions). Regarding nss and boringssl, it is important to use known working versions, otherwise the TLS signature might change (new TLS extensions, different default ciphers, and so on).

@@ -0,0 +1,104 @@
#!/usr/bin/env nix-shell
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we doing anything complicated in here that would be not that easy with bash, curl and jq?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could write this in bash, but when I started trying to parse the makefile in bash the code got large and annoying real fast (so I switched to Python). If it's really desired, I can probably rewrite it in bash when I have time

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we don't need to fetch other libraries the default nix-update $name might already be enough.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do need to fetch at least BoringSSL and NSS, as @lwthiker mentioned it being sensitive to those versions, and we would still have to parse the makefile to know where to name the sources for nghttp2 and brotli so that the build script can configure them

I'm not sure trying to avoid vendoring those two dependency source URLs is worth it either given it seems to require building them from source anyway, and if we use all of the vendored deps rather than some, automation is somewhat easier for package updates and maintenance burden is lower

I've rewritten the updater script in bash now, though, if that helps

Any thoughts @SuperSandro2000? (thanks you again for the reviews!)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want to use as little vendored libraries as possible because over time they are usually neglected and they never have nix specific patches which might be very important.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get that, but to avoid vendoring we would need to also seprately build compatible libs (with some of the curl-impersonate patches for at least boringssl) and bypass the curl-impersonate makefile entirely to write our own build logic (since it expects to build libs with custom options itself)

I'll assess how much work it would be sometime in the next few days to get an idea of what it would take to implement (including upfront difficulty in rewriting all of the relevant source files and in continued maintenance costs)

Also yeah as it is now the static build of NSS in this package is missing the patches to set an NSS_LIB_DIR (even if that doesn't really provide any value to curl-impersonate). I don't remember other Nix-specific patches this would be missing off the top of my head, but there may be one in boringssl to set a NIX_SSL_DIR, idk

@lilyinstarlight lilyinstarlight force-pushed the pkg/curl-impersonate branch 2 times, most recently from 80d3dd7 to bdf2f75 Compare October 4, 2022 13:44
@ivan
Copy link
Member

ivan commented Oct 4, 2022

Oops I think I forgot a flag. Can you try it again with the changes I just force-pushed?

It still wasn't working for me, but with

diff --git a/pkgs/tools/networking/curl-impersonate/update.py b/pkgs/tools/networking/curl-impersonate/update.py
index 34fc3e91f34..09e777f812f 100755
--- a/pkgs/tools/networking/curl-impersonate/update.py
+++ b/pkgs/tools/networking/curl-impersonate/update.py
@@ -35,9 +35,12 @@ def findpath(attr):
     path = json.loads(subprocess.run(
         ['nix', '--extra-experimental-features', 'nix-command flakes', 'eval', '--json', f'.#{attr}.meta.position'],
         capture_output=True, check=True).stdout).split(':', 1)[0]
-    outpath = json.loads(subprocess.run(
+    out = subprocess.run(
         ['nix', '--extra-experimental-features', 'nix-command flakes', 'eval', '--json', '--impure', '--expr', 'builtins.fetchGit ./.'],
-        capture_output=True).stdout)
+        capture_output=True)
+    print(f"stdout is: {out.stdout}")
+    print(f"stderr is: {out.stderr}")
+    outpath = json.loads(out.stdout)
     if outpath:
         path = path.replace(outpath, os.getcwd())

I saw

  0 at@dev:/home/at/code/system/nixpkgs/pkgs/tools/networking/curl-impersonate# ./update.py
stdout is: b''
stderr is: b"fatal: '/home/at/code/system/nixpkgs/pkgs/tools/networking/curl-impersonate' does not appear to be a git repository\nfatal: Could not read from remote repository.ou have the correct access rights\nand the repository exists.\nwarning: could not read HEAD ref from repo at 'file:///home/at/code/system/nixpkgs/pkgs/tools/networking/curl-imter'\nfatal: '/home/at/code/system/nixpkgs/pkgs/tools/networking/curl-impersonate' does not appear to be a git repository\nfatal: Could not read from remote repository.\n\nPle the correct access rights\nand the repository exists.\nerror: program 'git' failed with exit code 128\n(use '--show-trace' to show detailed location information)\n"
Traceback (most recent call last):
  File "/home/at/code/system/nixpkgs/pkgs/tools/networking/curl-impersonate/./update.py", line 89, in <module>
    pkgpath = findpath(attr)
  File "/home/at/code/system/nixpkgs/pkgs/tools/networking/curl-impersonate/./update.py", line 43, in findpath
    outpath = json.loads(out.stdout)
  File "/nix/store/ay0kw1fy63pwa8n06wvm89d4p30jsajy-python3-3.10.7/lib/python3.10/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "/nix/store/ay0kw1fy63pwa8n06wvm89d4p30jsajy-python3-3.10.7/lib/python3.10/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/nix/store/ay0kw1fy63pwa8n06wvm89d4p30jsajy-python3-3.10.7/lib/python3.10/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

and figured out that it does work if you change directory to the nixpkgs root:

  0 at@dev:/home/at/code/system/nixpkgs/pkgs/tools/networking/curl-impersonate# ..
  0 at@dev:/home/at/code/system/nixpkgs/pkgs/tools/networking# ..
  0 at@dev:/home/at/code/system/nixpkgs/pkgs/tools# ..
  0 at@dev:/home/at/code/system/nixpkgs/pkgs# ..
  0 at@dev:/home/at/code/system/nixpkgs:prime# ./pkgs/tools/networking/curl-impersonate/update.py
stdout is: b'"/nix/store/bn0q0s3rb7n05xc47dj2i1v01mhibymg-source"\n'
stderr is: b"warning: Git tree '/home/at/code/system/nixpkgs' is dirty\n"
update-source-version: New version same as old version, nothing to do.
update.py: New vendorHash same as old vendorHash, nothing to do.

@ivan
Copy link
Member

ivan commented Oct 4, 2022

Do completions work for you now?

Yes, thanks! Tested zsh again and they are present for all the binaries.

@lilyinstarlight
Copy link
Member Author

lilyinstarlight commented Oct 4, 2022

It still wasn't working for me, but with

[snip]

I saw

[snip]

and figured out that it does work if you change directory to the nixpkgs root:

[snip]

Ah, yeah it expects to be run from the root of the repo. I'll see if I can make an adjustment later when I'm at my computer again to fix that and make the script slightly more robust. Thank you for testing it!

Edit: I had a minute so I implemented some updates to it for robustness -- it should be good now when run from anywhere in the repo

@lilyinstarlight lilyinstarlight force-pushed the pkg/curl-impersonate branch 3 times, most recently from 2cab60b to 9a4d2c2 Compare October 5, 2022 12:59
@lilyinstarlight
Copy link
Member Author

lilyinstarlight commented Oct 5, 2022

I've done a little bit more of refactoring/cleaning up things to ensure the little details work (e.g. getting correct rpath if someone tries to use curl-impersonate-ff-config to link against the libcurl in the package or tries to run nix run nixpkgs#curl-impersonate.curl-impersonate-ff)

I think with that this is in good shape now that all 4 OfBorg builds pass, including the NixOS test which verifies TLS fingerprints from PCAPs against known browser signatures that passes on the Linux architectures

One question I have:

Is there a preference for attrs when doing passthru for the firefox/chrome derivations whether they are like (the current) curl-impersonate.curl-impersonate-ff or is something like curl-impersonate.ff preferable?

Or alternatively, should those inner passthru derivations be hoisted to the top-level package set with an inherit (curl-impersonate) curl-impersonate-ff curl-impersonate-chrome; in all-packages.nix?

Also thank you both for the reviews so far, I greatly appreciate them! :)

@lilyinstarlight
Copy link
Member Author

lilyinstarlight commented Oct 7, 2022

I decided on the latter option (hoisting curl-impersonate-{chrome,ff} to top-level attrset) and fixed a few other things for macOS

I've successfully tested this on macOS now, but curl-impersonate-ff won't find the CA store by default until #194844 lands in master. That PR is not a blocker though, and this PR can be merged now independently of that one

@lilyinstarlight
Copy link
Member Author

Those are all actions we need to check once in a while which makes more work. And the hardest part is remember doing that and I can guarantee you that it will be slowly forgotten in a while. I personally have zero trust in upstreams to properly update dependencies.

Do you want me to put a check for it in the update script so that r-ryantm can handle it? I don't think I can set meta.insecure and have r-ryantm still PR it (since it would break eval), but I can have it put a comment for the reviewer to mark it insecure. I truly get your concern and I don't want to have to have something be done manually or as it is remembered, but I'm having trouble understanding what you are suggesting I do about it then

If a package needs a dependency with custom patches then we can either use the vendored one or apply the patch on top of our package.

Applying the patches to our own versions would require even more custom and fragile makefile parsing against upstream and would require manual intervention for each update, but I suppose it can be done -- the problems below still would matter though

There is no technically reason to not use our libraries, or is there other than it could fail with later updates? That would apply to probably every other package, too, so that is acceptable.

Yes there are several technical reasons, I'll condense everything brought up in reviews and comments above into a numbered list here:

  1. The upstream build system builds these libraries statically from source (like how Rust or Golang do via locked dependency versions, even if this can be less than ideal)
  2. We could override versions with a bit of creativity to match the versions in nixpkgs, however given the upstream build system applies some patches to e.g. boringssl, this would require manual intervention a lot of/most of the time during updates and would still have problem 5 below
    • This is the most doable thing if problem 5 is not a concern, but if we do not solve problem 5 I'll only have limited capacity to perform intensive updates when it inevitably breaks and I may not be able to get to it (since this is just a leaf package that nothing else depends on -- more critical libraries I'd dedicate a lot more time to)
  3. We cannot just use the dependencies from nixpkgs directly without redoing upstream's build system to accept linking against pre-built .so's for these libraries and would require specialized and overridden versions of those packages to make sure all of the relevant stuff is turned on or off in the dependencies meson/autotools configuration
    • Without solving this, you will get zero benefit from nix-specific patches to these dependencies in nixpkgs since they will be unused and only source version numbers would be changed
  4. Making all of the above less difficult would require significant patches to the build system that diverge from upstream, which they are unlikely to accept given it goes against the determinism that project seeks to achieve
    • I do not like maintaining patches that significantly diverge from upstream in nixpkgs
  5. The tests are highly likely to start failing if some dependencies like NSS and boringssl don't match version exactly
    • The whole point of the project is to have exact TLS fingerprints (and exact HTTP request header fingerprints), but updates to TLS libraries will cause different TLS extensions and whatnot to be used as they are added and enabled by default and would break the fingerprints that the tests check

@SuperSandro2000, I really do understand your concerns, but I do not know of a way or am not skilled enough to address these in a way that doesn't make this package unmaintainable immediately after being merged

If that just means this does not get merged into nixpkgs and we stick with a prebuilt binary that has similar problems instead (but not tests), then so be it I suppose since your concerns are valid and I don't know what we can do about them

Also, I do not want this response to come off as antagonistic, as I do genuinely appreciate all of your reviews and insights on my PRs, so I apologize if it does come off that way. There just seems to be critical miscommunications about what the issue here is. Thank you again for reviewing and continuing to discuss this stuff ❤️

@deliciouslytyped
Copy link
Contributor

deliciouslytyped commented Nov 17, 2022

Just merge this already. It's a complicated package and it has tests.

@lilyinstarlight
Copy link
Member Author

lilyinstarlight commented Dec 18, 2022

That last push was just a minor nit in the NixOS test, but functionally nothing is different. I only changed #!/bin/sh in the test script to #!${pkgs.runtimeShell}

Edit: I forget that pkgs.writeShellScript exists and automatically uses pkgs.runtimeShell so I just pushed again to use that instead

@lilyinstarlight
Copy link
Member Author

I pushed once to use nativeBuildInputs for the runCommand cert stuff for the NixOS test. I rebased and then pushed again since this had a merge conflict with master (so it should merge cleanly again)

@lilyinstarlight lilyinstarlight changed the title curl-impersonate: init at 0.5.3 and replace curl-impersonate-bin curl-impersonate: init at 0.5.4 and replace curl-impersonate-bin Feb 26, 2023
@7c6f434c
Copy link
Member

7c6f434c commented Jul 17, 2023

There is no technically reason to not use our libraries, or is there other than it could fail with later updates?

Well, as the point of the package is fingerprint falsification, without comprehensive fingerprint tests we don't know if a no-downside (in the cooperative settings) improvement in a dependency breaks the illusion.

(This, though, also makes availability of a binary package useful — even though it's great to have a source build too)

@7c6f434c
Copy link
Member

@Janik-Haag When are you planning to review this? Because I think I'll merge this soon.

@lilyinstarlight
Copy link
Member Author

@ofborg eval

I have this in my personal overlay and it runs the tests so I know those still pass, but I'll request a fresh eval from ofborg just in case there's some other issue, since the last ofborg eval is fairly stale now

@Janik-Haag
Copy link
Member

@7c6f434c feel free to merge

{ fetchurl }:

{
"curl-7.84.0.tar.xz" = fetchurl {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This version has 17 known CVEs https://curl.se/docs/security.html

Are we just going to ignore them or do we want to mark them in meta.knownVulnerabilties?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it is safe to discard most of them based on the use case (only the protocols the imitated browsers support have any relevance, and I'd argue the use case is even outside the prime usefulness of HSTS support), but indeed there are some unfortunate things, like today's CVE-2023-32001 (well it also applies to our main curl, we'll see how long it takes for the fix to reach the channel).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with adding meta.knownVulnerabilities if they are relevant to the build configuration options curl-impersonate uses. I'll investigate and add any relevant ones today or tomorrow

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only relevant ones for how curl is built that are worth a meta.knownVulnerabilties that I could find from https://curl.se/docs/vuln-7.84.0.html are:

I'll push in a minute to update this PR for this and the removal of pytest from PYTHONPATH

nixos/tests/curl-impersonate.nix Outdated Show resolved Hide resolved
nativeBuildInputs = [ unzip ];

proxyVendor = true;
}).go-modules;
Copy link
Member

@SuperSandro2000 SuperSandro2000 Jul 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC this go renamed on staging to goModules

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checked: that's now in staging-next even. Unfortunate timing indeed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to know, I missed that change

We can either merge as-is and I open a new PR to staging-next after master gets merged into it to update the usage here, or we can wait for staging and then I rebase

I don't have much preference either way

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer the first option with merge as-is.

@7c6f434c 7c6f434c merged commit b02fd49 into NixOS:master Jul 23, 2023
8 of 9 checks passed
@7c6f434c
Copy link
Member

(Looks like master→staging-next merge today was after this merge)

@lilyinstarlight
Copy link
Member Author

Staging-next followup in #245070

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Package request: curl-impersonate
8 participants