Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A different take on git submodule support for flakes #5497

Open
wants to merge 17 commits into
base: master
Choose a base branch
from

Conversation

kaii-zen
Copy link
Contributor

@kaii-zen kaii-zen commented Nov 4, 2021

As the title implies, this is a whole 'nother take. Ça veut dire, it does not interact AT ALL with submodules = true;. It does involve more code though.

Idea

If a tree has a .gitmodules in it, then we:

  • Get the paths where submodules go
  • Get the respective URLs that go with those paths
  • Traverse the git tree to work out the revs
  • Construct flake-friendly URLs (git+xxxx://host/who/what?rev=xxxx etc)
  • Build an attrset that maps paths to flake-friendly URLs
  • Slap this into the attrset returned by fetchTree

That's it. It looks like this:

nix eval --impure --expr 'builtins.fetchTree git+https://github.com/runtimeverification/iele-semantics' | nixfmt
{
  lastModified = 1636045922;
  lastModifiedDate = "20211104171202";
  modules = {
    "deps/secp256k1" =
      "git+https://github.com/bitcoin-core/secp256k1?allRefs=1&rev=f532bdc9f77f7bbf7e93faabfbe9c483f0a9f75f";
    plugin =
      "git+https://github.com/runtimeverification/blockchain-k-plugin?allRefs=1&rev=cc7384e565e4c8df4d17a3330cd9951d32d4830f";
    "tests/ethereum-tests" =
      "git+https://github.com/ethereum/tests.git?allRefs=1&rev=7057d0e021f7dbee37a2ddb6acf34f4946bc157d";
    "web/k-web-theme" =
      "git+https://github.com/runtimeverification/k-web-theme?allRefs=1&rev=189fe6650488f8f04dc89d25a61cdfd888cd9dd8";
  };
  narHash = "sha256-9GDWDomnRyJ47mDbygoBHThq9Q+3JGZDYTIHB8SsMlc=";
  outPath = "/nix/store/6d1w3h7fygfwl1f42x1bx6xhsrsmh7qj-source";
  rev = "431a4a44dad39aedb1dff126475cdce3b8d91e83";
  revCount = 1399;
  shortRev = "431a4a4";
  submodules = false;
}

It's just plain contextless strings so unless I'm fundamentally misunderstanding things, no dragons lie therein.
From here the user is empowered to do as much or as little as she wants with it: use builtins.getFlake for a flake, builtins.fetchTree for a non-flake, or ignore it altogether.

See tests/flakes-with-submodules.sh for a (rudimentary) example of how this might be used within a flake.

This could potentially be used as a basis for a more "flakeful" UX, where submodules behave like indirect (registry) inputs and get populated with the Right Thing if and only if they get added to the outputs function args set... or not.

Since my C++ is rusty and my familiarity with the Nix codebase is flakey (both puns intended btw 🙃), I invite y'all to punch holes in my reasoning and point out any blind spots so that we can address them and hopefully reach a solution that everyone's happy with.

I trust that this would save the sanity of at least a handful of Nixers/Nixresses out there who are forced to interact with git submodules against their better judgement (or in accordance with their poor judgement. to each their own 🤷‍♀️), thereby empowering them to once again pursue their (title-cased) Hopes and Dreams!

kk 'nuff prose. known issues/caveats:

  • this does add the submodule data to the cache, so a rm -rf ~/.cache/nix is required (otherwise you'd get error: input attribute 'modules' is missing)
  • dirty submodule scenario isn't handled yet added, and fixed so that it also works without __contentAddressed = true; (thanks @shlevy!)
  • does not solve the "I want self to have submodules = true;" use case (Support self-fetching parameters in flake.nix #5312)
  • might segfault, murder your kittens or both

Shout out to @cleverca22 for essentially onboarding me to the codebase (... and writing most of the harder parts)

Cheers

Copy link
Contributor

@Kha Kha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool. I had something like this in mind as well, but then decided for the simpler approach in #5399 for now.

{
auto rootTree = getRootTree(path, gitrev);
printTalkative("root tree is %s", rootTree);
auto entries = gitTreeList(path, rootTree);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this double git cat-file different from git ls-tree $rev?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idk
/cc @cleverca22

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like you can just skip directly to ls-tree on a rev

but you still need to recurse manually, git ls-tree $rev $path/ for each dir

return results;
}

std::map<string,string> parseSubmodules(const string & contents) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since .gitmodules is technically an arbitrarily formatted git config file, it might be nicer to have git parse the file into a stricter format, which can also be combined with locating & reading the blob: git config --blob $rev:.gitmodules --list

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a very neat trick thank you!

return results;
}

string findCommitHash(const Path & path, const std::map<string, std::pair<gitType, string>> & _entries, const string & pathToFind) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think git ls-tree $rev $submodulePath gives you the same information.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahhh, and that solves the recursion i mentioned a few comments up

prefix = prefix + "file://";


attrs.emplace(subPath, prefix + url + "?allRefs=1&rev=" + commitHash);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the submodule commit is available only in the local clone and not in the remote, we need to pass the former + submodule path as url here I think.

Copy link
Contributor

@Kha Kha Nov 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually we should probably always do this if the commit is locally available to avoid redundant fetches

@knedlsepp
Copy link
Member

Would this be able to handle nested submodules?

@kaii-zen
Copy link
Contributor Author

kaii-zen commented Nov 5, 2021

Would this be able to handle nested submodules?

Not as conveniently as it would be with simply enabling submodules for self. In my understanding the only way to achieve that is by enabling it across the board, which we cannot because it breaks existing workflows.

The happy use case here is when each of your submodules is a flake itself, handling its own submodules without you having to care about it.

With other use cases, such as requiring the recursively cloned source tree for the build itself, you'd have to write some nix code to manipulate the URL provided into an attrset (until #5434 adds &submodule=1 support) to which you'd // { submodule = true; } before passing it along to builtins.fetchTree. Somewhat ugly but works and can be factored out, tucked away and reused when necessary. IMHO this is still 100% better than the current state of affairs where you're simply out of luck 🤷‍♀️

fi

clearStore
rm -rf $TEST_HOME/.{cache,config}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's clearCache... Why do you need to nuke config?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part was just copied verbatim from flakes.sh

@@ -257,6 +257,13 @@ std::optional<time_t> Input::getLastModified() const
return {};
}

std::optional<std::string> Input::getModules() const
{
if (auto s = maybeGetStrAttr(attrs, "modules"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised that this works... Isn't modules an attrset? Doesn't maybeGetStrAttr throw if the attribute isn't a string?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's already stored as JSON. fetchTree reconstructs it into an attrset here: https://github.com/NixOS/nix/pull/5497/files#diff-e473b01a82cc31e51baa89b26b4fe5083e56df27cd834b9722789841ef9d5620R38-R50

not the prettiest thing in the world but it works 🤷‍♀️

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😱

Does the process producing the JSON sort keys?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops, you're right. we only sort in fetchTree 😅

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm... looks like nlohmann takes care of it, unless I'm missing something: https://json.nlohmann.me/features/object_order/

@FlorianFranzen
Copy link
Contributor

I rebased this on the latest master and pushed it to my fork under FlorianFranzen/nix/submodules. I have started using this instead of loading submodules by default and it safes me a lot of rebuilds. Would love to have getTree lazyly execute when I access any modules, otherwise I think it is a great first step in submodule support.

@kaii-zen
Copy link
Contributor Author

I rebased this on the latest master and pushed it to my fork under FlorianFranzen/nix/submodules. I have started using this instead of loading submodules by default and it safes me a lot of rebuilds. Would love to have getTree lazyly execute when I access any modules, otherwise I think it is a great first step in submodule support.

We actually did try to implement that using a thunk at some point but it ended up causing an infinite recursion (IIRC) so was reverted and scoped out... December kinda forced a context switch and I haven't had the time to re-wrap my head around this but I really want to see this through as it's been a recurring pain point in my workflow. Anyways I'm happy it's useful. Help me draw more attention 😎

@FlorianFranzen
Copy link
Contributor

FlorianFranzen commented Feb 3, 2022

@cleverca22 Any chance we could get some input on this? I see you did some earlier reviews.

I will have a look at how to implement a lazy version of this. If necessary I could also add an experimental feature flag for it.

I would also like to see this integrated with nix flake metadata.

@@ -37,6 +37,8 @@ path0_=$(nix eval --impure --raw --expr "(builtins.fetchTree { type = \"git\"; u
export _NIX_FORCE_HTTP=1
[[ $(tail -n 1 $path0/hello) = "hello" ]]

ls -l $repo
nix eval --debug --impure --raw --expr "(builtins.fetchGit file://$repo).outPath"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these really needed?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like debugging leftovers

Copy link

@dmadisetti dmadisetti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really excited for this change. Please let us know if any additional support is needed- but this looks fine to me

@@ -37,6 +37,8 @@ path0_=$(nix eval --impure --raw --expr "(builtins.fetchTree { type = \"git\"; u
export _NIX_FORCE_HTTP=1
[[ $(tail -n 1 $path0/hello) = "hello" ]]

ls -l $repo
nix eval --debug --impure --raw --expr "(builtins.fetchGit file://$repo).outPath"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like debugging leftovers

commit
};

string getBlob(const Path & path, const string & gitrev, const string & blobhash) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgive my ignorance, but is there a reason this isn't static (given that getRootTree is)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably because I suck at C++. @cleverca22 do you have a better answer? 😅

git -C $flakeWithSubmodules submodule add $nonFlakeDir
git -C $flakeWithSubmodules submodule add $flakeDir
git -C $flakeWithSubmodules add .
git -C $flakeWithSubmodules commit -m'add submodules'

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I were being a Mighty Stickler I'd request a test that declares this feature's behaviour in the presence of recursive submodules. (That is, repo A has a submodule B, which itself has a submodule C.)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Ah, I see you commented above that this isn't really supported. And anyway I wouldn't consider this blocking!)

}

/* Check whether this repo has any commits. There are
probably better ways to do this. */
Copy link

@Smaug123 Smaug123 Apr 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strictly speaking there's a hilarious but more correct answer:

git cat-file --batch-check='%(objecttype)' --batch-all-objects --unordered | grep commit

But I'm sure any sane repo is in fact going to have an entry in /refs/heads.

@Smaug123
Copy link

Does anyone know any blockers for this going in? And/or who has The Power To Merge?

@gregistech
Copy link

Would love to see this implemented for secrets management.

@dmadisetti
Copy link

@thegergo02 A bit of a hack now for submodule secrets management is to reference a local "fake flake" and then override it with --override-input secret_flake submodule/path

Fake flake example: https://github.com/dmadisetti/.dots/blob/template/nix/spoof/flake.nix (I have a bit of templating stuff around it)
Main Flake: https://github.com/dmadisetti/.dots/blob/2c8b22900aa368f285a8eef88d658c2ffc01d48b/flake.nix#L33

@Smaug123
Copy link

Oh yuck, the merge with master is pretty hairy :(

@FlorianFranzen
Copy link
Contributor

I do not mind putting in the work to get this rebased onto master. But I will need some feedback if this is something that actually has a chance of being merged.

And given the ongoing work on #6530, it probably also makes sense to wait till that is merged.

@kaii-zen
Copy link
Contributor Author

I do not mind putting in the work to get this rebased onto master. But I will need some feedback if this is something that actually has a chance of being merged.

And given the ongoing work on #6530, it probably also makes sense to wait till that is merged.

Makes two of us 🤭

@Ericson2314
Copy link
Member

Triaged in the Nix Team meeting this morning (n.b. I could not make it):

  • One part might be obsolete given some changes on master. @edolstra will enquire.

  • The rest (exposing the submodules at the Nix level) should be discussed

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/2023-08-04-nix-team-meeting-minutes-77/31487/1

@fricklerhandwerk
Copy link
Contributor

Discussed in Nix team meeting 2023-10-13:

  • Shouldn't be a blocker for stabilizing fetchTree
    • Maybe submodules should be experimental then
  • The lazy-trees branch has another solution for that (as part of the libgit rewrite), could be backported
    • Similar idea overall
    • @edolstra wants to open a first PR for that today
  • @edolstra: Prefer the libgit approach
  • Decision: The tests are great and should be incorporated. The rest is very likely conflicting with the planned libgit rewrite.

@FlorianFranzen @kaii-zen @Ericson2314 would one of you be interested in splitting out the tests?

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/2023-10-13-nix-team-meeting-minutes-94/34395/1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.