Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

create append-only caches in remote execution #17290

Merged
merged 5 commits into from
Nov 1, 2022

Conversation

tdyas
Copy link
Contributor

@tdyas tdyas commented Oct 20, 2022

Use a wrapper script to create append-only caches in remote execution.

@tdyas tdyas added the category:internal CI, fixes for not-yet-released features, etc. label Oct 20, 2022
Copy link
Contributor

@Eric-Arellano Eric-Arellano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

caches: &BTreeMap<CacheName, RelativePath>,
base_path: &str,
) -> Result<String, String> {
let mut script = String::new();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably over-optimization, but maybe pre-allocate based on the length of caches?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or just preallocate something like 512 bytes.

for (cache_name, path) in caches {
writeln!(
&mut script,
"mkdir -p '{}/{}'",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know how this will interact with Pants messing with the PATH sometimes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably get the mkdir and /bin/sh paths from a BinaryPathRequest.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or at least hardcode to /bin/mkdir for an initial implementation. Or maybe let users define the paths in remote_environment targets (to accommodate Docker containers)? Using BinaryPathRequest may be hard to do with chicken and egg?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, definitely a "chicken and egg" problem. it is unclear to me how to trigger a BinaryPathRequest from this part of the Rust code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another open question: Should failure to symlink the append-only cache fail the Process as well?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a failure fails all of Pants, and we instruct the user to disable this mechanism by unsettling the global option? We could mention that they can open a GitHub issue if they need more flexibility like setting /bin/mkdir

"TODO-DEV: Append-only caches and working directory cannot be set together in remote execution.".into()
);
}
// TODO-WIP: Make the base path configurable.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd recommend a global option, not the remote_environment target. That's because we use the same remote server for all remote_environment targets, and it seems desirable for them to share the same named cache. Also simpler modeling for users imo.

Copy link
Member

@stuhood stuhood Oct 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This behavior also needs to be disableable, since not all remote execution environments will support it (some will fail instead).

FWIW: I still think that I prefer explicit support in the backend rather than doing this in the client.

@tdyas tdyas force-pushed the remoting_append_only_caches branch from d3418be to 3df7898 Compare October 24, 2022 17:04
tdyas pushed a commit that referenced this pull request Oct 27, 2022
… tuple (#17358)

Refactor `make_execute_request` to return a struct instead of tuple so that (1) it is easier to refer to the returned instances by a more descriptive name than `.2`; (2) avoid call sites needing to know the order in which the values are returned if they are unpacking the result; and (3) make it easier to add new fields later (for example, the `input_root_digest` field to be added by #17290).
@tdyas
Copy link
Contributor Author

tdyas commented Oct 27, 2022

#17358 and #17370 refactor make_execute_request as prework for this PR.

tdyas pushed a commit that referenced this pull request Oct 27, 2022
… `DirectoryDigest` (#17370)

Modify `make_execute_request` to be async, take a `Store` as a parameter, and to return the input root to use. This is prework for #17290 which needs the ability to make a new input root digest if a wrapper script needs to be added to the input root. (It will also be useful for other wrapper script applications, such as to solve the issue at  #17367 (comment) for remote execution.)
@tdyas tdyas force-pushed the remoting_append_only_caches branch from 0c43a67 to 653f016 Compare October 31, 2022 18:16
@tdyas tdyas marked this pull request as ready for review October 31, 2022 18:17
@tdyas tdyas changed the title WIP: create append-only caches in remote execution create append-only caches in remote execution Oct 31, 2022
@tdyas
Copy link
Contributor Author

tdyas commented Oct 31, 2022

The latest commit rebases the PR on top of the pre-work. It also makes the wrapper for append-only caches both configurable and optional.

Copy link
Contributor

@Eric-Arellano Eric-Arellano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

src/rust/engine/process_execution/src/remote.rs Outdated Show resolved Hide resolved
src/rust/engine/process_execution/src/remote.rs Outdated Show resolved Hide resolved
base_path: &str,
) -> Result<String, String> {
let mut script = String::new();
writeln!(&mut script, "#!/bin/sh").map_err(|err| format!("write! failed: {err:?}"))?;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: This assumes /bin/sh exists in the remote environment.

@tdyas
Copy link
Contributor Author

tdyas commented Oct 31, 2022

The process_executor crate is failing to build due to compilation hanging. Happens locally and in CI (https://github.com/pantsbuild/pants/actions/runs/3364103818/jobs/5578072429).

@Eric-Arellano
Copy link
Contributor

Merging so that I can cherry-pick to 2.15 to unblock some work. We can fix anything in followups. Thanks Tom!

@Eric-Arellano Eric-Arellano merged commit 7eb4180 into pantsbuild:main Nov 1, 2022
Eric-Arellano pushed a commit to Eric-Arellano/pants that referenced this pull request Nov 1, 2022
Use a wrapper script to create append-only caches in remote execution.
Eric-Arellano added a commit that referenced this pull request Nov 1, 2022
…#17420)

Use a wrapper script to create append-only caches in remote execution.
@stuhood stuhood mentioned this pull request Nov 11, 2022
@stuhood stuhood mentioned this pull request Nov 19, 2022
@tdyas tdyas deleted the remoting_append_only_caches branch February 20, 2023 20:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category:internal CI, fixes for not-yet-released features, etc.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants