REv2: Allow downloads to be prevented by emitting symlinks #11703

EdSchouten · 2020-07-03T07:42:38Z

By default, Bazel's REv2 client downloads all output files of a build
action, which is slow. This can be disabled by using flags like
--remote_download_minimal, but this causes bazel-out/ to be mostly
empty, making the user experience suboptimal.

This change introduces a middle ground: it allows people to develop
virtual file systems (FUSE, NFS, 9p, etc. etc. etc.) that lazily expose
the CAS onto people's systems. It does this by adding a new flag,
--remote_download_symlink_template, that causes Bazel to generate
symlinks that conform to a given template.

ulfjack · 2020-07-03T10:51:17Z

I am not convinced this feature carries its weight. If you're going to write a FUSE filesystem, why not write one that shows the actual file system layout? Why not have the remote execution system return the symlinks directly by rewriting the responses?

At the same time, the template system here isn't powerful enough even for simple cases - on Linux ext2-4, having a single directory with all the cas entries is too slow. And, it's losing performance because symlinks are slow.

I hope and expect that someone (maybe me) will write a FUSE filesystem that shows the actual file layout and that fully integrates with Bazel, and then this feature will become irrelevant, but it will be difficult to remove again.

EdSchouten · 2020-07-03T11:12:53Z

Hey Ulf!

I am not convinced this feature carries its weight. If you're going to write a FUSE filesystem, why not write one that shows the actual file system layout? Why not have the remote execution system return the symlinks directly by rewriting the responses?

Those are good questions. To answer the second question, I think the answer to that would be that it then requires that all clients have the FUSE file system mounted, and potentially have it mounted at the same location on the system. Or you somehow have to make it possible for clients to specify a template to the remote execution system. That may work for ActionResult messages, but does become a bit hairy when the ActionResult message points to Tree objects.

With regards to the first question: the idea behind this approach was to keep the responsibility of tracking state where it is now: in bazel-out/. The FUSE daemon that exposes the CAS would remain a stateless process. It could be a single per-user (or even system-wide?) daemon that caches and serves blobs for all Bazel workspaces/checkouts that a person may be working on.

At the same time, the template system here isn't powerful enough even for simple cases - on Linux ext2-4, having a single directory with all the cas entries is too slow. And, it's losing performance because symlinks are slow.

That's a fair point. In my case I'm using a FUSE file system that exposes a directory that is not iterable. Hoping that the kernel is smart enough to use dirhashing, it should remain fairly fast.

I hope and expect that someone (maybe me) will write a FUSE filesystem that shows the actual file layout and that fully integrates with Bazel, and then this feature will become irrelevant, but it will be difficult to remove again.

Oh, man. I would honestly donate one of my kidneys to see a feature like that appear. I mainly took this approach, because I already had a lot of good FUSE code lying around that was not written in Java, and because I'm not familiar enough with Bazel's innards to make a change like that.

buchgr · 2020-07-16T13:31:25Z

I really like the approach that you took Ed. The code is very simple and easy to maintain, but gives users the option to use a FUSE filesystem as an alternative to --remote_download_minimal.

I have a few questions:

IIRC Bazel will follow symlinks in the output base (unless they are declared an output symlink) and compute their digest by reading the file contents and hashing them. I assume this is not happening here. Is that so and do you know why?
Wouldn't this feature already be a sufficient protocol to put the output base behind a fuse filesystem (as ulf suggests)? While Bazel would do a symlink() syscall, the fuse filesystem could expose a file instead. Shouldn't that work?

At the same time, the template system here isn't powerful enough even for simple cases - on Linux ext2-4, having a single directory with all the cas entries is too slow. And, it's losing performance because symlinks are slow.

I think this problem could easily be addressed by introducing a third template variable {delim_hash}. It would be equal to the first few bytes of {hash} and allow you to construct paths like /cas/{delim_hash}/{hash}-{size_bytes}. Wouldn't that work? If say the upper limit of the filesystem is 2^32 files a reasonable length for {delim_hash} would be 2^16, so the first 4 characters of {hash}. It's safe to assume that {hash} is truly random and so even in the case of 4 billion output files we should get no more than 2^16 files in one directory.

buchgr · 2020-07-16T13:38:59Z

IIRC Bazel will follow symlinks in the output base (unless they are declared an output symlink) and compute their digest by reading the file contents and hashing them. I assume this is not happening here. Is that so and do you know why?

I am not trying to suggest that this can't be made work, I am just surprised that it works :-). I would just expect that in addition to creating the symlink you will also need to inject the digest into the metadatahandler, like builds without the bytes does.

EdSchouten · 2020-07-22T12:28:43Z

Hey Jakob! \o/

I really like the approach that you took Ed. The code is very simple and easy to maintain, but gives users the option to use a FUSE filesystem as an alternative to --remote_download_minimal.

I have a few questions:

IIRC Bazel will follow symlinks in the output base (unless they are declared an output symlink) and compute their digest by reading the file contents and hashing them. I assume this is not happening here. Is that so and do you know why?

Exactly! I'm using this change in combination with PR #11662, so that Bazel reads extended attributes in case it needs to reobtain the digest.

Wouldn't this feature already be a sufficient protocol to put the output base behind a fuse filesystem (as ulf suggests)? While Bazel would do a symlink() syscall, the fuse filesystem could expose a file instead. Shouldn't that work?

That could work as well, but has the downside that the FUSE file system needs to be stateful. It would need to keep track of the actual layout that was constructed by Bazel.

At the same time, the template system here isn't powerful enough even for simple cases - on Linux ext2-4, having a single directory with all the cas entries is too slow. And, it's losing performance because symlinks are slow.

I think this problem could easily be addressed by introducing a third template variablxe {delim_hash}. It would be equal to the first few bytes of {hash} and allow you to construct paths like /cas/{delim_hash}/{hash}-{size_bytes}. Wouldn't that work? If say the upper limit of the filesystem is 2^32 files a reasonable length for {delim_hash} would be 2^16, so the first 4 characters of {hash}. It's safe to assume that {hash} is truly random and so even in the case of 4 billion output files we should get no more than 2^16 files in one directory.

Sure! I was thinking that maybe we could use some kind of printf()-like mechanism, where people could do expressions like these:

--remote_download_symlink_template=/some/path/%.4(hash)s/%(hash)s-%(sizeBytes)d

Is there any convention for this in Java/Bazel land?

buchgr · 2020-07-29T12:39:12Z

Sure! I was thinking that maybe we could use some kind of printf()-like mechanism, where people could do expressions like these:

Why do you think this would carry its weight? In my mind hard-coding 4 characters seems sufficient.

Is there any convention for this in Java/Bazel land?

Not that I am aware of.

buchgr · 2020-07-29T12:39:56Z

Overall the change looks good to me. @philwo

EdSchouten · 2020-07-31T12:03:01Z

Sure! I was thinking that maybe we could use some kind of printf()-like mechanism, where people could do expressions like these:

Why do you think this would carry its weight? In my mind hard-coding 4 characters seems sufficient.

That's a fair point. Using full string formatting for this is a bit overkill.

For my use case I don't think I need to shard objects across directories. In fact, I even prefer to keep it a flat namespace to not waste inodes. Would it be all right if we just stick to the patch as is, keeping the option open to add a special tag for 4 character prefixes? I'm more than happy to add it right now, but I suppose it only makes sense to add such a feature if people observe it to be a problem.

EdSchouten · 2020-08-17T06:41:05Z

Friendly ping. :-)

philwo

This is so simple and elegant. I love it.

I'll import it and send it for a final review to @coeuvre who maintains this area now.

By default, Bazel's REv2 client downloads all output files of a build action, which is slow. This can be disabled by using flags like --remote_download_minimal, but this causes bazel-out/ to be mostly empty, making the user experience suboptimal. This change introduces a middle ground: it allows people to develop virtual file systems (FUSE, NFS, 9p, etc. etc. etc.) that lazily expose the CAS onto people's systems. It does this by adding a new flag, --remote_download_symlink_template, that causes Bazel to generate symlinks that conform to a given template.

EdSchouten · 2020-09-03T14:21:00Z

Hey @coeuvre! I've just rebased this change on top of latest master. Is there anything left you want me to take care of?

coeuvre · 2020-09-10T06:51:39Z

Merged! Thanks for your PR.

In an attempt to achieve 'Builds without the Bytes' without losing access to build outputs, I am experimenting with a FUSE file system that gives access to objects stored in the CAS. In PR bazelbuild#11703, I added a command line flag to let Bazel emit symbolic links pointing into this FUSE file system, as opposed to downloading files into the exec root. Though this change has allowed me to get quite a lot of stuff working, there are quite a lot of build actions that break. For example, Python calls realpath(argv[0]) to figure out its installation path. Because the FUSE file system does not mimic the execroot, Python won't be able to find its site packages. Similar problems hold with shared library resolution in general. This is why I think the only proper way we can get this to work is by using hard links instead of symbolic links. That way the usual file hierarchy remains intact. This, however, requires that the exec root itself is placed on a FUSE file system. It is already possible to achieve this by setting --output_base, but that has the downside of also placing many other files on FUSE (e.g., the sandbox directories), which is detrimental to performance. This change adds a new command line flag, --exec_root_base, which can be used to leave the output base at the regular place, but host the exec root directory on a FUSE file system. This change originally seemed to work all right with Bazel 3.4-3.7. In order to make this work with Bazel master, I had to make a slight tweak to the changes in 0c249d5. That code added the assumption that "${output_base}/external" is always placed at "${exec_root_base}/../external". I suspect that already causes a regression in case a BlazeModule overrides the exec root base. While there, rename 'execRootParent' to 'execRootBase', as it corresponds to the exec root itself; not its parent directory.

In an attempt to achieve 'Builds without the Bytes' without losing access to build outputs, I am experimenting with a FUSE file system that gives access to objects stored in the CAS. In PR bazelbuild#11703, I added a command line flag to let Bazel emit symbolic links pointing into this FUSE file system, as opposed to downloading files into the exec root. Though this change has allowed me to get quite a lot of stuff working, there are also many build actions that break. For example, Python calls realpath(argv[0]) to figure out its installation path. Because the FUSE file system does not mimic the execroot, Python won't be able to find its site packages. Similar problems hold with shared library resolution in general. This is why I think the only proper way we can get this to work is by using hard links instead of symbolic links. That way the usual file hierarchy remains intact. This, however, requires that the exec root itself is placed on a FUSE file system. It is already possible to achieve this by setting --output_base, but that has the downside of also placing many other files on FUSE (e.g., the sandbox directories), which is detrimental to performance. This change adds a new command line flag, --exec_root_base, which can be used to leave the output base at the regular place, but host the exec root directory on a FUSE file system. This change originally seemed to work all right with Bazel 3.4-3.7. In order to make this work with Bazel master, I had to make a slight tweak to the changes in 0c249d5. That code added the assumption that "${output_base}/external" is always placed at "${exec_root_base}/../external". I suspect that already causes a regression in case a BlazeModule overrides the exec root base. While there, rename 'execRootParent' to 'execRootBase', as it corresponds to the exec root itself; not its parent directory.

In an attempt to achieve 'Builds without the Bytes' without losing access to build outputs, I am experimenting with a FUSE file system that gives access to objects stored in the CAS. In PR bazelbuild#11703, I added a command line flag to let Bazel emit symbolic links pointing into this FUSE file system, as opposed to downloading files into the exec root. Though this change has allowed me to get quite a lot of stuff working, there are also many build actions that break. For example, Python calls realpath(argv[0]) to figure out its installation path. Because the FUSE file system does not mimic the execroot, Python won't be able to find its site packages. Similar problems hold with shared library resolution in general. This is why I think the only proper way we can get this to work is by using hard links instead of symbolic links. That way the usual file hierarchy remains intact. This change renames the --remote_download_symlink_template flag to --remote_download_hard_link_template and changes the code to create hard links instead. When used in combination with --exec_root_base (bazelbuild#12558), it's now possible to let Bazel construct an exec root that does not have any additional indirection through symbolic links, thereby keeping programs that do symlink expansion happy.

googlebot added the cla: yes label Jul 3, 2020

EdSchouten changed the title ~~REv2: Allow downloads from being prevented by emitting symlinks~~ REv2: Allow downloads to be prevented by emitting symlinks Jul 3, 2020

EdSchouten mentioned this pull request Jul 27, 2020

UnixFileSystem: read cached hashes from extended attributes #11662

Closed

philwo approved these changes Aug 21, 2020

View reviewed changes

EdSchouten force-pushed the eschouten/20200702-symlinks branch from 21f49e4 to 3d08acb Compare September 3, 2020 14:20

bazel-io closed this in cb08ffc Sep 10, 2020

EdSchouten deleted the eschouten/20200702-symlinks branch September 10, 2020 07:44

EdSchouten mentioned this pull request Nov 25, 2020

Allow the exec root to be placed outside the output base #12558

Closed

EdSchouten mentioned this pull request Nov 26, 2020

Let --remote_download_symlink_template use hard links #12566

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

REv2: Allow downloads to be prevented by emitting symlinks #11703

REv2: Allow downloads to be prevented by emitting symlinks #11703

EdSchouten commented Jul 3, 2020

ulfjack commented Jul 3, 2020

EdSchouten commented Jul 3, 2020 •

edited

Loading

buchgr commented Jul 16, 2020

buchgr commented Jul 16, 2020 •

edited

Loading

EdSchouten commented Jul 22, 2020

buchgr commented Jul 29, 2020

buchgr commented Jul 29, 2020

EdSchouten commented Jul 31, 2020

EdSchouten commented Aug 17, 2020

philwo left a comment

EdSchouten commented Sep 3, 2020

coeuvre commented Sep 10, 2020

REv2: Allow downloads to be prevented by emitting symlinks #11703

REv2: Allow downloads to be prevented by emitting symlinks #11703

Conversation

EdSchouten commented Jul 3, 2020

ulfjack commented Jul 3, 2020

EdSchouten commented Jul 3, 2020 • edited Loading

buchgr commented Jul 16, 2020

buchgr commented Jul 16, 2020 • edited Loading

EdSchouten commented Jul 22, 2020

buchgr commented Jul 29, 2020

buchgr commented Jul 29, 2020

EdSchouten commented Jul 31, 2020

EdSchouten commented Aug 17, 2020

philwo left a comment

Choose a reason for hiding this comment

EdSchouten commented Sep 3, 2020

coeuvre commented Sep 10, 2020

EdSchouten commented Jul 3, 2020 •

edited

Loading

buchgr commented Jul 16, 2020 •

edited

Loading