-
-
Notifications
You must be signed in to change notification settings - Fork 602
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
vfs/zfs: remove per-file lock enforcement for ZFS read/write ops
OSv VFS layer enforces a per-file lock for file operations, including read and write operations. This approach prevents ZFS from doing parallel reads on the same file, or even parallel writes into different regions from the same file. ZFS allows that parallelism by using r/w range locks from its own internal nodes, mapped 1:1 to VFS vnodes. It properly satisfies POSIX requeriments. Currently, every write/read to a specific file will always be serialized. Workloads where many threads concurrently read/write from/into the same file would perform terribly bad as compared to FreeBSD, for example. FreeBSD also uses a r/w range lock within the VFS layer to protect concurrent read/write ops, however, that implies double range locking. It's not necessary, and it's also redundant. Specifically to ZFS, this bottleneck - which prevents parallel ZFS reads/writes on the same file from happening - could be pulverized by discarding the vnode locks surrounding VOP_READ and VOP_WRITE within vfs_file::read and vfs_file:: write, respectively. For read/write ops from the virtual file systems, vnode locking could be moved over their respective functions. Unlike ZFS, the virtual file systems don't deal with per-file locks on their own. From there on, ZFS r/w range locks would be working effectively. Before that, there was *no* r/w parallelism when working on the same file. vfs_file::get_arcbuf, which is entirely intended for ZFS, could also discard the vnode lock. It calls zfs_arc that also uses the r/w range lock approach, so properly protecting the underlying file. writeback() from the pagecache doesn't need to take the vnode lock either, given that it calls zfs_write (protected by r/w range lock). Changes involved: * For zfs_read, use zfs internal node data to avoid reading vnode. * For zfs_write, when file has grown in size, take vnode lock for updating vnode size. * For virtual file systems, surround their read()/write() ops with vnode lock. Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com> Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
- Loading branch information
Showing
6 changed files
with
16 additions
and
19 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters