-
Notifications
You must be signed in to change notification settings - Fork 507
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Scan and Lookup Traces. #2304
Conversation
/// Since the workload is always sorted by timestamp, | ||
/// adjacent objects are often going to be in the same table-value block. | ||
/// The grid is aware when N lookups ask for the same grid block concurrently, | ||
/// and queues up the reads internally such that they actually hit the storage once. | ||
/// | ||
/// To maximize IO utilization, we allow at least `Grid.read_iops_max` lookups to run, | ||
/// up to an arbitrary constant based on the maximum number of objects per block. | ||
/// Reasoning: the larger the block size is, higher is the probability of multiple | ||
/// lookups hitting the same grid block: | ||
const lookup_workers_max = @max( | ||
stdx.div_ceil( | ||
@divFloor(constants.block_size, @sizeOf(Object)), | ||
Grid.read_iops_max, | ||
), | ||
Grid.read_iops_max, | ||
); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic was removed:
- It simplified the mental model of always handling the max of
Grid.read_iops_max
parallel lookups. - Also allowed using the same
trace
event for lookups from prefetch and scans. - In practice, the benchmark didn't show any performance improvement.
- Reducing the number of
[max]LookupWorker
in the array also reduced the size of the binary 🤯
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a couple nits, otherwise looks great!
src/trace.zig
Outdated
inline else => |data| { | ||
comptime assert(@TypeOf(data) == void); | ||
return event_stack_base.get(event.*); | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about this:
inline else => |data| { | |
comptime assert(@TypeOf(data) == void); | |
return event_stack_base.get(event.*); | |
}, | |
inline else => |data, event_tag| { | |
comptime assert(@TypeOf(data) == void); | |
return comptime event_stack_base.get(event_tag); | |
}, |
Iiuc we use inline else
rather than else
so that we can assert @TypeOf(data) == void
.
So since we are generating code for each variant anyway, may as well inline the stack base as a constant rather than needing to fetch it at runtime from the EnumArray
.
return stack_base + @as(u32, @intCast(data.iop)); | ||
}, | ||
inline else => |data| { | ||
comptime assert(@TypeOf(data) == void); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also add comptime assert(event_stack_cardinality.get(event_tag) == 1);
.
levels: [constants.lsm_levels]LevelBuffer, | ||
|
||
pub fn init(self: *ScanBuffer, allocator: Allocator) !void { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe pull index
into an options
struct, so that the call site is labeled?
- Lookup/LookupWorker - ScanTree/ScanTreeLevel
5486af4
to
b2a8b6c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CFO found a failing seed due to tracing start/stop asserts: ./zig/zig build -Drelease vopr -- --lite 15404620186638443928
. I haven't looked into the details but I'm guessing it is missing a .reset()
somewhere.
Add new Traces:
Note 1: Spall sorts the "Thread ID" alphabetically, so it fails to render a usable visualization of this trace, as
ScanTree
andScanTreeLevel
are nested events (e.g., eachScanTree
has up tolsm_levels
ScanTreeLevel
) and stack number (displayed as "Thread ID: ##") is the only thing that groups them together.Note 2: For a more "illustrated" trace sample, try running the benchmark with the flags
--id-order=random
to skip the negative lookup optimization and--transfer-pending
to scan by expired transfers.