Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement LevelHistograms as a struct #1

Merged

Conversation

alamb
Copy link

@alamb alamb commented Jul 26, 2024

This PR targets apache#6105

Per the suggestion on apache#6105 (review) I tried out encapsulating the information about level histograms in a rust struct, LevelHistogram

I think I like this as it encapsulates the Vec<i64> for histograms, so that the operations on it can be named and documented

Copy link
Owner

@etseidl etseidl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @alamb. This looks great. I learned a lot from it.

Comment on lines +219 to +220
self.repetition_level_histogram.as_mut().map(LevelHistogram::reset);
self.definition_level_histogram.as_mut().map(LevelHistogram::reset);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So much nicer than what I was doing 😮

    if let Some(hist) = self.repetition_level_histogram.as_mut() {
        hist.reset()
    }

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think your way is pretty nice too -- this is borderline too functional for my taste, but it seemed to be ok :)

Comment on lines +279 to +280
if let (Some(page_hist), Some(chunk_hist)) = (page_histogram, chunk_histogram) {
chunk_hist.add(page_hist);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought I tried this and it didn't work. Thanks. My approach was to have add take an Option<LevelHistogram>.

/// Length will be `max_level + 1`.
///
/// Returns `None` when `max_level == 0` (because histograms are not necessary in this case)
pub fn try_new(max_level: i16) -> Option<Self> {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I struggled with this name, as the consensus seemed to be that try_new returns a Result. I wound up opting for maybe_new, but I prefer this!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah maybe there is a better convention but I think this is pretty standard in my experience

Comment on lines +652 to +662
impl From<Vec<i64>> for LevelHistogram {
fn from(inner: Vec<i64>) -> Self {
Self { inner }
}
}

impl From<LevelHistogram> for Vec<i64> {
fn from(value: LevelHistogram) -> Self {
value.into_inner()
}
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So much nicer. Thank you!

Comment on lines +897 to +898
let repetition_level_histogram = repetition_level_histogram.map(LevelHistogram::from);
let definition_level_histogram = definition_level_histogram.map(LevelHistogram::from);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

@etseidl etseidl merged commit 9582e0d into etseidl:size_stats_histograms Jul 26, 2024
@alamb alamb deleted the alamb/level_histogram_structs branch July 26, 2024 17:32
etseidl pushed a commit that referenced this pull request Jul 31, 2024
Add test for metadata equivalence
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants