Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(metrics): Reorganize Metric data type fields #6138

Merged
merged 7 commits into from
Jan 22, 2021
Merged

Conversation

bruceg
Copy link
Member

@bruceg bruceg commented Jan 19, 2021

chore(metrics): Reorganize Metric data type fields

The Metric type is composed of two types of fields:

  1. "key" fields which identify which data series the metric belongs to,
  2. "data" fields which contain a single data point.

This change splits struct Metric into two parts to allow for these
types of fields to be used separately in maps or arrays. Further, the
series data is split into the name and tags for further breakdown.

This will be rebased once #6034 is merged.

Signed-off-by: Bruce Guenter bruce@timber.io

Closes #6050

@bruceg bruceg self-assigned this Jan 19, 2021
@bruceg bruceg added domain: data model Anything related to Vector's internal data model domain: metrics Anything related to Vector's metrics events type: tech debt A code change that does not add user value. labels Jan 19, 2021
@bruceg bruceg force-pushed the split-metric-data branch 3 times, most recently from b95b6ef to 8b38a88 Compare January 19, 2021 20:27
Bruce Guenter added 2 commits January 20, 2021 09:57
The `Metric` type is composed of two types of fields:

1. "key" fields which identify which data series the metric belongs to,
2. "data" fields which contain a single data point.

This change splits `struct Metric` into two parts to allow for these
types of fields to be used separately in maps or arrays. Further, the
series data is split into the name and tags for further breakdown.

Signed-off-by: Bruce Guenter <bruce@timber.io>
Signed-off-by: Bruce Guenter <bruce@timber.io>
@bruceg bruceg force-pushed the split-metric-data branch from 14d402d to 3b1e584 Compare January 20, 2021 15:58
@bruceg bruceg marked this pull request as ready for review January 20, 2021 15:59
@bruceg bruceg requested a review from a team January 20, 2021 15:59
Signed-off-by: Bruce Guenter <bruce@timber.io>
Comment on lines 66 to 73
cond.check(&Event::from(Metric::new(
"test metric".to_string(),
None,
None,
None,
MetricKind::Incremental,
MetricValue::Counter { value: 1.0 },
))),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My only real concern here is that this is a lot of args for a constructor (i.e. None, None, None), which can make it more difficult to understand at a glance. It's not a huge deal, but if you see any common patterns it could be worth adding a couple of more focused constructors. The other option would be a builder of some kind, but I'm not sure it's worth that yet.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know, that constructor is ugly. I put it in as a convenience for converting code instead of writing an even uglier multi-level struct instantiation all over the place:

Metric {
    series: MetricSeries {
        name: MetricName {
            name: "test metric".to_string(),
            namespace: None,
        },
        tags: None,
    },
    data: MetricData {
        timestamp: None,
        kind: MetricKind::Incremental,
        value: MetricValue::Counter { value: 1.0 },
    },
}

I'm open to ideas on how to make it better. Note that doing so means rewriting over 300 call sites. 😢

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a good time to plug https://comby.dev/ or IntelliJ's structural search/replace 😄

I definitely think that we should keep using some kind of constructor or builder just to keep us decoupled from the internal representation. Since you always need a name, kind, and value, one option would be to have a constructor for those and then builder-style with_{namespace,timestamp,tags} methods that consume and return the metric and can be chained. Another option would be to see how many combinations there are and just add a couple of additional constructors to cover the majority of them.

If you start with the most specific pattern you're replacing (i.e. three Nones, then two, then one), you should be able to do a transformation like that in a pretty automated way.

Copy link
Contributor

@MOZGIII MOZGIII Jan 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a good fit for the derive_builder crate

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reduced the fn new parameters down to the minimum non-optional values, and added builder-like functions for the other three, and it dropped almost 400 lines of code after formatting. That's quite the win. Thanks for pointing me to comby!

src/event/metric.rs Outdated Show resolved Hide resolved
assert_eq!(
r#"{"name":"foos","namespace":"vector","timestamp":"2018-11-14T08:09:10.000000011Z","tags":{"Key3":"Value3","key1":"value1","key2":"value2"},"kind":"incremental","counter":{"value":100.0}}"#,
r#"{"name":"foos","namespace":"vector","tags":{"Key3":"Value3","key1":"value1","key2":"value2"},"timestamp":"2018-11-14T08:09:10.000000011Z","kind":"incremental","counter":{"value":100.0}}"#,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the reordering necessary here? Is it possibly flaky?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is dependant on the exact ordering of the fields in the output JSON. That field order is in turn dependant on the order serde visits the struct fields, which has now changed. However, readers of JSON should not care what order the fields come in, so it's a harmless change in the output.

@@ -53,12 +44,10 @@ impl PartialEq for MetricEntry {
// This differs from a straightforward implementation of `eq` by
// comparing only the "shape" bits (name, tags, and type) while
// allowing the contained values to be different.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some unit (or docs tests) to cover the behavior which is described here would be nice.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to be changing in a follow-up to this work, which is the whole motivation for this change. Metric buffering should only be keying on the MetricSeries part, but that requires structural changes to the Metrics buffer.

Bruce Guenter added 2 commits January 21, 2021 17:53
Signed-off-by: Bruce Guenter <bruce@timber.io>
Signed-off-by: Bruce Guenter <bruce@timber.io>
@bruceg bruceg requested a review from lukesteensen January 22, 2021 00:01
Bruce Guenter added 2 commits January 22, 2021 09:07
Signed-off-by: Bruce Guenter <bruce@timber.io>
Signed-off-by: Bruce Guenter <bruce@timber.io>
@bruceg bruceg merged commit 82f7fef into master Jan 22, 2021
@bruceg bruceg deleted the split-metric-data branch January 22, 2021 21:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain: data model Anything related to Vector's internal data model domain: metrics Anything related to Vector's metrics events type: tech debt A code change that does not add user value.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Split metric data into key + value
4 participants