Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON serializer emits metric type information #3834

Closed
wants to merge 1 commit into from

Conversation

amoghe
Copy link
Contributor

@amoghe amoghe commented Feb 26, 2018

JSON serializer now includes type information in the string it emits.

Also riding on this patch - ValueType now exposes a String()
method that returns the name of the ValueType. This is useful
when we print the metric type information (e.g. when serializing)
since the reader does not need to be aware of the enum types used
within telegraf.

Required for all PRs:

  • [ ✔️ ] Signed CLA.
  • [ ❌ ] Associated README.md updated (<-- not needed)
  • [ ✔️ ] Has appropriate unit tests.

Also riding on this patch - ValueType now exposes a String()
method that returns the name of the ValueType. This is useful
when we print the metric type information (e.g. when serializing)
since the reader does not need to be aware of the enum types used
within telegraf.
@amoghe
Copy link
Contributor Author

amoghe commented Feb 26, 2018

@sparrc, @ch3lo - tagging you since you're the other committers to this serializer

@danielnelson
Copy link
Contributor

I think we shouldn't emit a type field if the metric is untyped.

@amoghe
Copy link
Contributor Author

amoghe commented Mar 6, 2018

I think we shouldn't emit a type field if the metric is untyped.

I see your point. But don't you think that its better to have a json schema that is consistent (i.e. the reader can expect the json to have a consistent set of keys) rather than some keys missing sometimes?

If I were to write a parser to parse these messages, I'd want this behavior because otherwise I'd be writing a branch in my code to handle the untyped case when the key is missing.

Thoughts?

@danielnelson
Copy link
Contributor

It probably depends on what language/library you are using to parse if it helps or not. I will say that most of the metrics in Telegraf are currently untyped because it is only used by the prometheus output.

@danielnelson
Copy link
Contributor

I've been thinking about this more and I don't think we should make this change, the main reason is that we want to make changes that will allow setting a type per field, instead of the current model of one type per Metric. This would fix some issues with splitting metrics across lines that otherwise should logically remain together.

@amoghe
Copy link
Contributor Author

amoghe commented Mar 15, 2018

Could you point me to any documentation/writing that could explain this new scheme? (one-metric-many-values). I don't think I follow what you are suggesting.

@danielnelson
Copy link
Contributor

I'll try to explain here.

Using line protocol, a metric can contain multiple values like:

cpu time_user=42 usage_user=20

However, there is no guarantee that all of the values are of the same type, in the above example usage_user is a gauge and time_user is a counter. Currently we split these across metrics so that we can assign different types:

cpu time_user=42
cpu usage_user=20

In line protocol, which is our primary serialization format, there is actually no reason they can't be on the same line. This is more efficient in terms of space and parsing speed, and also makes processing the data easier since the related values are all on one line. A concrete example of these difficulties is reported in influxdata/kapacitor#1217.

Because of this, in the future I'd like to be able to express this as a single metric/line, while still allowing outputs such as the prometheus output to have the required metadata. There are also suggestions to add additional types of metadata that would be on a per field basis. (#3670)

I think what would be nice is if we had a new JSON output format that instead of having a hardcoded "shape", would allow the user to specify how they would like the data structured and what data, including metadata, would be included. I'm not sure the exact details of how this would work, so that remains to be fleshed out.

@amoghe
Copy link
Contributor Author

amoghe commented Apr 15, 2018

I think I understand the main goal to split out Type information per Field. However the JSON emitted by this plugin will be restricted to whatever the Metric interface exposes. Currently, the Metric is defined as:

type Metric interface {
	// Getting data structure functions
	Name() string
	Tags() map[string]string
	TagList() []*Tag
	Fields() map[string]interface{}
	FieldList() []*Field
	Time() time.Time
	Type() ValueType
...

So the JSON emitted can simply reflect this. Eventually when each Field has its own Type , we can change the implementation to support that.

I guess I'm pointing out that feature-A (JSON output) need not be gated on feature-B (type-per-field). This is possible since there are no protocol guarantees provided by the JSON output. The output lines are simply a serialization of the Metric type which itself is not versioned.

Thoughts?

@danielnelson
Copy link
Contributor

While I don't consider the Metric interface to be part of the versioned API, if we add the type to JSON output then we do need to support it until the next major version. While this is not a disqualifier, I'm hesitant to add this when I know it will be changed.

That said, I had hoped to add per field types to 1.6, but it turns out doing this without using a lot more memory is a bit trickier than I anticipated and right now it is not a priority. So I think it would be reasonable to move forward with this pull request.

)

func (v ValueType) String() string {
return ValueTypeToName[v]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If ValueType is not in the map don't set a type, or optionally set "untyped".

@danielnelson
Copy link
Contributor

Closing due to lack of activity, let us know if you want to continue to work on this and we can reopen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants