Added hep protocol support as parser in telegraf #10039

adubovikov · 2021-11-01T11:48:13Z

Required for all PRs:

Updated associated README.md.
Wrote appropriate unit tests.
Pull request title or commits are in conventional commit format

resolves #

telegraf-tiger · 2021-11-01T11:48:17Z

Thanks so much for the pull request!
🤝 ✒️ Just a reminder that the CLA has not yet been signed, and we'll need it before merging. Please sign the CLA when you get a chance, then post a comment here saying !signed-cla

adubovikov · 2021-11-01T11:52:35Z

!signed-cla

adubovikov · 2021-11-01T12:07:52Z

@srebhan here you are

telegraf-tiger · 2021-11-01T12:18:37Z

☺️ This pull request doesn't significantly change the Telegraf binary size (less than 1% increase)

📦 Looks like new artifacts were built from this PR.

Expand this list to get them here! 🐯

Artifact URLs

DEB	RPM	TAR GZ	ZIP
amd64.deb	aarch64.rpm	darwin_amd64.tar.gz	windows_amd64.zip
arm64.deb	armel.rpm	freebsd_amd64.tar.gz	windows_i386.zip
armel.deb	armv6hl.rpm	freebsd_armv7.tar.gz
armhf.deb	i386.rpm	freebsd_i386.tar.gz
i386.deb	ppc64le.rpm	linux_amd64.tar.gz
mips.deb	s390x.rpm	linux_arm64.tar.gz
mipsel.deb	x86_64.rpm	linux_armel.tar.gz
ppc64el.deb		linux_armhf.tar.gz
s390x.deb		linux_i386.tar.gz
		linux_mips.tar.gz
		linux_mipsel.tar.gz
		linux_ppc64le.tar.gz
		linux_s390x.tar.gz
		static_linux_amd64.tar.gz

reimda · 2021-11-01T14:52:00Z

plugins/parsers/hep/README.md

@@ -0,0 +1,19 @@
+# HEP
+
+The HEP data format parses a HEP packet into metric fields.


I'm not familiar with HEP and I suspect many telegraf users aren't either. Could you add a link to the project here in the docs? https://github.com/sipcapture/HEP

It might be worthwhile to provide a more comprehensive example of how this parser is meant to be used

The parser provides compatibility with the HEP encapsulation protocol which is almost universally supported in Open-Source VoIP platforms such as Asterisk, Freeswitch, Kamailio, OpenSIPS and many more, alongside major vendors like Genesys, Sansay, and other. This parser is dedicated to provide a layer of compatibility for those platforms to form and send metrics to Telegraf without implementing new patches/protocols.

reimda · 2021-11-01T14:52:06Z

plugins/parsers/hep/README.md

@@ -0,0 +1,19 @@
+# HEP


It looks like it's going through a name change to EEP? If we do add a parser, shouldn't it use the new name?

Due to the large number of integrations using HEP, the name change did not go through as of yet and won't be for the foreseeable future to avoid confusion on an already obscure protocol.

reimda · 2021-11-01T14:58:50Z

plugins/parsers/hep/README.md

+
+**NOTE:** All HEP packets are stores as Tags unless provided specifically
+provided with `hep_header` array and body is parsed with JSON parser.
+All the JSON parser features were imported in Hep parser. Please check Json Parser for more details.


I'm not sure what you mean by "all json parser features were imported". Telegraf has two JSON parsers, "json" and "json_v2". This PR uses the older one which doesn't work well for some common json object structures. Should you switch to v2?

reimda · 2021-11-01T14:59:05Z

plugins/parsers/hep/README.md

+The HEP data format parses a HEP packet into metric fields.
+
+**NOTE:** All HEP packets are stores as Tags unless provided specifically
+provided with `hep_header` array and body is parsed with JSON parser.


Does HEP only ever embed JSON formatted data?

@reimda as far as I read into this, newer versions of the protocol can embedd a JSON payload, so we then need an embedded JSON parsing... :-(

@reimda HEP is a generic encapsulation protocol and per-se can carry any payload, including binary. The focus is on JSON in this specific usecase as all the integrating platforms are capable of producing and consuming it.

Hmmm if this is the case, why not export the payload in a field payload together with a payload_type and then use a parser processor to parse these? This way you can stay generic in this parser...

srebhan

Thank you very much @adubovikov for the nice PR! Overall it looks quite good but I do have some comments additional to @reimda. Please take a look and also check out the linter issues. You can check the linter status locally running make lint-branch on your local git branch.

srebhan · 2021-11-01T17:05:12Z

plugins/parsers/hep/parser.go

+	if len(h.HepHeader) != 0 {
+		var headerArray []int
+		for _, v := range h.HepHeader {
+			headerArray = append(headerArray, headerNames[v])
+			headerTags = h.addHeaders(headerArray, hep)
+		}
+		headerTags = h.addHeaders(headerArray, hep)
+	} else {
+		var headerArray []int
+		for k := range headerReverseMap {
+			headerArray = append(headerArray, k)
+			headerTags = h.addHeaders(headerArray, hep)
+		}
+	}


I think this is much more complex than it needs to be. You first use the name in the header (e.g. "version") to lookup an index. In addHeaders() you then compare this index to an enumerate exactly matching the order in headerNames to decide which field to add. Afterwards you revert the index to the original name coming from the protocol itself. Why not just using the name directly in addHeaders() with a string-comparison?

srebhan · 2021-11-01T17:05:36Z

plugins/parsers/hep/parser.go

+		var headerArray []int
+		for _, v := range h.HepHeader {
+			headerArray = append(headerArray, headerNames[v])
+			headerTags = h.addHeaders(headerArray, hep)


As the linter says, you overwrite this assignment directly after the loop...

srebhan · 2021-11-01T17:06:39Z

plugins/parsers/hep/parser.go

+		h.MetricName = "hep"
+	}
+	headerTags := make(map[string]string)
+	jsonParser, err := json.New(


Please create the JSON-parser only once per Parser instead of doing it in each call to Parse().

srebhan · 2021-11-01T17:07:13Z

plugins/parsers/hep/parser.go

+		}
+	}
+
+	if hep.ProtoType >= 2 && hep.Payload != "" && hep.ProtoType != 100 {


Can we please also get definitions for those magic numbers?

Very valid point, this should have been better commented. HEP protocol types are defined in the HEP Draft. The protocol type 100 is for Logs/JSON objects which are the only viable subject in the scope of this integration.

Just define those as constants to have speaking names so people not knowing the protocol in depth (like myself) can still read the code. Adding the link you posted in a comment would also be nice.

srebhan · 2021-11-01T17:07:48Z

plugins/parsers/hep/hep.go

+			case 1:
+				h.ProtoString = "sip"
+			case 5:
+				h.ProtoString = "rtcp"
+			case 34:
+				h.ProtoString = "rtpagent"
+			case 35:
+				h.ProtoString = "rtcpxr"
+			case 38:
+				h.ProtoString = "horaclifix"
+			case 53:
+				h.ProtoString = "dns"
+			case 100:
+				h.ProtoString = "log"
+			case 112:
+				h.ProtoString = "alert"


Can you please define those magic numbers similarly to the chunk-type etc?

Protocol strings are the equivalent of a type tag and are injected by HEP senders to further identify the payload type.

I don't doubt any of your words, but it won't hurt to define those numbers as consts and use them as speaking names. You can the even implement a type with the Stringer interface to cover this conversion here.

srebhan · 2021-11-01T17:16:22Z

plugins/parsers/hep/parser.go

+		if err != nil {
+			return nil, err
+		}
+		metric := m[0]


Why only use the first metric?

srebhan · 2021-11-01T17:17:26Z

plugins/parsers/hep/parser.go

+	nFields := make(map[string]interface{})
+	nFields["protocol_type_field"] = hep.ProtoType


Please name those fields.

Furthermore, please think about what should be a tag and what should be a field. I think the protocol type is rather a tag as you might want to query for all packets of a certain protocol type. However, the IPs and also rapidly growing IDs below should really be fields as otherwise cardinality might explode...

srebhan · 2021-11-01T17:17:55Z

plugins/parsers/hep/parser.go

+	nFields := make(map[string]interface{})
+	nFields["protocol_type_field"] = hep.ProtoType
+
+	metric := metric.New(h.MetricName, headerTags, nFields, time.Now())


You don't need the time.Now() part as it is the default if not provided.

srebhan · 2021-11-01T17:24:39Z

plugins/parsers/registry.go

+		parser, err = newHEPParser(config.MetricName,
+			config.HepMeasurementName,


I think MetricName and HepMeasurementName denote the same thing and will conflict in the parser. Please remove HepMeasurementName in favor of MetricName.

srebhan · 2021-11-01T17:25:33Z

plugins/parsers/registry.go

@@ -232,6 +238,17 @@ func NewParser(config *Config) (Parser, error) {
 			config.GrokCustomPatternFiles,
 			config.GrokTimezone,
 			config.GrokUniqueTimestamp)
+	case "hep":
+		parser, err = newHEPParser(config.MetricName,


Maybe directly construct the parser here...

reimda · 2021-11-04T20:01:36Z

It seems to me that HEP should be a Telegraf input plugin, not a parser.

Input plugins are data sources. Network protocols are usually implemented in input plugins (http, socket/tcp/udp, mqtt, kafka, amqp). Non network data sources such as file content (file or tail input) or program output (exec or execd input) are also input plugins.

Prsers plugins decode data (like json, csv, influx line protocol) that come into telegraf through an input plugin. Parser plugins are useful because their formats are used by various inputs.

Take the example http listener using json parser.

[[inputs.http_listener_v2]]
  service_address = ":8080"
  data_format = "json_v2"
    [[inputs.file.json_v2]]
        measurement_name = "json_data"
        measurement_name_path = "gjson/path"
        ...

HEP seems analogous to the HTTP listener in this example- the input, not the parser.

reimda · 2021-11-04T20:04:30Z

@adubovikov The PR description is empty and there is no issue that describes what your use case or goals are with this PR. Could you open an issue and describe what you want from the integration of telegraf and HEP? I think it would help us reviewers to have a better idea of what you are trying to do.

lmangani · 2021-11-04T20:57:30Z

@reimda very good points being made, but HEP is an encapsulation protocol and as such, it can be transported by any other protocol (TCP/UDP/SCPT/Queuing protocols/etc) and as such as thought it would be best positioned as a decoder in the pipeline. Having it as an input would force it to carry support for protocols Telegraf already offers a comfortable input for.

EDIT: I will provide an issue with the usecase description as requested

srebhan · 2021-11-05T12:28:49Z

Regarding the embedded JSON, XML, etc, it might be an option to keep the encapsulated part in a field and then chain another parser, only parsing this field... What do you think?

reimda · 2021-11-05T18:30:18Z

Keep in mind that inputs already can support multiple protocols. This has been done typically by having the plugin accept a url that specifies the protocol. See socket_listener's service_address setting for a listening input example or mqtt_consumer's servers setting for a connecting client example.

We could have HEP be a parser in telegraf, but it sounds like it would need to own another parser to do the job of parsing the encapsulated data (json or other format). I understand that HEP is an encapsulation protocol and it may feel natural to do it this way. Telegraf has never needed to have a nested parser like this before. There are other encapsulated use cases that telegraf handles with the parser processor

I see a few options here. To describe them I'll write example configuration for listening on a UDP port for HEP formatted data.

HEP is a parser plugin that owns another parser

[[inputs.socket_listener]]
  service_address = "udp://:9060"
  data_format = "hep"
  [[inputs.socket_listener.hep]]
    data_format = "json_v2"
    [[inputs.socket_listener.hep.json_v2]]
      measurement_name = "json_data_from_hep"
      measurement_name_path = "gjson/path"
      ...

This would introduce a new concept of nested parsers to telegraf users. Nesting tables in TOML is repetetive and tends to confuse users. It could also be a little ugly to implement because telegraf parser settings are implemented as input level settings.

HEP is an input plugin that has a parser

[[inputs.hep]]
  service_address = "udp://:9060"
  data_format = "json_v2"
  [[inputs.hep.json_v2]]
    measurement_name = "json_data_from_hep"
    measurement_name_path = "gjson/path"
    ...

This requires hep to handle its own transport protocols but the configuration is the most simple and feels the most familiar to me.

HEP is a parser plugin that returns its encapsulated data as a field which can be decoded by a parser processor

[[inputs.socket_listener]]
  service_address = "udp://:9060"
  data_format = "hep" # produces a field called hep_payload
[[processor.parser]]
  parse_fields = ["hep_payload"]
  data_format = "json_v2"
  [[inputs.socket_listener.hep.json_v2]]
    measurement_name = "json_data_from_hep"
    measurement_name_path = "gjson/path"

This feels relatively familiar to a telegraf user.

Thanks @lmangani for working on an issue to describe the use case. Having that should make it easier to decide which way to go.

srebhan · 2021-11-08T14:50:42Z

Thanks for the very nice visualization @reimda. I agree that option 2 looks the easiest from the user perspective. However, from a maintainers perspective it is the worst as we would now fold all possible transports (e.g. http_listener,
socket_listener, ...) into one plugin. Already now socket_listener as a complex piece of software.
Even more severe, we see quite some work on the TLS part for http, so we risk a duplication of those changes...

So I see two options from implementation side:

Continue with reimda's option 2 and abstract the transport similar to what we do in feat: Unify graphics SMI implementations #9815. This would mean the new input plugin can easily configure the transportation and expose the config options to the user. This is nice in terms of easing work for plugins that also want to implement such encapsulation protocols.
Decide for option 3 or a variation thereof. Contrary to option 1, this can be handled by the current structure. You can think of parsing the embedded JSON such that you simply keep every field instead of not parsing, but that would limit the parser to embedded JSON. It's up to you @adubovikov, @lmangani.

For reimda's option 1 we would need to restructure the parsers (maybe starting with something similar to #8791) first in order to allow them to specify the TOML options in the parser itself...

What do you guys think?

srebhan · 2022-02-04T16:11:17Z

@adubovikov any news on this PR?

srebhan · 2022-06-22T19:34:55Z

Is there still interest in this PR?

telegraf-tiger · 2022-07-07T18:09:47Z

Hello! I am closing this issue due to inactivity. I hope you were able to resolve your problem, if not please try posting this question in our Community Slack or Community Page. Thank you!

added hep input plugin

4b74e56

telegraf-tiger bot added fix pr to fix corresponding bug new plugin plugin/parser 1. Request for new parser plugins 2. Issues/PRs that are related to parser plugins labels Nov 1, 2021

feat: added hep input plugin. Fixed text.

55f487f

adubovikov mentioned this pull request Nov 1, 2021

Added hep protocol support as parser in telegraf #6167

Closed

3 tasks

reimda reviewed Nov 1, 2021

View reviewed changes

srebhan requested changes Nov 1, 2021

View reviewed changes

srebhan self-assigned this Nov 4, 2021

srebhan added the waiting for response waiting for response from contributor label Jun 22, 2022

telegraf-tiger bot closed this Jul 7, 2022

		@@ -0,0 +1,19 @@
		# HEP

		The HEP data format parses a HEP packet into metric fields.

		nFields := make(map[string]interface{})
		nFields["protocol_type_field"] = hep.ProtoType

		parser, err = newHEPParser(config.MetricName,
		config.HepMeasurementName,

Added hep protocol support as parser in telegraf #10039

Added hep protocol support as parser in telegraf #10039

Conversation

adubovikov commented Nov 1, 2021 • edited Loading

Required for all PRs:

telegraf-tiger bot commented Nov 1, 2021

adubovikov commented Nov 1, 2021

adubovikov commented Nov 1, 2021

telegraf-tiger bot commented Nov 1, 2021

Artifact URLs

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

srebhan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

reimda commented Nov 4, 2021

reimda commented Nov 4, 2021

lmangani commented Nov 4, 2021 • edited Loading

srebhan commented Nov 5, 2021 • edited Loading

reimda commented Nov 5, 2021

srebhan commented Nov 8, 2021

srebhan commented Feb 4, 2022

srebhan commented Jun 22, 2022

telegraf-tiger bot commented Jul 7, 2022

adubovikov commented Nov 1, 2021 •

edited

Loading

lmangani commented Nov 4, 2021 •

edited

Loading

srebhan commented Nov 5, 2021 •

edited

Loading