-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improving Top-level-Array support in metricbeat [ http module ] #10136
Conversation
* Update ICMP protocol to use ECS fields Only a few fields were changed. No dashboards were changed because there are no ICMP specific dashboards. Here's a summary of what fields changed. Part of elastic#7968 Changed - responsetime -> event.duration (unit are now nanoseconds) - bytes_in -> source.bytes - bytes_out -> destination.bytes Added - event.dataset = icmp - event.end - event.start - network.community_id - network.transport = icmp or ipv6-icmp - network.type = ipv4/ipv6 Unchanged Packetbeat Fields - status - type = icmp (we might remove this since we have event.dataset) - path = destination.ip (what is requested, not sure if this still makes sense) * Change NNNN to 10062
* Update flows to use event.dataset event.type is a reserved field for ECS and event.dataset is a more appropriate field for this identifying the what produced the event. And we'll keep type until we are done updating Packetbeat for ECS, at which time we can reevaluate if we want to change anything w.r.t. the common fields used in several Packetbeat datasets (like path, query, type, status). * Update Packetbeat Flows dashboard
* Enhance dashboard tooling - Pass URL username and password to Kibana client This allows dashboards to be exported from Kibana hosts with X-Pack security. - Clean the build/kibana dir when running `mage dashboards` - Improve error reporting when exporting fails * Add missing error check * Change log message assertion * Don't log errors in extended format Extended format is good for debugging but it's confusing when a CLI tool emits a stack for a user error.
make should be replaced with mage build, make no longer work.
…lastic#10100) Elasticsearch just remove the types from the mappings. This PR fixes the broken integration tests by removing the type. Further work to remove it in other parts is probably needed.
Follow up to elastic#10018. Changes the Auditbeat system module documentation to say dataset rather than metricset in all places. No Go code changes.
The Kafka had the same id as the title which complicates the automatic conversion to 7.x dashboards, see elastic#9998. This is extracted from the migration PR to be able to easily reset the other data again when needed.
Missing character encodings are added to `libbeat/reader` package. Thus, it is possible to configure the following encodings in Filebeat: * `"iso8859-1"`: ISO8859_1 * `"iso8859-2"`: ISO8859_2 * `"iso8859-3"`: ISO8859_3 * `"iso8859-4"`: ISO8859_4 * `"iso8859-5"`: ISO8859_5 * `"iso8859-6"`: ISO8859_6 * `"iso8859-7"`: ISO8859_7 * `"iso8859-8"`: ISO8859_8 * `"iso8859-9"`: ISO8859_9 * `"iso8859-10"`: ISO8859_10 * `"iso8859-13"`: ISO8859_13 * `"iso8859-14"`: ISO8859_14 * `"iso8859-15"`: ISO8859_15 * `"iso8859-16"`: ISO8859_16 * `"cp437"`: CodePage437 * `"cp850"`: CodePage850 * `"cp852"`: CodePage852 * `"cp855"`: CodePage855 * `"cp858"`: CodePage858 * `"cp860"`: CodePage860 * `"cp862"`: CodePage862 * `"cp863"`: CodePage863 * `"cp865"`: CodePage865 * `"cp866"`: CodePage866 * `"ebcdic-037"`: CodePage037 * `"ebcdic-1040"`: CodePage1140 * `"ebcdic-1047"`: CodePage1047 * `"koi8r"`: KOI8R * `"koi8u"`: KOI8U * `"macintosh"`: Macintosh * `"macintosh-cyrillic"`: MacintoshCyrillic * `"windows1250"`: Windows1250 * `"windows1251"`: Windows1251 * `"windows1252"`: Windows1252 * `"windows1253"`: Windows1253 * `"windows1254"`: Windows1254 * `"windows1255"`: Windows1255 * `"windows1256"`: Windows1256 * `"windows1257"`: Windows1257 * `"windows1258"`: Windows1258 * `"windows874"`: Windows874 Filebeat can be configured using the encoding names between quotation marks. The sentences I have used in the unit tests are coming from Google Translate. If something is off with those, it's on that program. :D
Unit tests in package journalbeat/checkpoint are duplicates of those in Winlogbeat. Thus, there is not any additional information coming from running unit tests in both packages. I am not removing the tests. So if someone modifies checkpointing in Journalbeat, he/she does not need to write new tests from scratch or copy from Winlogbeat again.
* Run queue unit tests in parallel This reduce test queue test duration by a factor of 4 on my machine. From 20s to ~5s for memqueue + spool-queue testing. Disable parallel execution if run in verbose mode
* Add Metricbeat AWS EC2 Module * Switch from aws-sdk-go to aws-sdk-go-v2 and update vendor * Add information into RootFields and ModuleFields * Change metric names to match ECS * Change metric names to match system module metrics * Generate data.json and add fields.yml * Add AWS EC2 overview dashboard * Add schema to AWS EC2 metricset * Check if detailed monitoring is enabled to adjust period setting * Change AWS EC2 from experimental to beta and update docs * Replace TestMockFetch with TestCreateCloudWatchEvents
… use whitelisting. (elastic#10087) * Instead of blacklisting chars in the resource name for cloudformation use whitelisting. Only [a-zA-Z0-9] are permitted as resource name Fixes: elastic#9420
* Update HTTP protocol to use ECS fields Here's a summary of what fields changed. Part of elastic#7968 Changed - bytes_in -> source.bytes - bytes_out -> destination.bytes - http.request.body -> http.request.body.content - http.response.body -> http.response.body.content - http.response.code -> http.response.status_code - http.response.phrase -> http.response.status_phrase - method -> http.request.method (lowercased) - notes -> error.message - params -> url.query - path -> url.path - real_ip -> network.forwarded_ip - responsetime -> event.duration (unit are now nanoseconds) - transport -> network.transport Added - event.dataset = http - event.end - event.start - http.request.referrer (always added if Referer header is present) - http.version - network.bytes - network.community_id - network.protocol = http - network.type - source.domain (added if Host header is present and not an IP address) - url.domain - set with the Host header value - url.full (synthesized from data in the request/response) - url.port (when port is != 80) - user_agent.original - (always added if User-Agent header is present) Unchanged Packetbeat Fields - query = {{ http.request.method }} {{ url.path }} - request - text representation of the entire request - response - text representation of the entire response - status - type = http (we might remove this since we have event.dataset) The HTTP dashboard was updated too. * Add http migration aliases * Remove duplicate from ecs-migration
There are no dashboards to update. Here's a summary of what fields changed. Part of elastic#7968 Changed - bytes_in -> source.bytes - bytes_out -> destination.bytes - responsetime -> event.duration - notes -> error.message Added - client - server - event.dataset = amqp - event.start - event.end - network.bytes - network.community_id - network.protocol = dhcpv4 - network.transport = tcp - network.type Unchanged Packetbeat Fields - method - status - type = amqp (we might remove this since we have event.dataset) - request - response
Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually? |
Thanks for starting this. Can you share and example json even that you expect in case of an array? |
Hi @ruflin, Suppose metricbeat[http] hits a REST endpoint for a list of connected systems that returns a JSON that looks like this: The solution proposed will accept strings or integers or anything and assign it to a key ( PS: These examples were inspired by the example in #9743 and discussion |
Hi @ruflin, |
I see in your example above, that you even mix array and non array values. I wonder if we should support that? For the pure array values, I would expect that it's all sent as 1 array. In Elasticsearch / Lucene every key is automatically an array, so you can sen |
Hey @ruflin,
☝️ Already done in the existing implementation.
☝️ Will change my PR to incorporate this change ☕️ What about other valid JSON arrays ?
|
@ruflin, |
@live-wire Can you share some example on what For the PR review: It's on my list but I really want to play around with it locally and probably will not get to it this week. Thanks for your patience. |
Examples
|
@live-wire I'm really sorry for the super late reply, this completely slipped through on my end. For the 4 cases that you showed above, we should discuss what the expected elasticsearch docs are. Here some thoughts:
What I'm asking myself for the second use case is on how the queries would like like that you want to run against these values? |
@live-wire Any updates from you on this one? |
Hi @ruflin sorry for the delayed response.
This is a choice we'd have to make :) My current implementation defines a |
The problem we have if we hard code it, is that if one of the top level arrays is We could also argue, all the values we expect in an array should always be a number and we make it a float or double by default. Like this we would even be able to have a template for it. But your example in the PR description is one with keywords, so I guess this is also not an option. |
Closing this one due inactivity, I wonder if the https://www.elastic.co/guide/en/beats/metricbeat/current/metricbeat-metricset-http-json.html solves this already? |
This PR aims to close the issue: #9743
More Context
#6480 added support for arrays as top-level response. However, the arrays are expected to have items of type
common.MapStr
which should not be the case. Even[ ["storm"] , "trooper", 1, {"darth": "vader"}]
is a valid JSON response.common.MapStr
, it is assigned as a value to the key_data
(Is it okay to do it ?)test-module
but that seems to get stuck.$make MODULE=http test-module
. I noticed that TravisCI also skips integration checks. (The test always times out while trying to kill old services in the function EnsureUpWithTimeout) How do you suggest I run the integration test ? 😓Is this fix acceptable ? 👨🚒