Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add filter structure to the Sumo Logic exporter #1

Conversation

sumo-drosiek
Copy link
Owner

Depends on open-telemetry#1565

// GetMetadata builds string which represents metadata in alphabetical order
func (f *filter) GetMetadata(attributes pdata.AttributeMap) Fields {
attrs := f.filter(attributes)
metadata := make([]string, 0, len(attrs))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if it changes much but perhaps an array could be leveraged for performance reasons? I.e. something along:

metadata := make([]string, len(attrs))
i := 0
for k, v := range attrs {
  metadata[i++] = ...
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tested this myself a while ago and the difference is marginal but it's there.
I'm OK with both approaches.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it doesn't matter too much, as append do not have to reallocate memory and current approach is more clear

@sumo-drosiek sumo-drosiek merged this pull request into drosiek-sumologic-output-filtering Nov 23, 2020
@sumo-drosiek sumo-drosiek deleted the drosiek-sumologic-output-filtering-pr branch November 23, 2020 13:19
djaglowski added a commit that referenced this pull request Jun 10, 2022
…y#9224)

* add vcenter vSAN collection

* checkpoint on getting property collection working

* checkpoint before integration test

* dual receivers under root receiver pointer

* checkpoint before updated mdatagen

* use syslog receiver rather than tcplogreceiver

* getting more performance counter refinements

* remove unneccessary component addition

* try to fix go.mod resolution issues

* try to fix go.mod resolution issues pt 2

* addlicense

* fix go.mod by fixing require directive

* add readme for metrics

* update readme

* fix go.mod referring nonexistent version

* add performance manager tests

* more tests

* add more attributes to virtual machines and host systems

* add more attributes to virtual machines and host systems

* spike changelog entry

* fix go.mod in both places

* fix go.mod in configschema

* add // import github.com/open-telemetry/opentelemetry-collector-contrib/receiver/vmwarevcenterreceiver to imports

* add quotations

* add to receiver lifecycle

* remove extra go generate direction

* fix typo of utilizaiton in metric description

* small changes to interval id in performance queries to be more consistent

* PR feedback including omitting company name prefix

* PR feedback to not fail starting the component on potential network failures

* minor grammar correction in vcenter readme

* update expected metrics

* update host_effective attribute value

* remove PerformanceInterval customizability

* add to codeowners

* fix indentation on merge conflict

* fix changelog entry place so its in the new components section

* update to be on 0.49.0 of the collector'

* add PR number to changelog

* regenerate with newer version of mdatagen

* move error log if unable to connect on start to receiver.Start() rather than scraper.Start()

* fix test cases from last commit

* minor update to config with tests

* fix metric description

* use utc for host vsan collection as well

* update comments of public facing methods

* return errors on getting clusters to the scraper errors

* PR feedback #1

* instantiate new client if client is nil

* update all descriptions to have punctuation

* three more descs

* move ensureReceiver up to once we validated as a config

* some more PR feedback

* looking into race conditions

* run go tidy

* fix import order and remove unneccessary mutex

* remove mutex from struct

* refactor client to responsible for knowing if the vsan endpoints are reachable

* fix integration test referencing old var

* change metrics.metrics => metrics.settings, update client pr feedback

* remove vSAN collection temporarily

* remove extra metric attributes for vSAN

* remove vsan specific variables

* clean up host PerfCounter disk latency metrics and fix some descriptions to better reflect interval

* add 20s interval to extended documentation as needed

* mdatagen fixes

* add integration test metric scrape

* fix import order

* go up to 0.49.1

* gotidy

* add replace directive for semconv

* gotidy fixes

* fix component not being on 0.50.0

* update to v0.50.1-0.20220429151328-041f39835df7

* use newer mdatagen

* remove any logging functionality change && update documentation

* fix integration test from flattening of config

* fix scraper start not erroring if connection cannot be established

* make scrapertest less flaky

* format test json

* Apply suggestions from code review

Co-authored-by: Daniel Jaglowski <jaglows3@gmail.com>

* adjust metric definition for vcenter.host.disk.throughput

* remove comment and move pm level 2 metrics to appropriate section

* try to be respective of datacenters

* fix only vCenter server functionality

* try building out a mock server for test coverage

* make goporto

* fix build issues

* use latest mdatagen

* add newlines to ends of xml recordings

* fix integration test

* moved around scrapererrors because now the receiver is datacenter dependent

* try and do an audit of performance metrics and requests/responses

* update testdata with correct units

* make tidy

* make tidy

* update collector version

* fix local testing code including modules

* remove deprecated use of commonponenterror

* pr feedback; add method of collection recording, return poweredOn/poweredOff VMs

* remove content.json

* fix description change in scraper_test.go

* update collector version

* bump replaced module; rebuild load tests

* fix alibaba version auto localizing

Co-authored-by: Daniel Jaglowski <jaglows3@gmail.com>
sumo-drosiek pushed a commit that referenced this pull request Aug 21, 2023
…emetry#24676)

**Description:** The metadata.yml for the SSH check receiver currently
documents a resource attribute containing the SSH endpoint but this is
not emitted. This PR updates the receiver to include this resource
attribute.

**Link to tracking Issue:** open-telemetry#24441 

**Testing:**

Example collector config:
```yaml
receivers:
  sshcheck:
    endpoint: 13.245.150.131:22
    username: ec2-user
    key_file: /Users/dewald.dejager/.ssh/sandbox.pem
    collection_interval: 15s
    known_hosts: /Users/dewald.dejager/.ssh/known_hosts
    ignore_host_key: false
    resource_attributes:
      "ssh.endpoint":
        enabled: true

exporters:
  logging:
    verbosity: detailed
  prometheus:
    endpoint: 0.0.0.0:8081
    resource_to_telemetry_conversion:
      enabled: true

service:
  pipelines:
    metrics:
      receivers: [sshcheck]
      exporters: [logging, prometheus]
```

The log output looks like this:
```
2023-07-30T16:52:38.724+0200    info    MetricsExporter {"kind": "exporter", "data_type": "metrics", "name": "logging", "resource metrics": 1, "metrics": 2, "data points": 2}
2023-07-30T16:52:38.724+0200    info    ResourceMetrics #0
Resource SchemaURL: 
Resource attributes:
     -> ssh.endpoint: Str(13.245.150.131:22)
ScopeMetrics #0
ScopeMetrics SchemaURL: 
InstrumentationScope otelcol/sshcheckreceiver 0.82.0-dev
Metric #0
Descriptor:
     -> Name: sshcheck.duration
     -> Description: Measures the duration of SSH connection.
     -> Unit: ms
     -> DataType: Gauge
NumberDataPoints #0
StartTimestamp: 2023-07-30 14:52:22.381672 +0000 UTC
Timestamp: 2023-07-30 14:52:38.404003 +0000 UTC
Value: 319
Metric #1
Descriptor:
     -> Name: sshcheck.status
     -> Description: 1 if the SSH client successfully connected, otherwise 0.
     -> Unit: 1
     -> DataType: Sum
     -> IsMonotonic: false
     -> AggregationTemporality: Cumulative
NumberDataPoints #0
StartTimestamp: 2023-07-30 14:52:22.381672 +0000 UTC
Timestamp: 2023-07-30 14:52:38.404003 +0000 UTC
Value: 1
```

And the Prometheus metrics look like this:
```
# HELP sshcheck_duration Measures the duration of SSH connection.
# TYPE sshcheck_duration gauge
sshcheck_duration{ssh_endpoint="13.245.150.131:22"} 311
# HELP sshcheck_status 1 if the SSH client successfully connected, otherwise 0.
# TYPE sshcheck_status gauge
sshcheck_status{ssh_endpoint="13.245.150.131:22"} 1
```
sumo-drosiek pushed a commit that referenced this pull request Aug 21, 2023
)

**Description:** 

Adding command line argument `--status-code` to `telemetrygen traces`,
which accepts `(Unset,Error,Ok)` (case sensitive) or the enum equivalent
of `(0,1,2)`.

Running 

```shell
telemetrygen traces --otlp-insecure --traces 1 --status-code 1
```

against a minimal local collector yields

```txt
2023-07-29T21:27:57.862+0100	info	ResourceSpans #0
Resource SchemaURL: https://opentelemetry.io/schemas/1.4.0
Resource attributes:
     -> service.name: Str(telemetrygen)
ScopeSpans #0
ScopeSpans SchemaURL:
InstrumentationScope telemetrygen
Span #0
    Trace ID       : f6dc4be32c78b9999c69d504a79e68c1
    Parent ID      : 4e2cd6e0e90cf2ea
    ID             : 20835413e32d26a5
    Name           : okey-dokey
    Kind           : Server
    Start time     : 2023-07-29 20:27:57.861602 +0000 UTC
    End time       : 2023-07-29 20:27:57.861726 +0000 UTC
    Status code    : Error
    Status message :
Attributes:
     -> net.peer.ip: Str(1.2.3.4)
     -> peer.service: Str(telemetrygen-client)
Span #1
    Trace ID       : f6dc4be32c78b9999c69d504a79e68c1
    Parent ID      :
    ID             : 4e2cd6e0e90cf2ea
    Name           : lets-go
    Kind           : Client
    Start time     : 2023-07-29 20:27:57.861584 +0000 UTC
    End time       : 2023-07-29 20:27:57.861726 +0000 UTC
    Status code    : Error
    Status message :
Attributes:
     -> net.peer.ip: Str(1.2.3.4)
     -> peer.service: Str(telemetrygen-server)
```

and similarly (the string version)

```shell
telemetrygen traces --otlp-insecure --traces 1 --status-code '"Ok"'
```

produces 

```txt
Resource SchemaURL: https://opentelemetry.io/schemas/1.4.0
Resource attributes:
     -> service.name: Str(telemetrygen)
ScopeSpans #0
ScopeSpans SchemaURL:
InstrumentationScope telemetrygen
Span #0
    Trace ID       : dfd830da170acfe567b12f87685d7917
    Parent ID      : 8e15b390dc6a1ccc
    ID             : 165c300130532072
    Name           : okey-dokey
    Kind           : Server
    Start time     : 2023-07-29 20:29:16.026965 +0000 UTC
    End time       : 2023-07-29 20:29:16.027089 +0000 UTC
    Status code    : Ok
    Status message :
Attributes:
     -> net.peer.ip: Str(1.2.3.4)
     -> peer.service: Str(telemetrygen-client)
Span #1
    Trace ID       : dfd830da170acfe567b12f87685d7917
    Parent ID      :
    ID             : 8e15b390dc6a1ccc
    Name           : lets-go
    Kind           : Client
    Start time     : 2023-07-29 20:29:16.026956 +0000 UTC
    End time       : 2023-07-29 20:29:16.027089 +0000 UTC
    Status code    : Ok
    Status message :
Attributes:
     -> net.peer.ip: Str(1.2.3.4)
     -> peer.service: Str(telemetrygen-server)
```

The default is `Unset` which is the current behaviour.

**Link to tracking Issue:**

24286

**Testing:**

Added unit tests which covers both valid and invalid inputs.

**Documentation:**

Command line arguments are self documenting via the usage info in the
flag.

Co-authored-by: Pablo Baeyens <pbaeyens31+github@gmail.com>
sumo-drosiek pushed a commit that referenced this pull request Nov 27, 2023
open-telemetry#29116)

**Description:** 

As originally proposed in open-telemetry#26991 before I got distracted

Exposes the duration of generated spans as a command line parameter. It
uses a `DurationVar` flag so units can be easily provided and are
automatically applied.

Example usage:

```bash
telemetrygen traces --traces 100 --otlp-insecure --span-duration 10ns # nanoseconds
telemetrygen traces --traces 100 --otlp-insecure --span-duration 10us # microseconds
telemetrygen traces --traces 100 --otlp-insecure --span-duration 10ms # milliseconds
telemetrygen traces --traces 100 --otlp-insecure --span-duration 10s # seconds
```

**Testing:** 

Ran without the argument provided `telemetrygen traces --traces 1
--otlp-insecure` and seen spans publishing with the default value.

Ran again with the argument provided: `telemetrygen traces --traces 1
--otlp-insecure --span-duration 1s`

And observed the expected output:

```
Resource SchemaURL: https://opentelemetry.io/schemas/1.4.0
Resource attributes:
     -> service.name: Str(telemetrygen)
ScopeSpans #0
ScopeSpans SchemaURL: 
InstrumentationScope telemetrygen 
Span #0
    Trace ID       : 8b441587ffa5820688b87a6b511d634c
    Parent ID      : 39faad428638791b
    ID             : 88f0886894bd4ee2
    Name           : okey-dokey
    Kind           : Server
    Start time     : 2023-11-12 02:05:07.97443 +0000 UTC
    End time       : 2023-11-12 02:05:08.97443 +0000 UTC
    Status code    : Unset
    Status message : 
Attributes:
     -> net.peer.ip: Str(1.2.3.4)
     -> peer.service: Str(telemetrygen-client)
Span #1
    Trace ID       : 8b441587ffa5820688b87a6b511d634c
    Parent ID      : 
    ID             : 39faad428638791b
    Name           : lets-go
    Kind           : Client
    Start time     : 2023-11-12 02:05:07.97443 +0000 UTC
    End time       : 2023-11-12 02:05:08.97443 +0000 UTC
    Status code    : Unset
    Status message : 
Attributes:
     -> net.peer.ip: Str(1.2.3.4)
     -> peer.service: Str(telemetrygen-server)
	{"kind": "exporter", "data_type": "traces", "name": "debug"}
```

**Documentation:** No documentation added.

---------

Co-authored-by: Pablo Baeyens <pbaeyens31+github@gmail.com>
sumo-drosiek pushed a commit that referenced this pull request May 14, 2024
**Description:** <Describe what has changed.>
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
This PR implements the new container logs parser as it was proposed at
open-telemetry#31959.

**Link to tracking Issue:** <Issue number if applicable>
open-telemetry#31959

**Testing:** <Describe what testing was performed and which tests were
added.>

Added unit tests. Providing manual testing steps as well:

### How to test this manually

1. Using the following config file:
```yaml
receivers:
  filelog:
    start_at: end
    include_file_name: false
    include_file_path: true
    include:
    - /var/log/pods/*/*/*.log
    operators:
      - id: container-parser
        type: container
        output: m1
      - type: move
        id: m1
        from: attributes.k8s.pod.name
        to: attributes.val
      - id: some
        type: add
        field: attributes.key2.key_in
        value: val2

exporters:
  debug:
    verbosity: detailed

service:
  pipelines:
    logs:
      receivers: [filelog]
      exporters: [debug]
      processors: []
```
2. Start the collector:
`./bin/otelcontribcol_linux_amd64 --config
~/otelcol/container_parser/config.yaml`
3. Use the following bash script to create some logs:
```bash
#! /bin/bash

echo '2024-04-13T07:59:37.505201169-05:00 stdout P This is a very very long crio line th' >> /var/log/pods/kube-scheduler-kind-control-plane_49cc7c1fd3702c40b2686ea7486091d3/kube-scheduler43/1.log
echo '{"log":"INFO: log line here","stream":"stdout","time":"2029-03-30T08:31:20.545192187Z"}' >> /var/log/pods/kube-controller-kind-control-plane_49cc7c1fd3702c40b2686ea7486091d6/kube-controller/1.log
echo '2024-04-13T07:59:37.505201169-05:00 stdout F at is awesome! crio is awesome!' >> /var/log/pods/kube-scheduler-kind-control-plane_49cc7c1fd3702c40b2686ea7486091d3/kube-scheduler43/1.log
echo '2021-06-22T10:27:25.813799277Z stdout P some containerd log th' >> /var/log/pods/kube-scheduler-kind-control-plane_49cc7c1fd3702c40b2686ea7486091d3/kube-scheduler44/1.log
echo '{"log":"INFO: another log line here","stream":"stdout","time":"2029-03-30T08:31:20.545192187Z"}' >> /var/log/pods/kube-controller-kind-control-plane_49cc7c1fd3702c40b2686ea7486091d6/kube-controller/1.log
echo '2021-06-22T10:27:25.813799277Z stdout F at is super awesome! Containerd is awesome' >> /var/log/pods/kube-scheduler-kind-control-plane_49cc7c1fd3702c40b2686ea7486091d3/kube-scheduler44/1.log



echo '2024-04-13T07:59:37.505201169-05:00 stdout F standalone crio line which is awesome!' >> /var/log/pods/kube-scheduler-kind-control-plane_49cc7c1fd3702c40b2686ea7486091d3/kube-scheduler43/1.log
echo '2021-06-22T10:27:25.813799277Z stdout F standalone containerd line that is super awesome!' >> /var/log/pods/kube-scheduler-kind-control-plane_49cc7c1fd3702c40b2686ea7486091d3/kube-scheduler44/1.log
```
4. Run the above as a bash script to verify any parallel processing.
Verify that the output is correct.


### Test manually on k8s

1. `make docker-otelcontribcol && docker tag otelcontribcol
otelcontribcol-dev:0.0.1 && kind load docker-image
otelcontribcol-dev:0.0.1`
2. Install using the following helm values file:
```yaml
mode: daemonset
presets:
  logsCollection:
    enabled: true

image:
  repository: otelcontribcol-dev
  tag: "0.0.1"
  pullPolicy: IfNotPresent

command:
  name: otelcontribcol

config:
  exporters:
    debug:
      verbosity: detailed
  receivers:
    filelog:
      start_at: end
      include_file_name: false
      include_file_path: true
      exclude:
        - /var/log/pods/default_daemonset-opentelemetry-collector*_*/opentelemetry-collector/*.log
      include:
        - /var/log/pods/*/*/*.log
      operators:
        - id: container-parser
          type: container
          output: some
        - id: some
          type: add
          field: attributes.key2.key_in
          value: val2


  service:
    pipelines:
      logs:
        receivers: [filelog]
        processors: [batch]
        exporters: [debug]
```
3. Check collector's output to verify the logs are parsed properly:
```console
2024-05-10T07:52:02.307Z	info	LogsExporter	{"kind": "exporter", "data_type": "logs", "name": "debug", "resource logs": 1, "log records": 2}
2024-05-10T07:52:02.307Z	info	ResourceLog #0
Resource SchemaURL: 
ScopeLogs #0
ScopeLogs SchemaURL: 
InstrumentationScope  
LogRecord #0
ObservedTimestamp: 2024-05-10 07:52:02.046236071 +0000 UTC
Timestamp: 2024-05-10 07:52:01.92533954 +0000 UTC
SeverityText: 
SeverityNumber: Unspecified(0)
Body: Str(otel logs at 07:52:01)
Attributes:
     -> log: Map({"iostream":"stdout"})
     -> time: Str(2024-05-10T07:52:01.92533954Z)
     -> k8s: Map({"container":{"name":"busybox","restart_count":"0"},"namespace":{"name":"default"},"pod":{"name":"daemonset-logs-6f6mn","uid":"1069e46b-03b2-4532-a71f-aaec06c0197b"}})
     -> logtag: Str(F)
     -> key2: Map({"key_in":"val2"})
     -> log.file.path: Str(/var/log/pods/default_daemonset-logs-6f6mn_1069e46b-03b2-4532-a71f-aaec06c0197b/busybox/0.log)
Trace ID: 
Span ID: 
Flags: 0
LogRecord #1
ObservedTimestamp: 2024-05-10 07:52:02.046411602 +0000 UTC
Timestamp: 2024-05-10 07:52:02.027386192 +0000 UTC
SeverityText: 
SeverityNumber: Unspecified(0)
Body: Str(otel logs at 07:52:02)
Attributes:
     -> log.file.path: Str(/var/log/pods/default_daemonset-logs-6f6mn_1069e46b-03b2-4532-a71f-aaec06c0197b/busybox/0.log)
     -> time: Str(2024-05-10T07:52:02.027386192Z)
     -> log: Map({"iostream":"stdout"})
     -> logtag: Str(F)
     -> k8s: Map({"container":{"name":"busybox","restart_count":"0"},"namespace":{"name":"default"},"pod":{"name":"daemonset-logs-6f6mn","uid":"1069e46b-03b2-4532-a71f-aaec06c0197b"}})
     -> key2: Map({"key_in":"val2"})
Trace ID: 
Span ID: 
Flags: 0
...
```


**Documentation:** <Describe the documentation added.>  Added

Signed-off-by: ChrsMark <chrismarkou92@gmail.com>
sumo-drosiek pushed a commit that referenced this pull request Aug 23, 2024
…try#33225)

**Description:** <Describe what has changed.>
Using the DB span example below, X-Ray exporter failed to generate the
expected DB call subsegment names because it could not parse JDBC
connection strings that start with the `jdbc:` prefix.
```
Span #1
    Trace ID       : 663a0b68a5e3849c09c07f914b3df738
    Parent ID      : 1052e2a4a2516884
    ID             : 374de78b552e23c2
    Name           : orders@no-appsignals-mysql-1.cnkqok6c8mo1.eu-west-1.rds.amazonaws.com
    Kind           : Client
    Start time     : 2024-05-07 11:07:20.62 +0000 UTC
    End time       : 2024-05-07 11:07:20.624 +0000 UTC
    Status code    : Unset
    Status message :
Attributes:
     -> db.connection_string: Str(jdbc:mysql://no-appsignals-mysql-1.cnkqok6c8mo1.eu-west-1.rds.amazonaws.com:3306)
     -> db.name: Str(orders)
     -> db.system: Str(MySQL)
     -> db.user: Str(myuser@10.0.149.233)
```

**Link to tracking Issue:** <Issue number if applicable>

**Testing:** <Describe what testing was performed and which tests were
added.>
local tests
sumo-drosiek pushed a commit that referenced this pull request Aug 23, 2024
…pen-telemetry#33353)

**Description:** <Describe what has changed.>
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
Container parser should add k8s metadata as resource attributes and not
as log record attributes.

**Link to tracking Issue:** <Issue number if applicable> Fixes
open-telemetry#33341

**Testing:** <Describe what testing was performed and which tests were
added.>
Manual testing on local k8s cluster:

```console
2024-06-04T06:40:08.219Z	info	ResourceLog #0
Resource SchemaURL: 
Resource attributes:
     -> k8s.pod.uid: Str(d5ecc924-e255-4525-b5be-6437939b1e4d)
     -> k8s.container.name: Str(busybox)
     -> k8s.namespace.name: Str(default)
     -> k8s.pod.name: Str(daemonset-logs-dhzcq)
     -> k8s.container.restart_count: Str(0)
ScopeLogs #0
ScopeLogs SchemaURL: 
InstrumentationScope  
LogRecord #0
ObservedTimestamp: 2024-06-04 06:40:08.007370503 +0000 UTC
Timestamp: 2024-06-04 06:40:07.855932421 +0000 UTC
SeverityText: 
SeverityNumber: Unspecified(0)
Body: Str(otel logs at 06:40:07)
Attributes:
     -> logtag: Str(F)
     -> key2: Map({"key_in":"val2"})
     -> log.file.path: Str(/var/log/pods/default_daemonset-logs-dhzcq_d5ecc924-e255-4525-b5be-6437939b1e4d/busybox/0.log)
     -> time: Str(2024-06-04T06:40:07.855932421Z)
     -> log.iostream: Str(stdout)
Trace ID: 
Span ID: 
Flags: 0
LogRecord #1
ObservedTimestamp: 2024-06-04 06:40:08.007451031 +0000 UTC
Timestamp: 2024-06-04 06:40:07.957875321 +0000 UTC
SeverityText: 
SeverityNumber: Unspecified(0)
Body: Str(otel logs at 06:40:07)
Attributes:
     -> log.file.path: Str(/var/log/pods/default_daemonset-logs-dhzcq_d5ecc924-e255-4525-b5be-6437939b1e4d/busybox/0.log)
     -> log.iostream: Str(stdout)
     -> time: Str(2024-06-04T06:40:07.957875321Z)
     -> key2: Map({"key_in":"val2"})
     -> logtag: Str(F)
Trace ID: 
Span ID: 
Flags: 0
```

**Documentation:** <Describe the documentation added.> ~

---------

Signed-off-by: ChrsMark <chrismarkou92@gmail.com>
sumo-drosiek pushed a commit that referenced this pull request Aug 23, 2024
…try.log_response_body` config (open-telemetry#33854)

**Description:** <Describe what has changed.>
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
- Add `telemetry.log_request_body` and `telemetry.log_response_body`
config for debugging. Debug log will contain field `request_body` and/or
`response_body` in the same log line instead of separate lines to avoid
interleaved log lines.
- Change "Request failed" log level to debug.

Output:
```
2024-07-02T14:09:24.983+0100	debug	elasticsearchexporter/elasticsearch_bulk.go:67	Request roundtrip completed.	{"kind": "exporter", "data_type": "logs", "name": "elasticsearch", "response_body": "{\"version\":{\"number\":\"1.2.3\"}}\n", "path": "/", "method": "GET", "duration": 0.000865486, "status": "200 OK"}
2024-07-02T14:09:24.984+0100	debug	elasticsearchexporter/elasticsearch_bulk.go:67	Request roundtrip completed.	{"kind": "exporter", "data_type": "logs", "name": "elasticsearch", "request_body": "{\"create\":{\"_index\":\"logs-test-idx\"}}\n{\"@timestamp\":\"2024-07-02T13:09:24.970187592Z\",\"Attributes\":{\"a\":\"test\",\"b\":5,\"batch_index\":\"batch_1\",\"c\":3,\"d\":true,\"item_index\":\"item_1\"},\"Body\":\"Load Generator Counter #0\",\"Scope\":{\"name\":\"\",\"version\":\"\"},\"SeverityNumber\":11,\"SeverityText\":\"INFO3\",\"TraceFlags\":1}\n{\"create\":{\"_index\":\"logs-test-idx\"}}\n{\"@timestamp\":\"2024-07-02T13:09:24.970187592Z\",\"Attributes\":{\"a\":\"test\",\"b\":5,\"batch_index\":\"batch_1\",\"c\":3,\"d\":true,\"item_index\":\"item_2\"},\"Body\":\"Load Generator Counter #1\",\"Scope\":{\"name\":\"\",\"version\":\"\"},\"SeverityNumber\":11,\"SeverityText\":\"INFO3\",\"TraceFlags\":1}\n", "response_body": "{\"took\":0,\"errors\":false,\"items\":[{\"create\":{\"_index\":\"logs-test-idx\",\"_id\":\"\",\"_version\":0,\"result\":\"\",\"status\":201,\"_seq_no\":0,\"_primary_term\":0,\"_shards\":{\"total\":0,\"successful\":0,\"failed\":0},\"error\":{\"type\":\"\",\"reason\":\"\",\"caused_by\":{\"type\":\"\",\"reason\":\"\"}}}},{\"create\":{\"_index\":\"logs-test-idx\",\"_id\":\"\",\"_version\":0,\"result\":\"\",\"status\":201,\"_seq_no\":0,\"_primary_term\":0,\"_shards\":{\"total\":0,\"successful\":0,\"failed\":0},\"error\":{\"type\":\"\",\"reason\":\"\",\"caused_by\":{\"type\":\"\",\"reason\":\"\"}}}}]}\n", "path": "/_bulk", "method": "POST", "duration": 0.000539979, "status": "200 OK"}
```

Required config to log
```
exporters:
  elasticsearch:
    telemetry:
      log_request_body: true
      log_response_body: true
    
service:
  telemetry:
    logs:
      level: debug
```

For easier analysis, limit the size of request body size. Use
`num_workers`=1 and lower `flush.bytes` and/or `flush.interval`.

**Link to tracking Issue:** <Issue number if applicable>

**Testing:** <Describe what testing was performed and which tests were
added.>

Manually verified with a modified integration test.

**Documentation:** <Describe the documentation added.>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants