Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug] Null attributes not serialized to protobuf correctly, breaking Loki #6138

Open
johnnyggalt opened this issue Feb 17, 2025 · 3 comments · May be fixed by #6149
Open

[bug] Null attributes not serialized to protobuf correctly, breaking Loki #6138

johnnyggalt opened this issue Feb 17, 2025 · 3 comments · May be fixed by #6149
Assignees
Labels
bug Something isn't working needs-triage New issues which have not been classified or triaged by a community member pkg:OpenTelemetry Issues related to OpenTelemetry NuGet package

Comments

@johnnyggalt
Copy link

johnnyggalt commented Feb 17, 2025

Package

OpenTelemetry

Package Version

Package Name Version
OpenTelemetry.Exporter.OpenTelemetryProtocol 1.11.1
OpenTelemetry.Exporter.Prometheus.AspNetCore 1.11.0-beta.1
OpenTelemetry.Extensions.Hosting 1.11.1
OpenTelemetry.Instrumentation.AspNetCore 1.11.0
OpenTelemetry.Instrumentation.Process 1.11.0-beta.1
OpenTelemetry.Instrumentation.Runtime 1.11.0

Runtime Version

net9.0

Description

As discovered via this issue, the TagWriter class of the OpenTelemetry.Exporter.OpenTelemetryProtocol package is misbehaving when a LogRecord has an attribute with a null value. Instead of writing the attribute's key followed by an empty (missing) value, it skips the attribute altogether (see this code). This results in what I assume is an invalid serialized object and it ends up breaking Loki when it receives such a log record.

To repro this, it was as simple as doing this and forwarding it onto Loki:

logger.LogInformation("A null value: {Value}", null)

Steps to Reproduce

  1. Stand up Loki instance. I'm using grafana/otel-lgtm:latest and mapping a volume to a local directory so I can quickly and easily delete all its storage and start afresh. See my docker compose service below
  2. dotnet new console
  3. dotnet add package Microsoft.Extensions.Hosting
  4. dotnet add package OpenTelemetry.Exporter.OpenTelemetryProtocol
  5. In Program.cs, copy/paste the below code
  6. Update the exporter endpoint URL as necessary
  7. Run the program and confirm log entries appear in Loki:
    1. Open Grafana dashboard
    2. Go to Explore
    3. Click the Label drop-down and choose service_name
    4. Click the Value drop-down and choose the value there
    5. Click the Run Query button as necessary
  8. Uncomment the line of code that logs a null value and re-execute the repro
  9. Go back to Grafana and attempt to refresh the logs. It should spin for a while before failing

Docker Compose service

  monitoring:
    image: grafana/otel-lgtm:latest
    ports:
      - 3001:3000 # Grafana admin
      - 4317:4317 # OpenTelemetry GRPC ingestion
      - 9091:9090 # Prometheus admin
    volumes:
      - ./path/to/grafana_data/:/data

Program.cs:

using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Logging;
using Microsoft.Extensions.Hosting;
using OpenTelemetry.Logs;

var hostBuilder =
    Host
        .CreateDefaultBuilder()
        .ConfigureServices(services =>
            services
                .AddLogging(logging =>
                    logging
                        .AddOpenTelemetry(openTelemetry =>
                            openTelemetry
                                .AddOtlpExporter(exporter =>
                                    exporter.Endpoint = new Uri("http://localhost:4317")
                                )
                            )
                    )
                .AddHostedService<HostedService>()
        );

hostBuilder.RunConsoleAsync().Wait();

public class HostedService(ILogger<HostedService> logger) : IHostedService
{
    public Task StartAsync(CancellationToken cancellationToken)
    {
        logger.LogInformation("Hosted Service is starting.");

        string? value = null;
        //logger.LogInformation("Here is a null {Value}", value);

        return Task.CompletedTask;
    }

    public Task StopAsync(CancellationToken cancellationToken)
    {
        return Task.CompletedTask;
    }
}

Expected Result

I should see the log entry in Grafana's log viewer.

Actual Result

Once the poisoned log entry is submitted to the GRPC endpoint, the Grafana log viewer will spin for a while before displaying an error message:

Image

Error message as text for search purposes:

failed to parse series labels to categorize labels: 1:2: parse error: unexpected "=" in label set, expected identifier or "}"

Additional Context

No response

@johnnyggalt johnnyggalt added bug Something isn't working needs-triage New issues which have not been classified or triaged by a community member labels Feb 17, 2025
@github-actions github-actions bot added the pkg:OpenTelemetry Issues related to OpenTelemetry NuGet package label Feb 17, 2025
@sergey-v9
Copy link

I encountered the same issue.
This problem was introduced in 1.11.0 when transitioning from Google.Protobuf to an in-house protobuf serializer.
So downgrading to 1.10.0 is a workaround.

@rajkumar-rangaraj rajkumar-rangaraj self-assigned this Feb 18, 2025
@matt-hensley
Copy link
Contributor

@rajkumar-rangaraj I'm also taking a look at this

@Usergitbit
Copy link

Usergitbit commented Feb 20, 2025

Just ran into this when trying to use the OtlpExporter in YARP with no extra logging apart from what it does by default.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs-triage New issues which have not been classified or triaged by a community member pkg:OpenTelemetry Issues related to OpenTelemetry NuGet package
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants