Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: allow setting Datadog syslog.hostname attribute #87

Merged
merged 4 commits into from
Jun 21, 2022

Conversation

btkostner
Copy link
Contributor

What is breaking

When logging to Datadog from a kubernetes cluster, the syslog hostname is set to the container hostname. This means Datadog will try to use the container hostname as the actual host name for metrics, traces, etc. Because of this, nothing will link up successfully. Sadly you can't rewrite that or do some sort of pipeline fix for this.

Why this change fixes it

Removing the syslog hostname means that dd-agent will fall back to its native hostname logic. This will pick up the same hostname when running on a regular server (non containerized), and it will pick up the correct (node) host name when running in a container (kubernetes).

@coveralls
Copy link

coveralls commented May 23, 2022

Coverage Status

Coverage increased (+0.2%) to 75.494% when pulling fe26628 on btkostner:patch-1 into 349c817 on Nebo15:master.

@AndrewDryga
Copy link
Member

AndrewDryga commented May 30, 2022

Hello @btkostner, I'm not sure about this one because it would be a breaking change for users of DataDog agents that rely on a hostname to be an Erlang one. How DD-agent hostname is different? Are you sure that we can't read it in the same way as DD-agent does and give you an option to override the hostname?

@btkostner
Copy link
Contributor Author

The only time it differs is when running Elixir in a container, because it grabs the container hostname instead of the host's hostname. Sadly this isn't something I can overwrite in the agent configuration.

@AndrewDryga
Copy link
Member

@btkostner do you run it in Kubernetes or some other environment?

@btkostner
Copy link
Contributor Author

@AndrewDryga Yes, this is in kubernetes. I can show some screenshots in Datadog of some different setups if you want more clarity of what's happening.

@AndrewDryga
Copy link
Member

AndrewDryga commented May 31, 2022

@btkostner I think there is a way to do this without introducing breaking changes: you can expose the hostname using an environment variable and then read it and pass it to the logger (https://kubernetes.io/docs/tasks/inject-data-application/environment-variable-expose-pod-information/) so that both DD-agent values and Logger one would be equal. In logger we can allow overriding the host with anything you want.

@btkostner
Copy link
Contributor Author

@AndrewDryga That would work. We could also just make a bool to include the hostname and have it default to true. Both options work and I can update the PR with the solution. Which one would you prefer?

@AndrewDryga
Copy link
Member

AndrewDryga commented May 31, 2022

@btkostner Great! I can share the plan that I have in my head that should fit all the use-cases:

We can add a formater_state here and a formatter_opts to the configuration. It would allow passing any formatter-specific options when the logger is started and make sure that people won't do expensive calls for each of the log entries. It can be used like so (eg. in release.exs):

config :logger_json, :backend,
  json_encoder: Jason,
  formatter: LoggerJSON.Formatters.GoogleCloudLogger,
  formatter_opts: [hostname: System.get_env("KUBERNETES_NODE_HOSTNAME")]

To make it work we probably would need to extend the behavior (https://github.com/Nebo15/logger_json/blob/master/lib/logger_json/formatter.ex) and add an init callback that will receive formatter_opts and return formatter_state which would be saved and passed to format_event. New init callback should be called somewhere here to make sure it would support runtime reconfiguration.

When the backend initializes we can check if such an option was given and:

  • if it's a string just write it to state variable formatter_opts and use it when we format the JSON;
  • if it's a :from_system_agent atom DataDog formatter can skip writing it entirely (your use-case);
  • if it's unset we read the hostname as it's done now, persist it to formatter_opts and use that, so by default behavior would not change.

@AndrewDryga
Copy link
Member

I know it shoulds like a lot, if you feel like it's too much I'll handle this myself but a bit later

@btkostner
Copy link
Contributor Author

That looks like a solid well thought out plan. I'll start working on it today 👍

@btkostner btkostner changed the title Remove syslog hostname from datadog formatter feat: allow setting Datadog syslog.hostname attribute May 31, 2022
@btkostner
Copy link
Contributor Author

Alrighty, PR is updated. Let me know how it looks 👍

@btkostner
Copy link
Contributor Author

As a stop gap, I found a way to unset that value in Datadog itself. You can remove syslog.hostname from the hostname attributes by editing the "Preprocessing for JSON logs" pipeline.

Screen Shot 2022-06-20 at 8 17 45 PM

@@ -16,6 +23,7 @@ defmodule LoggerJSON.Formatter do
msg :: Logger.message(),
ts :: Logger.Formatter.time(),
md :: [atom] | :all,
state :: map
state :: map,
formatter_state :: Keyword.t()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should use a map for the state. Keywords have higher complexity for reading operations because sometimes we need to traverse the full list to get a value, which would be harmful as the state grows. Maps are highly optimized for that in Erlang:

Suggested change
formatter_state :: Keyword.t()
formatter_state :: map()

Copy link
Member

@AndrewDryga AndrewDryga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small change and looks good, thank you!

@AndrewDryga
Copy link
Member

Hey @btkostner sorry for taking it so long, was distracted by life and work :). I think this still is a great change that can be used by other providers so let's merge it anyways, even with a workaround.

@btkostner
Copy link
Contributor Author

@AndrewDryga No problem. I figured there was more important stuff happening in your life. No need to apologize.

@AndrewDryga AndrewDryga merged commit cad53fe into Nebo15:master Jun 21, 2022
@AndrewDryga
Copy link
Member

Thank you @btkostner, great work! Can you run this in your project and verify it's what you need? I'll release a new hex version once I get feedback from you and make sure it's stable.

@btkostner btkostner deleted the patch-1 branch June 21, 2022 17:36
@btkostner
Copy link
Contributor Author

Alright. Aside from the small doc fix I opened above, it looks to be working for me. I'm going to put it up on our staging cluster to make sure there are no side effects, but it should be good from my end.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants