Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set env vars for service instead of machine #3930

Merged
merged 78 commits into from
Jan 23, 2024

Conversation

pjanotti
Copy link
Contributor

@pjanotti pjanotti commented Nov 17, 2023

Description:
On Windows move the environment variables to the service scope. This servers two purposes:

  1. Avoid issues with Splunk instrumentations that use some of the same environment variables as the collector and could end up sending data straight to the backend instead of the sending locally to the collector.
  2. Reduce the scope of visibility of the access token, which should only be visible to the collector

Thanks @pellared for noticing this issue.

Review notes

  • "Monkey see, monkey does": in retrospect, after learning a bit about some of the tech used here, I wanted to change some other things, e.g.: use an array instead of maps that a later converted to arrays in Chef, but, this is already taking too long and such improvements can be addressed separately.

  • Sorting variables when writing to the registry: this facilitates human inspection and tests.

  • Choco: if there was a previous version with environment variables on the machine scope, the environment variables are removed from the machine level. Not doing the same for other deployments because they don't seem to support upgrading deployments, even the install script requires removing the previously installed collector.

  • Ansible:

    • Removed some redundant restarts of the service on deployments/ansible/roles/collector/tasks/otel_win_install.yml, the restart is done only after the environment variables for the service are updated.
    • Tests are failing on CI but passing on mac dev box.
  • Puppet:

    • Unlike the other mass deployments Puppet doesn't support deploying SFx.NET to Windows+IIS.
    • deployments/puppet/manifests/collector_win_registry.pp: respective tests are at internal\buildscripts\packaging\tests\deployments\puppet\puppet_test.py (internal/buildscripts/packaging/tests/helpers/win_utils.py was modified to support new environment variables location).

Link to Splunk idea:
N/A

Testing:
Local tests and CI.

Documentation:
Updating CHANGELOG.md

cc @RassK @theletterf

@pjanotti pjanotti force-pushed the move-windows-env-vars-to-service branch from 9647687 to e091ad7 Compare November 27, 2023 22:43
@pjanotti pjanotti marked this pull request as ready for review November 27, 2023 23:09
@pjanotti pjanotti requested review from a team as code owners November 27, 2023 23:09
@pjanotti pjanotti marked this pull request as draft November 28, 2023 00:42
@pjanotti
Copy link
Contributor Author

Returning to draft until I can look at the test failure.

@jeffreyc-splunk
Copy link
Contributor

The msi itself also needs to be updated: https://github.com/signalfx/splunk-otel-collector/blob/main/internal/buildscripts/packaging/msi/splunk-otel-collector.wxs#L42

@jeffreyc-splunk
Copy link
Contributor

And we'll have to update our ansible, chef, and puppet modules since they currently only configure HKLM:\SYSTEM\CurrentControlSet\Control\Session Manager\Environment.

@pjanotti pjanotti reopened this Jan 17, 2024
@pjanotti pjanotti marked this pull request as ready for review January 17, 2024 20:17
CHANGELOG.md Outdated Show resolved Hide resolved
Comment on lines -39 to -42
- name: Assert splunk-otel-collector service status
assert:
that: not service_status.changed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this removal related to the env var changes? Wonder if there should be a new var to dictate if expected.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was failing consistently for me because the status had changed to "started" - I confess that I don't understand how it was working before.

@@ -37,7 +37,6 @@
ansible.windows.win_package:
path: "{{otel_msi_package.dest}}"
state: present
notify: "restart windows splunk-otel-collector"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are these removals related and necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the restart should only happen after the environment variables are set for the service.

Copy link
Contributor

@atoulme atoulme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved, feel free to merge if enough CI coverage.

@pjanotti pjanotti merged commit 1da0e2c into signalfx:main Jan 23, 2024
486 of 520 checks passed
@pjanotti pjanotti deleted the move-windows-env-vars-to-service branch January 23, 2024 14:22
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants