Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(@opentelemetry/exporter-prometheus): unref prometheus server to prevent process running indefinitely #2558

Merged

Conversation

mothershipper
Copy link
Contributor

Which problem is this PR solving?

Prometheus exporter is holding the process open for scripts that are meant to exit.

Short description of the changes

Unrefs the prometheus exporter to allow the process to exit.

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Run in our production environment, with a server that was purposely failing to boot -- something like:

const main = async () => {
  await initOtel()
  await initServer() // throws
}

main.catch(err => console.log)

Previously, prometheus would hold the process open. With this change, the process exits as expected.

@mothershipper mothershipper requested a review from a team October 21, 2021 19:59
Copy link
Member

@dyladan dyladan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch

@dyladan
Copy link
Member

dyladan commented Oct 21, 2021

If its easy a test would be nice. If not, then this is such a small change its unlikely to be a problem.

@codecov
Copy link

codecov bot commented Oct 21, 2021

Codecov Report

Merging #2558 (95d36da) into main (b531acf) will increase coverage by 0.02%.
The diff coverage is n/a.

❗ Current head 95d36da differs from pull request most recent head 4fc5b3b. Consider uploading reports for the commit 4fc5b3b to get more accurate results

@@            Coverage Diff             @@
##             main    #2558      +/-   ##
==========================================
+ Coverage   93.05%   93.08%   +0.02%     
==========================================
  Files         140       58      -82     
  Lines        5172     1894    -3278     
  Branches     1111      408     -703     
==========================================
- Hits         4813     1763    -3050     
+ Misses        359      131     -228     
Impacted Files Coverage Δ
...s/opentelemetry-core/src/platform/node/sdk-info.ts 0.00% <0.00%> (-100.00%) ⬇️
...opentelemetry-core/src/platform/node/globalThis.ts 0.00% <0.00%> (-100.00%) ⬇️
...pentelemetry-core/src/platform/node/performance.ts 0.00% <0.00%> (-100.00%) ⬇️
...emetry-api-metrics/src/platform/node/globalThis.ts 0.00% <0.00%> (-100.00%) ⬇️
...ckages/opentelemetry-exporter-zipkin/src/zipkin.ts 84.48% <0.00%> (-15.52%) ⬇️
...ntelemetry-sdk-metrics-base/src/BoundInstrument.ts
...sync-hooks/src/AbstractAsyncHooksContextManager.ts
...k-metrics-base/src/export/ConsoleMetricExporter.ts
...telemetry-sdk-metrics-base/src/export/Processor.ts
...pentelemetry-sdk-metrics-base/src/BatchObserver.ts
... and 77 more

@mothershipper
Copy link
Contributor Author

We've got this monkeypatched it in our environment, so I'm not in a rush to get this in upstream -- I can take a look at adding tests after hours.

Would you mock createServer and just test for unref being called? Usually I prefer the behavioral tests, but I'm not sure offhand how I'd safely test for the reference count not increasing / exit isn't hung.

@dyladan
Copy link
Member

dyladan commented Oct 21, 2021

I can't see anywhere in the docs to check if a socket is unref'd so I guess the only way to do it would be with a mock

@mothershipper
Copy link
Contributor Author

took a stab at the test

@mothershipper mothershipper force-pushed the task/jack/unref-prom-server branch 2 times, most recently from e07925e to fbaf06a Compare October 22, 2021 19:45
@dyladan
Copy link
Member

dyladan commented Oct 26, 2021

This branch is out of date and I don't have permission to update your branch so I can't merge it. Please update the branch and if you give maintainers permission to push to the branch that makes it much easier for us.

@mothershipper
Copy link
Contributor Author

@dyladan I've updated, but to be honest, I'm not entirely sure how to give you access -- looking into that now. Would you consider adding a link to those instructions to the CONTRIBUTING.md doc if I can find some?

@dyladan
Copy link
Member

dyladan commented Oct 26, 2021

Yeah it should be a checkbox when you create the PR that says something along the lines of "allow maintainers to push changes to the branch"

Not sure how its done after the fact but I know its possible.

@dyladan dyladan added the bug Something isn't working label Oct 26, 2021
@mothershipper
Copy link
Contributor Author

I can recreate it if that's going to be easier

@dyladan
Copy link
Member

dyladan commented Oct 26, 2021

It should be right underneath the screenshot you posted
image

Don't bother recreating. The tests are failing btw

@Flarna
Copy link
Member

Flarna commented Oct 26, 2021

Yeah it should be a checkbox when you create the PR that says something along the lines of "allow maintainers to push changes to the branch"

As far as I know this works only for personal forks but not for forks owned by an organisation.

@dyladan
Copy link
Member

dyladan commented Oct 26, 2021

Oh I didn't see it was an org fork. Tests are still failing.

@mothershipper
Copy link
Contributor Author

I agree - it's just not there on my end.

Screen Shot 2021-10-26 at 10 30 04 AM

Apologies on the test, pushing a fix now - shouldn't have ran mine in isolation, forgot to unstub the http lib.

…revent process running indefinitely

Adds a test case as well, but, unfortunately, just that the server is
unrefed. Not immediately clear how we could test based on open handles
or reference counting to get at the behavior over the specific
implementation.
@mothershipper
Copy link
Contributor Author

Ah, good to know! I can open another PR to add instructions / caveats to the CONTRIBUTING.md

@mothershipper
Copy link
Contributor Author

tests should be passing now, ran the full suite locally this time

@vmarchaud vmarchaud merged commit c1939a7 into open-telemetry:main Oct 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants