Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NH-11358 Add integration tests #67

Merged
merged 69 commits into from
Nov 22, 2022
Merged

Conversation

tammy-baylis-swi
Copy link
Contributor

@tammy-baylis-swi tammy-baylis-swi commented Oct 25, 2022

This PR adds working, automated, integration tests! They will run as part of the existing GH workflow for automated test runs that we added in this PR.

This replaces a previous PR that wasn't working out so I closed it without merging.

To run the tests on my local a bit quicker (only one env) and to make sure all the new dependencies install, I've used:

make tox OPTIONS="--recreate -vv -e py37-nh-staging"

Rationale for this setup

This is based heavily on this conversation we had in May. It tests acceptance criteria 1, 4, and 6. EDIT 2022-11-15: and now 8 too.

EDIT 2022-11-15: Ignore the rest of this. It's now totally inaccurate.

Summary of changes

(1)
SolarWindsDistroTestBase does setup shared by the two Test* classes below. To initialize our custom OTel components and have them readily accessible at test time, we call individual SolarWindsConfigurator methods. Our testbase extends OTel's TestBase to prepare the InMemorySpanExporter used to check finished spans. This way, these tests all done in the same process/thread (not sure which) and doesn't go so far as to spin up separate containers like the testbed.

(2)
TestSpanAttributes uses the OTel SDK to manual create spans with our custom OTel components then check their attributes and trace states, all while mocking liboboe decisions.

For scenario 4, we manually extract traceparent and tracestate headers with our custom (composite) propagator. The resulting OTel context needs to be manually passed to the SDK at span creation else it won't find it. (This also wouldn't work with Flask as attempted in the failed previous PR.)

For scenario 6, we do the same as 4 plus extract x-trace-options and x-trace-options-signature for signed TT.

There is an additional method test_child_span_attrs. We mock its remote parent by setting a full span context instead of providing OTel context.

(3)
PropagationTest defines a minimal Flask app that we can instrument. It has a Werkzeug client exposed for making requests in-code instead of to a separate service. Its route /test_trace makes an outgoing request so that we can capture the traceparent and tracestate it sends out and check them.

(4)
TestHeaderPropagation uses the Flask app to test header propagation performed by our custom OTel components, also while mocking liboboe decision. The main reason this is separate from TestSpanAttributes is that I couldn't get the Flask app's spans to export to InMemorySpanExporter (what I attempted in the failed previous PR).

(5)
Bonus: while writing these tests I found a typo bug that's been fixed here in 431bd12.

Not in this PR

I realized while writing this that this doesn't test what happens when invalid values are given, e.g. incorrect traceparent and tracestate. Given the size of this PR already, I'd like to add them in a different PR.

The ignoring of some x-trace-options that shouldn't be will be fixed in
https://swicloud.atlassian.net/browse/NH-24786. It is there that I will add tests for incorrect TT-related values and fix the TODOs I've added.


I'll add this information to this doc soon so it's not lost. EDIT: Done, new section: https://swicloud.atlassian.net/wiki/spaces/NIT/pages/2917272644/NH+Python+Automated+Regression+Testing#Example-implementation

Please let me know what you think! 😺

@tammy-baylis-swi
Copy link
Contributor Author

Regarding this comment:

So why did I put all that in there?

It's because non-entry spans need an up-to-date tracestate at time of injection, rather than the placeholder set by the sampler!

@tammy-baylis-swi
Copy link
Contributor Author

Hi @cheempz ! Thanks again for all those suggestions. I've addressed all that I can while also making big changes to how these tests are set up. Therefore, the original PR description is now inaccurate but I won't update Confluence in good English until this makes sense to us both 😃

Key updates:

  1. I figured out how to set up all of custom composite OTel propagation for sw and trigger trace tracestate dealings, custom SW sampling logic for attribute setting, and export to InMemorySpanExporter for span inspection!
  2. Redesigns integration tests so that each script contains one of the W3C trace context Acceptance Criteria scenarios 1, 4, 6, or 8 (https://swicloud.atlassian.net/wiki/spaces/NIT/pages/2325479753/W3C+Trace+Context#Acceptance-Criteria).
  3. Each of the scenario scripts extends a single, new shared class TestBaseSwHeadersAndAttributes for configuration of OTel components, test app config, and shared setup/teardown.
  4. All the classes that existed at the start of this PR are gone (SolarWindsDistroTestBase, TestSpanAttributes, TestHeaderPropagation, PropagationTest)
  5. Also merges from main the liboboe bump to 11.0.0 and other changes to stop some version conflicts.

Please let me know what you think! This is a lot of changes; sorry for another mega PR.

cheempz
cheempz previously approved these changes Nov 21, 2022
Copy link
Contributor

@cheempz cheempz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯 what an awesome revamp @tammy-baylis-swi! This was great to read, educational, and such a good foundation that you managed to glue all the pieces together (header extraction/injection, resulting span attributes) into a simple to understand structure, kudos!

I left minor comments feel free to ignore or address in subsequent PRs.

(done by OTel TraceContextTextMapPropagator).
4. Sampling-related attributes are set for the root/service entry span.
5. The span_id of the outgoing request span matches the span_id portion in the
tracestate header.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love this docstring and your point 5 reminded me that this was never made clear in the W3C Trace context spec! I've updated the paragraph about tracestate accordingly.

expected_trace_state = trace_api.TraceState([
("sw", "0000000000000000-01"),
("xtrace_options_response", "trigger-trace####ok"),
])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really appreciate how you have tests that now document previously opaque parts of our customization :)

entry span (mocked). There is no OTel context extracted from request headers,
so this is the root and start of the trace.
2. Some traceparent and tracestate are injected into service's outgoing request (done by OTel TraceContextTextMapPropagator).
3. The injected x-trace-options header is also propagated.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The inbound x-trace-options header doesn't actually need to be propagated to any outbound calls. But I'm guessing here you are referring to how we propagate the Trigger Trace response value (from the tracing decision call) from span to span, in order to make it available at the end of the request when we inject our custom x-trace-options-response header.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes you're right! Fixed the wording in f327539

expected_trace_state = trace_api.TraceState([
("sw", "0000000000000000-01"),
("xtrace_options_response", "trigger-trace####ok"),
])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above comment, this is great test-as-documentation on we use TraceState to stash the trigger trace response so it's available at the time of our custom injecting of the x-trace-options-response header.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! Addressed in cce07b4

new_span_id,
new_trace_flags,
)
# TODO NH-24786 will not ignored foo
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hehe it's actually correct right now, foo should be ignored.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! My silly brain. Rm those todos in b55984c

assert "trigger-trace=ignored" in resp.headers["x-trace-options-response"]
assert "ignored=foo" in resp.headers["x-trace-options-response"]
# TODO NH-24786 will not ignored foo
# assert "ignored" not in resp.headers["x-trace-options-response"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change with NH-24786 is that keys like custom-* should now be set as attributes on the span, so for this case I'd expect these attributes:

  • TriggeredTrace: true
  • custom-from: lin
  • SWKeys: custom-sw-from:tammy,baz:qux

The TT spec does not make this easy to see, I will extra highlight in the Jira and also update the spec to hopefully make it clearer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Lin! I've removed this todo in b55984c and will update the PR for NH-24786!

assert new_trace_id in resp.headers["x-trace"]

# Verify x-trace-options-response response header present
# and has same values as tracestate but different delimiters
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nit: "and has same values as internally cached in TraceState..." would make clearer that we don't expect it to have the same value as the actual tracestate header that was injected in outbound requests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea! Addressed in 20a8de4


#### Integration tests

The integration tests are defined in `tests/integration/`. They require a compiled C-extension and should be run inside the build container. They are also run under tox as per the unit tests. Here is how to run tests locally:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hehe, very minor: I gave this a try and was BAD and didn't set any SW_APM_SERVICE_KEY_TOX_* env vars, but the tests passed fine. So it seems that the tox-driven unit and integration tests don't actually need valid SW_APM_ env vars (which makes sense given what they're testing).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yess!!! This must have been leftover from a very old setup. Removed in ea5d5d8

@tammy-baylis-swi
Copy link
Contributor Author

Thank you again @cheempz ! All of last comments addressed. Please could I get another review? 😺

Copy link
Contributor

@cheempz cheempz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @tammy-baylis-swi!

@tammy-baylis-swi tammy-baylis-swi merged commit a136522 into main Nov 22, 2022
@tammy-baylis-swi tammy-baylis-swi deleted the NH-11358-functional-tests branch November 22, 2022 01:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants