Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use ddtracepy extractor #379

Closed
wants to merge 10 commits into from
Closed

use ddtracepy extractor #379

wants to merge 10 commits into from

Conversation

zARODz11z
Copy link
Contributor

What does this PR do?

Motivation

Testing Guidelines

Additional Notes

Types of Changes

  • Bug fix
  • New feature
  • Breaking change
  • Misc (docs, refactoring, dependency upgrade, etc.)

Check all that apply

  • This PR's description is comprehensive
  • This PR contains breaking changes that are documented in the description
  • This PR introduces new APIs or parameters that are documented and unlikely to change in the foreseeable future
  • This PR impacts documentation, and it has been updated (or a ticket has been logged)
  • This PR's changes are covered by the automated tests
  • This PR collects user input/sensitive content into Datadog
  • This PR passes the integration tests (ask a Datadog member to run the tests)

@zARODz11z zARODz11z requested a review from a team as a code owner October 16, 2023 21:54
@@ -19,6 +19,11 @@ class TraceHeader(object):
PARENT_ID = "x-datadog-parent-id"
SAMPLING_PRIORITY = "x-datadog-sampling-priority"

class TraceState(object):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rm

) = _extract_context_from_eventbridge_sqs_event(event)
return trace_id, parent_id, sampling_priority
context = _extract_context_from_eventbridge_sqs_event(event)
return context
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably check if None here.

@@ -175,8 +174,9 @@ def extract_context_from_lambda_context(lambda_context):
trace_id = client_context.custom.get(TraceHeader.TRACE_ID)
parent_id = client_context.custom.get(TraceHeader.PARENT_ID)
sampling_priority = client_context.custom.get(TraceHeader.SAMPLING_PRIORITY)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider what if the client_context.custom has w3c headers, not just dd headers.

span = tracer.trace("dummy.span")
span.trace_id = int(context[TraceHeader.TRACE_ID])
span.span_id = int(context[TraceHeader.PARENT_ID])
if isinstance(context, Context) and context is not None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's try to make sure that context is always a Context object.

@@ -175,8 +174,9 @@ def extract_context_from_lambda_context(lambda_context):
trace_id = client_context.custom.get(TraceHeader.TRACE_ID)
parent_id = client_context.custom.get(TraceHeader.PARENT_ID)
sampling_priority = client_context.custom.get(TraceHeader.SAMPLING_PRIORITY)

return trace_id, parent_id, sampling_priority
context = Context(trace_id=trace_id, span_id=parent_id,sampling_priority=sampling_priority)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit

Suggested change
context = Context(trace_id=trace_id, span_id=parent_id,sampling_priority=sampling_priority)
context = Context(trace_id=trace_id, span_id=parent_id, sampling_priority=sampling_priority)

return trace_id, parent_id, sampling_priority
context = propagator.extract(dd_data)
if context is not None:
return context
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The if not None here seems unnecessary

@@ -328,10 +317,9 @@ def _extract_context_from_eventbridge_sqs_event(event):

Copy link
Contributor

@purple4reina purple4reina Nov 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change

body_str = first_record.get("body", {})

to

body_str = first_record.get("body")

We already know the "body" key is in the dictionary. Plus, we're then calling json.loads on the returned value. That will fail if given a non-string like {}.

return trace_id, parent_id, sampling_priority
context = propagator.extract(dd_context)
if context is not None:
return context
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The if is not None here is unnecessary

return trace_id, parent_id, sampling_priority
context = propagator.extract(dd_ctx)
if context is not None:
return context
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The if is not None here is unnecessary

@@ -408,7 +393,8 @@ def extract_context_from_step_functions(event, lambda_context):
execution_id + "#" + state_name + "#" + state_entered_time
)
sampling_priority = SamplingPriority.AUTO_KEEP
return trace_id, parent_id, sampling_priority
context = Context(trace_id=trace_id, parent_id= parent_id, sampling_priority=sampling_priority)
return context
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or just

return Context(trace_id=trace_id, parent_id= parent_id, sampling_priority=sampling_priority)

@@ -660,18 +623,32 @@ def is_lambda_context():

def set_dd_trace_py_root(trace_context_source, merge_xray_traces):
if trace_context_source == TraceContextSource.EVENT or merge_xray_traces:
context = dict(dd_trace_context)
global dd_trace_context
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing this to a global alters how this function works. If merge xray traces is True, then you're mutating the global trace context object, which was not done before.

dd_trace_context.span_id = xray_context.span_id
dd_trace_context.trace_id = xray_context.trace_id
dd_trace_context.sampling_priority = xray_context.sampling_priority
dd_trace_context.dd_origin = xray_context.dd_origin
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously we were only setting parent-id from xray context. We shouldn't change that functionality.

# dd_origin=xray_trace_entity.get("source")
# )
context = Context(trace_id=parent, span_id=parent, sampling_priority=sampled, dd_origin=TraceContextSource.XRAY)
return context
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just

return Context(trace_id=parent, span_id=parent, sampling_priority=sampled, dd_origin=TraceContextSource.XRAY)



def generate_random_id():
return binascii.b2a_hex(os.urandom(8)).decode("utf-8")


def build_segment(context, key, metadata):
def build_segment(context: Context, key, metadata):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should all or nothing the python type checks.

"accept": "*/*",
"accept-encoding": "gzip, deflate",
"host": "szcqwshtpwosbcodv4hge5zj4a0bmker.lambda-url.sa-east-1.on.aws",
"traceparent": "00-00000000000000005f486460f346b80c-dc070c2bb8714351-01",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"traceparent": "00-00000000000000005f486460f346b80c-dc070c2bb8714351-01",
"traceparent": "01-00000000000000005f486460f346b80c-dc070c2bb8714351-01",

@@ -42,7 +44,7 @@
"Root=1-5e272390-8c398be037738dc042009320;Parent=94ae789b969f1cc5;Sampled=1"
)
fake_xray_header_value_parent_decimal = "10713633173203262661"
fake_xray_header_value_root_decimal = "3995693151288333088"
fake_xray_header_value_root_decimal = "1490261136348486853"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't need to change this.

ctx,
{"trace-id": "123", "parent-id": "321", "sampling-priority": "1"},
Context(trace_id=123, span_id=321, sampling_priority=1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably best to do as you did previously, an assert each for trace id, span id, and sampling priority.

self.assertEqual(
ctx.sampling_priority,
'2'
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Go ahead and put this all on one line. We dont need this file longer than it already is.

self.mock_send_segment.assert_called_with(
XraySubsegment.TRACE_KEY,
{"trace-id": "123", "parent-id": "321", "sampling-priority": "1"},
actual_context
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is wrong. You should be doing assertions against the args returned from .call_args.

)
self.assertEqual(str(ctx.trace_id), "666")
self.assertEqual(str(ctx.span_id), "777")
self.assertEqual(str(ctx.sampling_priority), "1")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of casting to a string, just assert the values are ints.

self.mock_send_segment.assert_called_with(
XraySubsegment.TRACE_KEY,
{"trace-id": "666", "parent-id": "777", "sampling-priority": "1"},
actual_context
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is wrong, see above.

self.mock_send_segment.assert_called_with(
XraySubsegment.TRACE_KEY,
{"trace-id": "666", "parent-id": "777", "sampling-priority": "1"},
actual_context
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is wrong, see above.

propagator = HTTPPropagator()
context = propagator.extract(headers)
print(context)
self.assertEqual(1,2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block should probably be removed.

mock_span.trace_id = 123
mock_span.span_id = 456
mock_trace.return_value = mock_span
mock_current_span.return_value = mock_span
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using mocks is confusing and unnecessary. If you're worried about the random number generator creating different ids each time, try seeding the generator. Something like https://docs.python.org/3/library/random.html#random.seed

self.assertEqual(context["sampling-priority"], "1")
self.assertEqual(context.trace_id, 7379586022458917877)
self.assertEqual(context.span_id, 2644033662113726488)
self.assertEqual(context.sampling_priority, 1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good.

)
self.mock_activate.assert_called()
self.mock_activate.assert_has_calls([call(expected_context)])
def test_use_dd_trace_context(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test doesn't seem to add anything we didn't already have.

ctx = get_mock_context()
context, source, event_type = extract_dd_trace_context(event, ctx)
self.assertEqual(context._traceparent, "00-00000000000000005f486460f346b80c-dc070c2bb8714351-01")
self.assertEqual(context._tracestate, "dd=s:1;t.dm:-1")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's actually test the trace id, span id, and sampling priority on the context directly. Attributes starting with an _ in Python are implied to be private and could change at anytime.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will also need tests that check for w3c trace headers for all of the extract methods tested in this file.

parent_id = dd_data.get(TraceHeader.PARENT_ID)
sampling_priority = dd_data.get(TraceHeader.SAMPLING_PRIORITY)
dd_data = client_context.custom.get("_datadog")
context = propagator.extract(dd_data)
elif (
TraceHeader.TRACE_ID in client_context.custom
and TraceHeader.PARENT_ID in client_context.custom
and TraceHeader.SAMPLING_PRIORITY in client_context.custom
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't check for these headers exclusively. This would not support w3c headers in the client_context.custom.


return trace_id, parent_id, sampling_priority
context = propagator.extract(client_context.custom)
return context
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bug. If neither the if or the elif are hit, then the variable context is undefined.

@astuyve astuyve closed this Dec 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants