aws-sdk: Use compile time reference+@NoMuzzle instead of reflection. #8603

Oberon00 · 2023-05-30T16:36:26Z

This is a first chunk, if we want to follow this approach, I will add more, also to avoid inconsistencies.

Thank you @laurit for helping me with various hurdles I ran into when trying this! (See #8394)

This is a first chunk, if we want to follow this approach, I can add more.

Oberon00 · 2023-05-31T07:32:44Z

Sorry, something is still not right here, I will push an update once I figure it out (it's not only the spotless, but also some NPE I can't fully explain).

Oberon00 · 2023-05-31T09:28:24Z

My shortcut of adding the muzzle test to library tests was the problem: It drags in the autoconfigure library indirectly through the agent, and this leads to double instrumentation, and conflicting access to the execution attributes.

.../src/main/java/io/opentelemetry/instrumentation/awssdk/v2_2/SqsSendMessageRequestAccess.java

Oberon00 · 2023-06-01T08:00:47Z

Note that I'd prefer at least a general "yes, let's go ahead with this approach" / "no this does not look good" / etc. before I'd work on the remaining follow-up tasks I'd like to implement for SQS/SNS propagation (as outlined in the description of #8405), since depending on whether this approach finds acceptance, I will have to either implement it with boilerplate reflection classes or not.

mateuszrzeszutek · 2023-06-01T09:43:08Z

.../src/main/java/io/opentelemetry/instrumentation/awssdk/v2_2/TracingExecutionInterceptor.java

@@ -125,35 +126,22 @@ public SdkRequest modifyRequest(
    return request;
  }

+  @NoMuzzle


@NoMuzzle turns off a pretty important security feature of the agent -- in general I think I'd prefer all the no-muzzle'd methods to be kept as minimal as possible, contained in classes separate from the rest of the code, and either guarded with extra defensive conditions (e.g. it'd be great to move/copy the SqsSendMessageRequestAccess.isInstance check to the beginning of this method) or at the very least heavily commented why it is safe.

With that in mind, WDYT about moving all the affected code into a separate class, and minimizing the entry points? It'd be great if it was just one method that was used outside of it, and the rest of them would be private and encapsulated in the new class.

If you do this then you may run into an issue where classes used only in the @NoMuzzle method are not found. This is because besides the safety checks muzzle plugin also composes the list of the instrumentation classes used. You may need to manually add these classes using getAdditionalHelperClassNames method as described in https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/main/docs/contributing/writing-instrumentation-module.md#inject-additional-instrumentation-helper-classes-manually-with-the-getadditionalhelperclassnames-method

Hmm, that's true, although it's true for the current code as well. E.g. MessageAttributeValueMapGetter is only referenced in a @NoMuzzle method.

If we want to avoid that, we can modify the ReferenceCollectingClassVisitor to create a special MethodVisitor for @NoMuzzle methods, that just collects helper classes and not references.

If you do this then you may run into an issue where classes used only in the @NoMuzzle method are not found.

I don't think that's true, it seems that @NoMuzzle is not transitive. I.e. if a method annotated with NoMuzzle calls another method, that method is checked again -- I even had to make a lambda a separate method to add NoMuzzle.

So it unfortunately won't help us minimize NoMuzzle if we minimize the entrypoints. But it would still probably be a nice refactoring to move out SQS-specific code to a separate class.

It'd be great if it was just one method that was used outside of it,

It will not be that well-encapsulated, since we need a few different hooks (I think two) where we do SQS specific things.

@NoMuzzle turns off a pretty important security feature of the agent

But: If you use reflection instead, you turn off pretty important safety features of the Java compiler (I think you mean safety here, not security). And IIUC, reflection stops Muzzle from noticing problems just like @NoMuzzle (though it is more fine grained with reflection)

So it unfortunately won't help us minimize NoMuzzle if we minimize the entrypoints. But it would still probably be a nice refactoring to move out SQS-specific code to a separate class.

👍

It will not be that well-encapsulated, since we need a few different hooks (I think two) where we do SQS specific things.

That's fine. I just want to make sure that any usage of @NoMuzzle is very deliberate and well documented.

But: If you use reflection instead, you turn off pretty important safety features of the Java compiler (I think you mean safety here, not security).

Yup, safety, I mixed up these two for some reason 🙈
That's true; that's why we've had all these null checks on MethodHandles.

I think I found a good solution to this build failure by testing the core module and the sqs module in separate muzzle runs using excludedInstrumentationNames (e06b345). When you have multiple other modules (e.g. sqs, and sns) it could get a bit unwieldy, but then adding a feature includedInstrumentationNames to muzzle might not be that hard.

(NB: For some reason, these checks don't run when I run gradle using IntelliJ, but they do run when I execute the same gradle task from the command line)

I think excludeInstrumentationName is a very good solution for now. I didn't know we had such an option, would have thought that you'd need to use extraDependency to get around this.

excludeInstrumentationName also has the advantage that we test that it works without the SQS references. Which makes me realize, I can probably remove the manual muzzle unit tests as this is now effectively done by this muzzle directive.

Removed the whole javaagent-unit-tests module in a095c37. We now lose the check that we actually do reference SQS from the SQS module, but that sanity check does not seem valuable enough to justify the module.

Do you think this GH conversation can be considered "resolved" now?

instrumentation/aws-sdk/aws-sdk-2.2/library-extra-test/build.gradle.kts

...est/src/test/java/io/opentelemetry/instrumentation/awssdk/v2_2/SqsNoMuzzleReferenceTest.java

Oberon00 · 2023-06-05T16:03:45Z

To be honest, I do not really feel up to more fundamental reworks of this concept like splitting into multiple instrumentation modules. Personally, I think the current concept is fine enough to count as improvement (after all the reflection were are also not Muzzle checked, now we have slightly more code not covered by Muzzle, but far less boilerplate and more coverage by basic Java compiler type checks).

Personally (and for Dynatrace, as I'm doing this in my working time), my priority is to get the features mentioned under "Further follow-up tasks" in the description of #8405 done. I thought that refactoring this a bit would be worth the time but if it becomes a larger undertaking, I would rather go ahead with the reflection approach.

These are my first contributions to the java-instrumentation repository, so I don't really know my way around the more intricate parts of the Java agent infrastructure, so it may very well be that my assessment of this PR's approach being an improvement over the status quo is wrong.

I would prefer one of these two options (in order of preference):

Go ahead with the current approach without major changes (i.e., I would address the comments about renaming of course, but would not do larger refactorings like splitting up the module)
Close this PR unmerged for now, and I will go ahead with the old/current reflection approach for the remaining follow-up features I want to add for SQS/SNS.

In the end, I have the feeling that the ideal approach would require some improvements to Muzzle infrastructure. Basically, what we want is all the bells and whistles of Muzzle and no @NoMuzzle anywhere, except that during instrumentation module loading, if a SQS-class is missing (but not present but incompatible), we still want to load the module.

Alternatively, re-architecting the AWS SDK instrumentation to mirror the AWS SDK's plugin split might also work (though not sure if that is possible without introducing much boilerplate) and would have the advantage that you can't accidentally call into classes that are not present, so that would be the safest approach. If you then additionally added a way that the user still gets a single all-in-one maven/gradle module that adds everything without picking manually, that would be great.
But I'm not up for working on such a large refactoring at this time.

laurit · 2023-06-06T11:07:36Z

To be honest, I do not really feel up to more fundamental reworks of this concept like splitting into multiple instrumentation modules. Personally, I think the current concept is fine enough to count as improvement (after all the reflection were are also not Muzzle checked, now we have slightly more code not covered by Muzzle, but far less boilerplate and more coverage by basic Java compiler type checks).

The purpose of muzzle checks is to improve run time safety by verifying that the methods that the instrumentation calls exist and are accessible. The checks reflection code performs are essentially the same as what muzzle does, if it does not find the method it reverts to do nothing and avoids failing at runtime.
Using plain java is definitely much nicer than using reflection, but the main question here is do we want to give up on the muzzle checks. One could argue that library instrumentation does not have the muzzle checks and if there is an incompatible change in the instrumented library runtime failures are a possibility, why bother with the agent, could let it fail the same way. I prepared a small sample for you that shows how you could use java code with @NoMuzzle and also keep the muzzle checks https://github.com/dynatrace-oss-contrib/opentelemetry-java-instrumentation/compare/aws-sdk-no-muzzle...laurit:opentelemetry-java-instrumentation:aws-no-muzzle-test?expand=1 Something like that only works when you can isolate the code that touches sqs types from the rest of the code.

Personally (and for Dynatrace, as I'm doing this in my working time), my priority is to get the features mentioned under "Further follow-up tasks" in the description of #8405 done. I thought that refactoring this a bit would be worth the time but if it becomes a larger undertaking, I would rather go ahead with the reflection approach.

Unfortunately what you are trying here is something new, we don't have any sample code that you could learn from. Totally understand it if you want to give up on this.

Oberon00 · 2023-06-06T14:51:58Z

Thank you, that branch is quite helpful.

// types that are only used in @NoMuzzle methods are not injected, with agent here you'll get
// NoClassDefFoundError for MessageAttributeValueMapGetter

I think this means that I should run the respective unit test from the agent module as well (currently it runs only in the library module with the settings that enable this code path)

I also noticed you did manage to put two instrumentation modules in the same gradle module, with the same primary name. Is this OK then? I think it would reduce the boilerplate a bit.

Oberon00 · 2023-06-07T16:55:45Z

So here's what I found while running the tests also for the java-agent ae2e8d8:

Indeed just as you noted @laurit, the NoMuzzle prevents the class from being injected, causing a class loader exception (most unit tests fail because of a strict context violation that is caused by this instead of directly because of this, interestingly)
Muzzle only seems to collect references to methods, so if you just use the type in the signature or even cast to it (at least when the cast only involves the class in a type parameter position), it does not cause Muzzle to collect a reference. Hopefully/probably this is in line with JVM rules. So in this case, the relevant @NoMuzzle annotations could be removed which causes all classes to be found again.
The design of the generic SDK tests is not fit for the SQS injection as it hard-codes an MD5 and has no way to calculate it dynamically (since the span + trace ID is part of the message attributes, the MD5 will always be different). This causes an exception in the client SDK. I removed the test and instead updated the SQS-specific test to also test the async client.

(cherry picked from commit 08fc75e)

Oberon00 · 2023-06-13T21:09:35Z

(See #8603 (comment) for latest comments)

.../aws-sdk-2.2/library/src/main/java/io/opentelemetry/instrumentation/awssdk/v2_2/SqsImpl.java

laurit · 2023-06-14T09:34:24Z

Muzzle only seems to collect references to methods, so if you just use the type in the signature or even cast to it (at least when the cast only involves the class in a type parameter position), it does not cause Muzzle to collect a reference. Hopefully/probably this is in line with JVM rules. So in this case, the relevant @NoMuzzle annotations could be removed which causes all classes to be found again.

The relevant code is in

opentelemetry-java-instrumentation/muzzle/src/main/java/io/opentelemetry/javaagent/tooling/muzzle/ReferenceCollectingClassVisitor.java

Line 308 in f0b9a25

private class AdviceReferenceMethodVisitor extends MethodVisitor {

If I understand it correctly then it collects used method parameters & return type, field type, type used in instanceof, checkcast or new, class used in Foo.class. It does not look at generic signatures, which is fine as vm doesn't care about these either.

.../src/main/java/io/opentelemetry/instrumentation/awssdk/v2_2/TracingExecutionInterceptor.java

Oberon00 · 2023-06-14T17:16:05Z

...g/src/main/groovy/io/opentelemetry/instrumentation/awssdk/v2_2/AbstractAws2ClientTest.groovy

-        <SendMessageResponse>
-            <SendMessageResult>
-                <MD5OfMessageBody>d41d8cd98f00b204e9800998ecf8427e</MD5OfMessageBody>
-                <MD5OfMessageAttributes>3ae8f24a165a8cedc005670c81a27295</MD5OfMessageAttributes>


I had to drop this test because of this hardcoded MD5. Alternative would be to run the test suite without the experimental message propgator enabled.

However, I instead added an async variant to the SQS-specific test class, so hopefully still covering everything.

If it helps you then setting a flag for only some tests is also quite easy

opentelemetry-java-instrumentation/instrumentation/netty/netty-4.1/javaagent/build.gradle.kts

Lines 45 to 68 in 4a0d725

tasks {

val testConnectionSpan by registering(Test::class) {

filter {

includeTestsMatching("Netty41ConnectionSpanTest")

includeTestsMatching("Netty41ClientSslTest")

}

include("**/Netty41ConnectionSpanTest.*", "**/Netty41ClientSslTest.*")

jvmArgs("-Dotel.instrumentation.netty.connection-telemetry.enabled=true")

jvmArgs("-Dotel.instrumentation.netty.ssl-telemetry.enabled=true")

}

test {

systemProperty("testLatestDeps", findProperty("testLatestDeps") as Boolean)

filter {

excludeTestsMatching("Netty41ConnectionSpanTest")

excludeTestsMatching("Netty41ClientSslTest")

}

}

check {

dependsOn(testConnectionSpan)

}

}

This seems like a good idea, but would also require some reshuffling of the tests, as I don't think (or wouldn't know how) you can exclude tests based on parameters.

Would it be OK not to add this in this PR anymore?

I think it is fine, as there are not the only tests that exercise sqs instrumentation. Idk whether this is what you are looking for, but to conditionally disable tests we usually use

opentelemetry-java-instrumentation/instrumentation/jaxrs/jaxrs-common/testing/src/main/groovy/AbstractJaxRsFilterTest.groovy

Line 45 in ca24c38

Assumptions.assumeTrue(!abortPrematch || testAbortPrematch())

For java tests it is pretty much the same but use junit5 Assumptions class.

...ws-sdk-2.2/library/src/main/java/io/opentelemetry/instrumentation/awssdk/v2_2/SqsAccess.java

...c/main/groovy/io/opentelemetry/instrumentation/awssdk/v2_2/AbstractAws2SqsTracingTest.groovy

trask · 2023-06-16T20:12:12Z

thx @Oberon00 (and @laurit)!

Oberon00 · 2023-06-19T07:40:25Z

Thank you very much for your help @laurit and @mateuszrzeszutek!

aws-sdk: Use compile time reference+@NoMuzzle instead of reflection.

060bed5

This is a first chunk, if we want to follow this approach, I can add more.

Oberon00 requested a review from a team May 30, 2023 16:36

spotless

86b511e

Oberon00 marked this pull request as draft May 31, 2023 07:32

Split muzzle ref check in new module to avoid autoconfigure

6a7f46d

github-actions bot requested a review from theletterf May 31, 2023 09:17

Oberon00 marked this pull request as ready for review May 31, 2023 10:09

Oberon00 commented May 31, 2023

View reviewed changes

.../src/main/java/io/opentelemetry/instrumentation/awssdk/v2_2/SqsSendMessageRequestAccess.java Show resolved Hide resolved

mateuszrzeszutek reviewed Jun 1, 2023

View reviewed changes

laurit reviewed Jun 2, 2023

View reviewed changes

instrumentation/aws-sdk/aws-sdk-2.2/library-extra-test/build.gradle.kts Outdated Show resolved Hide resolved

laurit reviewed Jun 2, 2023

View reviewed changes

...est/src/test/java/io/opentelemetry/instrumentation/awssdk/v2_2/SqsNoMuzzleReferenceTest.java Outdated Show resolved Hide resolved

laurit reviewed Jun 2, 2023

View reviewed changes

...est/src/test/java/io/opentelemetry/instrumentation/awssdk/v2_2/SqsNoMuzzleReferenceTest.java Outdated Show resolved Hide resolved

Merge remote-tracking branch 'upstream/main' into aws-sdk-no-muzzle

c27eb0b

Oberon00 added 3 commits June 7, 2023 11:24

Merge remote-tracking branch 'upstream/main' into aws-sdk-no-muzzle

6af15ff

Enable SQS messaging propagation for more tests.

ae2e8d8

Lint

b04a906

laurit and others added 5 commits June 13, 2023 17:00

Use muzzle and @NoMuzzle together

b941bad

(cherry picked from commit 08fc75e)

spotless

1a14cd5

Address review comments.

3952448

Merge remote-tracking branch 'upstream/main' into aws-sdk-no-muzzle

ad818e2

Fix muzzle checks.

e06b345

laurit reviewed Jun 14, 2023

View reviewed changes

.../aws-sdk-2.2/library/src/main/java/io/opentelemetry/instrumentation/awssdk/v2_2/SqsImpl.java Outdated Show resolved Hide resolved

laurit reviewed Jun 14, 2023

View reviewed changes

.../src/main/java/io/opentelemetry/instrumentation/awssdk/v2_2/TracingExecutionInterceptor.java Outdated Show resolved Hide resolved

Oberon00 added 3 commits June 14, 2023 18:43

Tighten package-privateness.

d2bf029

Move afterConsumerResponse to SqsAccess, avoiding reflection

f4822a4

Remove manual muzzle checks: covered by muzzle directive

a095c37

Oberon00 commented Jun 14, 2023

View reviewed changes

Oberon00 commented Jun 15, 2023

View reviewed changes

...ws-sdk-2.2/library/src/main/java/io/opentelemetry/instrumentation/awssdk/v2_2/SqsAccess.java Outdated Show resolved Hide resolved

Replace unsafe call to .class.getPackage()

169d15f

laurit reviewed Jun 16, 2023

View reviewed changes

...c/main/groovy/io/opentelemetry/instrumentation/awssdk/v2_2/AbstractAws2SqsTracingTest.groovy Show resolved Hide resolved

laurit approved these changes Jun 16, 2023

View reviewed changes

trask merged commit 5b4e05a into open-telemetry:main Jun 16, 2023

Oberon00 deleted the aws-sdk-no-muzzle branch June 19, 2023 07:40

This was referenced Jun 20, 2023

aws-sdk-2.2: More reflection cleanup. #8775

Merged

Fix aws-sdk-2.2 library instrumentation crashing app when SQS is not on classpath #8805

Merged

Oberon00 mentioned this pull request Jul 4, 2023

aws-sdk-1.1: Copy SQS plugin/NoMuzzle approach from 2.2 #8866

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

aws-sdk: Use compile time reference+@NoMuzzle instead of reflection. #8603

aws-sdk: Use compile time reference+@NoMuzzle instead of reflection. #8603

Oberon00 commented May 30, 2023 •

edited

Loading

Oberon00 commented May 31, 2023 •

edited

Loading

Oberon00 commented May 31, 2023

Oberon00 commented Jun 1, 2023 •

edited

Loading

mateuszrzeszutek Jun 1, 2023

laurit Jun 1, 2023

mateuszrzeszutek Jun 1, 2023 •

edited

Loading

Oberon00 Jun 1, 2023 •

edited

Loading

mateuszrzeszutek Jun 1, 2023

Oberon00 Jun 13, 2023 •

edited

Loading

laurit Jun 14, 2023

Oberon00 Jun 14, 2023

Oberon00 Jun 14, 2023 •

edited

Loading

Oberon00 Jun 14, 2023

Oberon00 commented Jun 5, 2023

laurit commented Jun 6, 2023

Oberon00 commented Jun 6, 2023 •

edited

Loading

Oberon00 commented Jun 7, 2023 •

edited

Loading

Oberon00 commented Jun 13, 2023

laurit commented Jun 14, 2023

Oberon00 Jun 14, 2023 •

edited

Loading

laurit Jun 15, 2023

Oberon00 Jun 15, 2023

laurit Jun 16, 2023

trask commented Jun 16, 2023

Oberon00 commented Jun 19, 2023

	tasks {
	val testConnectionSpan by registering(Test::class) {
	filter {
	includeTestsMatching("Netty41ConnectionSpanTest")
	includeTestsMatching("Netty41ClientSslTest")
	}
	include("*/Netty41ConnectionSpanTest.", "*/Netty41ClientSslTest.")
	jvmArgs("-Dotel.instrumentation.netty.connection-telemetry.enabled=true")
	jvmArgs("-Dotel.instrumentation.netty.ssl-telemetry.enabled=true")
	}

	test {
	systemProperty("testLatestDeps", findProperty("testLatestDeps") as Boolean)

	filter {
	excludeTestsMatching("Netty41ConnectionSpanTest")
	excludeTestsMatching("Netty41ClientSslTest")
	}
	}

	check {
	dependsOn(testConnectionSpan)
	}
	}

aws-sdk: Use compile time reference+@NoMuzzle instead of reflection. #8603

aws-sdk: Use compile time reference+@NoMuzzle instead of reflection. #8603

Conversation

Oberon00 commented May 30, 2023 • edited Loading

Oberon00 commented May 31, 2023 • edited Loading

Oberon00 commented May 31, 2023

Oberon00 commented Jun 1, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mateuszrzeszutek Jun 1, 2023 • edited Loading

Choose a reason for hiding this comment

Oberon00 Jun 1, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Oberon00 Jun 13, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Oberon00 Jun 14, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Oberon00 commented Jun 5, 2023

laurit commented Jun 6, 2023

Oberon00 commented Jun 6, 2023 • edited Loading

Oberon00 commented Jun 7, 2023 • edited Loading

Oberon00 commented Jun 13, 2023

laurit commented Jun 14, 2023

Oberon00 Jun 14, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

trask commented Jun 16, 2023

Oberon00 commented Jun 19, 2023

Oberon00 commented May 30, 2023 •

edited

Loading

Oberon00 commented May 31, 2023 •

edited

Loading

Oberon00 commented Jun 1, 2023 •

edited

Loading

mateuszrzeszutek Jun 1, 2023 •

edited

Loading

Oberon00 Jun 1, 2023 •

edited

Loading

Oberon00 Jun 13, 2023 •

edited

Loading

Oberon00 Jun 14, 2023 •

edited

Loading

Oberon00 commented Jun 6, 2023 •

edited

Loading

Oberon00 commented Jun 7, 2023 •

edited

Loading

Oberon00 Jun 14, 2023 •

edited

Loading