Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds parsers and handlers for Rpc abstraction and ports Dubbo and gRPC #999

Merged
merged 6 commits into from
Apr 28, 2020

Conversation

codefromthecrypt
Copy link
Member

@codefromthecrypt codefromthecrypt commented Sep 30, 2019

This adds basic infrastructure for uniform parsing of RPC libraries

By default, the following are added to both RPC client and server spans:

  • Span.name is the RPC service/method. Ex. "zipkin.proto3.SpanService/Report"
    • If the service is absent, the method is the name and visa versa.
  • Tags:
    • "rpc.method", eg "Report"
    • "rpc.service", eg "zipkin.proto3.SpanService"
    • "rpc.error_code", eg "CANCELLED"
    • "error" the RPC error code if there is no exception
  • Remote IP and port information

As we know propagation fields are not always headers, "getters and setter" methods used for propagation are protected, not public methods, and not named headers.

Copy link
Member

@jeqo jeqo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM so far

instrumentation/rpc/README.md Outdated Show resolved Hide resolved
@codefromthecrypt
Copy link
Member Author

#1000 pulls the most important parts out into a separate change

@codefromthecrypt codefromthecrypt changed the title Adds abstraction layer for RPC Adds basic parsers and handlers for RPC instrumentation Oct 2, 2019
@codefromthecrypt
Copy link
Member Author

@jeqo @anuraaga do you have some modeling thoughts? should we just add the following to the response types?

@Nullable String message()
@Nullable Boolean isError()


## Span data policy
By default, the following are added to both rpc client and server spans:
* Span.name is the rpc method in lowercase: ex "zipkin.proto3.spanservice/report"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NOTE: ignore this it is copy/paste and needs to be re-written

@jeqo
Copy link
Member

jeqo commented Oct 2, 2019

maybe @Nullable String errorMessage() or similar would be better, as just message could be confusing

@codefromthecrypt
Copy link
Member Author

codefromthecrypt commented Oct 2, 2019 via email

@codefromthecrypt
Copy link
Member Author

ok I added error parsing. next steps on this are to integrate gRPC and dubbo.

After this is merged, I'd recommend pulling out the brave.Response type and backporting http to implement it. After that, try again on the messaging abstraction!

@codefromthecrypt
Copy link
Member Author

PS I plan to stop on this PR last friday :) I will hard stop on it tomorrow morning, so hopefully we can have a volunteer to carry forward after that.

@codefromthecrypt
Copy link
Member Author

ok formally setting this down for someone else to pick up or delete.

NEXT STEPS:

  • implement dubbo, dubbo-rpc, grpc

@codefromthecrypt
Copy link
Member Author

looks like this stopped, which is also blocking #914, which is probably blocking some other things..

@codefromthecrypt
Copy link
Member Author

so this is from otel (possibly just lifting grpc stuff) https://github.com/open-telemetry/opentelemetry-java/blob/c24e4a963669cfd044d7478a801e8f170faba4fe/api/src/main/java/io/opentelemetry/trace/Status.java#L38

I think we do need a message, regardless of whether it is success or failure. I'm less sure about "code" as not sure that's entirely portable.

@minwoox do you know between the various frameworks armeria supports (grpc, thrift etc) which properties are actually portable?
@trytocatch I looked at this for dubbo.. seems mostly to define stats, not generic properties like status message or code. do you have any comments about properties most required for stats/tracing? https://github.com/apache/dubbo/blob/master/dubbo-rpc/dubbo-rpc-api/src/main/java/org/apache/dubbo/rpc/RpcStatus.java

@trytocatch
Copy link

trytocatch commented Apr 15, 2020

so this is from otel (possibly just lifting grpc stuff) https://github.com/open-telemetry/opentelemetry-java/blob/c24e4a963669cfd044d7478a801e8f170faba4fe/api/src/main/java/io/opentelemetry/trace/Status.java#L38

I think we do need a message, regardless of whether it is success or failure. I'm less sure about "code" as not sure that's entirely portable.

@minwoox do you know between the various frameworks armeria supports (grpc, thrift etc) which properties are actually portable?
@trytocatch I looked at this for dubbo.. seems mostly to define stats, not generic properties like status message or code. do you have any comments about properties most required for stats/tracing? https://github.com/apache/dubbo/blob/master/dubbo-rpc/dubbo-rpc-api/src/main/java/org/apache/dubbo/rpc/RpcStatus.java

sorry, I just fixed a small concurrency problem there

@codefromthecrypt
Copy link
Member Author

thanks for the reply @trytocatch @beiwei30 do you have any commentary on RPC status/message portability from Dubbo?

codefromthecrypt pushed a commit that referenced this pull request Apr 27, 2020
Similar to #999, HTTP should also be able to derive the error directly
from `HttpResponse` vs a second argument. This also allows exceptions
that nest status codes to become parsable into span tags.
codefromthecrypt pushed a commit that referenced this pull request Apr 27, 2020
This fixes some glitches in preparation of #999

* Bumps gRPC to 1.29 (there was no impact)
* GrpcClientRequest and GrpcServerRequest unwrapped `this`, not the call.
* Anonymous GrpcClientParser and GrpcServerParser used ErrorParser unnecessarily
  * This is implicitly done in the Zipkin span handler
* Status.cause was ignored
  * This led to a less precise "error" tag
* Client headers were unnecessarily null during sampling. This was caused by
  too eager allocation of a span.
* ClientCall.Listener was wrapped twice
* Most ClientCall hooks were not placed in scope
* ServerCall's span was started too late
codefromthecrypt pushed a commit that referenced this pull request Apr 27, 2020
This fixes some glitches in preparation of #999

* Bumps gRPC to 1.29 (there was no impact)
* GrpcClientRequest and GrpcServerRequest unwrapped `this`, not the call.
* Anonymous GrpcClientParser and GrpcServerParser used ErrorParser unnecessarily
  * This is implicitly done in the Zipkin span handler
* Status.cause was ignored
  * This led to a less precise "error" tag
* Client headers were unnecessarily null during sampling. This was caused by
  too eager allocation of a span.
* ClientCall.Listener was wrapped twice
* Most ClientCall hooks were not placed in scope
* ServerCall's span was started too late
codefromthecrypt pushed a commit that referenced this pull request Apr 27, 2020
)

This fixes some glitches in preparation of #999

* Bumps gRPC to 1.29 (there was no impact)
* GrpcClientRequest and GrpcServerRequest unwrapped `this`, not the call.
* Anonymous GrpcClientParser and GrpcServerParser used ErrorParser unnecessarily
  * This is implicitly done in the Zipkin span handler
* Status.cause was ignored
  * This led to a less precise "error" tag
* Client headers were unnecessarily null during sampling. This was caused by
  too eager allocation of a span.
* ClientCall.Listener was wrapped twice
* Most ClientCall hooks were not placed in scope
* ServerCall's span was started too late
codefromthecrypt pushed a commit that referenced this pull request Apr 27, 2020
This fixes some glitches in preparation of #999. These changes were
already reviewed, just being raised separately to reduce the diff.

* Restores old dubbo.error_code numeric value
  * 999 will use this unless initialized with RpcTracing
* Sets callback context of alibaba dubbo client to invocation context.
* Makes integration tests consistently use setTracing unless sampling
  * This helps classify legacy behavior vs new
* General cleanups on alibaba dubbo Future code
* Addition of types used for 999
codefromthecrypt pushed a commit that referenced this pull request Apr 27, 2020
…1177)

This fixes some glitches in preparation of #999. These changes were
already reviewed, just being raised separately to reduce the diff.

* Restores old dubbo.error_code numeric value
  * 999 will use this unless initialized with RpcTracing
* Sets callback context of alibaba dubbo client to invocation context.
* Makes integration tests consistently use setTracing unless sampling
  * This helps classify legacy behavior vs new
* General cleanups on alibaba dubbo Future code
* Addition of types used for 999
@codefromthecrypt codefromthecrypt changed the title Adds basic parsers and handlers for RPC instrumentation Adds parsers and handlers for Rpc abstraction and ports Dubbo and gRPC Apr 27, 2020
@codefromthecrypt
Copy link
Member Author

ok this is back in business. I'm happy with the code I think and hope you are. @anuraaga has done a lot of work moving my hands. The commit log will betray this, so please thank him.

tomorrow, I will polish up the readmes for shipit!

@@ -25,3 +25,14 @@ record the application exception.
In short, we choose to not mask the gRPC status with a potential application
exception on close. In worst case, if there is no instrumentation for the layer
that throws, its trace and span ID could be known via log correlation.

## Why don't we use RpcClientHandler or RpcServerHandler for unexpected errors?
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anuraaga since we now have a more compelling story about parsing, I agree that we should keep it nicer on the happy path vs the edge cases. This follows up your points on the other pull request I hope. Otherwise, let me know how to change it.

@@ -509,6 +489,59 @@ void initMessageTaggingClient() {
assertThat(reporter.takeRemoteSpan(Span.Kind.CLIENT).parentId()).isNull();
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NOTE: all ITs still pass, across dubbo and grpc. This means the data policy is exactly the same unless, someone initializes with RpcTracing, which will set the new policy from the rpc abstraction (ex "rpc.error_code")

@codefromthecrypt
Copy link
Member Author

rebased over a jax-rs test flake fix.

proceeding to add README content to finish this

@codefromthecrypt
Copy link
Member Author

OK this is done now, imho

Copy link
Contributor

@anuraaga anuraaga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small nits but looks good, thanks!

*
* @since 5.12
*/
@Override public Metadata headers() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

*/
// NOTE: gRPC is Java 1.7+, so we cannot add methods to this later
public interface GrpcRequest {
MethodDescriptor<?, ?> methodDescriptor();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest just method()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree. that's a lot better and now's the only time to change it!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ergh. now I remember. we define method() in RpcRequest already. I'll make a comment

Scope scope = currentTraceContext.newScope(span.context());
Result result = null;
Throwable error = null;
try {
result = invoker.invoke(invocation);
error = result.getException();
Future<Object> future = rpcContext.getFuture(); // the case on async client invocation
if (future instanceof FutureAdapter) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

before, this instanceof was also checking null. In practice, this is always a (dubbo) FutureAdapter

@codefromthecrypt codefromthecrypt merged commit aea76a8 into master Apr 28, 2020
@codefromthecrypt codefromthecrypt deleted the rpc-instrumentation branch April 28, 2020 06:14
@codefromthecrypt
Copy link
Member Author

@anuraaga you're the hero of this change! thanks for the weeks of review on this (and all the ancillary changes around it)

thgoexpt added a commit to thgoexpt/Java-distributed-tracing-implementation that referenced this pull request Apr 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants