-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gRPC post spans #1042
gRPC post spans #1042
Conversation
Signed-off-by: Isaac Hier <isaachier@gmail.com>
eb0a65a
to
0db4f12
Compare
Signed-off-by: Isaac Hier <isaachier@gmail.com>
Codecov Report
@@ Coverage Diff @@
## grpc #1042 +/- ##
==========================================
+ Coverage 99.67% 99.81% +0.13%
==========================================
Files 140 139 -1
Lines 6424 6403 -21
==========================================
- Hits 6403 6391 -12
+ Misses 21 12 -9
Continue to review full report at Codecov.
|
lis, err := net.Listen("tcp", "localhost:0") | ||
require.NoError(t, err) | ||
go func() { | ||
err := server.Serve(lis) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doesn't this introduce a potential race condition where the client is initialized before the server is ready?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I believe that this is somewhat unavoidable. The issue is that Serve
will block indefinitely. Using a sync.WaitGroup
would help, if I could call the Done
method before the the server blocks, but there is no good place to do that.
defer server.Stop() | ||
client, conn := newClient(t, addr) | ||
defer conn.Close() | ||
ctx, cancel := context.WithTimeout(context.Background(), time.Second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can't we just use context.Background()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe timeout helps with the race when client starts before server. Although, the conn car above, is it already open?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough. I think this originally helped me when I had an issue with client hanging forever. Now that it's fixed, can remove this.
func (p *mockSpanProcessor) ProcessSpans(spans []*model.Span, spanFormat string) ([]bool, error) { | ||
p.mux.Lock() | ||
defer p.mux.Unlock() | ||
p.spans = append(p.spans, spans...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this all necessary? It looks like your tests call ProcessSpans
just once, the extra complexity (mutex, keeping track of spans, etc) doesn't look required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's needed since read and write are in different goroutines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Luckily for me, the only code I had to write here was essentially copied from the equivalent Thrift test. It does locking the same way IIRC.
}) | ||
require.NoError(t, err) | ||
require.False(t, r.GetOk()) | ||
require.Len(t, processor.getSpans(), 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so when you do this, you're essentially testing a function on the mock, kinda weird.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It tests sending spans over the wire, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ya again, Thrift handler does this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean if spans go over the wire then we're not just testing mock, we're testing the transmission and marshalling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes exactly.
return p.spans | ||
} | ||
|
||
func initializeGRPCTestServer(t *testing.T, err error) (*grpc.Server, *mockSpanProcessor, net.Addr) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: expectedError
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea.
} | ||
return &api_v2.PostSpansResponse{Ok: true}, nil | ||
return &api_v2.PostSpansResponse{Ok: success}, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may want to rethink this API. What are we trying to communicate back to the caller with this Boolean flag? At least (accepted, rejected) counts would make more sense. When can the processor return not ok? Should it be returning specific errors instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that the API as it stands isn't exactly perfect. If anything, a boolean vector (or even bit vector) might make more sense here. Count works too if we don't care which spans failed. An optional error message might be better than the boolean response.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any thoughts @yurishkuro?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The question is: what is the sender going to do with the errors?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, I agree with you here. The only reason to provide an error is if we believe in some far-fetched scenario it would be retried. Also, an empty response sort of seems weird, but IDK what else to do here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we want to support retries we should be using gRPC error code for that. Can you check what openCensus is doing in this case?
func (p *mockSpanProcessor) ProcessSpans(spans []*model.Span, spanFormat string) ([]bool, error) { | ||
p.mux.Lock() | ||
defer p.mux.Unlock() | ||
p.spans = append(p.spans, spans...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's needed since read and write are in different goroutines
|
||
func TestPostSpansWithError(t *testing.T) { | ||
processorErr := errors.New("test-error") | ||
server, processor, addr := initializeGRPCTestServer(t, processorErr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should simplify this. You can return a test server struct with fields instead of multiple values. And we can rewrite this in FP style withServer(params, func) to hide all the client/server complexity from the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Again, sort of copied this stuff. Can easily clean this up a bit.
defer server.Stop() | ||
client, conn := newClient(t, addr) | ||
defer conn.Close() | ||
ctx, cancel := context.WithTimeout(context.Background(), time.Second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe timeout helps with the race when client starts before server. Although, the conn car above, is it already open?
}) | ||
require.NoError(t, err) | ||
require.False(t, r.GetOk()) | ||
require.Len(t, processor.getSpans(), 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It tests sending spans over the wire, no?
Signed-off-by: Isaac Hier <ihier@uber.com>
Signed-off-by: Isaac Hier <ihier@uber.com>
fmt.Printf("PostSpans(%+v)\n", *r) | ||
for _, s := range r.Batch.Spans { | ||
println(s.OperationName) | ||
oks, err := g.spanProcessor.ProcessSpans(r.GetBatch().Spans, JaegerFormatType) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is losing Batch.Process data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the spans have pointers to the process. I'm not sure when that back-reference is set. If it is already set, there should not be a problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
by convention when a bunch of spans arrive in a Batch and logically they are all from the same Process then Span.Process will be nil, but Batch.Process will be defined. This is done to minimize the amount of data sent over the wire. So you need to loop through spans and copy the Process pointer from Batch if Span.Process == nil
done in #1165 |
Add
PostSpans
functionality to gRPC branch.