Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gRPC post spans #1042

Closed
wants to merge 4 commits into from
Closed

Conversation

isaachier
Copy link
Contributor

Add PostSpans functionality to gRPC branch.

Signed-off-by: Isaac Hier <isaachier@gmail.com>
Signed-off-by: Isaac Hier <isaachier@gmail.com>
@codecov
Copy link

codecov bot commented Sep 4, 2018

Codecov Report

Merging #1042 into grpc will increase coverage by 0.13%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##             grpc    #1042      +/-   ##
==========================================
+ Coverage   99.67%   99.81%   +0.13%     
==========================================
  Files         140      139       -1     
  Lines        6424     6403      -21     
==========================================
- Hits         6403     6391      -12     
+ Misses         21       12       -9
Impacted Files Coverage Δ
cmd/collector/app/grpc_handler.go 55.55% <100%> (+55.55%) ⬆️
cmd/ingester/app/flags.go

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update aa9059d...1222ead. Read the comment docs.

lis, err := net.Listen("tcp", "localhost:0")
require.NoError(t, err)
go func() {
err := server.Serve(lis)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't this introduce a potential race condition where the client is initialized before the server is ready?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I believe that this is somewhat unavoidable. The issue is that Serve will block indefinitely. Using a sync.WaitGroup would help, if I could call the Done method before the the server blocks, but there is no good place to do that.

defer server.Stop()
client, conn := newClient(t, addr)
defer conn.Close()
ctx, cancel := context.WithTimeout(context.Background(), time.Second)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't we just use context.Background()?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe timeout helps with the race when client starts before server. Although, the conn car above, is it already open?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. I think this originally helped me when I had an issue with client hanging forever. Now that it's fixed, can remove this.

func (p *mockSpanProcessor) ProcessSpans(spans []*model.Span, spanFormat string) ([]bool, error) {
p.mux.Lock()
defer p.mux.Unlock()
p.spans = append(p.spans, spans...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this all necessary? It looks like your tests call ProcessSpans just once, the extra complexity (mutex, keeping track of spans, etc) doesn't look required.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's needed since read and write are in different goroutines

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Luckily for me, the only code I had to write here was essentially copied from the equivalent Thrift test. It does locking the same way IIRC.

})
require.NoError(t, err)
require.False(t, r.GetOk())
require.Len(t, processor.getSpans(), 1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so when you do this, you're essentially testing a function on the mock, kinda weird.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It tests sending spans over the wire, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ya again, Thrift handler does this.

Copy link
Member

@yurishkuro yurishkuro Oct 8, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean if spans go over the wire then we're not just testing mock, we're testing the transmission and marshalling.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes exactly.

return p.spans
}

func initializeGRPCTestServer(t *testing.T, err error) (*grpc.Server, *mockSpanProcessor, net.Addr) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: expectedError

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea.

}
return &api_v2.PostSpansResponse{Ok: true}, nil
return &api_v2.PostSpansResponse{Ok: success}, nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to rethink this API. What are we trying to communicate back to the caller with this Boolean flag? At least (accepted, rejected) counts would make more sense. When can the processor return not ok? Should it be returning specific errors instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that the API as it stands isn't exactly perfect. If anything, a boolean vector (or even bit vector) might make more sense here. Count works too if we don't care which spans failed. An optional error message might be better than the boolean response.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any thoughts @yurishkuro?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The question is: what is the sender going to do with the errors?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, I agree with you here. The only reason to provide an error is if we believe in some far-fetched scenario it would be retried. Also, an empty response sort of seems weird, but IDK what else to do here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we want to support retries we should be using gRPC error code for that. Can you check what openCensus is doing in this case?

func (p *mockSpanProcessor) ProcessSpans(spans []*model.Span, spanFormat string) ([]bool, error) {
p.mux.Lock()
defer p.mux.Unlock()
p.spans = append(p.spans, spans...)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's needed since read and write are in different goroutines


func TestPostSpansWithError(t *testing.T) {
processorErr := errors.New("test-error")
server, processor, addr := initializeGRPCTestServer(t, processorErr)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should simplify this. You can return a test server struct with fields instead of multiple values. And we can rewrite this in FP style withServer(params, func) to hide all the client/server complexity from the test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Again, sort of copied this stuff. Can easily clean this up a bit.

defer server.Stop()
client, conn := newClient(t, addr)
defer conn.Close()
ctx, cancel := context.WithTimeout(context.Background(), time.Second)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe timeout helps with the race when client starts before server. Although, the conn car above, is it already open?

})
require.NoError(t, err)
require.False(t, r.GetOk())
require.Len(t, processor.getSpans(), 1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It tests sending spans over the wire, no?

Isaac Hier added 2 commits September 28, 2018 13:07
Signed-off-by: Isaac Hier <ihier@uber.com>
Signed-off-by: Isaac Hier <ihier@uber.com>
fmt.Printf("PostSpans(%+v)\n", *r)
for _, s := range r.Batch.Spans {
println(s.OperationName)
oks, err := g.spanProcessor.ProcessSpans(r.GetBatch().Spans, JaegerFormatType)
Copy link
Member

@yurishkuro yurishkuro Oct 8, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is losing Batch.Process data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the spans have pointers to the process. I'm not sure when that back-reference is set. If it is already set, there should not be a problem.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by convention when a bunch of spans arrive in a Batch and logically they are all from the same Process then Span.Process will be nil, but Batch.Process will be defined. This is done to minimize the amount of data sent over the wire. So you need to loop through spans and copy the Process pointer from Batch if Span.Process == nil

@pavolloffay
Copy link
Member

done in #1165

@isaachier isaachier deleted the grpc-post-spans branch November 28, 2018 00:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants