Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor agent configuration #1092

Merged
merged 11 commits into from
Oct 29, 2018
Merged

Conversation

pavolloffay
Copy link
Member

@pavolloffay pavolloffay commented Sep 28, 2018

Resolves: #927

  • agent tchannel flags moved to reporter.tchannel.*
  • agent reporters have to be specified explicitly b.WithReporters - if none added create fails
  • agent client configuration manager have to be specified explicitly b.WithClientConfigManager - if none added create fails

@ghost ghost assigned pavolloffay Sep 28, 2018
@ghost ghost added the review label Sep 28, 2018
@pavolloffay pavolloffay changed the title WIP: Agent flags WIP: Refactor agent configuration Sep 28, 2018
@@ -100,8 +98,11 @@ type HTTPServerConfiguration struct {
}

// WithReporter adds auxiliary reporters.
func (b *Builder) WithReporter(r reporter.Reporter) *Builder {
b.otherReporters = append(b.otherReporters, r)
func (b *Builder) WithReporter(r reporter.ReporterBuilder) *Builder {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For multiple builders, I think this is meant to be called more than once, right?

b.WithReporter(reporterBuilderA).WithReporter(reporterBuilderB)

Wouldn't it be better to accept multiple builders at once? Like:

b.WithReporters(reporterBuilderA, reporterBuilderB)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, can make it a vararg

cmd/agent/app/reporter/config.go Outdated Show resolved Hide resolved
cmd/agent/app/reporter/tchannel/reporter.go Outdated Show resolved Hide resolved
cmd/all-in-one/main.go Outdated Show resolved Hide resolved
if len(tb.CollectorHostPorts) == 0 {
tb.CollectorHostPorts = append(tb.CollectorHostPorts, fmt.Sprintf("127.0.0.1:%d", cOpts.CollectorPort))
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no other possible reporter builder yet, but you never know the future... Perhaps it would be better to add a default case where, to log a warning in in case the type isn't recognized.

rs = append(rs, r)
switch val := rb.(type) {
case *tchreporter.Builder:
server = b.HTTPServer.GetHTTPServer(val.CollectorServiceName, r.(*tchreporter.Reporter).Channel(), b.metricsFactory)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a problem. The abstraction wasn't clear here previously, i.e. both the reporter (invoked from UDP servers) and the HTTP sampling/config endpoint were relying on the TChannel connection to the collector. If everything was wired manually in the main() then we could've cleaned it up by instantiating the connection first and passing it to both the reporter and the http config server. But with the configuration approach it was less obvious how to do that. Yet I think we should address it now since we're making a breaking change (from configuration file pov) anyway.

@codecov
Copy link

codecov bot commented Oct 2, 2018

Codecov Report

Merging #1092 into master will not change coverage.
The diff coverage is 100%.

Impacted file tree graph

@@          Coverage Diff           @@
##           master   #1092   +/-   ##
======================================
  Coverage     100%    100%           
======================================
  Files         142     144    +2     
  Lines        6743    6769   +26     
======================================
+ Hits         6743    6769   +26
Impacted Files Coverage Δ
cmd/agent/app/flags.go 100% <ø> (ø) ⬆️
cmd/agent/app/reporter/tchannel/flags.go 100% <100%> (ø)
cmd/agent/app/reporter/tchannel/collector_proxy.go 100% <100%> (ø)
cmd/agent/app/builder.go 100% <100%> (ø) ⬆️
cmd/agent/app/reporter/tchannel/reporter.go 100% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fb10d4c...8af5d3e. Read the comment docs.

@pavolloffay pavolloffay changed the title WIP: Refactor agent configuration Refactor agent configuration Oct 2, 2018
@pavolloffay
Copy link
Member Author

@yurishkuro @jpkrohling PR updated. I think all concerns were addressed.

@@ -125,33 +116,36 @@ func (b *Builder) getMetricsFactory() (metrics.Factory, error) {
return nil, err
}

return baseFactory.Namespace("agent", nil), nil
b.metricsFactory = baseFactory.Namespace("agent", nil)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unexpected side effect. Is it necessary? Can we not pass it explicitly as before?

I'd rather do it in one way, either always pass explicitly (e.g. to CreateAgent) or change to InitMetricsFactory and never pass to member methods (but pretty much everywhere else we pass metrics and logger explicitly).

@@ -25,7 +25,7 @@ services:

jaeger-agent:
image: jaegertracing/jaeger-agent
command: ["--collector.host-port=jaeger-collector:14267"]
command: ["--reporter.tchannel.collector.host-port=jaeger-collector:14267"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want to do this as a breaking change or support the old config options as deprecated

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will try to depreciate in a way that values from old flags would be used and logged a warning.

I don't want to keep old flags just as noop. Then I think it's better to fail at init time to avoid misconfiguration.

@@ -50,9 +52,19 @@ func main() {

builder := &app.Builder{}
builder.InitFromViper(v)
tchanRep := tchannel.NewBuilder().InitFromViper(v)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what signal would this code use to instantiate grpc instead of tchannel reporter?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if I follow. Do you mean doing something similar as SPAN_STORAGE_TYPE?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could do something similar to storage type or add all reporter flags and introduce flag --reporter.type

cmd/agent/app/builder.go Outdated Show resolved Hide resolved
func (c HTTPServerConfiguration) GetHTTPServer(svc string, channel *tchannel.Channel, mFactory metrics.Factory) *http.Server {
mgr := httpserver.NewCollectorProxy(svc, channel, mFactory)
func (c HTTPServerConfiguration) getHTTPServer(manager httpserver.ClientConfigManager, mFactory metrics.Factory, mBuilder *jmetrics.Builder) (*http.Server, error) {
if manager == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we never do these checks anywhere else afaik

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added these checks to better understand requirements. manager and reporter is mandatory. We can also change the builder to build(reporters, manager)

@@ -50,9 +52,19 @@ func main() {

builder := &app.Builder{}
builder.InitFromViper(v)
tchanRep := tchannel.NewBuilder().InitFromViper(v)
mFactory, err := builder.GetMetricsFactory()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CreateAgent calls this again

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is cached

if err != nil {
logger.Fatal("Could not create metrics", zap.Error(err))
}
r, err := tchanRep.CreateReporter(mFactory, logger)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it would make more sense to encapsulate "tchannel collector proxy" that will either implement or at least expose both Reporter and ClientConfigManager APIs. It would make the code in main() simpler (and code itself can be unit tested), and will be easier to switch for gRPC implementation.

Right now httpserver.NewCollectorProxy() below is leaking tchannel detail outside of tchannel package.

@pavolloffay
Copy link
Member Author

@yurishkuro could you please have a look?

@yurishkuro
Copy link
Member

I'm trying to build out internal versions with this branch.

@yurishkuro
Copy link
Member

I was able to build our internal agent with these changes, but haven't had a change to test it, esp. loading of the old configs.

- 127.0.0.1:14269

collectorServiceName: some-collector-service
minPeers: 4
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to confirm - this is no longer supported? I thought there was a fallback with the options that might make the old config still work.

Copy link
Member Author

@pavolloffay pavolloffay Oct 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

old flags are supported and deprecated https://github.com/jaegertracing/jaeger/pull/1092/files#diff-1fc7689d387f4303d6e478ca01841286R29. The new flags take precedence over old ones.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Jaeger agent is a daemon program that runs on every host and receives tracing data submitted by Jaeger client libraries.

Usage:
  jaeger-agent [flags]
  jaeger-agent [command]

Available Commands:
  help        Help about any command
  version     Print the version

Flags:
      --collector.host-port string                                Deprecated; comma-separated string representing host:ports of a static list of collectors to connect to directly (e.g. when not using service discovery)
      --config-file string                                        Configuration file in JSON, TOML, YAML, HCL, or Java properties formats (default none). See spf13/viper for precedence.
      --discovery.conn-check-timeout duration                     Deprecated; sets the timeout used when establishing new connections (default 250ms)
      --discovery.min-peers int                                   Deprecated; if using service discovery, the min number of connections to maintain to the backend (default 3)
  -h, --help                                                      help for jaeger-agent
      --http-server.host-port string                              host:port of the http server (e.g. for /sampling point and /baggage endpoint) (default ":5778")
      --log-level string                                          Minimal allowed log Level. For more levels see https://github.com/uber-go/zap (default "info")
      --metrics-backend string                                    Defines which metrics backend to use for metrics reporting: expvar, prometheus, none (default "prometheus")
      --metrics-http-route string                                 Defines the route of HTTP endpoint for metrics backends that support scraping (default "/metrics")
      --processor.jaeger-binary.server-host-port string           host:port for the UDP server (default ":6832")
      --processor.jaeger-binary.server-max-packet-size int        max packet size for the UDP server (default 65000)
      --processor.jaeger-binary.server-queue-size int             length of the queue for the UDP server (default 1000)
      --processor.jaeger-binary.workers int                       how many workers the processor should run (default 10)
      --processor.jaeger-compact.server-host-port string          host:port for the UDP server (default ":6831")
      --processor.jaeger-compact.server-max-packet-size int       max packet size for the UDP server (default 65000)
      --processor.jaeger-compact.server-queue-size int            length of the queue for the UDP server (default 1000)
      --processor.jaeger-compact.workers int                      how many workers the processor should run (default 10)
      --processor.zipkin-compact.server-host-port string          host:port for the UDP server (default ":5775")
      --processor.zipkin-compact.server-max-packet-size int       max packet size for the UDP server (default 65000)
      --processor.zipkin-compact.server-queue-size int            length of the queue for the UDP server (default 1000)
      --processor.zipkin-compact.workers int                      how many workers the processor should run (default 10)
      --reporter.tchannel.collector.host-port string              comma-separated string representing host:ports of a static list of collectors to connect to directly (e.g. when not using service discovery)
      --reporter.tchannel.discovery.conn-check-timeout duration   sets the timeout used when establishing new connections (default 250ms)
      --reporter.tchannel.discovery.min-peers int                 if using service discovery, the min number of connections to maintain to the backend (default 3)

Use "jaeger-agent [command] --help" for more information about a command.

Signed-off-by: Pavol Loffay <ploffay@redhat.com>
Signed-off-by: Pavol Loffay <ploffay@redhat.com>
Signed-off-by: Pavol Loffay <ploffay@redhat.com>
Signed-off-by: Pavol Loffay <ploffay@redhat.com>
Signed-off-by: Pavol Loffay <ploffay@redhat.com>
Signed-off-by: Pavol Loffay <ploffay@redhat.com>
Signed-off-by: Pavol Loffay <ploffay@redhat.com>
Signed-off-by: Pavol Loffay <ploffay@redhat.com>
Signed-off-by: Pavol Loffay <ploffay@redhat.com>
Signed-off-by: Pavol Loffay <ploffay@redhat.com>
Signed-off-by: Pavol Loffay <ploffay@redhat.com>
@pavolloffay
Copy link
Member Author

@yurishkuro is there anything more what should be done?

@yurishkuro
Copy link
Member

We're testing it internally to make sure we can still do multi-reporter configs. Will update soon.

cc @isaachier

@pavolloffay
Copy link
Member Author

Any news @isaachier or @yurishkuro ?

Copy link
Contributor

@isaachier isaachier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We tested this internally and it looks good.

@yurishkuro yurishkuro merged commit 3bdb8e6 into jaegertracing:master Oct 29, 2018
@ghost ghost removed the review label Oct 29, 2018
@yurishkuro yurishkuro mentioned this pull request Oct 29, 2018
7 tasks
@yurishkuro yurishkuro added this to the 1.8 Release milestone Nov 9, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants