Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Replace component registration system with depencencies #138

Closed
owais opened this issue Jul 8, 2019 · 6 comments
Closed

Proposal: Replace component registration system with depencencies #138

owais opened this issue Jul 8, 2019 · 6 comments
Assignees
Labels
discussion-needed Community discussion needed
Milestone

Comments

@owais
Copy link
Contributor

owais commented Jul 8, 2019

Overview

Currently all components factories such as receivers, exporters, etc self-register themselves with a central factory "repository". This add a number of restrictions on how these components and the greater OpenTelemetry Service project can be used. The registries maintain global lists of all registered components and as soon as a component package is imported by any code, it is immediately registered globally. This is convenient when workng directly on the opentelemetry service but can seriously limit flexibility for people consuming otel as a service, doing custom builds or writing tests.

Proposal

I propose we remove the global registries and follow more of a bare-bones DI pattern to "register" the components. The otel command would be responsible for importing the core infrastructure and the components, and then stitching them together. This would make the whole system a lot more flexible and enable many possible use-cases that would be very hard with central registries. I'll try to list down some of the benefits of this approach.

Example

Here is a very basic example of what it would look like. This can obviously be improved a lot and I'll touch on that later.

// cmd/otelsvc/main.go
package main

import (
    // import base command infrastructure
	"github.com/open-telemetry/opentelemetry-service/collector"
	"github.com/open-telemetry/opentelemetry-service/receivers"
	"github.com/open-telemetry/opentelemetry-service/exporters"

    // import components
	"github.com/open-telemetry/opentelemetry-service/receiver/vmmetricsreceiver"
	"github.com/open-telemetry/opentelemetry-service/receiver/opencensusrecever"
	"github.com/open-telemetry/opentelemetry-service/exporters/opencensusexporter"
)

func main() {
    recs := []receivers.Factory{
        opencensusrecever.Factory{},
        vmmetricsreceiver.Factory{},
    } 
    exps := []exporters.Factory{
        opencensusexporter.Factory{},
    }

    // collector.App is no longr
    app := collector.New(recs, exps)
    app.StartUnified()
}

Discoverability

With this approach, anyone can immediately look at the command entry point code and discover which components are registered by default. No more hunting through the codebase or adding debug/log statements to figure out the default set of components.

Testing

IMO this will make writing and understanding tests a lot more straightforward. For example, a test case can create multiple instances of the collector with different dependencies (components). A package can import different components to test different things but if the components self-register at import, they can make tests non-deterministic in the sense that adding a new test can affect existing tests in strange ways if the new test needs to import any of the components. A very basic example:

// myPackage/a_test.go
package myPackage

import (
	"fmt"
	"path"
	"testing"

	"github.com/open-telemetry/opentelemetry-service/configv2"
	_ "github.com/open-telemetry/opentelemetry-service/exporter/opencensusexporter"
	_ "github.com/open-telemetry/opentelemetry-service/receiver/opencensusreceiver"
)

func TestA(t *testing.T) {
	c, err := configv2.LoadConfigFile(t, path.Join(".", "config.yaml"))
	// assert err is raised because no processor is available (registered)
}

config.yaml used by the test

receivers:
  opencensus:
    enabled: true

exporters:
  opencensus:
    enabled: true


processors:
  attributes:
    enabled: true

pipelines:
  traces:
    receivers: [opencensus]
    processors: [attributes]
    exporters: [opencensus]

TestA will work until someone else adds a new test to the package that needs the addattributesprocessor.

// myPackage/b_test.go
import (
	"github.com/open-telemetry/opentelemetry-service/processor/addattributesprocessor"
)

func TestB(t *testing.T) {
	// test something with addattributesprocessor
}

At this point TestA will start failing because attributes processor is now auto-registered. Of course, this is not a big issue right now and within the OpenTelemetry project but as the project and the ecosystem around it grows, I fear global auto-registration will cause issues like these.

Also, this one is a pretty basic and easy to fix scenario. As things get more complex, it might not be as easy to detect the cause for failing tests. For example, imagine a test written to test the whole pipeline where it prepares some input data, runs that through the pipeline and then checks the output data. It's not hard to imagine that another test importing something could affect the data being processed and affect the existing test in a non-deterministic way.

Instead, if we had more of a dependency based system instead of a registration based one, each test could specify it's own dependencies (components) completely independently without affecting any other tests in any way.

Testing Commands

This would especially be useful when the main collector Application needs to be tested. Different tests can initialize the main app multiple times with different dependencies and test them indepedently irrespective of what was or wasn't imported by others tests.

Composability and Wrappers

net/http is a great package. It defines very basic interfaces for handlers and accepts them when they are explicitly handed over to the listener. This has resulted in a highly rich ecosystem around net/http where the community has created so many different libraries and frameworks that build on top of net/http. This wouldn't be possible if defined the interfaces for handlers but then asked the interfaces to register themselves gloablly. For example, one can write an authentication middleware that takes any net/http compatible handler and geenerates a new one which can be registered with a net/http listener.

We obviously don't need as much flexibility as net/http but I think we still need to enable the community to be able to take any component (receiver, exporter, processor, etc), write a thin wrapper over it and create a new component with modified behaviour.

For example, if someone discovered an issue with the opencensusexporter for a very specific usecase, they create a new package called fixed-ocexporter, import the real opencesus exporter inside adding a thin wrapper over it that fixed their specific use case, and then import their custom build as shown below:

// cmd/otelsvc/main.go
package main

import (
    // import base command infrastructure
	"github.com/open-telemetry/opentelemetry-service/collector"
	"github.com/open-telemetry/opentelemetry-service/receivers"
	"github.com/open-telemetry/opentelemetry-service/exporters"

    // import components
	"github.com/open-telemetry/opentelemetry-service/receiver/vmmetricsreceiver"
	"github.com/open-telemetry/opentelemetry-service/exporters/opencensusexporter"

	"private/wrappedOCReceiver" // imported private receiver that wraps stock OC receiver
)

func main() {
    recs := []receivers.Factory{
        wrappedOCReceiver.Factory{}, 
        vmmetricsreceiver.Factory{},
    } 
    exps := []exporters.Factory{
        opencensusexporter.Factory{},
    }

    // collector.App is no longr
    app := collector.New(recs, exps)
    app.StartUnified()
}

This'll allow anyone to easily create or consume component wrappers or "middlewares/decorators" in the web development sense and compose their OT service from different components. People will be able to easily fix issues or modify the behaviour any component without having to fork them or change any config. This is a simple change that can help usher a very rich ecosystem around OT.

Verbosity

The above examples remove one line registration calls from all packages and add at least as many lines to the main command code. This consolidated the "registration" code in a single place which I think it very helpful even if it makes the command more verbose. The increase in command/app verbosity is pretty harmless. It mostly just lists the dependencies and doesn't contain any complex logic. However, it can still be reduced by a number of ways. I think adding a default package would probably be most helpful. An example:

// cmd/otelsvc/main.go
package main

import (
    // import base command infrastructure
	"github.com/open-telemetry/opentelemetry-service/collector"
	
	"github.com/open-telemetry/default"
)

func main() {
    app := collector.New(default.Receivers, default.Exporters, default.Processors)
    app.StartUnified()
}

Custom builds can then import the default package, add other packages to the defaults, remove some packages or even wrap some packages before handing them over to the command.

// cmd/otelsvc/main.go
package main

import (
	"github.com/open-telemetry/opentelemetry-service/collector"
	"github.com/open-telemetry/receiver"
	"github.com/open-telemetry/default"

	"ecosystem/opencensusAuth"

	"private/proprietaryReceiver"
)

func main() {
	// register an custom exporter
	exporters := append(default.Exporters, properitaryExporter)
	receivers := []receiver.Factory{}

	// wrap opencensus receiver with an auth wrapper to enable custom authn/authz
	for i, r := range default.Receivers {
		if isOCReceiver(r) {
			r := opencensusAuth.Wrap(r)
		}
		receivers := append(receivers, r)
	}

    app := collector.New(receivers, exporters, default.Processors)
    app.StartUnified()
}

Conclusion

The gist of the proposal is that the components should not have the knowledge about how, when or where they'll be used. They should just provide an implementation of a receiver, exporter, processor or something else and never make any assumptions about how they'll be consumed. This makes components less dependent on the core opentelemetry service while making the service itself a lot more flexible, and at the same time open up possibility of a richer ecosystem built around the OT service. This will also help reduce tight coupling between components, for example, configv2 will not need to consult receivers, exporters or processors packages as it won't need to call the GetFactory() function anymore. It'll instead take a dependency on a slices of Factories for different types which will be passed to it by the main Application. This makes Application the one and only coordinator of the service, exposes a clear, easy to follow code path and considerably reduces the the number of packages different components need to import. This also makes it trivial to easily substitute implementations with alternate versions or wrap/decorate them to add features or modify behaviour. Lastly this makes testing simpler and more deterministic by eliminating the side-effects of importing component packages.

@owais
Copy link
Contributor Author

owais commented Jul 8, 2019

@tigrannajaryan
Copy link
Member

tigrannajaryan commented Jul 8, 2019

@owais I like the proposal. (I think) I wrote the code for automatic factory registration and global registry and I am the first to admit it is not a good solution. I like what you suggest here and the reasoning is good. It will require to pass the factory lists as parameters in a few places so that they can be used instead of global registries so a bunch of files will need to be touched but it is a straightforward, low-risk change.

Unless someone objects let's do this early. The longer we wait the more will will need to be changed (since we will be adding more exporters/processor very soon).

@bogdandrutu
Copy link
Member

+1 to pass the list of available components in the main function. Also as you can see the "application" package should be extracted in a standalone component that others can reuse and not be part of the collector.

@pjanotti
Copy link
Contributor

pjanotti commented Jul 8, 2019

I already discussed this with Owais and I agree with his approach.

@owais
Copy link
Contributor Author

owais commented Jul 8, 2019

Thanks everyone. If we have consensus, can someone assign this issue to me? I'll try to get this done this week.

@pjanotti
Copy link
Contributor

Closed via #159

MovieStoreGuy pushed a commit to atlassian-forks/opentelemetry-collector that referenced this issue Nov 11, 2021
* Run golangci-lint in all directories with go.mod

* Imports fix by golangci-lint

* Propagate failures from commands running in loops
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion-needed Community discussion needed
Projects
None yet
Development

No branches or pull requests

5 participants