Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separate ES client code from ES output code #16150

Merged
merged 91 commits into from
Mar 6, 2020
Merged

Separate ES client code from ES output code #16150

merged 91 commits into from
Mar 6, 2020

Conversation

ycombinator
Copy link
Contributor

@ycombinator ycombinator commented Feb 6, 2020

What does this PR do?

This PR introduces a new package, libbeat/esleg/eslegclient. This package contains code pertaining to an Elasticsearch HTTP API client implementation that was previously part of the libbeat/outputs/elasticsearch package.

This PR is just the first step towards eventually introducing the official Elasticsearch HTTP API client, go-elasticsearch, into the Beats codebase. To that end, the new package introduced in this PR is named eslegclient, to stand for legacy Elasticsearch client.

Why is it important?

  • To reduce the responsibilities of the libbeat/outputs/elasticsearch package to focus primarily on publishing events to Elasticsearch, and
  • to make it easier to later introduce go-elasticsearch into the Beats codebase.

Developer Docs

The elasticsearch package below refers to libbeat/outputs/elasticsearch. The eslegclient package below refers to libbeat/esleg/eslegclient.

The phrase "replaced by" implies that the new symbol isn't a like-for-like drop-in substitute for the old one. Please look at the new symbol's definition to understand how to use it in place of the old one.

The phrase "moved to" implies a simple relocation of the symbol. The new symbol can be used as a like-for-like drop-in substitute for the old one; just the package suffix will need to be changed.

  • Connecting to Elasticsearch
    • elasticsearch.Client is significantly pared down in terms of fields and methods. It only contains fields and methods needed for publishing events using the Elasticsearch output. All other fields and methods have been moved into eslegclient.Connection.
    • elasticsearch.Connection is replaced by eslegClient.Connection.
    • elasticsearch.NewConnectedClient is replaced by eslegclient.NewConnectedClient.
    • elasticsearch.NewElasticsearchClients is replaced by eslegclient.NewClients.
  • Bulk API
    • elasticsearch.MetaBuilder has been removed.
    • elasticsearch.BulkResult has moved to eslegclient.BulkResult
    • elasticsearch.BulkReadToItems has been unexported.
    • elasticsearch.BulkReadItemStatus has been unexported.
    • elasticsearch.JSONReader has been unexported.
    • elasticsearch.NewJSONReader has been unexported.
  • Errors
    • elasticsearch.ErrNotConnected has moved to eslegclient.ErrNotConnected.
    • elasticsearch.ErrJSONEncodeFailed has moved to eslegclient.ErrJSONEncodeFailed.
    • elasticsearch.ErrResponseRead has moved to eslegclient.ErrResponseRead.
  • Other
    • elasticsearch.QueryResult has moved to eslegclient.QueryResult.
    • elasticsearch.SearchResults has moved to eslegclient.SearchResults.
    • elasticsearch.Hits has moved to eslegclient.Hits.
    • elasticsearch.CountResults has moved to eslegclient.CountResults.
    • elasticsearch.Total has moved to eslegclient.Total.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • [ ] I have made corresponding changes to the documentation
  • [ ] I have made corresponding change to the default configuration files
  • [ ] I have added tests that prove my fix is effective or that my feature works Since this is a refactoring, existing tests must continue to pass.

Related issues

@ycombinator ycombinator added in progress Pull request is currently in progress. libbeat :Outputs labels Feb 7, 2020
@ycombinator ycombinator added [zube]: In Review Team:Integrations Label for the Integrations team refactoring and removed in progress Pull request is currently in progress. labels Feb 7, 2020
@ycombinator ycombinator marked this pull request as ready for review February 7, 2020 23:59
@ycombinator ycombinator requested a review from ph February 7, 2020 23:59
@ycombinator ycombinator requested a review from urso February 8, 2020 00:00
CompressionLevel int `config:"compression_level" validate:"min=0, max=9"`
EscapeHTML bool `config:"escape_html"`
Timeout time.Duration `config:"timeout"`
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe as follow up, but these settings look quite generic, but HTTP focused. Just wondering if it would make sense to have these in libbeat/common/transport in one way or the other. We've got a many HTTP clients, with different set of settings.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, that makes sense to me. I'd prefer to do it as a follow up PR in which we look at HTTP settings across the various clients and consolidate the common ones somewhere under libbeat/common/transport.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created #16856 to track this refactoring as a follow up.

@ycombinator
Copy link
Contributor Author

@urso BTW, check out 6788cfd — the monitoring ES reporter no longer depends on the ES output package!

EscapeHTML: false,
TLS: nil,
}
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

config can store pointers and map[string]interface{}. I'd sleep a little better if we'd turn these template configurations into functions:

func defaultConfig() config {
  ...
}

This guarantees that we can never share pointers/references by accident.

Copy link
Contributor Author

@ycombinator ycombinator Mar 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 63664e5.

@ycombinator
Copy link
Contributor Author

CI failures are unrelated (same failures are also see in latest master builds). Merging.

@ycombinator ycombinator merged commit 8b8829e into elastic:master Mar 6, 2020
@ycombinator ycombinator deleted the lb-esclientleg branch March 6, 2020 02:08
@axw axw mentioned this pull request Mar 6, 2020
4 tasks
@simitt
Copy link
Contributor

simitt commented Mar 24, 2020

Is this going to be backported to 7.x and therefore 7.7?

@ycombinator
Copy link
Contributor Author

ycombinator commented Mar 24, 2020

@urso WDYT about @simitt's question above? It would be very hard to backport this PR into 7.x without introducing breaking changes. Type aliasing and wrapper functions might help in some cases but not with all the changes introduced in this PR.

OTOH, we will need these changes in 7.x if we are to fix the output reloading issues for Agent in 7.x as well. So my vote is to backport these to 7.x even though it introduces breaking changes. We can try to minimize the breaking changes via type aliasing and wrapper functions as much as possible and document the remaining breaking changes clearly. WDYT?

@simitt
Copy link
Contributor

simitt commented Mar 24, 2020

Just for clarification - I did not mean to suggest backporting, just wanted to know as I am in the process of creating missing backports for APM Server.

mikemadden42 pushed a commit that referenced this pull request Mar 25, 2020
* Separate ES client code from ES output code (#16150)

* Basic extraction of ES client code from ES output code

* Move test

* Removing duplicate function in monitoring reporter

* Break import cycle

* Guard onConnect callback execution

* Replace use of field with getter

* Moving common bulk API response processing into esclientleg

* Moving API integration tests

* Fixing references in tests

* Adding developer CHANGELOG entry

* Move LoadJSON method

* Move callbacks to own file

* Move client-related constructors into client.go file

* Reducing global logging usage

* Use new constructor in test

* Passing logger in test

* Use logger in test

* Use struct fieldnames when initializing

* Use constructor in test

* Fixing typos

* Replace esclient.ParseProxyURL with generic function in common

* Imports formatting

* Moving more fields from ES output client to esclientleg.Connection

* Moving more fields

* Update test code

* Use new TLS package

* Extracting common test code into eslegtest package

* Replace uses of elasticsearch output client with esclientleg.NewConnection

* Replacing uses of ES output client struct with esclientleg.Connection

* Handle callbacks

* Fixing formatting

* Fixing import cycle

* Fixing import and package name

* Fixing imports

* More fixes

* Breaking import cycle

* Removing unused function

* Adding back missing statement

* Fixing param

* Fixing package name

* Include ES output plugin so it's registered

* Proxy handling

* Let Connection handle ProxyDisable setting

* Only parse proxy field from config if set

* Cast timeout ints

* Parse proxy URL

* Fixing proxy integration test

* Fixing ILM test

* Updating expected request count in test

* Fixing package names

* Lots more refactoring!!!

* Move timeout field

* More fixes

* Adding missing files

* No need to pass HTTP any more

* Simplifying Bulk API usage

* Removing unused code

* Remove bulk state from Connection

* Removing empty file

* Moving Bulk API response streaming parsing code back into ES output package

* Don't make monitoring bulk parsing code use streaming parser

* Replacing old HTTP struct passing

* Removing HTTP use

* Adding build tag

* Fixing up tests

* Allow default scheme to be configurable

* Adding versions to import paths

* Remove redundant check

* Undoing unnecessary heartbeat import change

* Forgot to resolve conflicts

* Fixing imports

* Running go mod tidy

* Revert "Remove redundant check"

This reverts commit c5fde6ff3be765a89c0bc20f9cae8f697d08d47e.

* Fixing args order

* Removing extraneous parameter

* Removing wrong errors package import

* Fixing order of arguments

* Fixing package name

* Instantiating logger for tests

* Making streaming JSON parser private to ES output package

* Detect and try to fix scheme before parsing URL

* Making Connection private to ES output Client

* Update test

* Replace client.Ping() calls with client.Connect() calls in test code

* Updating tests

* Removing usage of ES output from monitoring code!

* Using strings.Index instead of strings.SplitN

* Return default config via function call

* Removing "escape hatch" method to expose underlying connection from ES output client

* Using client connection in tests

* Re-implement Test() method for ES output client

* Adding back missing import / sorting imports

* Removing unused import

* Fixing up developer CHANGELOG

* Clean up rebase

* Rebase cleanup

* Making 7.x specific adaptations (ML setup in Filebeat)

* Updating go.mod and go.sum files

* Running go mod tidy
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
libbeat :Outputs release-note:dev_docs Contains a Dev Docs section Team:Integrations Label for the Integrations team Team:Services (Deprecated) Label for the former Integrations-Services team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants