-
Notifications
You must be signed in to change notification settings - Fork 314
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for multiple clusters in custom runners #488
Conversation
Example invocations like: --target-hosts='{"default":"127.0.0.1:9200","remote":"127.0.0.1:19200"}' or (backwards compatible, single cluster) --target-hosts="127.0.0.1:9200"
and also move handling of es single/multi cluster passing to runners, inside the wrapper runners used by register_runner()
As things are now, --target-hosts does not need to be explicitly accompanied by --client-options (in which case the default timeout:60 gets applied). Allow this behavior as well, when --target-hosts specifies multiple clusters, assigning the default client-option to each cluster. Add needed tests and fix ClientOpotions test bug with test_json_file_parameter_parses().
Runners, such as many core runners, can be context-manager enabled. Ensure that when they get wrapped, the context gets preserved.
ExternalLauncherTests currently have some hard coded values for hosts and client options, which need to be adjusted now that both cfg options store an Object.
Ensure client-options per cluster get properly propagated
Rally defaults to 127.0.0.1:39200 when no --target-host is used (and no external candidate is used). Fix passing of host/port in this default case.
and use the prefix "Single" instead of "Uni" for consistency across the code.
We stick to only attaching telemetry devices against one cluster (the "default"). Having a telemetry device retrieve data from a different cluster relies only on the "es" parameter (the ES client) and thus there's no advantage making this code "cluster aware".
This should have been a part of ab8af4a
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR. Nice work! The general direction looks fine but I left some comments.
docs/command_line_reference.rst
Outdated
This will run the benchmark against the hosts 10.17.0.5 and 10.17.0.6 on port 9200. See ``client-options`` if you use X-Pack Security and need to authenticate or Rally should use https. | ||
** This will run the benchmark against the hosts 10.17.0.5 and 10.17.0.6 on port 9200. See ``client-options`` if you use X-Pack Security and need to authenticate or Rally should use https. | ||
|
||
You can also target multiple clusters with ``--target-hosts`` for specific use cases. This is described in the Advanced topics section. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe add an in-page link?
docs/command_line_reference.rst
Outdated
@@ -435,7 +434,9 @@ If you run the ``benchmark-only`` :doc:`pipeline </pipelines>` or you want Rally | |||
|
|||
esrally --pipeline=benchmark-only --target-hosts=10.17.0.5:9200,10.17.0.6:9200 | |||
|
|||
This will run the benchmark against the hosts 10.17.0.5 and 10.17.0.6 on port 9200. See ``client-options`` if you use X-Pack Security and need to authenticate or Rally should use https. | |||
** This will run the benchmark against the hosts 10.17.0.5 and 10.17.0.6 on port 9200. See ``client-options`` if you use X-Pack Security and need to authenticate or Rally should use https. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason for adding **
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo. Thanks for catching.
docs/command_line_reference.rst
Outdated
~~~~~~~~~~~~~~~~ | ||
|
||
Rally can also create client connections for multiple Elasticsearch clusters. | ||
This is only useful if you want to create :ref:`custom runners <adding_tracks_custom_runners>` that execute operations against remote clusters, for example for Cross Cluster Search or Cross Cluster Replication. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replace "against remote clusters" with "against multiple clusters"? Also maybe add a link for cross cluster search? I also think that both names need to be lower case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
docs/command_line_reference.rst
Outdated
Rally can also create client connections for multiple Elasticsearch clusters. | ||
This is only useful if you want to create :ref:`custom runners <adding_tracks_custom_runners>` that execute operations against remote clusters, for example for Cross Cluster Search or Cross Cluster Replication. | ||
|
||
To define the host:port pairs for additional clusters with ``target-hosts`` you can specify either a JSON filename (ending in ``.json``) or an inline JSON string. The JSON object should be a collection of name:value pairs. ``name`` is string for the cluster name and there **must be** one ``default``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replace "for additional clusters" with "for multiple clusters"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
docs/command_line_reference.rst
Outdated
{"timeout": 60}, | ||
"remote": | ||
{"use_ssl":true,"verify_certs": false,"ca_certs":"/path/to/cacert.pem"} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Maybe reformat so it's easier to read?
{
"default": {
"timeout": 60
},
"remote": {
"use_ssl": true,
"verify_certs": false,
"ca_certs": "/path/to/cacert.pem"
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
tests/mechanic/launcher_test.py
Outdated
@@ -76,11 +77,15 @@ def info(self, *args, **kwargs): | |||
|
|||
|
|||
class ExternalLauncherTests(TestCase): | |||
test_host = config_helper.TargetHosts('127.0.0.1:9200,10.17.0.5:19200') | |||
client_options = config_helper.ClientOptions('timeout:60') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
tests/utils/config_test.py
Outdated
def test_json_file_parameter_parses(self): | ||
self.assertEqual( | ||
{"default": ["127.0.0.1:9200","10.127.0.3:19200"] }, | ||
config.TargetHosts(os.path.join(os.path.dirname(__file__), "resources/target_hosts_1.json")).all_hosts) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I think this should rather be os.path.join(os.path.dirname(__file__), "resources", "target_hosts_1.json")
) (all path components as separate elements)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes makes perfect sense. Addressing.
tests/utils/config_test.py
Outdated
{"host": "88.33.27.15", "port": 39200} | ||
] | ||
}, | ||
config.TargetHosts(os.path.join(os.path.dirname(__file__), "resources/target_hosts_2.json")).all_hosts) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see above w.r.t. usage of os.path.join
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes makes perfect sense. Addressing.
tests/utils/config_test.py
Outdated
{'default': {'timeout':60}, | ||
'remote_1': {'use_ssl': True,'verify_certs': True,'basic_auth_user':'elastic','basic_auth_password':'changeme'}, | ||
'remote_2': {'use_ssl': True,'verify_certs': True, 'ca_certs':'/path/to/cacert.pem'}}, | ||
config.ClientOptions(os.path.join(os.path.dirname(__file__), "resources/client_options_1.json")).all_client_options) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
os.path.join(os.path.dirname(file), "resources/target_hosts_1.json")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure I understand the suggestion here with file
in bold, shouldn't I just split resources
and target_hosts_1.json
as separate arguments in os.path.join()
or?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a copy & paste error. I suggest the same as for the other test case above, i.e.
os.path.join(os.path.dirname(__file__), "resources", "client_options_1.json")
tests/utils/config_test.py
Outdated
|
||
self.assertEqual( | ||
{'default': {'timeout':60}}, | ||
config.ClientOptions(os.path.join(os.path.dirname(__file__), "resources/client_options_2.json")).all_client_options) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
os.path.join(os.path.dirname(file), "resources/target_hosts_1.json")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure I understand the suggestion here with file
in bold, shouldn't I just split resources
and target_hosts_1.json
as separate arguments in os.path.join()
or?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a copy & paste error. I suggest the same as for the other test case above, i.e.
os.path.join(os.path.dirname(__file__), "resources", "client_options_2.json")
change utils/config.py to utils/opts.py
remove __call__() method from ConnectOptions base class; cfg.opts("client", "hosts") and cfg.opts("client", "options") will now need refer to the .default property instead of `()`.
Add context-manager capability for multi-cluster enabled runners. Also add tests.
Removing unnecessary Class in lieu of a simple function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for iterating. I left a few more comments.
esrally/driver/driver.py
Outdated
@@ -582,12 +582,18 @@ def __init__(self): | |||
|
|||
@actor.no_retry("load generator") | |||
def receiveMsg_StartLoadGenerator(self, msg, sender): | |||
def EsClients(all_hosts, all_client_options): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename to es_clients
? (I immediately thought that this is a class when I read the code below that calls this function)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in 0a17876
esrally/racecontrol.py
Outdated
if len(configured_hosts()) !=0 : | ||
logger.info("Using configured hosts %s" % configured_hosts()) | ||
if len(configured_hosts.default) !=0 : | ||
logger.info("Using configured hosts {}".format(configured_hosts.default)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we settled on using the idiom logger.info("...%s...", parameter0, parameter1)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. Addressed in 0a17876
esrally/racecontrol.py
Outdated
else: | ||
logger.info("Setting default host to [%s:%d]" % (host, port)) | ||
logger.info("Setting default host to [{}:{}]".format(host, port)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we settled on using the idiom logger.info("...%s...", parameter0, parameter1)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ugh too fast replacing things here. Will revert.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in 0a17876
tests/mechanic/launcher_test.py
Outdated
@@ -147,7 +147,7 @@ def test_error_on_cluster_launch(self, sleep): | |||
cfg = config.Config() | |||
cfg.add(config.Scope.application, "client", "hosts", self.test_host) | |||
# Simulate that the client will raise an error upon startup | |||
cfg.add(config.Scope.application, "client", "options", config_helper.ClientOptions('raise-error-on-info:true')) | |||
cfg.add(config.Scope.application, "client", "options", opts.ClientOptions('raise-error-on-info:true')) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Use double-quotes for string?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in e04df54
Stick to double quotes consistently for strings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was a large change which touched quite a few areas! Congratulations! LGTM.
Thank you for the thorough review and insightful comments. With this foundation in, adding telemetry devices for additional clusters should be a simpler task. I'll squash and merge this in. |
With this PR we add support for specifying multiple clusters using
target-hosts
andclient-options
.The existing way of specifying
target-hosts
, as a csv host:port string andclient-options
as csv key:value pairs in a string is preserved. Additionally, we now allow passing parameters fortarget-hosts
andclient-options
using a json filename as well as an in-line json string, which is already possible for other cli arguments. The documentation now reflects how to use this and we back this with a number of rigorous test cases.All core runners and custom runners, by default, still receive an es client object pointing to the "default" cluster (for compatibility with the regular use case of Rally benchmarking one ES cluster) but it is now possible to specify
multi_cluster = True
as a class attribute in a custom runner. This results in anes
dictionary instead where the user can freely choose which cluster to target inside the custom runner code.Additionally we now wrap context-manager enabled runners and add a test case to ensure that wrapping it in this way really works as expected. The documentation for custom runners has also been fixed to not show a half complete example of a context-manager enabled Class.