Skip to content

Commit

Permalink
Add a configuration file reference (#1137)
Browse files Browse the repository at this point in the history
With this commit we add reference docs for Rally's configuration file
`rally.ini`. We also move one configuration property from the `system`
to the `reporting` section as it is more appropriate there.

We intentionally placed this information on the existing configuration
page instead of creating a new one. We did this to provide continuity in
the future because we intend to remove the dedicated `configure`
subcommand and instead rely on users editing the configuration file
directly. When we remove this functionality, we can also remove obsolete
sections from this page and move it to the reference documentation.

Closes #991
  • Loading branch information
danielmitterdorfer authored Dec 17, 2020
1 parent bf95cb8 commit b3a0e15
Show file tree
Hide file tree
Showing 2 changed files with 126 additions and 1 deletion.
125 changes: 125 additions & 0 deletions docs/configuration.rst
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,131 @@ Rally will ask you a few more things in the advanced setup:
* **Name for this benchmark environment** (only for metrics store type ``elasticsearch``): You can use the same metrics store for multiple environments (e.g. local, continuous integration etc.) so you can separate metrics from different environments by choosing a different name.
* whether or not Rally should keep the Elasticsearch benchmark candidate installation including all data by default. This will use lots of disk space so you should wipe ``~/.rally/benchmarks/races`` regularly.

Configuration File Reference
----------------------------

Rally stores its configuration in the file ``~/.rally/rally.ini``. It comprises the following sections.

meta
~~~~

This section contains meta information about the configuration file.

* ``config.version``: The version of the configuration file format. This property is managed by Rally and should not be changed.

system
~~~~~~

This section contains global information for the current benchmark environment. This information should be identical on all machines where Rally is installed.

* ``env.name`` (default: "local"): The name of this benchmark environment. It is used as meta-data in metrics documents if an Elasticsearch metrics store is configured. Only alphanumeric characters are allowed.
* ``probing.url`` (default: "https://github.com"): This URL is used by Rally to check for a working Internet connection. It's useful to change this to an internal server if all data are hosted inside the corporate network and connections to the outside world are prohibited.
* ``available.cores`` (default: number of logical CPU cores): Determines the number of available CPU cores. Rally aims to create one asyncio event loop per core and will distribute clients evenly across event loops.
* ``async.debug`` (default: false): Enables debug mode on Rally's internal `asyncio event loop <https://docs.python.org/3/library/asyncio-eventloop.html#enabling-debug-mode>`_. This setting is mainly intended for troubleshooting.
* ``passenv`` (default: "PATH"): A comma-separated list of environment variable names that should be passed to the Elasticsearch process.

node
~~~~

This section contains machine-specific information.

* ``root.dir`` (default: "~/.rally/benchmarks"): Rally uses this directory to store all benchmark-related data. It assumes that it has complete control over this directory and any of its subdirectories.
* ``src.root.dir`` (default: "~/.rally/benchmarks/src"): The directory where the source code of Elasticsearch or any plugins is checked out. Only relevant for benchmarks from sources.

source
~~~~~~

This section contains more details about the source tree.

* ``remote.repo.url`` (default: "https://github.com/elastic/elasticsearch.git"): The URL from which to checkout Elasticsearch.
* ``elasticsearch.src.subdir`` (default: "elasticsearch"): The local path, relative to ``src.root.dir``, of the Elasticsearch source tree.
* ``cache`` (default: true): Enables Rally's internal :ref:`source artifact <pipelines_from-sources>` cache (``elasticsearch*.tar.gz`` and optionally ``*.zip`` files for plugins). Artifacts are cached based on their git revision.
* ``cache.days`` (default: 7): The number of days for which an artifact should be kept in the source artifact cache.

benchmarks
~~~~~~~~~~

This section contains details about the benchmark data directory.

* ``local.dataset.cache`` (default: "~/.rally/benchmarks/data"): The directory in which benchmark data sets are stored. Depending on the benchmarks that are executed, this directory may contain hundreds of GB of data.

reporting
~~~~~~~~~

This section defines how metrics are stored.

* ``datastore.type`` (default: "in-memory"): If set to "in-memory" all metrics will be kept in memory while running the benchmark. If set to "elasticsearch" all metrics will instead be written to a persistent metrics store and the data are available for further analysis.
* ``sample.queue.size`` (default: 2^20): The number of metrics samples that can be stored in Rally's in-memory queue.
* ``"metrics.request.downsample.factor`` (default: 1): Determines how many service time and latency samples should be kept in the metrics store. By default all values will be kept. To keep only e.g. every 100th sample, specify 100. This is useful to avoid overwhelming the metrics store in benchmarks with many clients (tens of thousands).
* ``output.processingtime`` (default: false): If set to "true", Rally will show a metric, called "processing time" in the command line report. Contrary to "service time" which is measured as close as possible to the wire, "processing time" also includes Rally's client side processing overhead. Large differences between the service time and the reporting time indicate a high overhead in the client and can thus point to a potential client-side bottleneck which requires investigation.

The following settings are applicable only if ``datastore.type`` is set to "elasticsearch":

* ``datastore.host``: The host name of the metrics store, e.g. "10.17.20.33".
* ``datastore.port``: The port of the metrics store, e.g. "9200".
* ``datastore.secure``: If set to ``false``, Rally assumes a HTTP connection. If set to ``true``, it assumes a HTTPS connection.
* ``datastore.ssl.verification_mode`` (default: "full"): By default the metric store's SSL certificate is checked ("full"). To disable certificate verification set this value to "none".
* ``datastore.ssl.certificate_authorities`` (default: empty): Determines the path on the local file system to the certificate authority's signing certificate.
* ``datastore.user``: Sets the name of the Elasticsearch user for the metrics store.
* ``datastore.password``: Sets the password of the Elasticsearch user for the metrics store.
* ``datastore.probe.cluster_version`` (default: true): Enables automatic detection of the metric store's version.

**Examples**

Define an unprotected metrics store in the local network::

[reporting]
datastore.type = elasticsearch
datastore.host = 192.168.10.17
datastore.port = 9200
datastore.secure = false
datastore.user =
datastore.password =

Define a secure connection to a metrics store in the local network with a self-signed certificate::

[reporting]
datastore.type = elasticsearch
datastore.host = 192.168.10.22
datastore.port = 9200
datastore.secure = true
datastore.ssl.verification_mode = none
datastore.user = rally
datastore.password = the-password-to-your-cluster

Define a secure connection to an Elastic Cloud cluster::

[reporting]
datastore.type = elasticsearch
datastore.host = 123456789abcdef123456789abcdef1.europe-west4.gcp.elastic-cloud.com
datastore.port = 9243
datastore.secure = true
datastore.user = rally
datastore.password = the-password-to-your-cluster


tracks
~~~~~~

This section defines how :doc:`tracks </track>` are retrieved. All keys are read by Rally using the convention ``<<track-repository-name>>.url``, e.g. ``custom-track-repo.url`` which can be selected the command-line via ``--track-repository="custom-track-repo"``. By default, Rally chooses the track repository specified via ``default.url`` which points to https://github.com/elastic/rally-tracks.

teams
~~~~~

This section defines how :doc:`teams </car>` are retrieved. All keys are read by Rally using the convention ``<<team-repository-name>>.url``, e.g. ``custom-team-repo.url`` which can be selected the command-line via ``--team-repository="custom-team-repo"``. By default, Rally chooses the track repository specified via ``default.url`` which points to https://github.com/elastic/rally-teams.

defaults
~~~~~~~~

This section defines default values for certain command line parameters of Rally.

* ``preserve_benchmark_candidate`` (default: false): Determines whether Elasticsearch installations will be preserved or wiped by default after a benchmark. For preserving an installation for a single benchmark, use the command line flag ``--preserve-install``.

distributions
~~~~~~~~~~~~~

* ``release.cache`` (default: true): Determines whether released Elasticsearch versions should be cached locally.

Proxy Configuration
-------------------

Expand Down
2 changes: 1 addition & 1 deletion esrally/driver/driver.py
Original file line number Diff line number Diff line change
Expand Up @@ -819,7 +819,7 @@ def receiveMsg_StartWorker(self, msg, sender):
self.worker_id = msg.worker_id
self.config = load_local_config(msg.config)
self.on_error = self.config.opts("driver", "on.error")
self.sample_queue_size = int(self.config.opts("system", "sample.queue.size", mandatory=False, default_value=1 << 20))
self.sample_queue_size = int(self.config.opts("reporting", "sample.queue.size", mandatory=False, default_value=1 << 20))
self.track = msg.track
track.set_absolute_data_path(self.config, self.track)
self.client_allocations = msg.client_allocations
Expand Down

0 comments on commit b3a0e15

Please sign in to comment.