Skip to content

Commit

Permalink
Fallback to closest minor branch for track/team repositories (#1165)
Browse files Browse the repository at this point in the history
So far Rally's fallback logic for picking the track/team repository
branch has been basically either that of exact match, or failing that
use the matching major version.

This means that e.g. a major change introduced in 7.11 would mean that
e.g. rally-teams would require branches 7.11, 7.12 etc. throughout the
entire lifecycle of the 7 version.

This commit alters this fallback logic and instead will prefer the
nearest minor branch, if available.
  • Loading branch information
dliappis authored Feb 4, 2021
1 parent ee860dd commit f859867
Show file tree
Hide file tree
Showing 5 changed files with 197 additions and 71 deletions.
7 changes: 1 addition & 6 deletions docs/car.rst
Original file line number Diff line number Diff line change
Expand Up @@ -121,12 +121,7 @@ Rally provides a default team repository that is hosted on `Github <https://gith
* The `master` branch needs to work with the latest `master` branch of Elasticsearch.
* All other branches need to match the version scheme of Elasticsearch, i.e. ``MAJOR.MINOR.PATCH-SUFFIX`` where all parts except ``MAJOR`` are optional.

Rally implements a fallback logic so you don't need to define a branch for each patch release of Elasticsearch. For example:

* The branch `6.0.0-alpha1` will be chosen for the version ``6.0.0-alpha1`` of Elasticsearch.
* The branch `5` will be chosen for all versions for Elasticsearch with the major version 5, e.g. ``5.0.0``, ``5.1.3`` (provided there is no specific branch).

Rally tries to use the branch with the best match to the benchmarked version of Elasticsearch.
Rally implements a fallback logic similar to the one used for :ref:`track-repositories <track-repositories-fall-back-logic>`.

Creating a new team repository
""""""""""""""""""""""""""""""
Expand Down
14 changes: 10 additions & 4 deletions docs/track.rst
Original file line number Diff line number Diff line change
Expand Up @@ -38,12 +38,18 @@ Alternatively, you can store Rally tracks also in a dedicated git repository whi
* The `master` branch needs to work with the latest `master` branch of Elasticsearch.
* All other branches need to match the version scheme of Elasticsearch, i.e. ``MAJOR.MINOR.PATCH-SUFFIX`` where all parts except ``MAJOR`` are optional.

Rally implements a fallback logic so you don't need to define a branch for each patch release of Elasticsearch. For example:
.. _track-repositories-fall-back-logic:

* The branch `6.0.0-alpha1` will be chosen for the version ``6.0.0-alpha1`` of Elasticsearch.
* The branch `5` will be chosen for all versions for Elasticsearch with the major version 5, e.g. ``5.0.0``, ``5.1.3`` (provided there is no specific branch).
Rally implements a fallback logic in order of specificity up to the minor version level, so you don't need to define a branch for each patch release of Elasticsearch.

Rally tries to use the branch with the best match to the benchmarked version of Elasticsearch.
Assuming the track repository has several branches, the order is:

1. Exact branch matches; e.g. if the repo contains branches `7`, `7.1` and `7.10.2` and Elasticsearch version is `7.10.2`, `7.10.2` will be checked out.
2. Nearest prior minor matches; e.g. if the repo contains branches `7`, `7.1` and `7.10` and Elasticsearch version is `7.10.2`, `7.10` will be checked out. Alternatively if version is `7.9`, `7.1` will be checked out.
3. Major branch matches; e.g. if the repo contains branches `7` and `7.10` and Elasticsearch version is `7.1`, `7` will be checked out.
4. Failing everything, `master` will be elected, e.g. if the repo contains branches `6`, `7` and `master` and Elasticsearch version is `8.1.0`, `master` will be checked out.

In general, Rally tries to use the branch with the best match to the benchmarked version of Elasticsearch.

Rally will also search for related files like mappings or custom runners or parameter sources in the track repository. However, Rally will use a separate directory to look for data files (``~/.rally/benchmarks/data/$TRACK_NAME/``). The reason is simply that we do not want to check multi-GB data files into git.

Expand Down
2 changes: 1 addition & 1 deletion esrally/utils/repo.py
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,7 @@ def update(self, distribution_version):

def _find_matching_tag(self, distribution_version):
tags = git.tags(self.repo_dir)
for version in versions.versions(distribution_version):
for version in versions.variants_of(distribution_version):
# tags have a "v" prefix by convention.
tag_candidate = "v{}".format(version)
if tag_candidate in tags:
Expand Down
99 changes: 81 additions & 18 deletions esrally/utils/versions.py
Original file line number Diff line number Diff line change
Expand Up @@ -67,45 +67,79 @@ def components(version, strict=True):
raise exceptions.InvalidSyntax("version string '%s' does not conform to pattern '%s'" % (version, versions_pattern.pattern))


def versions(version):
def variants_of(version):
for v, _ in VersionVariants(version).all_versions:
yield v

class VersionVariants:
"""
Determines possible variations of a version.
Build all possible variations of a version.
E.g. if version "5.0.0-SNAPSHOT" is given, the possible variations are:
e.g. if version "5.0.0-SNAPSHOT" is given, the possible variations are:
self.with_suffix: "5.0.0-SNAPSHOT",
self.with_patch: "5.0.0",
self.with_minor: "5.0",
self.with_major: "5"
"""

["5.0.0-SNAPSHOT", "5.0.0", "5.0", "5"]
def __init__(self, version):
"""
:param version: A version string in the format major.minor.path-suffix (suffix is optional)
"""
self.major, self.minor, self.patch, self.suffix = components(version)

:param version: A version string in the format major.minor.path-suffix (suffix is optional)
:return: a list of version variations ordered from most specific to most generic variation.
"""
v = []
major, minor, patch, suffix = components(version)
self.with_major = f"{int(self.major)}"
self.with_minor = f"{int(self.major)}.{int(self.minor)}"
self.with_patch = f"{int(self.major)}.{int(self.minor)}.{int(self.patch)}"
self.with_suffix = f"{int(self.major)}.{int(self.minor)}.{int(self.patch)}-{self.suffix}" if self.suffix \
else None

@property
def all_versions(self):
"""
:return: a list of tuples containing version variants and version type
ordered from most specific to most generic variation.
if suffix:
v.append("%d.%d.%d-%s" % (major, minor, patch, suffix))
v.append("%d.%d.%d" % (major, minor, patch))
v.append("%d.%d" % (major, minor))
v.append("%d" % major)
return v
Example:
[("5.0.0-SNAPSHOT", "with_suffix"), ("5.0.0", "with_patch"), ("5.0", "with_minor"), ("5", "with_major")]
"""

versions = [(self.with_suffix, "with_suffix")] if self.suffix else []
versions.extend([(self.with_patch, "with_patch"),
(self.with_minor, "with_minor"),
(self.with_major, "with_major")])

return versions


def best_match(available_alternatives, distribution_version):
"""
Finds the most specific branch for a given distribution version assuming that versions have the pattern:
major.minor.patch-suffix
and the provided alternatives reflect this pattern.
Best matches for distribution_version from available_alternatives may be:
1. exact matches of major.minor
2. nearest prior minor within the same major
3. major version
4. as a last resort, `master`.
See test_find_best_match() for examples.
:param available_alternatives: A list of possible distribution versions (or shortened versions).
:param distribution_version: An Elasticsearch distribution version.
:return: The most specific alternative that is available (most components of the version match) or None.
:return: The most specific alternative that is available or None.
"""
if is_version_identifier(distribution_version):
for version in versions(distribution_version):
versions = VersionVariants(distribution_version)
for version, version_type in versions.all_versions:
if version in available_alternatives:
return version
# match nearest prior minor
if version_type == "with_minor" and (latest_minor := latest_bounded_minor(available_alternatives, versions)):
if latest_minor:
return f"{versions.major}.{latest_minor}"
# not found in the available alternatives, it could still be a master version
major, _, _, _ = components(distribution_version)
if major > _latest_major(available_alternatives):
Expand All @@ -122,3 +156,32 @@ def _latest_major(alternatives):
major, _, _, _ = components(a, strict=False)
max_major = max(major, max_major)
return max_major


def latest_bounded_minor(alternatives, target_version):
"""
Finds the closest minor version that is smaller or eq to target_version from a list of version alternatives.
Versions including patch or patch-suffix in alternatives are ignored.
See test_latest_bounded_minor() for examples.
:param alternatives: list of alternative versions
:param target_version: a VersionVariants object presenting the distribution version
:return: the closest minor version (if available) from alternatives otherwise None
"""
eligible_minors = []
for a in alternatives:
if is_version_identifier(a, strict=False):
major, minor, patch, suffix = components(a, strict=False)
if patch is not None or suffix is not None:
# branches containing patch or patch-suffix aren't supported
continue
if major == target_version.major and minor and minor <= target_version.minor:
eligible_minors.append(minor)

# no matching minor version
if not eligible_minors:
return None

eligible_minors.sort()

return min(eligible_minors, key=lambda x: abs(x - target_version.minor))
146 changes: 104 additions & 42 deletions tests/utils/versions_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,64 +15,126 @@
# specific language governing permissions and limitations
# under the License.

from unittest import TestCase
import random
import re

import pytest # type: ignore

from esrally import exceptions
from esrally.utils import versions


class VersionsTests(TestCase):
class TestsVersions:
def test_is_version_identifier(self):
self.assertFalse(versions.is_version_identifier(None))
self.assertFalse(versions.is_version_identifier(""))
self.assertFalse(versions.is_version_identifier(" \t "))
self.assertFalse(versions.is_version_identifier("5-ab-c"))
self.assertFalse(versions.is_version_identifier("5.1"))
self.assertFalse(versions.is_version_identifier("5"))
self.assertTrue(versions.is_version_identifier("5.0.0"))
self.assertTrue(versions.is_version_identifier("1.7.3"))
self.assertTrue(versions.is_version_identifier("20.3.7-SNAPSHOT"))

self.assertFalse(versions.is_version_identifier(None, strict=False))
self.assertFalse(versions.is_version_identifier("", strict=False))
self.assertTrue(versions.is_version_identifier("5.1", strict=False))
self.assertTrue(versions.is_version_identifier("5", strict=False))
self.assertTrue(versions.is_version_identifier("23", strict=False))
self.assertTrue(versions.is_version_identifier("20.3.7-SNAPSHOT", strict=False))
assert versions.is_version_identifier(None) is False
assert versions.is_version_identifier("") is False
assert versions.is_version_identifier(" \t ") is False
assert versions.is_version_identifier("5-ab-c") is False
assert versions.is_version_identifier("5.1") is False
assert versions.is_version_identifier("5") is False

assert versions.is_version_identifier("5.0.0")
assert versions.is_version_identifier("1.7.3")
assert versions.is_version_identifier("20.3.7-SNAPSHOT")

assert versions.is_version_identifier(None, strict=False) is False
assert versions.is_version_identifier("", strict=False) is False
assert versions.is_version_identifier("5.1", strict=False)
assert versions.is_version_identifier("5", strict=False)
assert versions.is_version_identifier("23", strict=False)
assert versions.is_version_identifier("20.3.7-SNAPSHOT", strict=False)

def test_finds_components_for_valid_version(self):
self.assertEqual((5, 0, 3, None), versions.components("5.0.3"))
self.assertEqual((5, 0, 3, "SNAPSHOT"), versions.components("5.0.3-SNAPSHOT"))
assert versions.components("5.0.3") == (5, 0, 3, None)
assert versions.components("7.12.1-SNAPSHOT") == (7, 12, 1, "SNAPSHOT")

self.assertEqual((25, None, None, None), versions.components("25", strict=False))
self.assertEqual((5, 1, None, None), versions.components("5.1", strict=False))
assert versions.components("25", strict=False) == (25, None, None, None)
assert versions.components("5.1", strict=False) == (5, 1, None, None)

def test_major_version(self):
self.assertEqual(5, versions.major_version("5.0.3"))
self.assertEqual(5, versions.major_version("5.0.3-SNAPSHOT"))
self.assertEqual(25, versions.major_version("25.0.3"))
assert versions.major_version("7.10.2") == 7
assert versions.major_version("7.12.1-SNAPSHOT") == 7
assert versions.major_version("25.0.3") == 25

@pytest.mark.parametrize("seed", range(40))
def test_latest_bounded_minor(self, seed):
_alternatives = ["7", "7.10", "7.11.2", "7.2", "5", "6", "master"]
random.seed(seed)
alternatives = _alternatives.copy()
random.shuffle(alternatives)

assert versions.latest_bounded_minor(alternatives, versions.VersionVariants("7.6.3")) == 2
assert versions.latest_bounded_minor(alternatives, versions.VersionVariants("7.12.3")) == 10,\
"Nearest alternative with major.minor, skip alternatives with major.minor.patch"
assert versions.latest_bounded_minor(alternatives, versions.VersionVariants("7.11.2")) == 10,\
"Skips all alternatives with major.minor.patch, even if exact match"
assert versions.latest_bounded_minor(alternatives, versions.VersionVariants("7.1.0")) is None,\
"No matching alternative with minor version"

def test_components_ignores_invalid_versions(self):
with self.assertRaises(exceptions.InvalidSyntax) as ctx:
with pytest.raises(
exceptions.InvalidSyntax,
match=re.escape(
r"version string '5.0.0a' does not conform to pattern "
r"'^(\d+)\.(\d+)\.(\d+)(?:-(.+))?$'")):
versions.components("5.0.0a")
self.assertEqual(r"version string '5.0.0a' does not conform to pattern '^(\d+)\.(\d+)\.(\d+)(?:-(.+))?$'", ctx.exception.args[0])

def test_versions_parses_correct_version_string(self):
self.assertEqual(["5.0.3", "5.0", "5"], versions.versions("5.0.3"))
self.assertEqual(["5.0.0-SNAPSHOT", "5.0.0", "5.0", "5"], versions.versions("5.0.0-SNAPSHOT"))
self.assertEqual(["10.3.63", "10.3", "10"], versions.versions("10.3.63"))
def test_versionvariants_parses_correct_version_string(self):
assert versions.VersionVariants("5.0.3").all_versions == [
("5.0.3", "with_patch"),
("5.0", "with_minor"),
("5", "with_major")]
assert versions.VersionVariants("7.12.1-SNAPSHOT").all_versions == [
("7.12.1-SNAPSHOT", "with_suffix"),
("7.12.1", "with_patch"),
("7.12", "with_minor"),
("7", "with_major")]
assert versions.VersionVariants("10.3.63").all_versions == [
("10.3.63", "with_patch"),
("10.3", "with_minor"),
("10", "with_major")]

def test_versions_rejects_invalid_version_strings(self):
with self.assertRaises(exceptions.InvalidSyntax) as ctx:
versions.versions("5.0.0a-SNAPSHOT")
self.assertEqual(r"version string '5.0.0a-SNAPSHOT' does not conform to pattern '^(\d+)\.(\d+)\.(\d+)(?:-(.+))?$'"
, ctx.exception.args[0])
with pytest.raises(
exceptions.InvalidSyntax,
match=re.escape(r"version string '5.0.0a-SNAPSHOT' does not conform to pattern "
r"'^(\d+)\.(\d+)\.(\d+)(?:-(.+))?$'")
):
versions.VersionVariants("5.0.0a-SNAPSHOT")

def test_find_best_match(self):
self.assertEqual("master", versions.best_match(["1.7", "2", "5.0.0-alpha1", "5", "master"], "6.0.0-alpha1"),
"Assume master for versions newer than latest alternative available")
self.assertEqual("5", versions.best_match(["1.7", "2", "5.0.0-alpha1", "5", "master"], "5.1.0-SNAPSHOT"),
"Best match for specific version")
self.assertEqual("master", versions.best_match(["1.7", "2", "5.0.0-alpha1", "5", "master"], None),
"Assume master on unknown version")
self.assertIsNone(versions.best_match(["1.7", "2", "5.0.0-alpha1", "5", "master"], "0.4"), "Reject versions that are too old")
assert versions.best_match(["1.7", "2", "5.0.0-alpha1", "5", "master"], "6.0.0-alpha1") == "master",\
"Assume master for versions newer than latest alternative available"

assert versions.best_match(["1.7", "2", "5.0.0-alpha1", "5", "master"], "5.1.0-SNAPSHOT") == "5",\
"Best match for specific version"

assert versions.best_match(["1.7", "2", "5.0.0-alpha1", "5", "master"], None) == "master",\
"Assume master on unknown version"

assert versions.best_match(["1.7", "2", "5.0.0-alpha1", "5", "master"], "0.4") is None,\
"Reject versions that are too old"

assert versions.best_match(["7", "7.10.2", "7.11", "7.2", "5", "6", "master"], "7.10.2") == "7.10.2", \
"Exact match"

assert versions.best_match(["7", "7.10", "master"], "7.1.0") == "7", \
"Best match is major version"

assert versions.best_match(["7", "7.11", "7.2", "5", "6", "master"], "7.11.0") == "7.11",\
"Best match for specific minor version"

assert versions.best_match(["7", "7.11", "7.2", "5", "6", "master"], "7.12.0") == "7.11",\
"If no exact match, best match is the nearest prior minor"

assert versions.best_match(["7", "7.11", "7.2", "5", "6", "master"], "7.3.0") == "7.2",\
"If no exact match, best match is the nearest prior minor"

assert versions.best_match(["7", "7.11", "7.2", "5", "6", "master"], "7.10.0") == "7.2", \
"If no exact match, best match is the nearest prior minor"

assert versions.best_match(["7", "7.1", "7.11.1", "7.11.0", "7.2", "5", "6", "master"], "7.12.0") == "7.2",\
"Patch or patch-suffix branches are not supported and ignored, best match is nearest prior minor"

assert versions.best_match(["7", "7.11", "7.2", "5", "6", "master"], "7.1.0") == "7",\
"If no exact match and no minor match, next best match is major version"

0 comments on commit f859867

Please sign in to comment.