Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fallback to closest minor branch for track/team repositories #1165

Merged
merged 5 commits into from
Feb 4, 2021
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 1 addition & 6 deletions docs/car.rst
Original file line number Diff line number Diff line change
Expand Up @@ -121,12 +121,7 @@ Rally provides a default team repository that is hosted on `Github <https://gith
* The `master` branch needs to work with the latest `master` branch of Elasticsearch.
* All other branches need to match the version scheme of Elasticsearch, i.e. ``MAJOR.MINOR.PATCH-SUFFIX`` where all parts except ``MAJOR`` are optional.

Rally implements a fallback logic so you don't need to define a branch for each patch release of Elasticsearch. For example:

* The branch `6.0.0-alpha1` will be chosen for the version ``6.0.0-alpha1`` of Elasticsearch.
* The branch `5` will be chosen for all versions for Elasticsearch with the major version 5, e.g. ``5.0.0``, ``5.1.3`` (provided there is no specific branch).

Rally tries to use the branch with the best match to the benchmarked version of Elasticsearch.
Rally implements a fallback logic similar to the one used for :ref:`track-repositories <track-repositories-fall-back-logic>`.

Creating a new team repository
""""""""""""""""""""""""""""""
Expand Down
14 changes: 10 additions & 4 deletions docs/track.rst
Original file line number Diff line number Diff line change
Expand Up @@ -38,12 +38,18 @@ Alternatively, you can store Rally tracks also in a dedicated git repository whi
* The `master` branch needs to work with the latest `master` branch of Elasticsearch.
* All other branches need to match the version scheme of Elasticsearch, i.e. ``MAJOR.MINOR.PATCH-SUFFIX`` where all parts except ``MAJOR`` are optional.

Rally implements a fallback logic so you don't need to define a branch for each patch release of Elasticsearch. For example:
.. _track-repositories-fall-back-logic:

* The branch `6.0.0-alpha1` will be chosen for the version ``6.0.0-alpha1`` of Elasticsearch.
* The branch `5` will be chosen for all versions for Elasticsearch with the major version 5, e.g. ``5.0.0``, ``5.1.3`` (provided there is no specific branch).
Rally implements a fallback logic in order of specificity up to the minor version level, so you don't need to define a branch for each patch release of Elasticsearch.

Rally tries to use the branch with the best match to the benchmarked version of Elasticsearch.
Assuming the track repository has several branches, the order is:

1. Exact branch matches; e.g. if the repo contains branches `7`, `7.1` and `7.10.2` and Elasticsearch version is `7.10.2`, `7.10.2` will be chosen as the track repo.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe say "will be chosen as branch" or "checked out" instead of "will be chosen as the track repo"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 3ae2fa4

2. Nearest minor matches; e.g. if the repo contains branches `7`, `7.1` and `7.10` and Elasticsearch version is `7.10.2`, `7.10` will be chosen as the track repo. Alternatively if version is `7.9`, `7.1` will be chosen.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the term "nearest" is too vague. Consider the version is 7.8.0 and we have branches 7.9 and 7.1. To me, 7.9 is "nearer than" 7.1 because it's only one minor version distance. Therefore we need to refer to this behavior more accurately, e.g. as "nearest prior minor"?

Note that the term "nearest" is used elsewhere too (e.g. in code comments).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 861a176

3. Major branch matches; e.g. if the repo contains branches `7` and `7.10` and Elasticsearch version is `7.1`, `7` will be chosen as the track repo.
4. Failing everything, `master` will be elected, e.g. if the repo contains branches `6`, `7` and `master` and Elasticsearch version is `8.1.0`, `master` will be chosen as the track repo.

In general, Rally tries to use the branch with the best match to the benchmarked version of Elasticsearch.

Rally will also search for related files like mappings or custom runners or parameter sources in the track repository. However, Rally will use a separate directory to look for data files (``~/.rally/benchmarks/data/$TRACK_NAME/``). The reason is simply that we do not want to check multi-GB data files into git.

Expand Down
2 changes: 1 addition & 1 deletion esrally/utils/repo.py
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,7 @@ def update(self, distribution_version):

def _find_matching_tag(self, distribution_version):
tags = git.tags(self.repo_dir)
for version in versions.versions(distribution_version):
for version, _ in versions.VersionVariants(distribution_version).all_versions:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to expose this as a function instead? This would maybe read a bit nicer:

for version, _ in versions.variants_of(distribution_version):
    ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The class VersionVariants is used in utils/versions.py leveraging state. repo.py doesn't require state, just the list expansion of all versions.
Are you suggesting variants_of() could be a utility function within util/versions.py that does the object instantiation/calling of .all_versions() or?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I am proposing a utility function along the lines of:

def variants_of(distribution_version):
    for version, _ in versions.VersionVariants(distribution_version).all_versions
        yield version

This implementation would even get rid of the superfluous second return value of #all_versions() and give us the chance to keep the class VersionVariants internal to to the versions module.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok fixed in 13ba1ad

# tags have a "v" prefix by convention.
tag_candidate = "v{}".format(version)
if tag_candidate in tags:
Expand Down
95 changes: 77 additions & 18 deletions esrally/utils/versions.py
Original file line number Diff line number Diff line change
Expand Up @@ -67,45 +67,75 @@ def components(version, strict=True):
raise exceptions.InvalidSyntax("version string '%s' does not conform to pattern '%s'" % (version, versions_pattern.pattern))


def versions(version):
class VersionVariants:
"""
Determines possible variations of a version.
Build all possible variations of a version.

E.g. if version "5.0.0-SNAPSHOT" is given, the possible variations are:
e.g. if version "5.0.0-SNAPSHOT" is given, the possible variations are:
self.with_suffix: "5.0.0-SNAPSHOT",
self.with_patch: "5.0.0",
self.with_minor: "5.0",
self.with_major: "5"
"""

["5.0.0-SNAPSHOT", "5.0.0", "5.0", "5"]
def __init__(self, version):
"""
:param version: A version string in the format major.minor.path-suffix (suffix is optional)
"""
self.major, self.minor, self.patch, self.suffix = components(version)

:param version: A version string in the format major.minor.path-suffix (suffix is optional)
:return: a list of version variations ordered from most specific to most generic variation.
"""
v = []
major, minor, patch, suffix = components(version)
self.with_major = f"{int(self.major)}"
self.with_minor = f"{int(self.major)}.{int(self.minor)}"
self.with_patch = f"{int(self.major)}.{int(self.minor)}.{int(self.patch)}"
self.with_suffix = f"{int(self.major)}.{int(self.minor)}.{int(self.patch)}-{self.suffix}" if self.suffix \
else None

@property
def all_versions(self):
"""
:return: a list of tuples containing version variants and version type
ordered from most specific to most generic variation.

Example:
[("5.0.0-SNAPSHOT", "with_suffix"), ("5.0.0", "with_patch"), ("5.0", "with_minor"), ("5", "with_major")]
"""

if suffix:
v.append("%d.%d.%d-%s" % (major, minor, patch, suffix))
v.append("%d.%d.%d" % (major, minor, patch))
v.append("%d.%d" % (major, minor))
v.append("%d" % major)
return v
versions = [(self.with_suffix, "with_suffix")] if self.suffix else []
versions.extend([(self.with_patch, "with_patch"),
(self.with_minor, "with_minor"),
(self.with_major, "with_major")])

return versions


def best_match(available_alternatives, distribution_version):
"""

Finds the most specific branch for a given distribution version assuming that versions have the pattern:

major.minor.patch-suffix

and the provided alternatives reflect this pattern.
Best matches for distribution_version from available_alternatives may be:
1. exact matches of major.minor
2. nearest minor within the same major
3. major version
4. as a last resort, `master`.

See test_find_best_match() for examples.

:param available_alternatives: A list of possible distribution versions (or shortened versions).
:param distribution_version: An Elasticsearch distribution version.
:return: The most specific alternative that is available (most components of the version match) or None.
:return: The most specific alternative that is available or None.
"""
if is_version_identifier(distribution_version):
for version in versions(distribution_version):
versions = VersionVariants(distribution_version)
for version, version_type in versions.all_versions:
if version in available_alternatives:
return version
# match nearest minor
if version_type == "with_minor" and (latest_minor := latest_bounded_minor(available_alternatives, versions)):
if latest_minor:
return f"{versions.major}.{latest_minor}"
# not found in the available alternatives, it could still be a master version
major, _, _, _ = components(distribution_version)
if major > _latest_major(available_alternatives):
Expand All @@ -122,3 +152,32 @@ def _latest_major(alternatives):
major, _, _, _ = components(a, strict=False)
max_major = max(major, max_major)
return max_major


def latest_bounded_minor(alternatives, target_version):
"""
Finds the closest minor version that is smaller or eq to target_version from a list of version alternatives.
Versions including patch or patch-suffix in alternatives are ignored.
See test_latest_bounded_minor() for examples.

:param alternatives: list of alternative versions
:param target_version: a VersionVariants object presenting the distribution version
:return: the closest minor version (if available) from alternatives otherwise None
"""
eligible_minors = []
for a in alternatives:
if is_version_identifier(a, strict=False):
major, minor, patch, suffix = components(a, strict=False)
if patch is not None or suffix is not None:
# branches containing patch or patch-suffix aren't supported
continue
if major == target_version.major and minor and minor <= target_version.minor:
eligible_minors.append(minor)

# no matching minor version
if not eligible_minors:
return None

eligible_minors.sort()

return min(eligible_minors, key=lambda x: abs(x - target_version.minor))
146 changes: 104 additions & 42 deletions tests/utils/versions_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,64 +15,126 @@
# specific language governing permissions and limitations
# under the License.

from unittest import TestCase
import random
import re

import pytest # type: ignore

from esrally import exceptions
from esrally.utils import versions


class VersionsTests(TestCase):
class TestsVersions:
def test_is_version_identifier(self):
self.assertFalse(versions.is_version_identifier(None))
self.assertFalse(versions.is_version_identifier(""))
self.assertFalse(versions.is_version_identifier(" \t "))
self.assertFalse(versions.is_version_identifier("5-ab-c"))
self.assertFalse(versions.is_version_identifier("5.1"))
self.assertFalse(versions.is_version_identifier("5"))
self.assertTrue(versions.is_version_identifier("5.0.0"))
self.assertTrue(versions.is_version_identifier("1.7.3"))
self.assertTrue(versions.is_version_identifier("20.3.7-SNAPSHOT"))

self.assertFalse(versions.is_version_identifier(None, strict=False))
self.assertFalse(versions.is_version_identifier("", strict=False))
self.assertTrue(versions.is_version_identifier("5.1", strict=False))
self.assertTrue(versions.is_version_identifier("5", strict=False))
self.assertTrue(versions.is_version_identifier("23", strict=False))
self.assertTrue(versions.is_version_identifier("20.3.7-SNAPSHOT", strict=False))
assert versions.is_version_identifier(None) is False
assert versions.is_version_identifier("") is False
assert versions.is_version_identifier(" \t ") is False
assert versions.is_version_identifier("5-ab-c") is False
assert versions.is_version_identifier("5.1") is False
assert versions.is_version_identifier("5") is False

assert versions.is_version_identifier("5.0.0")
assert versions.is_version_identifier("1.7.3")
assert versions.is_version_identifier("20.3.7-SNAPSHOT")

assert versions.is_version_identifier(None, strict=False) is False
assert versions.is_version_identifier("", strict=False) is False
assert versions.is_version_identifier("5.1", strict=False)
assert versions.is_version_identifier("5", strict=False)
assert versions.is_version_identifier("23", strict=False)
assert versions.is_version_identifier("20.3.7-SNAPSHOT", strict=False)

def test_finds_components_for_valid_version(self):
self.assertEqual((5, 0, 3, None), versions.components("5.0.3"))
self.assertEqual((5, 0, 3, "SNAPSHOT"), versions.components("5.0.3-SNAPSHOT"))
assert versions.components("5.0.3") == (5, 0, 3, None)
assert versions.components("7.12.1-SNAPSHOT") == (7, 12, 1, "SNAPSHOT")

self.assertEqual((25, None, None, None), versions.components("25", strict=False))
self.assertEqual((5, 1, None, None), versions.components("5.1", strict=False))
assert versions.components("25", strict=False) == (25, None, None, None)
assert versions.components("5.1", strict=False) == (5, 1, None, None)

def test_major_version(self):
self.assertEqual(5, versions.major_version("5.0.3"))
self.assertEqual(5, versions.major_version("5.0.3-SNAPSHOT"))
self.assertEqual(25, versions.major_version("25.0.3"))
assert versions.major_version("7.10.2") == 7
assert versions.major_version("7.12.1-SNAPSHOT") == 7
assert versions.major_version("25.0.3") == 25

@pytest.mark.parametrize("seed", range(40))
def test_latest_bounded_minor(self, seed):
_alternatives = ["7", "7.10", "7.11.2", "7.2", "5", "6", "master"]
random.seed(seed)
alternatives = _alternatives.copy()
random.shuffle(alternatives)

assert versions.latest_bounded_minor(alternatives, versions.VersionVariants("7.6.3")) == 2
assert versions.latest_bounded_minor(alternatives, versions.VersionVariants("7.12.3")) == 10,\
"Closest alternative with major.minor, skip alternatives with major.minor.patch"
assert versions.latest_bounded_minor(alternatives, versions.VersionVariants("7.11.2")) == 10,\
"Skips all alternatives with major.minor.patch, even if exact match"
assert versions.latest_bounded_minor(alternatives, versions.VersionVariants("7.1.0")) is None,\
"No matching alternative with minor version"

def test_components_ignores_invalid_versions(self):
with self.assertRaises(exceptions.InvalidSyntax) as ctx:
with pytest.raises(
exceptions.InvalidSyntax,
match=re.escape(
r"version string '5.0.0a' does not conform to pattern "
r"'^(\d+)\.(\d+)\.(\d+)(?:-(.+))?$'")):
versions.components("5.0.0a")
self.assertEqual(r"version string '5.0.0a' does not conform to pattern '^(\d+)\.(\d+)\.(\d+)(?:-(.+))?$'", ctx.exception.args[0])

def test_versions_parses_correct_version_string(self):
self.assertEqual(["5.0.3", "5.0", "5"], versions.versions("5.0.3"))
self.assertEqual(["5.0.0-SNAPSHOT", "5.0.0", "5.0", "5"], versions.versions("5.0.0-SNAPSHOT"))
self.assertEqual(["10.3.63", "10.3", "10"], versions.versions("10.3.63"))
def test_versionvariants_parses_correct_version_string(self):
assert versions.VersionVariants("5.0.3").all_versions == [
("5.0.3", "with_patch"),
("5.0", "with_minor"),
("5", "with_major")]
assert versions.VersionVariants("7.12.1-SNAPSHOT").all_versions == [
("7.12.1-SNAPSHOT", "with_suffix"),
("7.12.1", "with_patch"),
("7.12", "with_minor"),
("7", "with_major")]
assert versions.VersionVariants("10.3.63").all_versions == [
("10.3.63", "with_patch"),
("10.3", "with_minor"),
("10", "with_major")]

def test_versions_rejects_invalid_version_strings(self):
with self.assertRaises(exceptions.InvalidSyntax) as ctx:
versions.versions("5.0.0a-SNAPSHOT")
self.assertEqual(r"version string '5.0.0a-SNAPSHOT' does not conform to pattern '^(\d+)\.(\d+)\.(\d+)(?:-(.+))?$'"
, ctx.exception.args[0])
with pytest.raises(
exceptions.InvalidSyntax,
match=re.escape(r"version string '5.0.0a-SNAPSHOT' does not conform to pattern "
r"'^(\d+)\.(\d+)\.(\d+)(?:-(.+))?$'")
):
versions.VersionVariants("5.0.0a-SNAPSHOT")

def test_find_best_match(self):
self.assertEqual("master", versions.best_match(["1.7", "2", "5.0.0-alpha1", "5", "master"], "6.0.0-alpha1"),
"Assume master for versions newer than latest alternative available")
self.assertEqual("5", versions.best_match(["1.7", "2", "5.0.0-alpha1", "5", "master"], "5.1.0-SNAPSHOT"),
"Best match for specific version")
self.assertEqual("master", versions.best_match(["1.7", "2", "5.0.0-alpha1", "5", "master"], None),
"Assume master on unknown version")
self.assertIsNone(versions.best_match(["1.7", "2", "5.0.0-alpha1", "5", "master"], "0.4"), "Reject versions that are too old")
assert versions.best_match(["1.7", "2", "5.0.0-alpha1", "5", "master"], "6.0.0-alpha1") == "master",\
"Assume master for versions newer than latest alternative available"

assert versions.best_match(["1.7", "2", "5.0.0-alpha1", "5", "master"], "5.1.0-SNAPSHOT") == "5",\
"Best match for specific version"

assert versions.best_match(["1.7", "2", "5.0.0-alpha1", "5", "master"], None) == "master",\
"Assume master on unknown version"

assert versions.best_match(["1.7", "2", "5.0.0-alpha1", "5", "master"], "0.4") is None,\
"Reject versions that are too old"

assert versions.best_match(["7", "7.10.2", "7.11", "7.2", "5", "6", "master"], "7.10.2") == "7.10.2", \
"Exact match"

assert versions.best_match(["7", "7.10", "master"], "7.1.0") == "7", \
"Best match is major version"

assert versions.best_match(["7", "7.11", "7.2", "5", "6", "master"], "7.11.0") == "7.11",\
"Best match for specific minor version"

assert versions.best_match(["7", "7.11", "7.2", "5", "6", "master"], "7.12.0") == "7.11",\
"If no exact match, best match is the closest minor"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I interpret "closest" as a synonym for "nearest" in this context. IMHO we should stick to one term consistently.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 861a176


assert versions.best_match(["7", "7.11", "7.2", "5", "6", "master"], "7.3.0") == "7.2",\
"If no exact match, best match is the closest minor"

assert versions.best_match(["7", "7.11", "7.2", "5", "6", "master"], "7.10.0") == "7.2", \
"If no exact match, best match is the closest minor"

assert versions.best_match(["7", "7.1", "7.11.1", "7.11.0", "7.2", "5", "6", "master"], "7.12.0") == "7.2",\
"Patch or patch-suffix branches are not supported and ignored, best match is closest minor"

assert versions.best_match(["7", "7.11", "7.2", "5", "6", "master"], "7.1.0") == "7",\
"If no exact match and no minor match, next best match is major version"