-
Notifications
You must be signed in to change notification settings - Fork 314
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rally improve artifact download headers #1159
Conversation
To prevent Rally artifact downloads from being counted in statistics add parameter to download url: x-elastic-no-kpi=true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the quick turnaround!
Left a few comments, main one is about the no kpi parameter in supplier.py
.
esrally/utils/net.py
Outdated
@@ -182,6 +182,13 @@ def download_http(url, local_path, expected_size_in_bytes=None, progress_indicat | |||
return expected_size_in_bytes | |||
|
|||
|
|||
def add_url_param(url, params): | |||
url_parsed = urlparse(url) | |||
query = dict(parse_qsl(url_parsed.query)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we want to get a dict
back and update it with params
, should we use parse_qs instead since it actually returns a dict
rather an dictify the list that parse_qsl
returns? According the above docs, if we do that, we should use urlencode(query, doseq=True)
two lines below.
esrally/mechanic/supplier.py
Outdated
@@ -509,7 +509,7 @@ def __init__(self, repo, version, distributions_root): | |||
|
|||
def fetch(self): | |||
io.ensure_dir(self.distributions_root) | |||
download_url = self.repo.download_url | |||
download_url = net.add_url_param(self.repo.download_url, {"x-elastic-no-kpi": "true"}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having URL parameter specifics in this file (which deals with matter in a higher layer of abstraction) feels a bit out of place to me.
I wonder, would it make sense to just have an additional convenience function in net.py
like net.url_elastic_no_kpi()
where this param is added by invoking net.add_url_param()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense, i will add that
@@ -60,6 +60,12 @@ def test_gcs_object_url(self, seed): | |||
assert net._build_gcs_object_url(bucket_name, bucket_path) == \ | |||
"https://storage.googleapis.com/storage/v1/b/unittest-bucket.test.me/o/path%2Fto%2Fobject?alt=media" | |||
|
|||
def test_add_url_param(self): | |||
url = "https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.2.0.tar.gz?flag1=true" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add a second test to check if a flag value contains something that needs special quoting e.g. a whitespace? Reason I am saying this is because I noticed that urlencode by default uses quote_plus
which e.g. will encode a param like flag=test me
to flat=test+me
. The plus is OK for queries according to http://en.wikipedia.org/wiki/URL_encoding and http://www.w3.org/TR/html401/interact/forms.html#h-17.13.4.1 but perhaps having this test helps us figure out some edge case in the future?
@dliappis Thank you for the review, I addressed your comments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thank you for the quick review! @dliappis |
To prevent Rally artifact downloads from being counted in statistics add parameter to download url: x-elastic-no-kpi=true