-
Notifications
You must be signed in to change notification settings - Fork 166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python3 support #600
Python3 support #600
Conversation
To run the unit tests in Python2, just run `manage ./n2`, for python3, it's `./manage n3`. I have delibrately avoided the use of .eggs directory so we can keep the virtual environment directories when we switch environments. So I have added two requirements files, one for each environment. Comparing to the requirements described in setpy.py, I had made the folllowing modifications: 1. add future, the library I have used to migrate the codebase. 2. add flake8 and backports.ssl-match-hostname, to avoid import error. 3. install formic for python2 and formic-py3 for python3. Signed-off-by: Kai Xia <xiaket@gmail.com>
Run this command against the codebase. Signed-off-by: Kai Xia <xiaket@gmail.com>
Signed-off-by: Kai Xia <xiaket@gmail.com>
At this stage, all py2 tests passed. py3 tests has 52 errors=52 and 1 failure Signed-off-by: Kai Xia <xiaket@gmail.com>
Signed-off-by: Kai Xia <xiaket@gmail.com>
Signed-off-by: Kai Xia <xiaket@gmail.com>
Signed-off-by: Kai Xia <xiaket@gmail.com>
Signed-off-by: Kai Xia <xiaket@gmail.com>
run `./manage run`, this will run linter in py2 and py3, then nose in py2 and py3. Signed-off-by: Kai Xia <xiaket@gmail.com>
For zipfile usage, we should be using BytesIO, for we are storing bytes in the object. For the usage in config, we could use decode to convert the string to a unicode object, for now. lint2 pass lint3 pass nose2 pass nose3 failed with 73 errors and 4 failures. Signed-off-by: Kai Xia <xiaket@gmail.com>
FIXME: Please review this change. For I do not understand the full consequence of this change. nose3 failed with 39 errors and 3 failures. Signed-off-by: Kai Xia <xiaket@gmail.com>
36 error and 4 failures in nose3 test. Signed-off-by: Kai Xia <xiaket@gmail.com>
Signed-off-by: Kai Xia <xiaket@gmail.com>
Also, an infinite loop issue was fixed in py3. nose3 failed with 20 errors and 4 failures. Signed-off-by: Kai Xia <xiaket@gmail.com>
nose2 actually has a failure, we deal with that later. nose3 has 13 errors and 5 failures. Signed-off-by: Kai Xia <xiaket@gmail.com>
So instead of comparing to a fixed list of strings, I compare the stripped result. nose2: 1 failure nose3: 13 errors and 3 failures. Signed-off-by: Kai Xia <xiaket@gmail.com>
Also, fixed a simple compatibility issue(string.letters -> ascii_letters) nose2: 1 failure nose3: 11 errors and 3 failures Signed-off-by: Kai Xia <xiaket@gmail.com>
Can someone suggest a better way? This looks ugly. :( nose2: 1 failure nose3: 3 errors and 4 failures Signed-off-by: Kai Xia <xiaket@gmail.com>
nose2: 1 failure nose3: 4 failures Signed-off-by: Kai Xia <xiaket@gmail.com>
nose2: 1 failure nose3: 2 failures Signed-off-by: Kai Xia <xiaket@gmail.com>
.decode is required here because unlike its conterpart in py2, which returns a string object, the b64decode will return bytes. nose2: 1 failure nose3: 1 failure Are these two the same failure? You betcha. Signed-off-by: Kai Xia <xiaket@gmail.com>
in py2, the message shows as: ${message_starts} ${message_contains} and in py3: ${message_starts} '${message_contains}' I have tried for 20 minutes for a simple fix but I cannot find it, so I gave up and turned to plan B. Signed-off-by: Kai Xia <xiaket@gmail.com>
I've seen the downstream value of a turn to ['b', 'c', 'd']. given the nature of this API, we should use set to compare. Signed-off-by: Kai Xia <xiaket@gmail.com>
All tests will fail, because I left Makefile unchanged, and essentially all code will import |
I may not get a chance to review this for a while, but just wanted to say this is awesome and thanks for opening the PR @xiaket. For what it's worth, I definitely think we should start getting python3 tested (maybe just running the full test suite against python3 on the master branch), so that we can get it green, and then keep it green. /cc @phobologic |
This is awesome - thanks so much @xiaket - I'll try to get a look soon! |
I have to change the Makefile and setup.py to make this work as expected. It runs locally and I hope it will behave in the same way in CircleCI. Signed-off-by: Kai Xia <xiaket@gmail.com>
FYI, I've done another change and hopefully it will make all the tests pass in CircleCI. One other thing I want to report is I can see a 100% speed boost when we run the tests in Python 3:
|
Signed-off-by: Kai Xia <xiaket@gmail.com>
@xiaket I have good experience with tox, if you need any support you can just ask on the stacker Slack channel and I'll do my best to help :) |
Signed-off-by: Kai Xia <xiaket@gmail.com>
@xiaket it needs a small change to make formic2 work, sorry I didn't catch that before:
|
Credit should go to @troyready: #600 (comment) Signed-off-by: Kai Xia <xiaket@gmail.com>
In the `topological_sort` method of `DAG` class, `in_degree` is a dictionary to keep track of how deep an element is dependent: the higher the value, the lower the element in the dependence tree. In the test case, the initialization dict is: ``` { 'a': ['b', 'c'], 'b': ['d'], 'c': ['d'], 'd': [], } ``` The calculated `in_degree` is: ``` {'a': 0, 'b': 1, 'c': 1, 'd': 2} ``` which looks good so far. Then later in the while loop, we are getting the elements of the same "in degree" from `graph[u]`, which in fact is a set and the ordering of the elements are in general indeterministic. Our test case happen to have (again) relied on that specific order, which is changed in Python 3. That's why half of the time we can see a failure from this test. The solution is simple, since we are facing elements of the same "in degree" here, we could simply order the elements by their name by adding a `sorted()` call around `graph[u]`, which will return a deterministic list of elements. Signed-off-by: Kai Xia <xiaket@gmail.com>
Hey @ejholmes, this PR has been here for quite a while, can you please squeeze some time to review some of the DAG related changes? Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @xiaket - sorry this has taken so long, changing jobs as well as a vacation followed by sickness has left me with very little time.
I'm gonna give @ejholmes till Wednesday this week to chime in, but I think this is looking good from everything I can tell.
Thanks again for all your patience with this - really appreciate all the work you've done!
if xz and xy and yz: | ||
logger.debug("Removing edge %s -> %s" % (x, z)) | ||
graph[x].remove(z) | ||
combinations = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I'm tempted to accept this - @ejholmes wrote plenty of tests for this, and they are passing - though just one more question: Was there a reason we needed to make this change related to python3, or was this meant to change it for some other reason @xiaket? Do you have any idea if the performance is any different between the two?
@@ -72,7 +76,7 @@ def __init__(self, definition, context, variables=None, mappings=None, | |||
self.force = force | |||
self.enabled = enabled | |||
self.protected = protected | |||
self.context = copy.deepcopy(context) | |||
self.context = context |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I created a quick script to test if memory usage changed after removing the deepcopy:
import sys
import resource
from stacker.context import Context
from stacker.config import Config, Stack
def generate_context(stacks, conf):
stack_names = [d["name"] for d in stacks]
stack_objs = [Stack(d) for d in stacks]
conf.stacks = stack_objs
conf.validate()
return Context(
environment={},
stack_names=stack_names,
config=conf,
)
conf = Config()
conf.namespace = "test-namespace"
stacks = []
STACK_COUNT = int(sys.argv[1])
for i in range(STACK_COUNT):
stacks.append({"name": "vpc%s" % i, "class_path": "blueprints.VPC"})
start_mem = resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
for i in range(len(stacks)):
ii = i + 1
s = stacks[:ii]
ctx = generate_context(s, conf)
stacks_created = ctx.get_stacks()
end_mem = resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
total_mem = end_mem - start_mem
print total_mem,
And then ran it repeatedly, increasing the config size by one stack each time with this shell script, and dumped the output into a csv format file:
#!/bin/bash
for i in $(seq 100)
do
echo "$i,$(python ./test_stacker_mem.py $i)"
done
That resulted in this graph:
So it looks like in 1.3 we regressed and started using more memory once more, and this should help fix it. @ejholmes unless you can think of a reason this might be bad, I'm going to go ahead and go witht his too.
(BTW, here's the google spreadsheet with my run's results: https://docs.google.com/spreadsheets/d/15rP-RPYg7DtFQIwB_lK6oaJwPQHG6tz1NfxsvUKVJeM/edit?usp=sharing)
Also cleanup pep8 issues in hooks/command.py
Hey @xiaket - I went ahead and created a branch in the cloudtools/stacker repo to kick off functional tests, and it looks like we're running into issues with 2.7: https://circleci.com/gh/cloudtools/stacker/1992 (here's the fake PR I made to kick off the functional tests: #610) Any ideas? |
Ok, after some creative debugging with print statements (and dealing with the OS X systems integrity protection, ugh), I think I've tracked it down to urllib in python2.7 having issues with byte strings:
It's specifically |
Hmm, nevermind, i was wrong about the address being the first byte string - there are others before it. |
Ok, running into a little bit of a roadblock of lack of knowledge around strings/bytestrings in the python2/3 world. I have to work on some other stuff, but I'll be around most of the day to chat (here or on Slack) if anyone gets a chance to take a look! |
ref: http://python-future.org/compatible_idioms.html#queue future add a queue dir in site-packages so after the installation of future, we can import queue both in Python2 and Python3. Signed-off-by: Kai Xia <xiaket@gmail.com>
@phobologic Sorry for the late reply, I took the last two days fighting this (properly setup an environment alone took me one day) and I already have a dirty hack that can fix this issue, I'm looking for a clean solution ATM. I'll keep you updated about my progress. |
please see github comment in PR 610 for details: #610 (comment) Signed-off-by: Kai Xia <xiaket@gmail.com>
Signed-off-by: Kai Xia <xiaket@gmail.com>
Does the removal of importing the str method cause any other issues? That doesn't look super dirty to me, though maybe I don't understand the implications for python3. Thanks @xiaket! |
Signed-off-by: Kai Xia <xiaket@gmail.com>
@phobologic I believe the failure in dc10db4 is unrelated(throttling of API). I updated README and forced another round of tests, it's all green now. |
@xiaket unfortunately the functional-tests in non- The good news is that it looks like after forcing retries (to get around the Throttling) it looks like the functional tests passed. I'm going to merge this now! Thanks! |
* add script to ease the pain to test in two virtualenvs. To run the unit tests in Python2, just run `manage ./n2`, for python3, it's `./manage n3`. I have delibrately avoided the use of .eggs directory so we can keep the virtual environment directories when we switch environments. So I have added two requirements files, one for each environment. Comparing to the requirements described in setpy.py, I had made the folllowing modifications: 1. add future, the library I have used to migrate the codebase. 2. add flake8 and backports.ssl-match-hostname, to avoid import error. 3. install formic for python2 and formic-py3 for python3. Signed-off-by: Kai Xia <xiaket@gmail.com> * futurize -a -w -1 stacker Run this command against the codebase. Signed-off-by: Kai Xia <xiaket@gmail.com> * ignore venv dirs. Signed-off-by: Kai Xia <xiaket@gmail.com> * fix formic reinstallation for python3. At this stage, all py2 tests passed. py3 tests has 52 errors=52 and 1 failure Signed-off-by: Kai Xia <xiaket@gmail.com> * fix lint issues. Signed-off-by: Kai Xia <xiaket@gmail.com> * add lint functions, mute pip. Signed-off-by: Kai Xia <xiaket@gmail.com> * futurize -w -2 stacker Signed-off-by: Kai Xia <xiaket@gmail.com> * fix linter in python2 and python3 Signed-off-by: Kai Xia <xiaket@gmail.com> * add general lint and run. run `./manage run`, this will run linter in py2 and py3, then nose in py2 and py3. Signed-off-by: Kai Xia <xiaket@gmail.com> * fix StringIO related issues. For zipfile usage, we should be using BytesIO, for we are storing bytes in the object. For the usage in config, we could use decode to convert the string to a unicode object, for now. lint2 pass lint3 pass nose2 pass nose3 failed with 73 errors and 4 failures. Signed-off-by: Kai Xia <xiaket@gmail.com> * Python3 is not happy with this deepcopy. FIXME: Please review this change. For I do not understand the full consequence of this change. nose3 failed with 39 errors and 3 failures. Signed-off-by: Kai Xia <xiaket@gmail.com> * fix encoding issue for md5 checksum. 36 error and 4 failures in nose3 test. Signed-off-by: Kai Xia <xiaket@gmail.com> * expose nose3 test Signed-off-by: Kai Xia <xiaket@gmail.com> * fix encoding issues in these two files. Also, an infinite loop issue was fixed in py3. nose3 failed with 20 errors and 4 failures. Signed-off-by: Kai Xia <xiaket@gmail.com> * Fix the message attribute in exception with str(exception) nose2 actually has a failure, we deal with that later. nose3 has 13 errors and 5 failures. Signed-off-by: Kai Xia <xiaket@gmail.com> * fix the case where the returned template does not have trailing space. So instead of comparing to a fixed list of strings, I compare the stripped result. nose2: 1 failure nose3: 13 errors and 3 failures. Signed-off-by: Kai Xia <xiaket@gmail.com> * fix type errors in the utils and test_utils. Also, fixed a simple compatibility issue(string.letters -> ascii_letters) nose2: 1 failure nose3: 11 errors and 3 failures Signed-off-by: Kai Xia <xiaket@gmail.com> * fix encoding by type checking. Can someone suggest a better way? This looks ugly. :( nose2: 1 failure nose3: 3 errors and 4 failures Signed-off-by: Kai Xia <xiaket@gmail.com> * fix type. nose2: 1 failure nose3: 2 errors and 4 failures Signed-off-by: Kai Xia <xiaket@gmail.com> * fix encoding issue in kms. nose2: 1 failure nose3: 4 failures Signed-off-by: Kai Xia <xiaket@gmail.com> * fix encoding issues in config tests. nose2: 1 failure nose3: 2 failures Signed-off-by: Kai Xia <xiaket@gmail.com> * fix encoding issue in test_file. .decode is required here because unlike its conterpart in py2, which returns a string object, the b64decode will return bytes. nose2: 1 failure nose3: 1 failure Are these two the same failure? You betcha. Signed-off-by: Kai Xia <xiaket@gmail.com> * this is not a complete fix, but I think it's good enough. in py2, the message shows as: ${message_starts} ${message_contains} and in py3: ${message_starts} '${message_contains}' I have tried for 20 minutes for a simple fix but I cannot find it, so I gave up and turned to plan B. Signed-off-by: Kai Xia <xiaket@gmail.com> * fix occasional failure in DAG test. I've seen the downstream value of a turn to ['b', 'c', 'd']. given the nature of this API, we should use set to compare. Signed-off-by: Kai Xia <xiaket@gmail.com> * try to make tests pass in CircleCI. I have to change the Makefile and setup.py to make this work as expected. It runs locally and I hope it will behave in the same way in CircleCI. Signed-off-by: Kai Xia <xiaket@gmail.com> * update circleCI settings to include python3. Signed-off-by: Kai Xia <xiaket@gmail.com> * fix output match. The output for missing arguments is different in Python3. Signed-off-by: Kai Xia <xiaket@gmail.com> * fix another encoding issue. Signed-off-by: Kai Xia <xiaket@gmail.com> * Change the way we validate graph output. Key order has changed between python2 and python3, so this old test that relies on python2 key order behavior is no longer working. For a comparison, this is the output of the command in Python 3: ``` { "steps": { "vpc": { "deps": [] }, "bastion2": { "deps": [ "vpc" ] }, "bastion1": { "deps": [ "vpc" ] }, "app2": { "deps": [ "bastion2" ] }, "app1": { "deps": [ "bastion1" ] } } } ``` which, apart from the key order, is the same with the output embeded in the results. In order to support Python 2 and Python 3 at the same time, I have validated three keys in that json output, hopes that's enough. Signed-off-by: Kai Xia <xiaket@gmail.com> * fix another md5 encoding issue. This is the last one of the family, I've checked. Signed-off-by: Kai Xia <xiaket@gmail.com> * Fix another test that relies on dictionary key order. Previous this will cry because it cannot find a variable named `PublicSubnets`, but that's a consequence of python dictionary key order and in Python3, the order has changed. I've checked the blueprint, all other variables have a default value. Signed-off-by: Kai Xia <xiaket@gmail.com> * hopefully with this change, we can run lint last. Because it takes less time so it could run last. Signed-off-by: Kai Xia <xiaket@gmail.com> * add tests for python 3.5 Signed-off-by: Kai Xia <xiaket@gmail.com> * fix another dict key order. The desired output looks like: ``` "bastion2" -> "vpc"; "bastion1" -> "vpc"; "app2" -> "bastion2"; "app1" -> "bastion1"; ``` I've seen a test result that looks like this: ``` "bastion1" -> "vpc"; "bastion2" -> "vpc"; "app1" -> "bastion1"; "app2" -> "bastion2"; ``` It turns out that the sorting mechanism in `dag.DAG.topological_sort` cannot guarantee the order of keys at the same degree, because `in_degree` is a dict. So at most, we can say, bastion -> vpc will appear before app -> bastion. To validate this behavior, I've added the following test: ``` assert $(echo "$output" | grep -A 2 vpc | tail -n 2 | grep -c vpc) = '0' ``` So grep 2 lines after the last appearance of vpc, show the last two lines, and there should be no occurence of 'vpc' in these two lines. Were there any app line before bastion line, we will the last two lines would at least contain one line that has vpc in it. Signed-off-by: Kai Xia <xiaket@gmail.com> * rewrite dag reduction, so it would handle deep reduction. The previous implementation cannot handle the cases described in `test_transitive_deep_reduction`. This fault is discovered when we run the unit tests in Python 3.5, where the dictionary key order is different from Python 2.7 and Python 3.6. Signed-off-by: Kai Xia <xiaket@gmail.com> * Fix issue where the elements can only be single character. I realized this on my way home, better late than never. Signed-off-by: Kai Xia <xiaket@gmail.com> * Hack PyYAML so it will use OrderedDict for mapping. This function aiming to keep the order from the yaml file in an OrderedDict in the output of the yaml.load output, as well as make sure that some of the keys are unique. Our previous attempt to keep the order in the yaml in an OrderedDict is flawed. Take our test case for example: ``` raw_config = """ pre_build: hook2: path: foo.bar hook1: path: foo1.bar1 """ config = yaml_to_ordered_dict(raw_config) ``` The type of the config instance is dict instead of OrderedDict, moreover, the type of config['pre_build'] is also a dict. We fixed this behavior in this commit. PyYAML used dict internally to save mappings. This hardcoding of dict happens in two places, one in `SafeLoader.construct_yaml_map`, the other happens in our `_validate_mapping`, where a dictionary is used to keep the validated keys. This validated mapping is then sent to OrderedDict, but at this stage, all the key order is scrambled. So we need to: 1. Override the default behavior of `SafeLoader.construct_yaml_map`. 2. Use OrderedDict in `_validate_mapping`, and return it directly. The second one is easy to change, however, we have two choices to override the default method in `SafeLoader`. We can either monkey patch it: ``` yaml.loaders.SafeLoader.construct_yaml_map = our_construct_yaml_map ``` Or, the way I've used in this commit: ``` OrderedUniqueLoader.add_constructor( u'tag:yaml.org,2002:map', OrderedUniqueLoader.construct_yaml_map, ) ``` The first method is by far the cleaner solution, except that it's a monkey patch which may in some unforeseeable way bring disaster to us. The current solutions involves the hardcoding of the type identifier, which normally is not good, but even the PyYAML devs are hardcoding this string in 3 different places so I think it's safe. Signed-off-by: Kai Xia <xiaket@gmail.com> * some change to cleanup Python3 branch. 1. remove manage script. 2. normalize_json will not generate template with trailing spaces. 3. some more cleanup. Signed-off-by: Kai Xia <xiaket@gmail.com> * review 220244d and cleanup. In 220244d, we have used futurize to carry out some automatic changes. The changes are review in this commit and in general I have: 1. remove unnecessary list 2. remove unnecessary keys() call (in __iter__ and __len__ like calls). Signed-off-by: Kai Xia <xiaket@gmail.com> * remove the byproduct of manage script. Signed-off-by: Kai Xia <xiaket@gmail.com> * add tooling to enable future import checks. I'll rehash what I've learned from the doc of `flake8-future-import` here. * `--require-code` means this tool will ignore the empty or near empty python files. For example, `__init__.py`. * `--min-version=2.7` will tell the tool to ignore features that's already in Python 2.7. * `--ignore FI50,FI51,FI53,FI14` tells the tool to expect the following three future imports in the files: ``` from __future__ import print_function from __future__ import division from __future__ import absolute_import ``` and ignore the absence of `from __future__ import unicode_literals`, which is tricky and is not included in the default `-a` option of futurize for a good reason. Signed-off-by: Kai Xia <xiaket@gmail.com> * use formic2 over formic and formic-py3. Signed-off-by: Kai Xia <xiaket@gmail.com> * revert 66f4be3 because of failed tests. Signed-off-by: Kai Xia <xiaket@gmail.com> * cleanup logic here. Signed-off-by: Kai Xia <xiaket@gmail.com> * fix formic2 compatibility. Credit should go to @troyready: cloudtools#600 (comment) Signed-off-by: Kai Xia <xiaket@gmail.com> * fix indeterministic result in DAG.topological_sort. In the `topological_sort` method of `DAG` class, `in_degree` is a dictionary to keep track of how deep an element is dependent: the higher the value, the lower the element in the dependence tree. In the test case, the initialization dict is: ``` { 'a': ['b', 'c'], 'b': ['d'], 'c': ['d'], 'd': [], } ``` The calculated `in_degree` is: ``` {'a': 0, 'b': 1, 'c': 1, 'd': 2} ``` which looks good so far. Then later in the while loop, we are getting the elements of the same "in degree" from `graph[u]`, which in fact is a set and the ordering of the elements are in general indeterministic. Our test case happen to have (again) relied on that specific order, which is changed in Python 3. That's why half of the time we can see a failure from this test. The solution is simple, since we are facing elements of the same "in degree" here, we could simply order the elements by their name by adding a `sorted()` call around `graph[u]`, which will return a deterministic list of elements. Signed-off-by: Kai Xia <xiaket@gmail.com> * Fix full path to Queue * Fix hook future import issue * Setup dependencies in circle runs Also cleanup pep8 issues in hooks/command.py * Fix circle syntax * Fix future imports in tests_command * Try to get non-parallel testing workflows right * Fix queue import for python2/3 * Argh, one more fix on the queue/Queue import * use queue provided by future. ref: http://python-future.org/compatible_idioms.html#queue future add a queue dir in site-packages so after the installation of future, we can import queue both in Python2 and Python3. Signed-off-by: Kai Xia <xiaket@gmail.com> * avoid the transform to newstr. please see github comment in PR 610 for details: cloudtools#610 (comment) Signed-off-by: Kai Xia <xiaket@gmail.com> * review all occurences and remove unnecessary use of newstr. Signed-off-by: Kai Xia <xiaket@gmail.com> * force another round of test. Signed-off-by: Kai Xia <xiaket@gmail.com>
Hi everyone,
This is at this stage an EARLY EXPERIMENTAL PR for python3 support. All unit tests passed locally both in Python 2.7.14 and 3.6.5.
I've utilized future do the stupid changes, and it's part of the stacker code base now. In future, if we want to drop python2 support, we can simply remove it from our code base.
To make it easier to run the tests in 2 environments, I've created two virtual environments and they are managed by a bash script embedded in this PR. Here's a little help for the script(
manage
) here:To run a linting test for python2:
./manage lint2
(or./manage l2
for short)To run a nose test for python3:
./manage nose3
(or./manage n3
for short)To run everything:
./manage run
It's a handy tool created to make the development easier. Technically, it should not be a part of this PR, but for now, I decide to let it in so you do not have to set up a test environment.
This is a huge PR, I've tried my best to make each commit atomic, and I hope I've explained the rationale for the change in the commit message. If you have any questions, just leave a message here or shoot me a message on Slack. Thanks.