Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python3 support #600

Merged
merged 59 commits into from
Jul 2, 2018
Merged

Python3 support #600

merged 59 commits into from
Jul 2, 2018

Conversation

xiaket
Copy link
Contributor

@xiaket xiaket commented May 19, 2018

Hi everyone,

This is at this stage an EARLY EXPERIMENTAL PR for python3 support. All unit tests passed locally both in Python 2.7.14 and 3.6.5.

I've utilized future do the stupid changes, and it's part of the stacker code base now. In future, if we want to drop python2 support, we can simply remove it from our code base.

To make it easier to run the tests in 2 environments, I've created two virtual environments and they are managed by a bash script embedded in this PR. Here's a little help for the script(manage) here:

To run a linting test for python2: ./manage lint2 (or ./manage l2 for short)
To run a nose test for python3: ./manage nose3 (or ./manage n3 for short)
To run everything: ./manage run

It's a handy tool created to make the development easier. Technically, it should not be a part of this PR, but for now, I decide to let it in so you do not have to set up a test environment.

This is a huge PR, I've tried my best to make each commit atomic, and I hope I've explained the rationale for the change in the commit message. If you have any questions, just leave a message here or shoot me a message on Slack. Thanks.

xiaket added 24 commits May 19, 2018 20:15
To run the unit tests in Python2, just run `manage ./n2`, for python3,
it's `./manage n3`.

I have delibrately avoided the use of .eggs directory so we can keep the
virtual environment directories when we switch environments. So I have
added two requirements files, one for each environment.

Comparing to the requirements described in setpy.py, I had made the
folllowing modifications:

1. add future, the library I have used to migrate the codebase.
2. add flake8 and backports.ssl-match-hostname, to avoid import error.
3. install formic for python2 and formic-py3 for python3.

Signed-off-by: Kai Xia <xiaket@gmail.com>
Run this command against the codebase.

Signed-off-by: Kai Xia <xiaket@gmail.com>
Signed-off-by: Kai Xia <xiaket@gmail.com>
At this stage, all py2 tests passed. py3 tests has 52 errors=52 and 1 failure

Signed-off-by: Kai Xia <xiaket@gmail.com>
Signed-off-by: Kai Xia <xiaket@gmail.com>
Signed-off-by: Kai Xia <xiaket@gmail.com>
Signed-off-by: Kai Xia <xiaket@gmail.com>
Signed-off-by: Kai Xia <xiaket@gmail.com>
run `./manage run`, this will run linter in py2 and py3, then nose in
py2 and py3.

Signed-off-by: Kai Xia <xiaket@gmail.com>
For zipfile usage, we should be using BytesIO, for we are storing bytes
in the object. For the usage in config, we could use decode to convert
the string to a unicode object, for now.

lint2 pass
lint3 pass
nose2 pass
nose3 failed with 73 errors and 4 failures.

Signed-off-by: Kai Xia <xiaket@gmail.com>
FIXME: Please review this change. For I do not understand the full
consequence of this change.

nose3 failed with 39 errors and 3 failures.

Signed-off-by: Kai Xia <xiaket@gmail.com>
36 error and 4 failures in nose3 test.

Signed-off-by: Kai Xia <xiaket@gmail.com>
Signed-off-by: Kai Xia <xiaket@gmail.com>
Also, an infinite loop issue was fixed in py3.

nose3 failed with 20 errors and 4 failures.

Signed-off-by: Kai Xia <xiaket@gmail.com>
nose2 actually has a failure, we deal with that later.
nose3 has 13 errors and 5 failures.

Signed-off-by: Kai Xia <xiaket@gmail.com>
So instead of comparing to a fixed list of strings, I compare the
stripped result.

nose2: 1 failure
nose3: 13 errors and 3 failures.

Signed-off-by: Kai Xia <xiaket@gmail.com>
Also, fixed a simple compatibility issue(string.letters -> ascii_letters)

nose2: 1 failure
nose3: 11 errors and 3 failures

Signed-off-by: Kai Xia <xiaket@gmail.com>
Can someone suggest a better way? This looks ugly.  :(

nose2: 1 failure
nose3: 3 errors and 4 failures

Signed-off-by: Kai Xia <xiaket@gmail.com>
nose2: 1 failure
nose3: 2 errors and 4 failures

Signed-off-by: Kai Xia <xiaket@gmail.com>
nose2: 1 failure
nose3: 4 failures

Signed-off-by: Kai Xia <xiaket@gmail.com>
nose2: 1 failure
nose3: 2 failures

Signed-off-by: Kai Xia <xiaket@gmail.com>
.decode is required here because unlike its conterpart in py2, which
returns a string object, the b64decode will return bytes.

nose2: 1 failure
nose3: 1 failure

Are these two the same failure? You betcha.

Signed-off-by: Kai Xia <xiaket@gmail.com>
in py2, the message shows as:

${message_starts} ${message_contains}

and in py3:

${message_starts} '${message_contains}'

I have tried for 20 minutes for a simple fix but I cannot find it, so I
gave up and turned to plan B.

Signed-off-by: Kai Xia <xiaket@gmail.com>
I've seen the downstream value of a turn to ['b', 'c', 'd'].
given the nature of this API, we should use set to compare.

Signed-off-by: Kai Xia <xiaket@gmail.com>
@xiaket
Copy link
Contributor Author

xiaket commented May 19, 2018

All tests will fail, because I left Makefile unchanged, and essentially all code will import future and cause an error. I hope that's not an issue for now.

@ejholmes
Copy link
Contributor

I may not get a chance to review this for a while, but just wanted to say this is awesome and thanks for opening the PR @xiaket.

For what it's worth, I definitely think we should start getting python3 tested (maybe just running the full test suite against python3 on the master branch), so that we can get it green, and then keep it green.

/cc @phobologic

@phobologic
Copy link
Member

This is awesome - thanks so much @xiaket - I'll try to get a look soon!

I have to change the Makefile and setup.py to make this work as
expected. It runs locally and I hope it will behave in the same way in
CircleCI.

Signed-off-by: Kai Xia <xiaket@gmail.com>
@xiaket
Copy link
Contributor Author

xiaket commented May 23, 2018

FYI, I've done another change and hopefully it will make all the tests pass in CircleCI.

One other thing I want to report is I can see a 100% speed boost when we run the tests in Python 3:

# running unit test in python2
Ran 343 tests in 22.320s
# running unit test in python3
Ran 343 tests in 11.238s

Signed-off-by: Kai Xia <xiaket@gmail.com>
@danielkza
Copy link
Contributor

@xiaket I have good experience with tox, if you need any support you can just ask on the stacker Slack channel and I'll do my best to help :)

Signed-off-by: Kai Xia <xiaket@gmail.com>
@troyready
Copy link
Contributor

@xiaket it needs a small change to make formic2 work, sorry I didn't catch that before:

diff --git a/setup.py b/setup.py
index 7b04581..88dbd5a 100644
--- a/setup.py
+++ b/setup.py
@@ -11,7 +11,7 @@ install_requires = [
     "boto3>=1.3.1",
     "PyYAML~=3.12",
     "awacs>=0.6.0",
-    "formic~=0.9b",
+    "formic2~=1.0",
     "gitpython~=2.0",
     "schematics~=2.0.1",
 ]
diff --git a/stacker/hooks/aws_lambda.py b/stacker/hooks/aws_lambda.py
index 245d1bf..b96c4db 100644
--- a/stacker/hooks/aws_lambda.py
+++ b/stacker/hooks/aws_lambda.py
@@ -6,7 +6,6 @@ import hashlib
 from StringIO import StringIO
 from zipfile import ZipFile, ZIP_DEFLATED
 import botocore
-from functools import partial
 import formic
 from troposphere.awslambda import Code
 from stacker.session_cache import get_session
@@ -119,9 +118,7 @@ def _find_files(root, includes, excludes, follow_symlinks):
 
     root = os.path.abspath(root)
     file_set = formic.FileSet(directory=root, include=includes,
-                              exclude=excludes,
-                              walk=partial(
-                                  os.walk, followlinks=follow_symlinks))
+                              exclude=excludes, symlinks=follow_symlinks)
 
     for filename in file_set.qualified_files(absolute=False):
         yield filename

xiaket added 2 commits June 4, 2018 10:01
Credit should go to @troyready:

#600 (comment)

Signed-off-by: Kai Xia <xiaket@gmail.com>
In the `topological_sort` method of `DAG` class, `in_degree` is a dictionary
to keep track of how deep an element is dependent: the higher the value, the
lower the element in the dependence tree.

In the test case, the initialization dict is:

```
{
    'a': ['b', 'c'],
    'b': ['d'],
    'c': ['d'],
    'd': [],
}
```

The calculated `in_degree` is:

```
{'a': 0, 'b': 1, 'c': 1, 'd': 2}
```

which looks good so far. Then later in the while loop, we are getting the
elements of the same "in degree" from `graph[u]`, which in fact is a set and
the ordering of the elements are in general indeterministic. Our test case
happen to have (again) relied on that specific order, which is changed in
Python 3. That's why half of the time we can see a failure from this test.

The solution is simple, since we are facing elements of the same "in degree"
here, we could simply order the elements by their name by adding a `sorted()`
call around `graph[u]`, which will return a deterministic list of elements.

Signed-off-by: Kai Xia <xiaket@gmail.com>
@xiaket
Copy link
Contributor Author

xiaket commented Jun 11, 2018

Hey @ejholmes, this PR has been here for quite a while, can you please squeeze some time to review some of the DAG related changes? Thanks!

Copy link
Member

@phobologic phobologic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @xiaket - sorry this has taken so long, changing jobs as well as a vacation followed by sickness has left me with very little time.

I'm gonna give @ejholmes till Wednesday this week to chime in, but I think this is looking good from everything I can tell.

Thanks again for all your patience with this - really appreciate all the work you've done!

if xz and xy and yz:
logger.debug("Removing edge %s -> %s" % (x, z))
graph[x].remove(z)
combinations = []
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'm tempted to accept this - @ejholmes wrote plenty of tests for this, and they are passing - though just one more question: Was there a reason we needed to make this change related to python3, or was this meant to change it for some other reason @xiaket? Do you have any idea if the performance is any different between the two?

@@ -72,7 +76,7 @@ def __init__(self, definition, context, variables=None, mappings=None,
self.force = force
self.enabled = enabled
self.protected = protected
self.context = copy.deepcopy(context)
self.context = context
Copy link
Member

@phobologic phobologic Jun 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I created a quick script to test if memory usage changed after removing the deepcopy:

import sys
import resource

from stacker.context import Context
from stacker.config import Config, Stack


def generate_context(stacks, conf):
    stack_names = [d["name"] for d in stacks]
    stack_objs = [Stack(d) for d in stacks]

    conf.stacks = stack_objs
    conf.validate()
    return Context(
        environment={},
        stack_names=stack_names,
        config=conf,
    )


conf = Config()
conf.namespace = "test-namespace"

stacks = []
STACK_COUNT = int(sys.argv[1])

for i in range(STACK_COUNT):
    stacks.append({"name": "vpc%s" % i, "class_path": "blueprints.VPC"})


start_mem = resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
for i in range(len(stacks)):
    ii = i + 1
    s = stacks[:ii]
    ctx = generate_context(s, conf)

stacks_created = ctx.get_stacks()

end_mem = resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
total_mem = end_mem - start_mem

print total_mem,

And then ran it repeatedly, increasing the config size by one stack each time with this shell script, and dumped the output into a csv format file:

#!/bin/bash

for i in $(seq 100)
do
    echo "$i,$(python ./test_stacker_mem.py $i)"
done

That resulted in this graph:

screenshot 2018-06-23 15 36 16

So it looks like in 1.3 we regressed and started using more memory once more, and this should help fix it. @ejholmes unless you can think of a reason this might be bad, I'm going to go ahead and go witht his too.

(BTW, here's the google spreadsheet with my run's results: https://docs.google.com/spreadsheets/d/15rP-RPYg7DtFQIwB_lK6oaJwPQHG6tz1NfxsvUKVJeM/edit?usp=sharing)

@phobologic
Copy link
Member

Hey @xiaket - I went ahead and created a branch in the cloudtools/stacker repo to kick off functional tests, and it looks like we're running into issues with 2.7:

https://circleci.com/gh/cloudtools/stacker/1992

(here's the fake PR I made to kick off the functional tests: #610)

Any ideas?

@phobologic
Copy link
Member

phobologic commented Jun 24, 2018

Ok, after some creative debugging with print statements (and dealing with the OS X systems integrity protection, ugh), I think I've tracked it down to urllib in python2.7 having issues with byte strings:

   KeyError: 49
   [2018-06-24T11:20:55] vpc: failed (49)
   [2018-06-24T11:20:55] The following steps failed: vpc
### The following are debug print statements
DEBUG: urllib.quote: s arg: b'10.128.0.0/16'
DEBUG: urllib.quote: QUOTER var: <built-in method __getitem__ of dict object at 0x1099f3168>
   
DEBUG: botocore.vendored.requests.models._encode_params: data arg: {'Parameters.member.2.ParameterKey': 'PublicSubnets', 'Parameters.member.5.ParameterValue': 'NAT', 'StackName': u'stacker-test-namespace-vpc', 'Parameters.member.6.ParameterValue': 'false', 'Version': u'2010-05-15', 'Capabilities.member.1': 'CAPABILITY_NAMED_IAM', 'Parameters.member.5.ParameterKey': 'ImageName', 'Parameters.member.4.ParameterKey': 'BaseDomain', 'Parameters.member.7.ParameterValue': '10.128.0.0/16', 'Parameters.member.2.ParameterValue': '10.128.0.0/24,10.128.1.0/24,10.128.2.0/24,10.128.3.0/24', 'Parameters.member.1.ParameterValue': '', 'TemplateURL': u'https://s3.amazonaws.com/stacker-stacker-test-namespace/stack_templates/stacker-test-namespace-vpc/vpc-829da93c.json', 'Parameters.member.7.ParameterKey': 'CidrBlock', 'Tags.member.1.Value': u'stacker-test-namespace', 'Parameters.member.1.ParameterKey': 'InternalDomain', 'Parameters.member.3.ParameterValue': '10.128.8.0/22,10.128.12.0/22,10.128.16.0/22,10.128.20.0/22', 'Action': u'UpdateStack', 'Parameters.member.3.ParameterKey': 'PrivateSubnets', 'Tags.member.1.Key': 'stacker_namespace', 'Parameters.member.6.ParameterKey': 'UseNatGateway', 'Parameters.member.4.ParameterValue': '', 'Parameters.member.8.ParameterKey': 'InstanceType', 'Parameters.member.8.ParameterValue': 'm3.medium'}
   

DEBUG: botocore.vendored.requests.models._encode_params: RESULT var: [('Parameters.member.2.ParameterKey', 'PublicSubnets'), ('Parameters.member.5.ParameterValue', b'NAT'), ('StackName', 'stacker-test-namespace-vpc'), ('Parameters.member.6.ParameterValue', b'false'), ('Version', '2010-05-15'), ('Capabilities.member.1', 'CAPABILITY_NAMED_IAM'), ('Parameters.member.5.ParameterKey', 'ImageName'), ('Parameters.member.4.ParameterKey', 'BaseDomain'), ('Parameters.member.7.ParameterValue', b'10.128.0.0/16'), ('Parameters.member.2.ParameterValue', b'10.128.0.0/24,10.128.1.0/24,10.128.2.0/24,10.128.3.0/24'), ('Parameters.member.1.ParameterValue', b''), ('TemplateURL', 'https://s3.amazonaws.com/stacker-stacker-test-namespace/stack_templates/stacker-test-namespace-vpc/vpc-829da93c.json'), ('Parameters.member.7.ParameterKey', 'CidrBlock'), ('Tags.member.1.Value', 'stacker-test-namespace'), ('Parameters.member.1.ParameterKey', 'InternalDomain'), ('Parameters.member.3.ParameterValue', b'10.128.8.0/22,10.128.12.0/22,10.128.16.0/22,10.128.20.0/22'), ('Action', 'UpdateStack'), ('Parameters.member.3.ParameterKey', 'PrivateSubnets'), ('Tags.member.1.Key', 'stacker_namespace'), ('Parameters.member.6.ParameterKey', 'UseNatGateway'), ('Parameters.member.4.ParameterValue', b''), ('Parameters.member.8.ParameterKey', 'InstanceType'), ('Parameters.member.8.ParameterValue', b'm3.medium')]

It's specifically urllib.quote having an issue with the byte-string b"10.128.0.0/16" - it may have trouble with others as well, this is just the first byte string it's dealing with.

@phobologic
Copy link
Member

Hmm, nevermind, i was wrong about the address being the first byte string - there are others before it.

@phobologic
Copy link
Member

Ok, running into a little bit of a roadblock of lack of knowledge around strings/bytestrings in the python2/3 world. I have to work on some other stuff, but I'll be around most of the day to chat (here or on Slack) if anyone gets a chance to take a look!

ref: http://python-future.org/compatible_idioms.html#queue

future add a queue dir in site-packages so after the installation of
future, we can import queue both in Python2 and Python3.

Signed-off-by: Kai Xia <xiaket@gmail.com>
@xiaket
Copy link
Contributor Author

xiaket commented Jun 26, 2018

@phobologic Sorry for the late reply, I took the last two days fighting this (properly setup an environment alone took me one day) and I already have a dirty hack that can fix this issue, I'm looking for a clean solution ATM. I'll keep you updated about my progress.

xiaket added 2 commits June 27, 2018 12:33
please see github comment in PR 610 for details:

#610 (comment)

Signed-off-by: Kai Xia <xiaket@gmail.com>
Signed-off-by: Kai Xia <xiaket@gmail.com>
@phobologic
Copy link
Member

Does the removal of importing the str method cause any other issues? That doesn't look super dirty to me, though maybe I don't understand the implications for python3. Thanks @xiaket!

Signed-off-by: Kai Xia <xiaket@gmail.com>
@xiaket
Copy link
Contributor Author

xiaket commented Jul 1, 2018

@phobologic I believe the failure in dc10db4 is unrelated(throttling of API). I updated README and forced another round of tests, it's all green now.

@phobologic
Copy link
Member

@xiaket unfortunately the functional-tests in non-cloudtools/stacker based branches don't actually run (for security reasons). The output shows each of the tests as skipped in circleci.

The good news is that it looks like after forcing retries (to get around the Throttling) it looks like the functional tests passed. I'm going to merge this now! Thanks!

@phobologic phobologic merged commit 1cb24aa into cloudtools:master Jul 2, 2018
phrohdoh pushed a commit to phrohdoh/stacker that referenced this pull request Dec 18, 2018
* add script to ease the pain to test in two virtualenvs.

To run the unit tests in Python2, just run `manage ./n2`, for python3,
it's `./manage n3`.

I have delibrately avoided the use of .eggs directory so we can keep the
virtual environment directories when we switch environments. So I have
added two requirements files, one for each environment.

Comparing to the requirements described in setpy.py, I had made the
folllowing modifications:

1. add future, the library I have used to migrate the codebase.
2. add flake8 and backports.ssl-match-hostname, to avoid import error.
3. install formic for python2 and formic-py3 for python3.

Signed-off-by: Kai Xia <xiaket@gmail.com>

* futurize -a -w -1 stacker

Run this command against the codebase.

Signed-off-by: Kai Xia <xiaket@gmail.com>

* ignore venv dirs.

Signed-off-by: Kai Xia <xiaket@gmail.com>

* fix formic reinstallation for python3.

At this stage, all py2 tests passed. py3 tests has 52 errors=52 and 1 failure

Signed-off-by: Kai Xia <xiaket@gmail.com>

* fix lint issues.

Signed-off-by: Kai Xia <xiaket@gmail.com>

* add lint functions, mute pip.

Signed-off-by: Kai Xia <xiaket@gmail.com>

* futurize -w -2 stacker

Signed-off-by: Kai Xia <xiaket@gmail.com>

* fix linter in python2 and python3

Signed-off-by: Kai Xia <xiaket@gmail.com>

* add general lint and run.

run `./manage run`, this will run linter in py2 and py3, then nose in
py2 and py3.

Signed-off-by: Kai Xia <xiaket@gmail.com>

* fix StringIO related issues.

For zipfile usage, we should be using BytesIO, for we are storing bytes
in the object. For the usage in config, we could use decode to convert
the string to a unicode object, for now.

lint2 pass
lint3 pass
nose2 pass
nose3 failed with 73 errors and 4 failures.

Signed-off-by: Kai Xia <xiaket@gmail.com>

* Python3 is not happy with this deepcopy.

FIXME: Please review this change. For I do not understand the full
consequence of this change.

nose3 failed with 39 errors and 3 failures.

Signed-off-by: Kai Xia <xiaket@gmail.com>

* fix encoding issue for md5 checksum.

36 error and 4 failures in nose3 test.

Signed-off-by: Kai Xia <xiaket@gmail.com>

* expose nose3 test

Signed-off-by: Kai Xia <xiaket@gmail.com>

* fix encoding issues in these two files.

Also, an infinite loop issue was fixed in py3.

nose3 failed with 20 errors and 4 failures.

Signed-off-by: Kai Xia <xiaket@gmail.com>

* Fix the message attribute in exception with str(exception)

nose2 actually has a failure, we deal with that later.
nose3 has 13 errors and 5 failures.

Signed-off-by: Kai Xia <xiaket@gmail.com>

* fix the case where the returned template does not have trailing space.

So instead of comparing to a fixed list of strings, I compare the
stripped result.

nose2: 1 failure
nose3: 13 errors and 3 failures.

Signed-off-by: Kai Xia <xiaket@gmail.com>

* fix type errors in the utils and test_utils.

Also, fixed a simple compatibility issue(string.letters -> ascii_letters)

nose2: 1 failure
nose3: 11 errors and 3 failures

Signed-off-by: Kai Xia <xiaket@gmail.com>

* fix encoding by type checking.

Can someone suggest a better way? This looks ugly.  :(

nose2: 1 failure
nose3: 3 errors and 4 failures

Signed-off-by: Kai Xia <xiaket@gmail.com>

* fix type.

nose2: 1 failure
nose3: 2 errors and 4 failures

Signed-off-by: Kai Xia <xiaket@gmail.com>

* fix encoding issue in kms.

nose2: 1 failure
nose3: 4 failures

Signed-off-by: Kai Xia <xiaket@gmail.com>

* fix encoding issues in config tests.

nose2: 1 failure
nose3: 2 failures

Signed-off-by: Kai Xia <xiaket@gmail.com>

* fix encoding issue in test_file.

.decode is required here because unlike its conterpart in py2, which
returns a string object, the b64decode will return bytes.

nose2: 1 failure
nose3: 1 failure

Are these two the same failure? You betcha.

Signed-off-by: Kai Xia <xiaket@gmail.com>

* this is not a complete fix, but I think it's good enough.

in py2, the message shows as:

${message_starts} ${message_contains}

and in py3:

${message_starts} '${message_contains}'

I have tried for 20 minutes for a simple fix but I cannot find it, so I
gave up and turned to plan B.

Signed-off-by: Kai Xia <xiaket@gmail.com>

* fix occasional failure in DAG test.

I've seen the downstream value of a turn to ['b', 'c', 'd'].
given the nature of this API, we should use set to compare.

Signed-off-by: Kai Xia <xiaket@gmail.com>

* try to make tests pass in CircleCI.

I have to change the Makefile and setup.py to make this work as
expected. It runs locally and I hope it will behave in the same way in
CircleCI.

Signed-off-by: Kai Xia <xiaket@gmail.com>

* update circleCI settings to include python3.

Signed-off-by: Kai Xia <xiaket@gmail.com>

* fix output match.

The output for missing arguments is different in Python3.

Signed-off-by: Kai Xia <xiaket@gmail.com>

* fix another encoding issue.

Signed-off-by: Kai Xia <xiaket@gmail.com>

* Change the way we validate graph output.

Key order has changed between python2 and python3, so this old test that
relies on python2 key order behavior is no longer working. For a
comparison, this is the output of the command in Python 3:

```
{
    "steps": {
        "vpc": {
            "deps": []
        },
        "bastion2": {
            "deps": [
                "vpc"
            ]
        },
        "bastion1": {
            "deps": [
                "vpc"
            ]
        },
        "app2": {
            "deps": [
                "bastion2"
            ]
        },
        "app1": {
            "deps": [
                "bastion1"
            ]
        }
    }
}
```

which, apart from the key order, is the same with the output embeded in
the results.

In order to support Python 2 and Python 3 at the same time, I have
validated three keys in that json output, hopes that's enough.

Signed-off-by: Kai Xia <xiaket@gmail.com>

* fix another md5 encoding issue.

This is the last one of the family, I've checked.

Signed-off-by: Kai Xia <xiaket@gmail.com>

* Fix another test that relies on dictionary key order.

Previous this will cry because it cannot find a variable named
`PublicSubnets`, but that's a consequence of python dictionary key order
and in Python3, the order has changed.

I've checked the blueprint, all other variables have a default value.

Signed-off-by: Kai Xia <xiaket@gmail.com>

* hopefully with this change, we can run lint last.

Because it takes less time so it could run last.

Signed-off-by: Kai Xia <xiaket@gmail.com>

* add tests for python 3.5

Signed-off-by: Kai Xia <xiaket@gmail.com>

* fix another dict key order.

The desired output looks like:

```
  "bastion2" -> "vpc";
  "bastion1" -> "vpc";
  "app2" -> "bastion2";
  "app1" -> "bastion1";
```

I've seen a test result that looks like this:

```
  "bastion1" -> "vpc";
  "bastion2" -> "vpc";
  "app1" -> "bastion1";
  "app2" -> "bastion2";
```

It turns out that the sorting mechanism in `dag.DAG.topological_sort`
cannot guarantee the order of keys at the same degree, because
`in_degree` is a dict.

So at most, we can say, bastion -> vpc will appear before app -> bastion.
To validate this behavior, I've added the following test:

```
  assert $(echo "$output" | grep -A 2 vpc | tail -n 2 | grep -c vpc) = '0'
```

So grep 2 lines after the last appearance of vpc, show the last two
lines, and there should be no occurence of 'vpc' in these two lines.
Were there any app line before bastion line, we will the last two lines
would at least contain one line that has vpc in it.

Signed-off-by: Kai Xia <xiaket@gmail.com>

* rewrite dag reduction, so it would handle deep reduction.

The previous implementation cannot handle the cases described in
`test_transitive_deep_reduction`. This fault is discovered when we run
the unit tests in Python 3.5, where the dictionary key order is
different from Python 2.7 and Python 3.6.

Signed-off-by: Kai Xia <xiaket@gmail.com>

* Fix issue where the elements can only be single character.

I realized this on my way home, better late than never.

Signed-off-by: Kai Xia <xiaket@gmail.com>

* Hack PyYAML so it will use OrderedDict for mapping.

This function aiming to keep the order from the yaml file in an OrderedDict
in the output of the yaml.load output, as well as make sure that some of the
keys are unique.

Our previous attempt to keep the order in the yaml in an OrderedDict is
flawed. Take our test case for example:

```
raw_config = """
pre_build:
  hook2:
    path: foo.bar
  hook1:
    path: foo1.bar1
"""
config = yaml_to_ordered_dict(raw_config)
```

The type of the config instance is dict instead of OrderedDict, moreover,
the type of config['pre_build'] is also a dict. We fixed this behavior
in this commit.

PyYAML used dict internally to save mappings. This hardcoding of dict happens
in two places, one in `SafeLoader.construct_yaml_map`, the other happens in our
`_validate_mapping`, where a dictionary is used to keep the validated keys.
This validated mapping is then sent to OrderedDict, but at this stage,
all the key order is scrambled.

So we need to:

1. Override the default behavior of `SafeLoader.construct_yaml_map`.
2. Use OrderedDict in `_validate_mapping`, and return it directly.

The second one is easy to change, however, we have two choices to override
the default method in `SafeLoader`. We can either monkey patch it:

```
yaml.loaders.SafeLoader.construct_yaml_map = our_construct_yaml_map
```

Or, the way I've used in this commit:

```
OrderedUniqueLoader.add_constructor(
    u'tag:yaml.org,2002:map', OrderedUniqueLoader.construct_yaml_map,
)
```

The first method is by far the cleaner solution, except that it's a
monkey patch which may in some unforeseeable way bring disaster to us.
The current solutions involves the hardcoding of the type identifier,
which normally is not good, but even the PyYAML devs are hardcoding this
string in 3 different places so I think it's safe.

Signed-off-by: Kai Xia <xiaket@gmail.com>

* some change to cleanup Python3 branch.

1. remove manage script.
2. normalize_json will not generate template with trailing spaces.
3. some more cleanup.

Signed-off-by: Kai Xia <xiaket@gmail.com>

* review 220244d and cleanup.

In 220244d, we have used futurize to carry out some automatic changes.
The changes are review in this commit and in general I have:

1. remove unnecessary list
2. remove unnecessary keys() call (in __iter__ and __len__ like calls).

Signed-off-by: Kai Xia <xiaket@gmail.com>

* remove the byproduct of manage script.

Signed-off-by: Kai Xia <xiaket@gmail.com>

* add tooling to enable future import checks.

I'll rehash what I've learned from the doc of `flake8-future-import` here.

* `--require-code` means this tool will ignore the empty or near empty python
  files. For example, `__init__.py`.

* `--min-version=2.7` will tell the tool to ignore features that's already
  in Python 2.7.

* `--ignore FI50,FI51,FI53,FI14` tells the tool to expect the following
  three future imports in the files:

  ```
  from __future__ import print_function
  from __future__ import division
  from __future__ import absolute_import
  ```

  and ignore the absence of `from __future__ import unicode_literals`,
  which is tricky and is not included in the default `-a` option of
  futurize for a good reason.

Signed-off-by: Kai Xia <xiaket@gmail.com>

* use formic2 over formic and formic-py3.

Signed-off-by: Kai Xia <xiaket@gmail.com>

* revert 66f4be3 because of failed tests.

Signed-off-by: Kai Xia <xiaket@gmail.com>

* cleanup logic here.

Signed-off-by: Kai Xia <xiaket@gmail.com>

* fix formic2 compatibility.

Credit should go to @troyready:

cloudtools#600 (comment)

Signed-off-by: Kai Xia <xiaket@gmail.com>

* fix indeterministic result in DAG.topological_sort.

In the `topological_sort` method of `DAG` class, `in_degree` is a dictionary
to keep track of how deep an element is dependent: the higher the value, the
lower the element in the dependence tree.

In the test case, the initialization dict is:

```
{
    'a': ['b', 'c'],
    'b': ['d'],
    'c': ['d'],
    'd': [],
}
```

The calculated `in_degree` is:

```
{'a': 0, 'b': 1, 'c': 1, 'd': 2}
```

which looks good so far. Then later in the while loop, we are getting the
elements of the same "in degree" from `graph[u]`, which in fact is a set and
the ordering of the elements are in general indeterministic. Our test case
happen to have (again) relied on that specific order, which is changed in
Python 3. That's why half of the time we can see a failure from this test.

The solution is simple, since we are facing elements of the same "in degree"
here, we could simply order the elements by their name by adding a `sorted()`
call around `graph[u]`, which will return a deterministic list of elements.

Signed-off-by: Kai Xia <xiaket@gmail.com>

* Fix full path to Queue

* Fix hook future import issue

* Setup dependencies in circle runs

Also cleanup pep8 issues in hooks/command.py

* Fix circle syntax

* Fix future imports in tests_command

* Try to get non-parallel testing workflows right

* Fix queue import for python2/3

* Argh, one more fix on the queue/Queue import

* use queue provided by future.

ref: http://python-future.org/compatible_idioms.html#queue

future add a queue dir in site-packages so after the installation of
future, we can import queue both in Python2 and Python3.

Signed-off-by: Kai Xia <xiaket@gmail.com>

* avoid the transform to newstr.

please see github comment in PR 610 for details:

cloudtools#610 (comment)

Signed-off-by: Kai Xia <xiaket@gmail.com>

* review all occurences and remove unnecessary use of newstr.

Signed-off-by: Kai Xia <xiaket@gmail.com>

* force another round of test.

Signed-off-by: Kai Xia <xiaket@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants