Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade Modules to Python 3 #1686

Merged
merged 9 commits into from
Aug 21, 2019
Merged

Upgrade Modules to Python 3 #1686

merged 9 commits into from
Aug 21, 2019

Conversation

amarkpayne
Copy link
Member

@amarkpayne amarkpayne commented Aug 12, 2019

Changes

  • PEP-8 reformatting
  • Renaming local variables following PEP-8
  • Futurize
  • Fix relative imports
  • Fix iterators

@amarkpayne
Copy link
Member Author

Not complete yet, but I want to see what happens with the tests and Codacy checks

@amarkpayne amarkpayne self-assigned this Aug 12, 2019
@amarkpayne amarkpayne added Py3 Status: Blocked This PR should not be merged labels Aug 12, 2019
@rwest
Copy link
Member

rwest commented Aug 12, 2019

Looks like a great start! Please can you post somewhere a list of steps taken to do similar on other unmerged pull requests, other branches, and other projects that build on or use RMG?

@amarkpayne
Copy link
Member Author

amarkpayne commented Aug 12, 2019

Looks like a great start! Please can you post somewhere a list of steps taken to do similar on other unmerged pull requests, other branches, and other projects that build on or use RMG?

Certainly! First off, I recommend taking a look at the google doc that @mliu49 sent out in the shared Slack channel, there are a lot of good details in there. The general workflow I am taking so far is as follows (@mliu49, @mjohnson541, @alongd, and @xiaoruiDong feel free to add anything or correct any mistakes I make):

  1. PEP-8 reformatting (optional but highly encouraged. It has made a lot of the files easier to parse through already). As Max showed me I highly recommend using PyCharm's Reformat Code tool (Code > Reformat Code) and then going through the file to fix up any mistakes or changes that still need to be made. This should focus just on whitespace and formatting issues.

  2. Renaming local variables following PEP-8 (optional, but some of us think that it is worthwhile). Again, PyCharm's refactor feature is great for this. DO NOT rename any public variables (functions/methods, classes, etc.)

  3. Futurize (required) Get the futurize script conda install future and then call it on a single file with futurize -0 -wn file_name. The zero flag performs both stage 1 and stage 2 improvements to the code. The w flag actually writes changes to the file, and the n flag prevents creating a backup file (this is what git is for of course). I haven't run into as many of the issues with this as Max and Matt, but one issue that comes up is that if the code has any division statements it will try to import old style division and place it everywhere in your code. If this happens I recommend reverting the file back, placing a from __future__ import division statement at the top of the import block, and running the futurize script again. Thanks to Max for figuring out that this prevents the script from using old style division (make sure to unit test that old style division wasn't needed anywhere).

  4. Replace relative import statements with absolute import statements (required). The futurize scripts will fix the relative import statements, but we want to get rid of them completely. For example, no from molecule import Molecule, replace this with from rmgpy.molecule import Molecule.

  5. Replace iter[keys, values, items] with [keys, values, items] (required). Searching the files for iter is helpful here. If an iterator won't work and a list is need be sure to convert with list()

  6. Unit Test one more time (required)

  7. Squash commits so that there is one commit for each of steps 1-5. (required)

  8. Open PR to py3 branch or master (if this is after the initial transition or an existing PR)

One final note: the order here is important. In particular, do the style changes before the futurize changes so that the futurize changes are not masked in the git blame.

@codecov
Copy link

codecov bot commented Aug 12, 2019

Codecov Report

Merging #1686 into py3 will decrease coverage by 0.06%.
The diff coverage is 34.29%.

Impacted file tree graph

@@            Coverage Diff             @@
##              py3    #1686      +/-   ##
==========================================
- Coverage   41.82%   41.75%   -0.07%     
==========================================
  Files         167      167              
  Lines       29684    29725      +41     
  Branches     6203     6222      +19     
==========================================
- Hits        12415    12413       -2     
- Misses      16348    16385      +37     
- Partials      921      927       +6
Impacted Files Coverage Δ
rmgpy/rmg/__init__.py 100% <ø> (ø) ⬆️
rmgpy/rmg/main.py 21.74% <ø> (-0.11%) ⬇️
rmgpy/rmg/model.py 40.98% <ø> (-0.1%) ⬇️
rmgpy/rmg/listener.py 100% <100%> (ø) ⬆️
rmgpy/rmg/react.py 86.53% <100%> (ø) ⬆️
rmgpy/rmg/input.py 34.31% <29.29%> (-1.67%) ⬇️
rmgpy/display.py 50% <50%> (ø) ⬆️
rmgpy/rmg/output.py 53.36% <60.78%> (ø) ⬆️
rmgpy/rmg/settings.py 82.22% <83.33%> (ø) ⬆️
rmgpy/rmg/pdep.py 12.35% <9.33%> (+0.02%) ⬆️
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b3fa250...2913cb7. Read the comment docs.

@mjohnson541
Copy link
Contributor

It should be noted that you shouldn't rename local variables because if you do so it will make rebasing on RMG an absolute nightmare.

@amarkpayne amarkpayne force-pushed the amp_py3_pr branch 4 times, most recently from 9720913 to cbc2329 Compare August 14, 2019 15:55
@amarkpayne amarkpayne marked this pull request as ready for review August 14, 2019 15:55
@amarkpayne amarkpayne requested a review from mliu49 August 14, 2019 15:56
@amarkpayne amarkpayne force-pushed the amp_py3_pr branch 2 times, most recently from 60d0fe9 to b42bd3b Compare August 14, 2019 20:47
@amarkpayne amarkpayne added Status: Ready for Review PR is complete and ready to be reviewed and removed Status: Blocked This PR should not be merged labels Aug 15, 2019
@amarkpayne
Copy link
Member Author

I have altered the author dates on these commits so that they show up in the easiest order to review them in. Note that in the "Futurize" commit I introduce the absolute import feature, but this is always removed in the next commit "replace relative imports".

@amarkpayne amarkpayne force-pushed the amp_py3_pr branch 2 times, most recently from 70dd9fe to 764147c Compare August 16, 2019 14:24
@amarkpayne
Copy link
Member Author

Please note that all of the steps (PEP-8 reformatting, renaming local variables, futurize, replace relative imports) have been done for all files list here and with my name on them in the google doc. If you don't see a commit about it, it just means that the step was unnecessary for that file.

I have also separated the commits out by groups of files to make it easier to review. @alongd feel free to do what you want with the Arkane commits at the end of this PR.

@alongd
Copy link
Member

alongd commented Aug 16, 2019

Thanks! The Arkane commits were cherry-picked and rebased on top of the Arkane Py3 PR

@mliu49
Copy link
Contributor

mliu49 commented Aug 16, 2019

@alongd, can you confirm that it's safe to remove the last 5 commits from this branch?

@alongd
Copy link
Member

alongd commented Aug 17, 2019

I confirm. They were implemented appropriately in the Arkane branch, split by content and squashed into existing commits.

Copy link
Contributor

@mliu49 mliu49 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a couple of comments. Two things which you could consider doing in all of these files (if applicable) are

  1. Replacing star imports (my approach has been to delete it and wait for PyCharm to flag unknown names)
  2. Replacing import numpy with import numpy as np for consistency

rmg.py Show resolved Hide resolved
@@ -273,12 +273,12 @@ def setUp(self):
self.scalar_dict = {'class': 'ScalarQuantity', 'value': 500.0, 'units': 'K'}

# Abbreviate name
PRO = PseudoRMGObject
pro = PseudoRMGObject
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is technically aliasing a class, so should it be CamelCase? (Bringing up for discussion)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the catch! I am the one who wrote this, so I guess past me got it right then.

https://stackoverflow.com/questions/13765980/what-is-the-naming-convention-for-python-class-references

Also, I just realized that this abbreviation is not necessary, as without it this code will still be within the character limit. I will get rid of it entirely then.

from .model import CoreEdgeReactionModel, ReactionModel
from .output import *
from rmgpy.rmg.model import CoreEdgeReactionModel, ReactionModel
from rmgpy.rmg.output import *
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we would also like to replace star imports if possible.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks!

rmgpy/__init__.py Show resolved Hide resolved
environment_linux.yml Show resolved Hide resolved
@amarkpayne
Copy link
Member Author

Thanks for all of the comments @mliu49! You were right about removing the * imports, we were getting imports in very strange ways. I have rebased this branch on the current py3 branch, let me know what you think.


from rmgpy.rmg.main import RMG
from rmgpy.rmg import input as inp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would be the difference between this and import rmgpy.rmg.input as inp?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After doing some searching and some playing around in python, there is no difference between these two statements, and it is purely up to us as to which one we prefer. It is worth noting that if it wasn't for the as inp statements in each that there would be a slight difference, as import rmgpy.rmg.input would put rmgpy.rmg into the namespace, whereas from rmgpy.rmg import input would not. There is an even more interesting but subtle effect if we were trying to import a module level variable (see the second from the top response here), but this does not apply in this case.

Given this, which one do we prefer?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to know!

I have a slight preference for import rmgpy.rmg.input as inp since it's cleaner and seems a bit more straightforward. I always figured that from imports were mainly if you wanted something from within a file. Up to you whether or not to change it though.

Copy link
Contributor

@mliu49 mliu49 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I added a couple minor comments.

Are you planning to squash commits together?

def loadInput(self, path=None):
"""
Load an RMG job from the input file located at `inputFile`, or
from the `inputFile` attribute if not given as a parameter.
"""
from input import readInputFile
from .input import readInputFile
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace with absolute import.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the catch! I forgot to check import statements nested in the code. I did some further searching using regex in PyCharm and I didn't find any further instances in the files covered by this PR, so I think we have it all now.

def loadThermoInput(self, path=None):
"""
Load an Thermo Estimation job from a thermo input file located at `inputFile`, or
from the `inputFile` attribute if not given as a parameter.
"""
from input import readThermoInputFile
if path is None: path = self.inputFile
from .input import readThermoInputFile
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here also

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

def saveInput(self, path=None):
"""
Save an RMG job to the input file located at `path`, or
from the `outputFile` attribute if not given as a parameter.
"""
from input import saveInputFile
if path is None: path = self.outputFile
from .input import saveInputFile
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

self.newSurfaceRxnsLoss = self.newSurfaceRxnsLoss | lost_surface_rxns

return not (
self.newSurfaceRxnsAdd != set() or self.newSurfaceRxnsLoss != set() or self.newSurfaceSpcsLoss != set() or self.newSurfaceSpcsAdd != set())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should either be one line or split after each or.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, I agree this is much better

Copy link
Member Author

@amarkpayne amarkpayne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comments, I have made the fixes (including an error I made in the last rebase that caused the tests to fail, but I have made sure that everything is back to what it should be).

Before I push the commits, I can squash them if you would prefer. I wasn't sure if we were doing that before each merge into py3 or if we were going to squash the py3 branch before merging into master


from rmgpy.rmg.main import RMG
from rmgpy.rmg import input as inp
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After doing some searching and some playing around in python, there is no difference between these two statements, and it is purely up to us as to which one we prefer. It is worth noting that if it wasn't for the as inp statements in each that there would be a slight difference, as import rmgpy.rmg.input would put rmgpy.rmg into the namespace, whereas from rmgpy.rmg import input would not. There is an even more interesting but subtle effect if we were trying to import a module level variable (see the second from the top response here), but this does not apply in this case.

Given this, which one do we prefer?

def loadInput(self, path=None):
"""
Load an RMG job from the input file located at `inputFile`, or
from the `inputFile` attribute if not given as a parameter.
"""
from input import readInputFile
from .input import readInputFile
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the catch! I forgot to check import statements nested in the code. I did some further searching using regex in PyCharm and I didn't find any further instances in the files covered by this PR, so I think we have it all now.

def loadThermoInput(self, path=None):
"""
Load an Thermo Estimation job from a thermo input file located at `inputFile`, or
from the `inputFile` attribute if not given as a parameter.
"""
from input import readThermoInputFile
if path is None: path = self.inputFile
from .input import readThermoInputFile
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

def saveInput(self, path=None):
"""
Save an RMG job to the input file located at `path`, or
from the `outputFile` attribute if not given as a parameter.
"""
from input import saveInputFile
if path is None: path = self.outputFile
from .input import saveInputFile
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

self.newSurfaceRxnsLoss = self.newSurfaceRxnsLoss | lost_surface_rxns

return not (
self.newSurfaceRxnsAdd != set() or self.newSurfaceRxnsLoss != set() or self.newSurfaceSpcsLoss != set() or self.newSurfaceSpcsAdd != set())
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, I agree this is much better

@amarkpayne
Copy link
Member Author

Final note: the commit Add missing InputError import in arkane/util.py is no longer needed and will be dropped.

@mliu49
Copy link
Contributor

mliu49 commented Aug 21, 2019

I'm still a bit undecided on the best way to handle commits for this process. Perhaps we can discuss tomorrow.

Only covers the following files:
chemkinTest.py
constants*.py
constraints*.py
display.py
exceptions.py
quantityTest.py
rmgObjectTest.py
Before we were getting it from a dependency of a dependency
Previous wildcard imports masked this issue.
Thanks to @mliu49 for catching this and making this fix.
@amarkpayne
Copy link
Member Author

I have squashed the commits and rebased on the current py3 branch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Py3 Status: Ready for Review PR is complete and ready to be reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants