Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Py3 only fixes breaking Py2 compatibility #1713

Merged
merged 44 commits into from
Sep 11, 2019
Merged

Py3 only fixes breaking Py2 compatibility #1713

merged 44 commits into from
Sep 11, 2019

Conversation

mliu49
Copy link
Contributor

@mliu49 mliu49 commented Aug 29, 2019

This PR contains changes necessary for full compatibility with Python 3, some of which require breaking Python 2 compatibility.

#1683, which replaces dde with chemprop, has been included in this PR. The purpose is to have a single PR which switches (the py3 branch) from being fully functional in Python 2 to fully functional in Python3.

As far as testing goes, I think the requirement for merging this into the py3 branch will simply be reviewer approval and passing standard Travis tests. Once that's done, more rigorous testing will be done on the py3 branch.

@mliu49
Copy link
Contributor Author

mliu49 commented Aug 29, 2019

Unresolved Issue 1: XYZ string generation in Arkane

======================================================================
FAIL: Test properly loading the ArkaneSpecies object and respective sub-objects
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/mjliu/Code/RMG-Py/arkane/commonTest.py", line 369, in test_create_and_load_yaml
    self.assertTrue('C 7.54e-14 1.193e-13 5.52e-14' in arkane_spc.xyz)
AssertionError: False is not true

----------------------------------------------------------------------

I commented out the failing assertion temporarily so that unit tests will pass and RMG-tests will run. The cause has been identified as a change in how floats are handled by str.format.

Unresolved Issue 2: Sorting Entry objects

I just discovered this while running a test job. I think we might need to do a global search for sorting and verify that all instances of implicit sorting of custom classes are addressed.

Traceback (most recent call last):
  File "/home/mjliu/Code/RMG-Py/rmg.py", line 185, in <module>
    main()
  File "/home/mjliu/Code/RMG-Py/rmg.py", line 179, in main
    rmg.execute(**kwargs)
  File "/home/mjliu/Code/RMG-Py/rmgpy/rmg/main.py", line 894, in execute
    trimolecularReact=self.trimolecularReact)
  File "/home/mjliu/Code/RMG-Py/rmgpy/rmg/model.py", line 646, in enlarge
    self.updateUnimolecularReactionNetworks()
  File "/home/mjliu/Code/RMG-Py/rmgpy/rmg/model.py", line 1756, in updateUnimolecularReactionNetworks
    network.update(self, self.pressureDependence)
  File "/home/mjliu/Code/RMG-Py/rmgpy/rmg/pdep.py", line 776, in update
    spec.generateStatMech()
  File "rmgpy/species.py", line 779, in rmgpy.species.Species.generateStatMech
  File "/home/mjliu/Code/RMG-Py/rmgpy/data/statmech.py", line 667, in getStatmechData
    statmech_model = self.getStatmechDataFromGroups(molecule, thermoModel)
  File "/home/mjliu/Code/RMG-Py/rmgpy/data/statmech.py", line 701, in getStatmechDataFromGroups
    return self.groups['groups'].getStatmechData(molecule, thermoModel)
  File "/home/mjliu/Code/RMG-Py/rmgpy/data/statmech.py", line 404, in getStatmechData
    for entry in group_count if group_count[entry] > 0])
TypeError: '<' not supported between instances of 'Entry' and 'Entry'

Note: This particular comparison issue was fixed, but there could still be others.

@codecov
Copy link

codecov bot commented Aug 29, 2019

Codecov Report

Merging #1713 into py3 will decrease coverage by 8.79%.
The diff coverage is 44.66%.

Impacted file tree graph

@@            Coverage Diff            @@
##              py3    #1713     +/-   ##
=========================================
- Coverage   41.41%   32.62%   -8.8%     
=========================================
  Files         167       87     -80     
  Lines       30118    26184   -3934     
  Branches     6421     6875    +454     
=========================================
- Hits        12473     8542   -3931     
+ Misses      16685    16677      -8     
- Partials      960      965      +5
Impacted Files Coverage Δ
rmgpy/pdep/network.py 12.62% <0%> (ø) ⬆️
rmgpy/molecule/symmetry.py 0% <0%> (ø) ⬆️
rmgpy/data/statmech.py 42.39% <0%> (+0.67%) ⬆️
rmgpy/rmg/pdep.py 12.21% <0%> (ø) ⬆️
rmgpy/yml.py 15.71% <0%> (-0.12%) ⬇️
rmgpy/molecule/group.py 0% <0%> (ø) ⬆️
rmgpy/tools/regression.py 29.76% <0%> (ø) ⬆️
rmgpy/tools/fluxdiagram.py 9.91% <0%> (ø) ⬆️
rmgpy/data/solvation.py 65.2% <0%> (+1.53%) ⬆️
rmgpy/data/kinetics/rules.py 51.31% <0%> (+0.13%) ⬆️
... and 37 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update af9dc41...6cfa689. Read the comment docs.

@alongd
Copy link
Member

alongd commented Aug 29, 2019

Why is the coverage significantly lower?

@mliu49
Copy link
Contributor Author

mliu49 commented Aug 29, 2019

Good question! It might be due to the fact that codecov is comparing between py2 results and py3 results, although it's not apparent what differences would affect coverage that much.

Copy link
Member

@alongd alongd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few comments.
Do we also need/want to convert all u""" instances in the database repo?

rmgpy/data/test_data/thermo/groups/group.py Show resolved Hide resolved
rmgpy/data/thermo.py Show resolved Hide resolved
rmgpy/ml/estimator.py Show resolved Hide resolved
rmgpy/ml/estimator.py Show resolved Hide resolved
rmgpy/molecule/translatorTest.py Show resolved Hide resolved
Colin Grambow and others added 10 commits August 29, 2019 13:09
Only use two models, one that predicts enthalpy and one that predicts
entropy and heat capacity simultaneously. Uncertainty cannot be estimated
with the new models.

If chemprop is not available, don't raise error until trying to use it.
Packages requiring libgcc have been rebuilt to require libgcc-ng instead. With this the conda solver can resolve the environment without needing pydot-ng and pygpu to be given explicitly
Primary goal is to not overwrite settings.pxi if it already exists.

When using `cythonize`, Cython does a better job of figuring out
if files have changed and need to be recompiled. Since we would
rewrite settings.pxi every time upon calling `make solver`, Cython
would think that it needs to recompile the entire solver module
since it depends on that file. This change prevents excessive
rewriting and recompilation.
@mliu49
Copy link
Contributor Author

mliu49 commented Aug 29, 2019

Rebased onto py3. The two unresolved issues mentioned earlier have been fixed.

Copy link
Member

@amarkpayne amarkpayne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have taken a look through all of the commits and I have a few quick comments. Let me know what you think.

rmgpy/ml/estimator.py Show resolved Hide resolved
rmgpy/ml/estimator.py Show resolved Hide resolved
@@ -2,8 +2,8 @@
# encoding: utf-8

name = "Surface Adsorption Corrections"
shortDesc = u""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also make this change in the documentation as well? For example, line 242 of database/introduction.rst.txt and many others

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also remove this from RMG-database? It would make the database no longer compatible with python 2 going forward, but I think this is a good incentive for people to switch to the python 3 version of RMG-Py. I have not made my mind up on this, but rather just want to get the conversation going

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally, remove unicode strings that use single quotes in the following files (there may be more). I searched u' (leading space) to find these in PyCharm.

  • molecule/draw.py
  • reference.py
  • documentation/source/conf.py (these ones may be okay)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should remove them from RMG-database, but it's not crucial. I removed them from the testing databases because it was straightforward to do.

Ideally, we would machine write the entire database, like we planned to do a while back. Another consideration is whether or not we want to change the database syntax.

.travis.yml Outdated Show resolved Hide resolved
setup.py Outdated Show resolved Hide resolved
rmgpy/speciesTest.py Show resolved Hide resolved
arkane/inputTest.py Show resolved Hide resolved
rmgpy/rmg/model.py Show resolved Hide resolved
Makefile Show resolved Hide resolved
rmgpy/data/statmech.py Show resolved Hide resolved
setup.py Outdated Show resolved Hide resolved
@mliu49
Copy link
Contributor Author

mliu49 commented Sep 5, 2019

I got this warning while running a test job on this branch:

/home/mjliu/Code/RMG-Py/rmgpy/tools/plot.py:211: RuntimeWarning: More than 20 figures have been opened. Figures created through the pyplot interface (`matplotlib.pyplot.figure`) are retained until explicitly closed and may consume too much memory. (To control this warning, see the rcParam `figure.max_open_warning`).
  fig = plt.figure()

Maybe this resulted from switching some plotting functionality from pylab to pyplot? Regardless, we should probably explicitly close the figure anytime we plot something, using matplotlib.pyplot.close.

Improper syntax issues which didn't break in Python 2
Quantity objects cannot be implicitly sorted anymore
None can no longer be sorted along with strings
The `file` function not longer exists
The for loop over the bond dictionary would go through too many iterations
because we were modifying the dictionary while looping

This worked in Python 2, possibly because dictionaries weren't ordered
It is now passing for an unclear reason
To clean up Travis output a bit
Now, stdout will only be displayed if a test fails
The InChI class is defined as a subclass of str
How pickling worked before is a mystery...

Due to some change in objects and/or pickling in Python 3,
this functionality no longer works as expected.

Pickling of InChI/AugmentedInChI objects is not critical,
so just mark as work_in_progress for now.
mliu49 and others added 6 commits September 10, 2019 18:00
They were temporary for py2/3 compatibility
They can be removed since we will only support py3 going forward
Since we don't need str/unicode conversion anymore, there is no
need to have it as a property.
The Entry class does not have comparison methods implemented
In Arkane's create_hindered_rotor_figure() method
Also in rmgpy.stats and rmgpy.tools.plot
Otherwise they accumulate in memory
@mliu49
Copy link
Contributor Author

mliu49 commented Sep 10, 2019

I went ahead and squashed the fixes and rebased. I would like to get this in asap so we can move on to finalizing other stuff.

One last thing to address is how we want to handle the two temp commits. Since this branch incorporates chemprop and makes changes to environment files, both RMG-database and RMG-tests require updates.

My proposal is to add the chemprop models into RMG-database without removing the dde models yet, so that python 2 and python 3 builds will work, and to modify RMG-tests to use environment.yml when available and environment_[os].yml otherwise. This way, all tests will pass. We can then remove the dde models from database at a later date.

@cgrambow, any thoughts about database?

@cgrambow
Copy link

I'd say just kick out the DDE models but no strong feelings either way.

@amarkpayne
Copy link
Member

I made a quick comment about the file encoding lines, but otherwise once the TEMP commit issues get resolved I approve of merging this PR into py3 as long as @alongd and @cgrambow feel the same way.

I'll work with @mliu49 to review and merge the needed changes for RMG-tests

@mliu49
Copy link
Contributor Author

mliu49 commented Sep 11, 2019

If we remove the dde models, all master based RMG-Py builds, RMG-database builds, and RMG-tests builds will fail until we merge the py3 branch. If anyone were to pull current database during this time, their install would also be broken. If the time between merging this PR into py3 and merging py3 into master were to be less than 24 hours, I think it would be acceptable, but I feel like that is unlikely.

@amarkpayne amarkpayne merged commit 28931a0 into py3 Sep 11, 2019
@amarkpayne amarkpayne deleted the py3_only branch September 11, 2019 19:49
@mliu49 mliu49 mentioned this pull request Sep 23, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants