Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Report reactants and products in a consistent deterministic order #1616

Closed
rwest opened this issue Jun 5, 2019 · 6 comments
Closed

Report reactants and products in a consistent deterministic order #1616

rwest opened this issue Jun 5, 2019 · 6 comments

Comments

@rwest
Copy link
Member

rwest commented Jun 5, 2019

Motivation or Problem

Whenever I re-run an RMG simulation and then diff the log files or chemkin files to see what has changed, I have a ton of changes like this:
Screen Shot 2019-06-05 at 12 03 05 PM
where it is just a reaction is written as
16 + 34 <=> 7 one time but
34 + 16 <=> 7 the next.

Desired Outcome

I'd like the log files and chemkin files to come out identical if the models are chemically identical, so that what shows up in a git diff is actually a difference.

Possible solutions

We could sort by species ID, eg. we always put the lowest number first, like
16 + 34 <=> 7, for example.

Or for library and seed reactions it could preserve the order as originally written (make it easier to look up original sources)

For template reactions we might use the labels (eg. the molecule with *1 always goes first) so that reactions of the same type are written consistently (eg. abstractee + abstractor => or alkene + radical =>).

@mliu49
Copy link
Contributor

mliu49 commented Jun 5, 2019

I tried sorting by species ID recently. It seemed to work very well, but resulted in an unexpected change to one of the unit tests for the explorer feature of Arkane. I haven't had a chance to look more into what caused the change yet.

@rwest
Copy link
Member Author

rwest commented Jun 6, 2019

Is there a reason they're shuffled in the first place? I would have thought they'd be generated in a deterministic order.

@mliu49
Copy link
Contributor

mliu49 commented Jun 6, 2019

#388 and #409 are related.

We sort the reactants and products, but because Species do not have comparison methods and we don't provide a sorting key, they will be sorted by object ID. Supposedly that corresponds to its memory address, but it's not clear why that is so non-deterministic.

@rwest
Copy link
Member Author

rwest commented Jun 6, 2019

Wow. A couple of four-year-old issues! that even look like they have solutions proposed?
Thanks for digging those out. I should have done a better search myself.
Whoever fixes this can close three issues at once :-)

@rwest
Copy link
Member Author

rwest commented Aug 27, 2019

This one is still bugging me! But I'm hopeful the recent changes to sorting and comparison methods I spotted on the Python 3 branch may put this to bed...

@mliu49
Copy link
Contributor

mliu49 commented Dec 4, 2019

I think this should be fixed now.

Species sorting now uses comparison methods (implemented in 0c0c7ed), which use the new sorting_key property (implemented in c90c407). This will provide deterministic sorting, although the current implementation depends on the species label and index, which can change depending on thermo libraries and the order that species are generated.

@mliu49 mliu49 closed this as completed Dec 4, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants