-
Notifications
You must be signed in to change notification settings - Fork 227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sampling Reactors #1331
Sampling Reactors #1331
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1331 +/- ##
==========================================
- Coverage 42.24% 42.06% -0.19%
==========================================
Files 168 168
Lines 27508 27695 +187
Branches 5367 5432 +65
==========================================
+ Hits 11621 11650 +29
- Misses 15130 15275 +145
- Partials 757 770 +13
Continue to review full report at Codecov.
|
aaa4ca1
to
8dd9d0d
Compare
8dd9d0d
to
06e85e6
Compare
Could you add some documentation on this feature? |
Documentation should be added now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this PR! please see comments/questions below
Advanced Setting: Range Based Reactors | ||
------------------------------------------------- | ||
|
||
Under this setting rather than use reactors at fixed points reaction conditions are sampled from a range of conditions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a comma after "at fixed points,"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
------------------------------------------------- | ||
|
||
Under this setting rather than use reactors at fixed points reaction conditions are sampled from a range of conditions. | ||
Conditions are chosen by maximizing distance from prior run conditions (weighted by how recent they were, the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add(in terms of T, P, concentrations)
after maximizing distance
Also, what are the constraints? Simply maximizing the distance would mean choosing the most extreme T/P in the given range. Or did I miss understood something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They're weighted I've added "maximizing weighted distance (in terms of ...."
examples/rmg/SR_test/input.py
Outdated
simpleReactor( | ||
temperature=[(1000,'K'),(1500,'K')], | ||
pressure=[(1.0,'bar'),(10.0,'bar')], | ||
nSimsTerm=12, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any particular reason that 12 is used here? Does it work best from your experience?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
12 was my estimate of real reactors to do this system properly, I think 24 might be a more robust choice, but that makes the example run long. I'll add a comment noting that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you direct me to the comment that was added?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, in my tests it turns out 12 is actually a really good number for this one so I don't think a comment is necessary. I've also added a rule of thumb for choosing nSimsTerm to the documentation.
rmgpy/rmg/input.py
Outdated
): | ||
logging.debug('Found SimpleReactor reaction system') | ||
|
||
for value in initialMoleFractions.values(): | ||
if value < 0: | ||
if value != list and value < 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be if not isinstance(value, list) and ...
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm really embarrassed about these ones...
@@ -764,15 +798,15 @@ def execute(self, **kwargs): | |||
# Run sensitivity analysis post-model generation if sensitivity analysis is on | |||
for index, reactionSystem in enumerate(self.reactionSystems): | |||
|
|||
if reactionSystem.sensitiveSpecies: | |||
if reactionSystem.sensitiveSpecies and reactionSystem.sensConditions: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that sensConditions
is None
if ranges aren't specified. Does this condition allow "normal" SA to be run?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should do so now since the non-ranged values will default as sensConditions
@@ -782,8 +816,9 @@ def execute(self, **kwargs): | |||
pdepNetworks = self.reactionModel.networkList, | |||
sensitivity = True, | |||
sensWorksheet = sensWorksheet, | |||
modelSettings = self.modelSettingsList[-1], | |||
modelSettings = ModelSettings(toleranceMoveToCore=1e8,toleranceInterruptSimulation=1e8), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you explain this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before RMG was guaranteed to not add objects on the sensitivity runs because otherwise the run would not have terminated. Now if we don't do this the sensitivity analysis can terminate without completing the simulation.
rmgpy/rmg/mainTest.py
Outdated
Rmem.getCond() | ||
Rmem.addtConvN(1.0,.2,2) | ||
Rmem.generateCond() | ||
Rmem.getCond() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps add relevant assertions and helpful error messages?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't really think of anything I felt the need to specifically check in the object other than that the basic operations could be done. It is deterministic so I could hard code the test (check whether its generating the same condition), but I feel like unless I'm checking something I have reason to worry about that just forces someone to change the hard coded numbers anytime they change something in the weighing or optimization algorithm.
if conditions: | ||
isConc = hasattr(self,'initialConcentrations') | ||
keys = conditions.keys() | ||
if 'T' in keys and hasattr(self,'T'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could T
be in keys
w/o having a self.T attribute?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This if statement fundamentally just guarantees that the next line won't error. It was made to be general because base.pyx stuff has to work with any kind of reactor we propose.
rmgpy/solver/base.pyx
Outdated
for k in keys: | ||
if isConc: | ||
if k in self.initialConcentrations.keys(): | ||
self.initialConcentrations[k] = Quantity(conditions[k],'mol/m^3') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not too familiar with LiqReactor, but couldn't the conditions be specified in different units? If so, we're missing a conversion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The RMG_Memories object only outputs stuff in SI units
5c1f051
to
021490e
Compare
Ok, I've run tests I want to and updated the documentation accordingly to give better guidelines. I believe I've also fixed the issues you pointed out. |
021490e
to
432814b
Compare
I made some significant changes to the documentation so if you could look that over I'd appreciate it. |
I would like to see some more test results before this gets merged, including but not limited to:
To be honest, the mistakes which Alon pointed out were a bit concerning. It would be best if you could improve the test coverage of new code which you added. |
rmgpy/rmg/input.py
Outdated
initialMoleFractions[key] = float(value) | ||
if value < 0: | ||
raise InputError('Initial mole fractions cannot be negative.') | ||
elif isinstance(value,list): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be enough to just write else:
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed
rmgpy/rmg/input.py
Outdated
if value < 0: | ||
raise InputError('Initial mole fractions cannot be negative.') | ||
elif isinstance(value,list): | ||
initialMoleFractions[key] = [float(value[0]),float(value[1])] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we check for cases where the user inputs the concentration as a list but with just one value? (i.e., value[1]
does not exist?) I think this should be a separate Error, it it might happen frequently if modifying an input file that has some concentration ranges.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added InputErrors with useful messages for these cases.
432814b
to
4108eb2
Compare
@mliu49 I've greatly improved coverage from the original PR, however it doesn't show up because except for testing the new RMG_Memory objects it makes the most sense to test the rest of this system using functional tests. I believe under the functional tests nearly all lines of code I've added are covered now. I'm planning to present some results related to the output in RMG meeting this week. What question are you trying to answer with runtime information? |
I understand that it may be difficult to write unit tests for these changes, and I wasn't referring explicitly to the coverage percentage. I mainly meant that the fact that you didn't catch those errors during testing suggested that you weren't running the appropriate tests, whether unit or functional. I mentioned runtime comparison since you said in your initial comment that this should provide improved efficiency. That makes sense, but I was hoping you could quantify it, even if just in one example case. |
I apologize for the state of the original PR, but I'm pretty confident the tests I've added since are sufficient to thoroughly test this feature. Ok, I expect I can demonstrate that. |
4108eb2
to
a38fc24
Compare
So after discovering a space sampling issue with my original method for choosing conditions I discussed the sampling problem with Kevin Silmore from the Swan lab and he proposed a different sampling algorithm I've now adopted in this PR. In this algorithm we evaluate the objective function at a grid of condition points, normalize these values and then choose a point where each point has probability of its normalized objective function value of being chosen. After a grid point is chosen a random step (of maximum length half the distance to the next grid point) is taken from the grid point to choose a final sample point. |
71cca85
to
0045ce9
Compare
The deterministic-ness issue was actually unrelated to random number generation and actually related to non-deterministic solver output. By removing termination conversion and time from the objective function the algorithm is now as deterministic as regular RMG. |
0045ce9
to
410bb68
Compare
Ok, I believe I've fixed everything that needed fixed in this. |
Where does that occur? |
410bb68
to
4cf6f20
Compare
Ok, I believe I've fixed all of the above issues. |
62c168a
to
6686976
Compare
Are there any other issues or can we merge this? |
Could you take a look at the Codacy report? I think there a some recommendations there we could accept. Everything else looks good to me. |
this system allows RMG to choose new conditions for each simulation using the Weighted Stochastic Grid Sampling Algorithm (idea credit to Kevin Silmore) discussed below. First all conditions are normalized (P is normalized in log P units). An objective function is defined that weighs conditions based on their distance from prior conditions, on the number of objects returned and how recent the prior condition was. This objective function is evaluated in a multidimensional grid and these evaluation points are normalized so that they sum to one. Then a condition is chosen randomly with each point being chosen with the probability of its normalized objective function value. After this a random step (maximum magnitude is sqrt(2)/2 times the distance to another grid point) is taken from the grid point to give a final condition. This causes it to sample from the entire condition range. The random number generator used is seeded so the process is deterministic. The RMG_Memory object keeps track of all of this. The number of simulations completed successfully (without adding objects) is also tracked to determine when the reactor can complete.
…actors Also adaption of conversion calculations and return type for simulate so that the time and conversion can be put into the RMG_Memory objects. Addition of sensConditions for specifying the conditions under, which to do sensitivity analysis.
6686976
to
232ebb8
Compare
Ok, I've fixed some of codacy's recommendations. |
If a balance species is set mole fractions are generated thus: the ranged mole fractions are varied within their ranges according to the algorithm while the others remain constant, then the balance species is adjusted so that all of the mole fractions sum to one. This causes all mole fractions except the balanceSpecies to remain in their ranges
232ebb8
to
ea83368
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great
The pull request enables reactors to sample over a range of conditions rather than a single condition. Conditions are chosen by maximizing distance from prior run conditions (weighted by how recent they were, the conversion/time reached and how many objects they returned). These variable condition reactors terminate after a defined number of iterations (nSimsTerm). This should allow us to avoid missing chemistry that occurs between two fixed reactors and improve our efficiency with simulations.