Sampling Reactors #1331

mjohnson541 · 2018-03-26T21:48:44Z

The pull request enables reactors to sample over a range of conditions rather than a single condition. Conditions are chosen by maximizing distance from prior run conditions (weighted by how recent they were, the conversion/time reached and how many objects they returned). These variable condition reactors terminate after a defined number of iterations (nSimsTerm). This should allow us to avoid missing chemistry that occurs between two fixed reactors and improve our efficiency with simulations.

codecov · 2018-03-26T22:18:33Z

Codecov Report

Merging #1331 into master will decrease coverage by 0.18%.
The diff coverage is 16.07%.

@@            Coverage Diff             @@
##           master    #1331      +/-   ##
==========================================
- Coverage   42.24%   42.06%   -0.19%     
==========================================
  Files         168      168              
  Lines       27508    27695     +187     
  Branches     5367     5432      +65     
==========================================
+ Hits        11621    11650      +29     
- Misses      15130    15275     +145     
- Partials      757      770      +13

Impacted Files	Coverage Δ
rmgpy/tools/fluxdiagram.py	`9.5% <ø> (ø)`	⬆️
rmgpy/reduction/reduction.py	`19.11% <0%> (ø)`	⬆️
rmgpy/rmg/input.py	`40.87% <33.8%> (-1.5%)`	⬇️
rmgpy/rmg/main.py	`22.89% <7.89%> (-2.24%)`	⬇️
rmgpy/data/kinetics/family.py	`59.22% <0%> (+0.4%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5a07cc9...ea83368. Read the comment docs.

alongd · 2018-03-28T20:45:57Z

Could you add some documentation on this feature?

mjohnson541 · 2018-04-01T22:57:29Z

Documentation should be added now.

alongd

Thanks for this PR! please see comments/questions below

alongd · 2018-04-05T18:07:52Z

documentation/source/users/rmg/input.rst

+Advanced Setting: Range Based Reactors
+-------------------------------------------------
+
+Under this setting rather than use reactors at fixed points reaction conditions are sampled from a range of conditions.  


Please add a comma after "at fixed points,"

alongd · 2018-04-16T00:18:11Z

documentation/source/users/rmg/input.rst

+-------------------------------------------------
+
+Under this setting rather than use reactors at fixed points reaction conditions are sampled from a range of conditions.  
+Conditions are chosen by maximizing distance from prior run conditions (weighted by how recent they were, the 


Let's add(in terms of T, P, concentrations) after maximizing distance
Also, what are the constraints? Simply maximizing the distance would mean choosing the most extreme T/P in the given range. Or did I miss understood something?

They're weighted I've added "maximizing weighted distance (in terms of ...."

alongd · 2018-04-16T00:21:52Z

examples/rmg/SR_test/input.py

+simpleReactor(
+    temperature=[(1000,'K'),(1500,'K')],
+    pressure=[(1.0,'bar'),(10.0,'bar')],
+    nSimsTerm=12,


Any particular reason that 12 is used here? Does it work best from your experience?

12 was my estimate of real reactors to do this system properly, I think 24 might be a more robust choice, but that makes the example run long. I'll add a comment noting that.

Could you direct me to the comment that was added?

Sorry, in my tests it turns out 12 is actually a really good number for this one so I don't think a comment is necessary. I've also added a rule of thumb for choosing nSimsTerm to the documentation.

alongd · 2018-04-16T00:38:26Z

rmgpy/rmg/input.py

                  ):
    logging.debug('Found SimpleReactor reaction system')

    for value in initialMoleFractions.values():
-        if value < 0:
+        if value != list and value < 0:


Shouldn't this be if not isinstance(value, list) and ...?

I'm really embarrassed about these ones...

rmgpy/rmg/input.py

alongd · 2018-04-16T02:24:49Z

rmgpy/rmg/main.py

@@ -764,15 +798,15 @@ def execute(self, **kwargs):
        # Run sensitivity analysis post-model generation if sensitivity analysis is on
        for index, reactionSystem in enumerate(self.reactionSystems):

-            if reactionSystem.sensitiveSpecies:
+            if reactionSystem.sensitiveSpecies and reactionSystem.sensConditions:


I think that sensConditions is None if ranges aren't specified. Does this condition allow "normal" SA to be run?

It should do so now since the non-ranged values will default as sensConditions

alongd · 2018-04-16T02:26:41Z

rmgpy/rmg/main.py

@@ -782,8 +816,9 @@ def execute(self, **kwargs):
                    pdepNetworks = self.reactionModel.networkList,
                    sensitivity = True,
                    sensWorksheet = sensWorksheet,
-                    modelSettings = self.modelSettingsList[-1],
+                    modelSettings = ModelSettings(toleranceMoveToCore=1e8,toleranceInterruptSimulation=1e8),


Could you explain this change?

Before RMG was guaranteed to not add objects on the sensitivity runs because otherwise the run would not have terminated. Now if we don't do this the sensitivity analysis can terminate without completing the simulation.

alongd · 2018-04-16T02:28:12Z

rmgpy/rmg/mainTest.py

+            Rmem.getCond()
+            Rmem.addtConvN(1.0,.2,2)
+            Rmem.generateCond()
+            Rmem.getCond()


Perhaps add relevant assertions and helpful error messages?

I couldn't really think of anything I felt the need to specifically check in the object other than that the basic operations could be done. It is deterministic so I could hard code the test (check whether its generating the same condition), but I feel like unless I'm checking something I have reason to worry about that just forces someone to change the hard coded numbers anytime they change something in the weighing or optimization algorithm.

alongd · 2018-04-16T02:30:19Z

rmgpy/solver/base.pyx

+        if conditions:
+            isConc = hasattr(self,'initialConcentrations')
+            keys = conditions.keys()
+            if 'T' in keys and hasattr(self,'T'):


Could T be in keys w/o having a self.T attribute?

This if statement fundamentally just guarantees that the next line won't error. It was made to be general because base.pyx stuff has to work with any kind of reactor we propose.

alongd · 2018-04-16T02:32:31Z

rmgpy/solver/base.pyx

+            for k in keys:
+                if isConc:
+                    if k in self.initialConcentrations.keys():
+                        self.initialConcentrations[k] = Quantity(conditions[k],'mol/m^3')


I'm not too familiar with LiqReactor, but couldn't the conditions be specified in different units? If so, we're missing a conversion

The RMG_Memories object only outputs stuff in SI units

mjohnson541 · 2018-04-28T19:20:41Z

Ok, I've run tests I want to and updated the documentation accordingly to give better guidelines. I believe I've also fixed the issues you pointed out.

mjohnson541 · 2018-04-28T21:45:02Z

I made some significant changes to the documentation so if you could look that over I'd appreciate it.

mliu49 · 2018-04-28T23:13:40Z

I would like to see some more test results before this gets merged, including but not limited to:

RMG-tests
Comparison of output/runtime between multiple reactors vs. sampling reactor over a T range (and potentially P and composition ranges as well)

To be honest, the mistakes which Alon pointed out were a bit concerning. It would be best if you could improve the test coverage of new code which you added.

alongd · 2018-04-29T01:39:53Z

rmgpy/rmg/input.py

+            initialMoleFractions[key] = float(value)
+            if value < 0:
+                raise InputError('Initial mole fractions cannot be negative.')
+        elif isinstance(value,list):


Would it be enough to just write else:?

alongd · 2018-04-29T01:42:28Z

rmgpy/rmg/input.py

+            if value < 0:
+                raise InputError('Initial mole fractions cannot be negative.')
+        elif isinstance(value,list):
+            initialMoleFractions[key] = [float(value[0]),float(value[1])]


Do we check for cases where the user inputs the concentration as a list but with just one value? (i.e., value[1] does not exist?) I think this should be a separate Error, it it might happen frequently if modifying an input file that has some concentration ranges.

I've added InputErrors with useful messages for these cases.

mjohnson541 · 2018-04-29T21:11:23Z

@mliu49 I've greatly improved coverage from the original PR, however it doesn't show up because except for testing the new RMG_Memory objects it makes the most sense to test the rest of this system using functional tests. I believe under the functional tests nearly all lines of code I've added are covered now.

I'm planning to present some results related to the output in RMG meeting this week. What question are you trying to answer with runtime information?

mliu49 · 2018-04-30T03:03:57Z

I understand that it may be difficult to write unit tests for these changes, and I wasn't referring explicitly to the coverage percentage. I mainly meant that the fact that you didn't catch those errors during testing suggested that you weren't running the appropriate tests, whether unit or functional.

I mentioned runtime comparison since you said in your initial comment that this should provide improved efficiency. That makes sense, but I was hoping you could quantify it, even if just in one example case.

mjohnson541 · 2018-04-30T03:33:51Z

I apologize for the state of the original PR, but I'm pretty confident the tests I've added since are sufficient to thoroughly test this feature.

Ok, I expect I can demonstrate that.

mjohnson541 · 2018-05-04T15:46:24Z

So after discovering a space sampling issue with my original method for choosing conditions I discussed the sampling problem with Kevin Silmore from the Swan lab and he proposed a different sampling algorithm I've now adopted in this PR. In this algorithm we evaluate the objective function at a grid of condition points, normalize these values and then choose a point where each point has probability of its normalized objective function value of being chosen. After a grid point is chosen a random step (of maximum length half the distance to the next grid point) is taken from the grid point to choose a final sample point.

mjohnson541 · 2018-05-04T20:59:34Z

The deterministic-ness issue was actually unrelated to random number generation and actually related to non-deterministic solver output. By removing termination conversion and time from the objective function the algorithm is now as deterministic as regular RMG.

mjohnson541 · 2018-05-04T21:10:27Z

Ok, I believe I've fixed everything that needed fixed in this.

mjohnson541 · 2018-05-09T16:08:19Z

Where does that occur?

mjohnson541 · 2018-05-09T18:08:03Z

Ok, I believe I've fixed all of the above issues.

mjohnson541 · 2018-05-15T20:40:36Z

Are there any other issues or can we merge this?

alongd · 2018-05-16T00:54:40Z

Could you take a look at the Codacy report? I think there a some recommendations there we could accept. Everything else looks good to me.

…nated

this system allows RMG to choose new conditions for each simulation using the Weighted Stochastic Grid Sampling Algorithm (idea credit to Kevin Silmore) discussed below. First all conditions are normalized (P is normalized in log P units). An objective function is defined that weighs conditions based on their distance from prior conditions, on the number of objects returned and how recent the prior condition was. This objective function is evaluated in a multidimensional grid and these evaluation points are normalized so that they sum to one. Then a condition is chosen randomly with each point being chosen with the probability of its normalized objective function value. After this a random step (maximum magnitude is sqrt(2)/2 times the distance to another grid point) is taken from the grid point to give a final condition. This causes it to sample from the entire condition range. The random number generator used is seeded so the process is deterministic. The RMG_Memory object keeps track of all of this. The number of simulations completed successfully (without adding objects) is also tracked to determine when the reactor can complete.

…actors Also adaption of conversion calculations and return type for simulate so that the time and conversion can be put into the RMG_Memory objects. Addition of sensConditions for specifying the conditions under, which to do sensitivity analysis.

mjohnson541 · 2018-05-16T20:47:38Z

Ok, I've fixed some of codacy's recommendations.

If a balance species is set mole fractions are generated thus: the ranged mole fractions are varied within their ranges according to the algorithm while the others remain constant, then the balance species is adjusted so that all of the mole fractions sum to one. This causes all mole fractions except the balanceSpecies to remain in their ranges

alongd

Great

mjohnson541 changed the title ~~Sampling reactors~~ Sampling Reactors Mar 26, 2018

mjohnson541 force-pushed the SamplingReactors branch from aaa4ca1 to 8dd9d0d Compare March 27, 2018 00:38

mjohnson541 added the Type: Feature label Mar 27, 2018

mjohnson541 force-pushed the SamplingReactors branch from 8dd9d0d to 06e85e6 Compare March 27, 2018 15:40

mjohnson541 added the Status: Ready for Review PR is complete and ready to be reviewed label Mar 27, 2018

mjohnson541 requested a review from alongd March 27, 2018 16:06

alongd reviewed Apr 16, 2018

View reviewed changes

mjohnson541 force-pushed the SamplingReactors branch 4 times, most recently from 5c1f051 to 021490e Compare April 28, 2018 19:16

mjohnson541 force-pushed the SamplingReactors branch from 021490e to 432814b Compare April 28, 2018 19:54

alongd reviewed Apr 29, 2018

View reviewed changes

mjohnson541 force-pushed the SamplingReactors branch from 432814b to 4108eb2 Compare April 29, 2018 20:14

mjohnson541 force-pushed the SamplingReactors branch from 4108eb2 to a38fc24 Compare May 4, 2018 15:37

mjohnson541 force-pushed the SamplingReactors branch 2 times, most recently from 71cca85 to 0045ce9 Compare May 4, 2018 20:57

mjohnson541 force-pushed the SamplingReactors branch from 0045ce9 to 410bb68 Compare May 4, 2018 21:09

mjohnson541 force-pushed the SamplingReactors branch from 410bb68 to 4cf6f20 Compare May 9, 2018 18:05

mjohnson541 force-pushed the SamplingReactors branch 2 times, most recently from 62c168a to 6686976 Compare May 14, 2018 19:45

mjohnson541 added 10 commits May 16, 2018 16:34

Allow input to allow specification of temperature and pressure ranges

9b1ff15

adapt simulate to return the time and conversion at which a run termi…

5f21807

…nated

adapt simple and liquid reactors to deal with T and P range input

aaa0241

adapt simple and liquid reactors initialization to take a P and T input

0f2240a

adaption of tests and tools systems to match new syntax

43f130a

Add sampling reactor example for ethane oxidation

0e4dc4d

Add testing for RMG_Memory class

7679db4

Adjust sensitivity analysis to work with range based reactors

93a02f6

mjohnson541 force-pushed the SamplingReactors branch from 6686976 to 232ebb8 Compare May 16, 2018 20:47

mjohnson541 added 4 commits May 16, 2018 17:38

Added range based reactors example

5064ae3

added documentation for range based reactors

dae1585

fix checking of pdep input

ea83368

mjohnson541 force-pushed the SamplingReactors branch from 232ebb8 to ea83368 Compare May 16, 2018 21:41

alongd approved these changes May 16, 2018

View reviewed changes

mjohnson541 merged commit ef51186 into master May 16, 2018

mjohnson541 deleted the SamplingReactors branch May 16, 2018 22:26

rwest mentioned this pull request Apr 3, 2019

Surface reactor type doesn't do range reactors fully cfgoldsmith/RMG-Py#73

Open

mliu49 mentioned this pull request Oct 3, 2019

Reactors no longer normalize mole fractions #1750

Closed

kspieks mentioned this pull request Nov 5, 2019

Allow mole fractions to be normalized if they do not sum to one #1809

Merged

Sampling Reactors #1331

Sampling Reactors #1331

Conversation

mjohnson541 commented Mar 26, 2018

codecov bot commented Mar 26, 2018 • edited Loading

Codecov Report

alongd commented Mar 28, 2018

mjohnson541 commented Apr 1, 2018

alongd left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mjohnson541 commented Apr 28, 2018

mjohnson541 commented Apr 28, 2018

mliu49 commented Apr 28, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mjohnson541 commented Apr 29, 2018

mliu49 commented Apr 30, 2018

mjohnson541 commented Apr 30, 2018 • edited Loading

mjohnson541 commented May 4, 2018

mjohnson541 commented May 4, 2018

mjohnson541 commented May 4, 2018

mjohnson541 commented May 9, 2018

mjohnson541 commented May 9, 2018

mjohnson541 commented May 15, 2018

alongd commented May 16, 2018

mjohnson541 commented May 16, 2018

alongd left a comment

Choose a reason for hiding this comment

codecov bot commented Mar 26, 2018 •

edited

Loading

mjohnson541 commented Apr 30, 2018 •

edited

Loading