Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation of linear-QSSA based on GasKinetics #437

Closed
wants to merge 13 commits into from

Conversation

haowu80s
Copy link

Fixes # .

Changes proposed in this pull request:

  • GasQSSKinetics derived from GasKinetics
  • Takes two thermo object, one for bulk species and one for QSS species
  • Linear QSSA is solved by SparseQR from Eigen. SparseLU may be faster but QR is rank revealing, which ensures correctness at singularity.
  • Update ctml_writer.py to read qssa_gas from CTI files.
  • About 10% more expensive than a corresponding skeletal mech.

The functionality is complete and has been tested. Development is still "in progress", in terms of tests, samples, and formatting/naming. Need some input from the community to finalize it.

@codecov
Copy link

codecov bot commented Feb 22, 2017

Codecov Report

Merging #437 into master will increase coverage by 0.6%.
The diff coverage is 0%.

@@            Coverage Diff            @@
##           master     #437     +/-   ##
=========================================
+ Coverage   57.44%   58.04%   +0.6%     
=========================================
  Files         377      382      +5     
  Lines       40273    40455    +182     
  Branches     6701     6763     +62     
=========================================
+ Hits        23133    23482    +349     
+ Misses      15178    15010    -168     
- Partials     1962     1963      +1
Impacted Files Coverage Δ
src/kinetics/Kinetics.cpp 78.15% <ø> (ø)
include/cantera/kinetics/GasQSSKinetics.h 0% <ø> (ø)
src/kinetics/GasQSSKinetics.cpp 0% <ø> (ø)
include/cantera/thermo/PDSS_ConstVol.h 0% <ø> (-100%)
include/cantera/base/ValueCache.h 50% <ø> (-50%)
include/cantera/base/FactoryBase.h 14.28% <ø> (-35.72%)
src/thermo/MultiSpeciesThermo.cpp 61.16% <ø> (-7.77%)
src/thermo/PDSS_ConstVol.cpp 56.25% <ø> (-5%)
src/thermo/RedlichKisterVPSSTP.cpp 37.74% <ø> (-2.9%)
src/thermo/PDSS_IdealGas.cpp 56.25% <ø> (-1.82%)
... and 34 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a02753a...1d8298a. Read the comment docs.

@speth
Copy link
Member

speth commented Feb 23, 2017

Thanks for working on this; it looks very interesting. Can you please add some examples/tests? It is not entirely clear to me how one is supposed to use this class, e.g. what is the _buff_full argument to the constructor for?

@haowu80s
Copy link
Author

haowu80s commented Feb 23, 2017

A simple example ("qss_ignition") is added, which demonstrates how to use GasQSSKinetics with a constant pressure reactor.

The buffer is no longer needed to be provided by the user. The buffer was used to hold the rate of production/destruction for all species (including qss ones), while the corresponding methods only takes and outputs those for the bulk species. The usage of a user supplied buffer is now replaced by the usage of m_grt.

I'm not very familiar with the testing part. May need some help if tests are desired.

I also added the corresponding skeletal mechanism of the QSSA ones used in the sample for reference.

Copy link
Member

@speth speth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this. I think my one big question with this is that given that for the provided example the QSS version is slower in the sample ignition delay problem, under what circumstances would it be expected to offer a meaningful performance benefit?

It's good that you already have this working based on the CTI input file format. One thing that we still need to do is get this class set up so that a GasQSSKinetics object can be created via the factory functions so that they can be created in Python and Matlab as well.

In addition to addressing the specific comments made on various parts of the code, one other change that should be made is to remove the reversion of the fmt submodule from version 3.0.1 to 3.0.0.


// This file is part of Cantera. See License.txt in the top-level directory or
// at http://www.cantera.org/license.txt for license and copyright information.
// Created by Hao Wu (wuhao@stanford.edu), Feb. 2017
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should add yourself to the AUTHORS file instead of here


GasQSSKinetics(thermo_t *thermo = 0);

virtual ~GasQSSKinetics();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't need the destructor if it does nothing (only needed in the base class)

@@ -20,7 +20,7 @@ export PYTHON_CMD
PATH=@ct_bindir@:$PATH
export PATH

if [ "@python_cmd@" != `which python` ]; then
if [ "@python_cmd@" != "which python" ]; then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of this change? The backticks are deliberate...

CH2OCH2O2H""",
reactions="all",
phases="DME_SK39",
initial_state=state(temperature=300.0, pressure =OneAtm) )
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A separate input file duplicating all the thermo data and reactions isn't really needed -- The three phase definitions can just be put into a single file.

* Kinetics manager for elementary gas-phase chemistry with QSSA species.
* @ingroup kinetics
*/
class GasQSSKinetics : public GasKinetics
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment block for the class should have at least a short description of the model, a reference to a source describing the QSSA model, and provide some idea of how to use the class (i.e. the requirements for the two IdealGasPhase objects this uses).

}
// fill m_ropf_noqss
if (!qss_rts.size()) { // if no qss reactants
for (const auto pd : qss_pds) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In cases where the type is something as simple as size_t, I think it is more clear to use the type directly than const auto. auto is certainly preferred for more complex types, of course.

std::vector<std::vector<size_t>> m_ropr_qss;
//
static const char IROPF = 1;
static const char IROPR = 1 << 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These can be declared as static const variables in the .cpp file, and don't need to be exposed as class members. (Also probably worth a comment explaining what they do).

// make sparsity pattern for m_rop_qss
vector< Triplet<double> > tripletList;
for (size_t k = 0; k < m_nSpeciesQSS; k++) {
tripletList.push_back(Triplet<double>(k, k, 1.));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider writing as tripletList.emplace_back(k, k, 1.);

//
static const char IROPF = 1;
static const char IROPR = 1 << 1;
std::vector<char> m_ifr_qss;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to explain what this variable does

vector<vector<size_t>>::iterator it_ropf_qss = m_ropf_qss.begin();
vector<vector<size_t>>::iterator it_ropr_qss = m_ropr_qss.begin();
for (int k = 0; k < m_rop_qss.outerSize(); ++k)
for (SparseMatrix<double>::InnerIterator it(m_rop_qss,k); it; ++it) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think some more explanation of what is happening here would be useful.

@haowu80s
Copy link
Author

Hi Ray,
Thanks for the feedback.
A quick answer to the performance question. The performance gain for a QSSA mechanism comes mostly from the following three factors.

  1. The reduced number of indepedent species makes it cheaper to numerically evaluate and solve the Jacobian system for implicit ode solvers.
  2. For advection-reaction-diffusion systems, the reduced species also means fewer scalar conservation equations to solve.
  3. The QSS species are the stiff ones. The removal of them typically results in a less stiff mechanism and makes it possible for using explicit ode solvers in certain cases.

@speth
Copy link
Member

speth commented Mar 15, 2017

Right, I understand why the QSSA method should lead to performance improvements, my question was why, at least for the example problem given, it apparently doesn't -- I ran a loop to calculate the ignition delay 100 times using both the skeletal and QSS versions of the mechanism, and found the QSSA version to be slightly slower.

I think an ignition problem should be as good a place as any to see a performance increase, since you should see both the improvement in reducing the number of species (and the cost of the species number cubed cost of Jacobian factorization) and any reduction in stiffness via the variable timestep of the CVODES solver. However, I only saw a ~2% reduction in the number of internal timesteps needed to solve during the ignition delay period, and I guess the savings in Jacobian evaluation are wiped out by the extra work involved in the QSSA algorithm.

@skyreflectedinmirrors
Copy link
Contributor

I tend to agree with @speth. For a problem this sized (30 bulk, 9 QSS species) the majority of the simulation time is likely to be spent in FD Jacobian evaluation, hence we should expect a considerable reduction in overall simulation time from putting ~1/4 of the species as QSS.

I wonder what looking at say, the number of Jacobian evaluations in CVODE between the QSS and non-QSS baseline would reveal. Is the QSS case causing more discontinuities or invalidating the Jacobian more often? That might explain the difference.

@haowu80s
Copy link
Author

Ray,
I guess I miss understood your questions.

You are perfectly right that QSSA does not give much, if any, performance boost for a 0D ignition problem of this size, at least in the way cantera solves it. A quick back-of-envelop calculation to show this and one can benchmark it to verify :

Let's say a full RHS evaluation for the 39-species mechanism is of cost 1, and the corresponding 30-species QSS version is 1.1. Within the RHS evaluation, update_T is of cost 0.6, which is the same among the two. Matrix factorizing is of cost 2 for 39x39 and 1 for 30x30.

Jacobian evaluation for the 39-species mechanism is of cost 2 + 39x(1-0.6) (species perturbation avoids update_T). By the same token, it is 2.2 + 30x(1.1-0.6) for the 30-species QSS.

All in all, for the 39-species mech. the cost is 2 + 39x0.4 + 2 = 19.2, while for the 39-species QSS version, it is 2.2 + 30x0.5 + 1 = 18.2. The saving will be more significant if one uses CKWYP type of subroutines, which does not cache temperature related calculations.

In addition, CVODE only performs 1 Jacobian evaluation for as much as 50 steps. So the saving in Jacobian evaluations is really minimal for a system of this size.

Furthermore, the internal step size taken by CVODE/BDF is typically bounded by the accuracy constraints for an ignition problem, not the stability constraints. So a less stiff system does not really lead to fewer steps.

On a more positive note, the applications where I saw meaningful gains via QSSA are large-scale 3D calculations. QSSA reduces the number of scalar transport equations to solve and also enables the usage of explicit time-stepping as the CFL limited step size is rather small to begin with.

@speth
Copy link
Member

speth commented Mar 15, 2017

That makes sense, and seems to fit with a bit of profiling I just did. There is definitely a reduction in the time spent on Jacobian factorization, but for this problem size, that is gain is eaten up in the QSS algorithm. So maybe it would be effective for larger mechanisms where the Jacobian factorization becomes a bigger issue, if you are able to identify enough QSS species (but that's a separate problem).

@speth speth closed this Nov 18, 2018
@ghost
Copy link

ghost commented Dec 13, 2020

I tried @haowu80s 's LQSSA algorithm for a larger QSSA problem using mechanism from Prof. T.F. Lu. Small acceleration is observed. :-|

cd $HOME/download
wget http://spark.engr.uconn.edu/mechs/PRF3.zip
unzip PRF3.zip
cd PRF3/sk171
ck2cti --input chem.inp --output sk171.cti  --thermo therm.dat --transport tran.dat --permissive
ctml_writer sk171.cti
cd ../rd116
ck2cti --input chem.inp --output rd116.cti  --thermo therm.dat --transport tran.dat --permissive
ctml_writer rd116.cti

Then I manually modified the xml file to meet the need of LQSSA implementation.

test condition:
export OMP_NUM_THREADS=1
gas.setState_TPX(1000.0, OneAtm, "IC8H18:0.012766 , NC7H16:0.012766 , O2:1 , N2:3.76"); // ER=0.3, "PRF50", it should be 50%:50% by liquid volume.
IdealGasReactor r; // const volume reactor
dt =1e-4 ;// secs
N = 1000 ;//steps

The result is:

ODE - 14.563s
LQSSA - 11.315s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants