Implement Func1-derived tabulated functions #797

ischoegl · 2020-01-22T20:29:08Z

Checklist

There is a clear use-case for this code change
The commit message has a short title & references relevant issues
Build passes (scons build & scons test) and unit tests address code coverage
The pull request is ready for review

If applicable, fill in the issue number this pull request is fixing

Changes proposed in this pull request

use existing C++ Func1.h framework
create Python interface for new tabulated Func1 object and add unit tests

Implementing a tabulated function as a class derived from Func1 allows for a straight-forward integration into the existing framework without further code changes. An implementation in the C++ layer has the advantage that this solution is usable from (or at least can be extended to) all platforms.

Use case and benchmark

In [1]: import cantera as ct
   ...: import numpy as np
   ...: arr = np.array([[0, 2], [1, 1], [2, 0]])

In [2]: fcn1 = ct.Func1(arr)
   ...: fcn1(1.3)
Out[2]: 0.7

In [3]: fcn2 = ct.Func1(arr, interpolation='previous')
   ...: fcn2(1.3)
Out[3]: 1.0

In [4]: %timeit fcn1(1.3)
The slowest run took 29.33 times longer than the fastest. This could mean that an intermediate result is being cached.
10000000 loops, best of 3: 122 ns per loop

In [5]: time = arr[:, 0]
   ...: fval = arr[:, 1]
   ...: fcn3 = lambda t: np.interp(t, time, fval)

In [6]: %timeit fcn3(1.3)
The slowest run took 9.63 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 5.46 us per loop

In [7]: fcn4 = ct.Func1(fcn3)

In [8]: %timeit fcn4(1.3)
The slowest run took 13.10 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 5.76 us per loop

codecov · 2020-01-22T21:00:40Z

Codecov Report

Merging #797 into master will increase coverage by 0.02%.
The diff coverage is 50.84%.

@@            Coverage Diff             @@
##           master     #797      +/-   ##
==========================================
+ Coverage   71.39%   71.42%   +0.02%     
==========================================
  Files         372      372              
  Lines       43482    43580      +98     
==========================================
+ Hits        31045    31128      +83     
- Misses      12437    12452      +15

Impacted Files	Coverage Δ
include/cantera/numerics/Func1.h	`0.54% <10.00%> (+0.26%)`	⬆️
src/numerics/Func1.cpp	`8.44% <59.18%> (+7.81%)`	⬆️
src/oneD/Domain1D.cpp	`58.46% <0.00%> (-3.85%)`	⬇️
include/cantera/oneD/Sim1D.h	`62.50% <0.00%> (ø)`
test/kinetics/kineticsFromYaml.cpp	`97.75% <0.00%> (+0.10%)`	⬆️
include/cantera/oneD/StFlow.h	`91.83% <0.00%> (+1.13%)`	⬆️
src/oneD/Sim1D.cpp	`73.01% <0.00%> (+1.28%)`	⬆️
src/kinetics/KineticsFactory.cpp	`96.34% <0.00%> (+1.34%)`	⬆️
src/oneD/StFlow.cpp	`89.52% <0.00%> (+2.58%)`	⬆️
include/cantera/oneD/OneDim.h	`38.29% <0.00%> (+8.51%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b449e83...6e637a9. Read the comment docs.

bryanwweber · 2020-01-23T17:12:03Z

This is great, and I'm glad the implementation turned out to be so simple. I disagree, however, that interpolation should be done between values. In my opinion, the nearest previous value should be returned. If we want to include interpolation, it should be an option that can be passed, IMO, e.g.,

ct.Func1(arr, interpolation=None)  # or 'None'?
ct.Func1(arr, interpolation='linear')
ct.Func1(arr, interpolation='quadratic')

etc.

ischoegl · 2020-01-23T19:44:53Z

@bryanwweber ... happy that this may be useful.

The main reason why I implemented a linear interpolation is that it is more physical/flexible, and already contains the previous value case via:

ct.Func1([0, 1, 1, 2], [0, 0, 1, 1]) # using the two argument version for clarity

Beyond, I'm unsure about the fate of Func1.h, which defines methods for derivatives, etc. that are not broken out to Python. As I'm working from within that framework, things aren't straight-forward, and I'd rather not get into this deeper than absolutely necessary.

PS: while I have a strong preference for interpolation (we don't necessarily have to mimic CHEMKIN here), adding the 'previous value' alternative is still relatively simple. I do not think that it's worth going into quadratic or spline variants in this PR.

ischoegl · 2020-01-27T15:57:04Z

@bryanwweber ... I ended up adding the non-interpolating option per your suggestion, i.e.

ct.Func1([0, 1, 1, 2], [0, 0, 1, 1], interpolation='previous') # CHEMKIN style
ct.Func1([0, 1, 1, 2], [0, 0, 1, 1], interpolation='linear')
ct.Func1([0, 1, 1, 2], [0, 0, 1, 1]) # defaults to 'linear'

I am proposing to leave the default as linear as it is more intuitive. The interpolation method strings match scipy.interpolate.interp1d to future-proof potential additions beyond this PR.

bryanwweber · 2020-01-27T16:22:29Z

@ischoegl Thanks for adding the "previous" method! I appreciate you taking the time to consider my suggestion and add that. I want to advocate for having the "previous" method be the default, let me know what you think.

I think there is also a physical justification for the "previous" method being the default, beyond compatibility with CHEMKIN (which I agree is not strictly necessary). The justification I'm thinking of is that by assuming a linear variation, we are assuming more about the input data than the user has provided us. We are assuming that if the user were to add additional values in between, they would fall on a straight line between the two surrounding points. However, there is no particular reason we should assume this; the data could just as easily vary quadratically, sinusoidally, etc (note, I'm not suggesting these other options be implemented here). So on that basis, the case with the fewest assumptions on our part is that the data do not vary between the provided points. IMO this is the more reasonable assumption to make. What do you think?

ischoegl · 2020-01-27T16:35:34Z

I want to advocate for having the "previous" method be the default, let me know what you think.

Sure! I am mainly following precedent for default options of scipy.interpolation.interp1d or MATLAB interp1d, i.e. imho 'linear' is the expected behavior. There are also some practical reasons: this approach avoids discontinuous inputs when integrating ODE's. As a physical example, positions resulting from velocity inputs become C2 continuously differentiable, which avoids unphysical 'dirac'-type accelerations.

bryanwweber · 2020-01-27T17:28:09Z

As a physical example, positions resulting from velocity inputs become C2 continuously differentiable, which avoids unphysical 'dirac'-type accelerations.

Hmm, this makes a lot of sense as well. In the past, I've always set the maximum time step in the integration to be the smallest time interval between data points, which I think helps with this problem. I wonder how much difference it would make practically.

I am mainly following precedent for default options of scipy.interpolation.interp1d or MATLAB interp1d, i.e. imho 'linear' is the expected behavior.

I think the expected behavior would be linear if the user is expecting interpolation to happen. I don't think it is obvious that interpolation should happen for tabular data input.

ischoegl · 2020-01-27T23:23:46Z

I think the expected behavior would be linear if the user is expecting interpolation to happen. I don't think it is obvious that interpolation should happen for tabular data input.

Hmm. Based on past experience (using MATLAB/Scipy/etc.) I would automatically assume that tabulated input involves interpolation. I guess I've never used CHEMKIN much, so I'm not used to their convention. I.e. there may be no clear answer here.

I am still partial to the physics argument: if given the choice of C1 vs C0 input, I'd stick with the more 'natural' choice. My intuition would be that the system is less stiff for C1, as the ODE solver is unable to predict the effect of discontinuous input for C0. (Ps: similar arguments hold for higher derivatives, but C0 is imho the worst-case scenario for ODE input.)

Considering the alternatives, I'm still advocating for linear.

ischoegl · 2020-02-22T20:04:22Z

@speth ... while this is motivated by RCM work, would you mind having a quick look? Parts of this are reviving C++ Func1 objects where I’m not sure what their fate was intended to be.

- add new derived class to Func1.h that interpolates a tabulated function in C++

- the C++ defined Tabulated1 class is added as a variant of the Func1 object defined in Python - this approach is compatible with existing interfaces of various Python methods with Func1 arguments

- replace Python lambda function defining constant Func1 variants by pre-existing Const1 class defined in C++

speth

I have plenty of misgivings about the current implementation of the Func1 set of classes, but this barely touches on the problematic parts of that interface, so I think it should be okay and won't require too many changes if/when I get around to a more complete overhaul of Func1.

include/cantera/numerics/Func1.h

interfaces/cython/cantera/func1.pyx

include/cantera/numerics/Func1.h

bryanwweber

Thanks @ischoegl! I have a few comments and questions about the Python interface here.

bryanwweber · 2020-04-03T19:09:00Z

interfaces/cython/cantera/func1.pyx

+                if arr.shape[1] == 2:
+                    time = arr[:, 0]
+                    fval = arr[:, 1]
+                    method = kwargs.get('interpolation', 'linear')
+                    self._set_tables(time, fval, stringify(method))


Any reason not to allow row-based input here as well? It seems as long as ndim is 2, it doesn't matter whether it is rows or columns.

What about the ambiguity of the case where a 2x2 array is provided? I sort of prefer just having the two-argument form. That way, you can use your input if it's either rows or columns. Assuming numpy arrays, that would just be one of

TabulatedFunction(x[0], x[1]) TabulatedFunction(x[:,0], x[:,1])

That’s a valid point. I don’t have any objections to just requiring the two argument form. @bryanwweber ... any thoughts?

Going back to Cantera/enhancements#17, I believe there was the vel_array = np.genfromtxt('velocity-trace.txt') case, with the trace having two columns. I believe it would make sense to allow for a shorthand

TabulatedFunction(vel_array)

I.e. I'm tempted to just revert to the original implementation. I also think it may make sense to support something like [(0, 1), (1, 1.5), (2, 2)] where tuples are used to specify points.

I personally prefer the two argument form that @speth suggested. A two column or row array is easy enough for the user to specify as two arguments.

Done - two arguments it shall be ...

If a user is coming into this with list of tuples, they can always transform it with something like TabulatedFunction(*zip(*vel_array)).

interfaces/cython/cantera/func1.pyx

bryanwweber

Some copy editing changes this time around. Thanks for the quick turnaround!

interfaces/cython/cantera/func1.pyx

interfaces/cython/cantera/test/test_func1.py

ischoegl · 2020-04-03T22:29:00Z

@bryanwweber / @speth ... thanks for the comments/recommendations!

bryanwweber

Looks good, one minor change, and a curiosity question

interfaces/cython/cantera/func1.pyx

ischoegl force-pushed the tabulated-func1 branch from d848696 to 75cee19 Compare January 22, 2020 20:32

ischoegl changed the title ~~Implement tabulated functions for func1~~ Implement Func1-derived tabulated functions Jan 22, 2020

ischoegl requested a review from speth January 22, 2020 21:11

ischoegl force-pushed the tabulated-func1 branch 2 times, most recently from 2a089c3 to 5a88863 Compare January 23, 2020 16:23

ischoegl force-pushed the tabulated-func1 branch 4 times, most recently from 8a9d8f7 to 0fb3ece Compare January 27, 2020 14:55

ischoegl force-pushed the tabulated-func1 branch from 0fb3ece to 2658a88 Compare January 27, 2020 22:38

ischoegl requested review from bryanwweber and removed request for speth February 13, 2020 04:04

ischoegl added 5 commits March 23, 2020 15:39

[numerics] Implement tabulated Func1 object

ddc6a09

- add new derived class to Func1.h that interpolates a tabulated function in C++

[numerics] Break out tabulated Func1 to Cython interface

45c4c43

- the C++ defined Tabulated1 class is added as a variant of the Func1 object defined in Python - this approach is compatible with existing interfaces of various Python methods with Func1 arguments

[numerics] Add unit tests for tabulated Func1

7600ce2

[numerics] Use Const1 objects in Python

545dc61

- replace Python lambda function defining constant Func1 variants by pre-existing Const1 class defined in C++

[numerics] Add previous value option to tabulated Func1 objects

5d57220

ischoegl force-pushed the tabulated-func1 branch from 00f3ec5 to 5d57220 Compare March 23, 2020 20:39

speth requested changes Apr 1, 2020

View reviewed changes

ischoegl added 2 commits April 1, 2020 21:26

[numerics] Streamline and update docstrings for Tabulated1

8c8329f

[numerics] Improve unit tests for Tabulated1

597eff5

[numerics] Create Tabulated1 from raw pointers

d1f994a

ischoegl requested a review from speth April 3, 2020 03:39

bryanwweber requested changes Apr 3, 2020

View reviewed changes

[numerics] Introduce dedicated TabulatedFunction class in Python

d51a3f5

ischoegl force-pushed the tabulated-func1 branch from 042e013 to d51a3f5 Compare April 3, 2020 20:43

ischoegl requested a review from bryanwweber April 3, 2020 20:49

bryanwweber reviewed Apr 3, 2020

View reviewed changes

ischoegl requested a review from bryanwweber April 3, 2020 21:43

bryanwweber approved these changes Apr 4, 2020

View reviewed changes

interfaces/cython/cantera/func1.pyx Show resolved Hide resolved

interfaces/cython/cantera/func1.pyx Outdated Show resolved Hide resolved

[numerics] Improve documentation for TabulatedFunction

d625c6e

ischoegl force-pushed the tabulated-func1 branch from 663acb7 to d625c6e Compare April 4, 2020 01:36

[numerics] Simplify TabulatedFunction constructor

6e637a9

speth approved these changes Apr 5, 2020

View reviewed changes

speth merged commit 8a0ac7e into Cantera:master Apr 5, 2020

ischoegl deleted the tabulated-func1 branch July 13, 2020 17:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement Func1-derived tabulated functions #797

Implement Func1-derived tabulated functions #797

ischoegl commented Jan 22, 2020 •

edited

Loading

codecov bot commented Jan 22, 2020 •

edited

Loading

bryanwweber commented Jan 23, 2020

ischoegl commented Jan 23, 2020 •

edited

Loading

ischoegl commented Jan 27, 2020

bryanwweber commented Jan 27, 2020

ischoegl commented Jan 27, 2020 •

edited

Loading

bryanwweber commented Jan 27, 2020

ischoegl commented Jan 27, 2020 •

edited

Loading

ischoegl commented Feb 22, 2020

speth left a comment

bryanwweber left a comment

bryanwweber Apr 3, 2020

speth Apr 4, 2020

ischoegl Apr 4, 2020

ischoegl Apr 4, 2020 •

edited

Loading

bryanwweber Apr 4, 2020

ischoegl Apr 5, 2020 •

edited

Loading

speth Apr 5, 2020

bryanwweber left a comment

ischoegl commented Apr 3, 2020

bryanwweber left a comment

Implement Func1-derived tabulated functions #797

Implement Func1-derived tabulated functions #797

Conversation

ischoegl commented Jan 22, 2020 • edited Loading

codecov bot commented Jan 22, 2020 • edited Loading

Codecov Report

bryanwweber commented Jan 23, 2020

ischoegl commented Jan 23, 2020 • edited Loading

ischoegl commented Jan 27, 2020

bryanwweber commented Jan 27, 2020

ischoegl commented Jan 27, 2020 • edited Loading

bryanwweber commented Jan 27, 2020

ischoegl commented Jan 27, 2020 • edited Loading

ischoegl commented Feb 22, 2020

speth left a comment

Choose a reason for hiding this comment

bryanwweber left a comment

Choose a reason for hiding this comment

bryanwweber Apr 3, 2020

Choose a reason for hiding this comment

speth Apr 4, 2020

Choose a reason for hiding this comment

ischoegl Apr 4, 2020

Choose a reason for hiding this comment

ischoegl Apr 4, 2020 • edited Loading

Choose a reason for hiding this comment

bryanwweber Apr 4, 2020

Choose a reason for hiding this comment

ischoegl Apr 5, 2020 • edited Loading

Choose a reason for hiding this comment

speth Apr 5, 2020

Choose a reason for hiding this comment

bryanwweber left a comment

Choose a reason for hiding this comment

ischoegl commented Apr 3, 2020

bryanwweber left a comment

Choose a reason for hiding this comment

ischoegl commented Jan 22, 2020 •

edited

Loading

codecov bot commented Jan 22, 2020 •

edited

Loading

ischoegl commented Jan 23, 2020 •

edited

Loading

ischoegl commented Jan 27, 2020 •

edited

Loading

ischoegl commented Jan 27, 2020 •

edited

Loading

ischoegl Apr 4, 2020 •

edited

Loading

ischoegl Apr 5, 2020 •

edited

Loading