Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a API definining header hiding the function pointer as commonly done #4643

Closed
eddelbuettel opened this issue Jul 26, 2020 · 4 comments · Fixed by #4645
Closed

Add a API definining header hiding the function pointer as commonly done #4643

eddelbuettel opened this issue Jul 26, 2020 · 4 comments · Fixed by #4645
Milestone

Comments

@eddelbuettel
Copy link
Contributor

eddelbuettel commented Jul 26, 2020

Congratulations on getting 1.3.0 out. It is impressive to watch your well-oiled release process which I should do more often :)

It is lovely that the beginnings of a C API are now out. Big thanks to @lsilvest for getting it going and all of you for making it happen with #3751 and building on the older #1694.

This is immediatly useable in something like the following (and I apologise for bring C++ and Rcpp in but it makes one-liners, here broken for readability, easier to test)

R> Rcpp::cppFunction(' SEXP mysub(SEXP x, SEXP rows, SEXP cols) { 
  SEXP(*fun)(SEXP, SEXP, SEXP) = (SEXP(*)(SEXP,SEXP,SEXP)) R_GetCCallable("data.table", "CsubsetDT"); 
  return fun(x,rows,cols);
}')
R>

As your documentation states, if one is familiar with what WRE says or willing to read, this is easy to use as we proxy here from R giving us the desired subsetter from C(++).

R> library(data.table)
R> dt <- data.table(iris)
R> mysub(dt, 1:4, 1:4)
   Sepal.Length Sepal.Width Petal.Length Petal.Width
1:          5.1         3.5          1.4         0.2
2:          4.9         3.0          1.4         0.2
3:          4.7         3.2          1.3         0.2
4:          4.6         3.1          1.5         0.2
R> 

Now, it is not uncommon to export APIs such as this---and fully understood it is subject to change and will change---by an API defining header. I have done that twice in packages exporting C API portions from R itself (RApiDatetime, RApiSerialize) and added it to other packages (xts. In each case a header was added which client packages include and which makes the C function pointer assignment and SEXP-counting internal to a simpler helper function. With it (and I have a version here which I could PR) the C(++) snippet becomes

R> Rcpp::cppFunction("SEXP mysub2(SEXP x, SEXP rows, SEXP cols) { return subsetDT(x,rows,cols); }", 
     include="#include <datatableAPI.h>", 
     depends="data.table")
R> mysub2(dt, 1:4, 1:4)
   Sepal.Length Sepal.Width Petal.Length Petal.Width
1:          5.1         3.5          1.4         0.2
2:          4.9         3.0          1.4         0.2
3:          4.7         3.2          1.3         0.2
4:          4.6         3.1          1.5         0.2
R> 

I.e. we now only include a single header which can be added to other packages via LinkingTo:. And all (programmer) users need than is to call the function declared in the header which I named here subsetDT().

The code in the header is essentially a three-liner and easy to maintain and update if and when the API changes -- and can shield the users from API updates as it picks up your API name and re-exports it. It simply has to be updated in concordance with the change in src/init.c. (It is entirely up to you to declare this as subsetDT() as internally, or as CsubsetDT as exported called and imported. You even make it CsubsetDTapi() to be extra explicit. I like subsetDT() myself :).

Let me know what you think. If it is deemed unimportant feel free to close, if you want to send a PR I can do so as I have the file in a branch here.

@jangorecki
Copy link
Member

I agree, the plan was to have a dedicated file for exported C functions. Your idea is perfect. As for the naming I would prefer DT_ prefix instead, so in other packages code it is easy to spot calls to data.table C functions.

@eddelbuettel
Copy link
Contributor Author

eddelbuettel commented Jul 26, 2020

Ok, I can and will set-up a simple PR then.

One thing I just realized as you wrote that is that we can do what the R API does and make that a #define. By default you get that (butt-ugly but safe) DT_ prefix, but if you set the right #define up it becomes the shorter name that echose the name in your C sources, here, subsetDT. Good idea?

(Also, DT_subsetDT() is a monster. Just saying. I really wish C had namespaces. Maybe I just add a properfly if-def'ed for C++ namespace so that those of us including into C++ would get DT::subset() which I like best here.)

@mattdowle
Copy link
Member

mattdowle commented Aug 4, 2020

Great. Implemented and merged by your #4645. I just noticed the PR was missing the magic "closes 4643" (now edited in) so it didn't close automatically.

If you'd like to add a news item, please feel free in a new PR. I usually add a news item where I can, but in this case, you and others here are better equipped than me to write this one. Or it could be left out of NEWS too as a strategy to more easily make breaking changes to it in future, if you prefer.

@mattdowle mattdowle added this to the 1.13.1 milestone Aug 4, 2020
@eddelbuettel
Copy link
Contributor Author

I just noticed the PR was missing the magic "closes 4643"

True, and my bad. But when I wrote it I meant it more as an opening in a dialogue of "what interface do we want?" (i.e. wrt to naming the functions, with or without namespace) rather than a "final" opus magnus.

Anyway, thanks for merging -- we can refine as needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants