Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Naming of function names (in R) #11

Closed
flahn opened this issue Feb 26, 2018 · 9 comments
Closed

Naming of function names (in R) #11

flahn opened this issue Feb 26, 2018 · 9 comments

Comments

@flahn
Copy link
Member

flahn commented Feb 26, 2018

Splitting an off-topic discussion from Open-EO/openeo-api#47 into this issue.

from @nuest Re (3): Have you considered generating the functions in the R package from the API spec? See https://github.com/richfitz/stevedore/blob/master/R/swagger_args.R

@flahn
Copy link
Member Author

flahn commented Feb 26, 2018

Personally I don't see this as a real issue. We are currently using the detailed process description to get relevant information about the process and its arguments (the name of the process and the names of the arguments). To build the process graph we use basic functions process, collection and defineUdf. Each function requires the openeo-client object, the name of the process and arguments passed by ...-operator. This process is as generic as it can be and but the process graph simply resembles a list of nested lists (for each object one) that will be send to a openeo-backend.

@m-mohr
Copy link
Member

m-mohr commented Feb 27, 2018

The previous discussion:

@edzer wrote:

I see processes in the API as functions in R. Following your argumentation (and given that function is a reserved word) you would use

my_sum = Function("sum", a, b, c)

rather than

my_sum = sum(a, b, c)

That may reflect how the API works, but not what an R user would expect.

@GreatEmerald wrote:

Indeed, I agree. But it is not feasible to implement every single possible backend process as a function in the R client. Unless there's a way to make a function generator that works upon the list of processes in the backend and makes a function for each, but that it itself would be rather unusual (and cause issues when trying to connect to more than one backend within the same R session, not to mention possible name collisions with built-in functions).

@edzer wrote:

I think that in the end the set of compulsory openEO processes may not be that large. Let's start simple, but keep close to what R users are used to.

m-mohr wrote:

In PHP magic methods would be the solution to that. In R you probably need the process("xyz", args = ...) anyway as a fallback for non-core processes or name collisions. Of course it might be handy to have helper funtions that wrap core processes, e.g. sum(...) to process("sum", ...). Not sure whether you might want to prefix them? In other languages I'd use openeo.sum() or so...

@nuest wrote (mentioned above):

Re (3): Have you considered generating the functions in the R package from the API spec? See https://github.com/richfitz/stevedore/blob/master/R/swagger_args.R

@GreatEmerald
Copy link
Member

I'd agree with @flahn that it's not a big deal at the moment. If we were to define functions for each process (the way the Python client does at the moment AFAIK), we start running into what is effectively the question of profiles: which, if any, processes should be "mandatory" for openEO backends? Or which of them should be in a "core" openEO profile? There will probably be backends that don't implement all the processes in the core profile, anyway (AWS, for instance). And then you'd get sum(a, b, c) returning an error that it's not implemented on the backend.

Perhaps in the long run, once we see which backends implement which processes, having a function for each popular one would be an option. But I'd say it's too soon now.

@nuest On the Swagger codegen: I don't think this helps for the process definitions. Those are defined on each backend, not part of the core API.

@flahn
Copy link
Member Author

flahn commented Feb 27, 2018

This is kind of a design choice. If we want to define separate (standalone) functions for each process, then we would store each function in the global environment. The benefit would be that you will see every function offered by the backend as a function variable in the global environment. The drawback is almost the same, you will see ALL the functions offered by the backend in the global environment...

  1. the number of functions might increase drastically
  2. If you have variables for your analysis they might get lost between all that information

The idea of an "openeo core profile" we have already discussed at WWU and it seems a promosing approach. And for those it probably makes sense to have predefined functions. But as you said, this is a discussion for later.

@GreatEmerald
Copy link
Member

Yea, indeed, the standard practice is for packages to not pollute the environment. Name collisions is another problem, if the user had a variable called filter_bbox, that would get overwritten... And using a prefix is not terribly user-friendly either.

R is nice with the support for function signatures, though: if we define a task as an object of type Task, then defining a sum.Task() function would allow the user to simply use Task %>% sum() to get the sum of the result of the task.

But yes, I could see that being implemented if we have a core profile. Maybe it would make sense to also discuss the profiles idea in a separate issue?

@m-mohr
Copy link
Member

m-mohr commented Feb 28, 2018

The core profile is discussed here: Open-EO/openeo-api#2

@GreatEmerald
Copy link
Member

Ha, yes, I just noticed that myself :)

@flahn
Copy link
Member Author

flahn commented Apr 18, 2018

In order to have an easier way to formulate processes, I have implemented a process graph builder. If you are using RStudio or another IDE for R with autocompletion you get suggestions what function is available and which Arguments are allowed by the backend.

The way it work: when creating the process graph builder, we first query the backend for all offered processes GET /process and then we iterate over the list and query for a detailed description. The information is then translated into a function that wraps the clients process function and registers it as a function of the R6 object (ProcessGraphBuilder class).

Look here for more information.

@GreatEmerald
Copy link
Member

That's pretty neat, but it still results in an R6 object rather than a regular S3 function. Though that does make the functions self-contained and specific for a particular connection, so perhaps in this case it's worth the weird semantics of calling functions inside objects...

@flahn flahn closed this as completed Jun 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants