-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Type definitions for process arguments #47
Comments
Good point. I think in the end you'd want to have this: CropS2 <- filter(collection = "Sentinel-2") %>% filter(bbox = extent(MyAOI)) in principle, R can take care of all the dirty work for this; I'm not entirely sure how the typing in the API would look like, but if it can be done it sounds like an idea worth exploring. |
Well, that would work as well, but in the R client Having types in the core API would also be useful for users listing the arguments; for instance, at the moment it's unclear what |
I see processes in the API as functions in R. Following your argumentation (and given that my_sum = Function("sum", a, b, c) rather than my_sum = sum(a, b, c) That may reflect how the API works, but not what an R user would expect. |
Indeed, I agree. But it is not feasible to implement every single possible backend process as a function in the R client. Unless there's a way to make a function generator that works upon the list of processes in the backend and makes a function for each, but that it itself would be rather unusual (and cause issues when trying to connect to more than one backend within the same R session, not to mention possible name collisions with built-in functions). |
I think that in the end the set of compulsory openEO processes may not be that large. Let's start simple, but keep close to what R users are used to. |
Maybe I try to sketch the problem we are facing now and point out how the argument type might be an asset to solve this issue. If we want to use R objects as values for process arguments, then I need to know from a technical perspective how to relate the fields of the R object to the correct corresponding process arguments or how to serialize the R object into JSON in order to provide the backend with an understandable value. As for now we don't consider this case since argument values are either numeric, character strings or one of the predefined objects (collection, process or udf). The parameter assignment in this egard is done by name matching: R function parameter "xyz" corresponds to process parameter "xyz". No type checking can be done at this point. We leave it completely to the user to enter the correct values for the arguments. Now, the idea was to use argument type definitions together with a set of common object serializations for JSON like for an extent (or bounding box) to 1) be able to precheck process graphs on the client side, 2) be able to map common R objects against API defined counterparts and use them on the client. |
So, basically there are three things we discuss here:
(1) is a API related issue. I am not sure yet how we could define that properly. Any suggestion? In my opinion, I would let (2) be handled by the clients. They should do the translation, but it's true that they need some more information (see [1]) for it. (3) is mostly a R related issue and should be discussed in the perspective repo?! In PHP magic methods would be the solution to that. In R you probably need the process("xyz", args = ...) anyway as a fallback for non-core processes or name collisions. Of course it might be handy to have helper funtions that wrap core processes, e.g. sum(...) to process("sum", ...). Not sure whether you might want to prefix them? In other languages I'd use openeo.sum() or so... |
Re (3): Have you considered generating the functions in the R package from the API spec? See https://github.com/richfitz/stevedore/blob/master/R/swagger_args.R |
I agree. @flahn Can you make a new issue for (3) in the R client repository? Back to (1): it makes me wonder whether types are necessary for every argument, or just for the special spatial types (extents, points, polygons, etc.). The latter would limit the scope, but the former would be useful by itself (to make sure users know that For a proper definition, the backends could send an extra field for each process argument defining the name of the type (quite like there is a description field now). Then it would be a matter of documenting what each of the types mean (which is something that will be needed for #46 too). A composite type such as That said, there's a question of which types are useful, since each client is going to have somewhat different native types. |
It seems there is no language-independent function definition language: https://stackoverflow.com/questions/50059706/function-definiton-language-like-openapi Therefore I just started to use the idea behind OpenAPI and experiment with my own function definition language: https://github.com/m-mohr/functio-spec |
An initial draft of Functio is available, which allows us to formally specify our processes. The processes currently defined in the API have been converted to a Functio example file. |
A subset of Functio has been ported to the API spec with a proper OpenAPI schema. Issue should be solved. |
At the moment the process definition is typeless, arguments only have a human-readable description and no type definition. This is a problem for usability, since in different languages there are native equivalents for spatial types; users are very likely to already have their data in the native type and would want the client to handle the native type, rather than having to decompose the native type into what the backend expects manually. As an example, in R, this is what users need to do to pass an extent to the
bbox_filter
process at the moment:Whereas ideally this would just be:
But there is no way for the client to know how to map the Extent to the arguments, because they're loosely defined. We cannot just make an assumption that every process with an argument
left
expects the ymin coordinate of the object extent.This is not just a problem with extents; same goes for other spatial types, such as defining points. The users have points in a
SpatialPoints
object and would expect to be able to pass it, rather than having to manually extract information about every single point in that object. The problem gets even bigger with polygons, that don't even have a set number of vertices.The text was updated successfully, but these errors were encountered: