-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Semantics for POST method #154
Comments
Suggested HTTP Status codes for POSTOther suggestions welcome!
|
Suggested extended problem details typeAccording to RFC 7807 we should return a Problem Details type with a small number of required fields, and HTTP media type "appplication/problem+json"
Within the errors object we could have the following:
Thoughts, suggestions, changes welcome! |
We need to agree on do we want to support POST/PUT only per single resource or handle multiple resources in one request. For single resource actions it seems like more straight-forward and we can use best practices defined on jsonapi/restfulapi etc. If supporting multiple resources in one request how we handle an overall status if the request partly went well and partly failed? For example sending POST request for 2 milking visits when one had ended up with 201 and second one with 400 (or other). Should we require than within one request data handling should be transactional - all success/all fail? |
From our perspective we always advice to start with single resource updates. This makes the error handling a lot easier, and (for our role as a proxy) checking permissions does not require introspection of a message. |
I also agree with @alamers that starting with single resource handling would be most convention way for both client and services. Biggest issue still is to find a way how to work with resource IDs if other side in not capable to use single key for identifying a resource. I also like suggestion for http status and error/problem type as defined in Andrew messages. |
Also for http codes and problem response message - we might want to use them also for GET calls (invalid query params, no access, not implemented etc.) Would it be a breaking change for v1.1 if we'll define it for GET scheme instead of default "exampleErrorResource" resource array?? |
Had some discussion with our clients and most of them feel that they would prefer to use POST methods with batches and not with single resources. In our Nordic system client are handling data exchange once or twice a day – not in real-time. For them posting resources as singe events is not an option, especially in a big herds where there is a huge amount of data collected during the day to send. So I guess from our pov we would need a definition for sending resources in a batches in addition to the single resource POST (also for updating and deleting) |
@alamers wrote:
Makes sense, although I note also the comment about @ahokkonen regarding batches. |
@cookeac @alamers @erwinspeybroeck As discussed on Friday meeting:
|
From the last technical meeting on 19'th of November some concerns was raised (mainly from FMS side) about having POST events be based on single resource posting instead of the bulk insert (or update). For our case it is becoming kind of urgent task to decide the implementation standard of the services and POST/PUT handlings on API layer. Other valid point raised by @alamers about my previous proposal on having /batch as an extension for main collection endpoint. In case we want to support single resource update (PUT, PATCH) according RESTful specification for collections (aka /locations/{location-scheme}/{location-id}/milking-visits/{resource-id}) then this extensions is not a best solution. |
@cookeac @alamers @erwinspeybroeck @thomasd-gea on todays meeting we where discussing POST/PUT semantics for bulk/batches. Let's look on the situation for POST events (pushing new resources to the data service). [
{ "id": null, "eventDateTime": "2020-09-17T10:24:35.997Z", ... },
{ "id": null, "eventDateTime": "2020-09-18T10:24:35.997Z", ... },
{ "id": null, "eventDateTime": "2020-09-19T10:24:35.997Z", ... },
] How the response model should then be designed? [
{ "id": "1", "eventDateTime": "2020-09-17T10:24:35.997Z", ... },
{ "id": "2", "eventDateTime": "2020-09-18T10:24:35.997Z", ... },
{ "id": "3", "eventDateTime": "2020-09-19T10:24:35.997Z", ... },
] In response service should reply with resource processing statuses, new ids and/or possible errors in case of processing failure.
|
Hi, as far as I know, the order of JSON arrays guaranteed, however, depending on the implementation of JSON you use, the order might not be preserved (https://stackoverflow.com/questions/7214293/is-the-order-of-elements-in-a-json-list-preserved). In any case, I think having an explicit response that identifies entries would be great. BTW, @ahokkonen : was the example with two IDs being "2" intentionally? I assume you meant "3" in the last entry... Now, the question then is, how the sent entries and the response entries are related to each other. In general I see two ways: identify the order of the sent/returned entries explicitly (e.g. by numbering them) or use source/target identifiers to identify them. Well, the latter case would be quite easy with our current API and I would expect that every system either has an ID for an entity or could at least create one on the fly (e.g. just assign an increasing number to them) and put that into the ID field of the resource. Whenever such an entry is put into another system via the batch API, the receiving system could just return that ID in the Meta-Data field "sourceId" (https://github.com/adewg/ICAR/blob/Develop/types/icarMetaDataType.json) of the resulting object and put its own ID in the ID field of the resource. That way, the clear relation between entries can be maintained. This would at least be true if the full object is returned. The "smaller" (i.e. more efficient) alternative would be that the receiving system could just return an object containing the "source" id and either the "assigned" id from the receiving system or an error message. That would make the communication quite easy. I guess, returning just this list of "small" objects VS returning a list with fully specified objects (which is what I'd do for the single POST) would save some bandwidth (and most likely time). On the question of whether the receiving systems needs to store the sourceId: I guess this could be decided by the receiving system. It is not required, imho. One could just use that for the batch result and then forget about it. The sending system can then track the new id for the entries that it sent and could also see errors per entry. The alternative of identifying the order of the entries would be something that we would impose on the JSON array because we are not sure whether user implementations really keep the order of the entries seems somehow strange, so I'd vote for the "sourceId" way as described above. @ahokkonen would that solve all of your requirements? |
Example: Input from sending system:
Result returned from receiving system:
BTW: for single entry posts (so non-batch) this object could be re-used. This would make the API quite clear. On the other hand, returning the whole object would still be a valid option. |
I think @thomasd-gea summarizes it nicely: (1) either we need to rely on the implicit ordering of the array or (2) on some id from the client for correlating the responses with the input. Relying on the implicit ordering of the array feels very fragile. Why not make the id compulsory in these batch cases? The client should have some form of id internally, since it is handling a batch of entities here. Exposing that id (in some form) should not be too complicated? I also would like to remark that, as far as I remember, I think we agreed that (at least as a default) we'd prefer an accept-all-or-reject-all policy. This does not change the discussion much since we'd still need to correlate the responses with the input, but it does give us an alternative (3): for a successful post, we may not need to communicate anything back (completely ignoring the correlation problem). And for a failure, we could send back the full entities that failed, leaving the correlation problem at the client. For me, only option (2) feels robust. But current systems may not require that much robustness and rely more on manual corrections. |
@alamers Option (3) is not possible, I think. The sending system at least has to get back the ids that are assigned from the target system. If the source wants to reference them somehow, that is. Of course, for "fire and forget" this would be possible, but this is a rather rare case, I would expect. And even there, the mapping does not hurt. |
In our solution we allow server systems to implement "partial failure/success" cases - part of the events could be approved and saved (and the proper status should be returned) and part may fail (errors should be returned). First we where thinking about having some sort of wrapper for batch resources so the client can have their "correlation/tracking" ids for each resource on upper level: [
{
"referenceId": "323424234324", // or 'clientReferenceId' generated by client side used for tracking
"data": { // resource is embedded inside 'data' property
"id": null, // not specified for new resources
"status": "Pregnant",
"eventDateTime": "2020-09-17T10:24:35.997Z",
"animal": {
"id": "12543488552",
"scheme": "xxx"
}
}
}
] and the response would be: [
{
"referenceId": "323424234324", //or 'clientReferenceId' generated by client side used for tracking
"status": 200,
"data": {
"id": "34", 'id' generated by data service
"meta": {
"source": "DEN",
"modified": "2021-01-01"
}
},
"errors": null
}
] So the client app system receives back correlation id, status, possible error and event resource core information - basically only created id and timestamp ( Other simple option would be to just add "correlationId/referenceId" field directly into EventCore resource). But then we realised that this would be an overengineering for that purpose. I would like to have more generic and standard way of handling this. Maybe utilize "sourceId" in "meta" property for client tracking? Because in case of sending new event client side is a source (or?). It is a bit open for me how "sourceId" should be used :) [
{
"id": null,
"status": "Pregnant",
"eventDateTime": "2020-09-17T10:24:35.997Z",
"animal": {
"id": "12543488552",
"scheme": "xxx"
},
"meta": {
"source": "client identifier",
"sourceId": "1" // client generated id
}
}
] and then the response: [
{
"id": "12345", // server generated resource id
"status": "OK",
"meta": { // meta should be always returned!
"source": "client identifier",
"sourceId": "1" // original client generated id
},
"errors": null
}
] |
@ahokkonen In your 3rd code example, I'd prefer to have the client generated ID in the "id" field, because for the receiving system this is actually the ID in the sending system, isn't it? Example 4 is in general ok for me, but I wouldn'T reuse the meta object as it contains some more info (unless the info like "created", "validTo" are also relevant for the sender). Also, do we need the status field? I would expect it to be only "success" or "failure", so we could either leave it out completely or replace it with a boolean like |
@thomasd-gea client generated ID is irrelevant for the receiving system, at least in our cases when there are 2 completely disconnected system on client and server sides with their own ids. Only for the client side it is important to know ids for the server resources for update/delete operations. That is why we need to get server generated id back to the caller. With global unique identifiers (UUIDs) life would be much more easier of course :) But I afraid it is not an option |
Yes, we have a simlar case. Our system doesn't care about external IDs (and does not store them), but every system that wants to communicate with our systems needs to keep track of the changes and, hence, has to know the correlation of their ids with our ids. Otherwise one would have to rely on object equality based on fields (e.g. created on etc.), but at least I wouldn't want to touch that topic in the ICAR API, not even with a veeeeery long pole :) Having said that, I'd still say we should design the interface as if both systems should be able to know the other systems' ids. Then, each system can decide whether it wants to track the other systems IDs or not. |
I agree that icarMetaType should not be utilized in full specs in response. |
@thomasd-gea @cookeac @alamers we did some additional research on that case and decided to go with following solution to cover our POST/PUT requirements. [
{
// empty for new resource
"id": null,
"eventDateTime": "2020-09-17",
"animal": {
"id": "12543488552",
"scheme": "scheme-ххх"
},
"meta": {
// original id generated by client
"sourceId": "4431696f-8c7c-458b-97d3-b9fa426ce56e",
"source": "client identifier"
}
}
] Response for such request should be: [
{
// server generated resource id or (IF possible) same value as "sourceId" if UUID support on both sides
"id": "12345",
"status": "OK",
// meta should be always returned, even for failed as used for response tracking
"meta": {
// original id generated by client, used as a correlation id for response tracking
"sourceId": "4431696f-8c7c-458b-97d3-b9fa426ce56e",
"source": "client identifier"
// time stamp for created resource in data service, to get exact time
"modified": "2020-09-17",
},
"errors": null
}
] For PUTing all is the same, only difference is that in request "id" field should be assigned with server identifier value. Is there any big weaknesses is that solution or any critical conflicts against ADE design principles? p.s. is there any way for putting comments inside JSON without those ugly red lines? :) |
Well, I'd still prefer to have the id of the source system in the "id" field, as then I just have to pass on that entity to another system without changing it. That would also allow for the receiving system to reuse the id if UUID is enabled on both sides (so it covers @ahokkonen case in the second example and is even better for that use case as the entity is stored "as is"). Whether the source id is then returned in the meta section of the created entity is then in the hands of the managing system. And for the response, I'd still vote for not including the "status" field (i.e. And no, JSON does not allow for comments. You have to use explicit fields for that. But I miss this feature also :D |
@ahokkonen - I think that is a good suggestion. :) I have just one question though: What other naming could be better? While we already talk about "client" and "dataprovider", I would suggest naming them like this: source => client or clientId |
@thomasd-gea |
For "Status" I agree - (errors == null) is good enough indication of success request. Meta is also something that I would like to have even not in a full content. |
The "charme" of the ID field for me is that you don't have to change the entity when it originates from another system (which will be the case in 100% if you send it). There, it will have an ID (even if it is only an enumeration on the current batch). In the rare case, where we have globally unique IDs on both sides, the receiving system could even decide to store the given entity exactly as it was given (e.g. if it is a "backup storage" system that only mirrors what another system contains). Hence, the given ID on a POST is just an indicator of what was the ID in the originating system. The receving system can decide to assign a new one, if required, and can also return the original ID in the sourceId field (if that does not already contain a value that the client set AND the system stores that information at all) As a result I would see something like we discussed in our call on Tuesday. I don't have it exactly as we discssued, but it looked similar to the following example. Name changes are of course possible:
For completeness, examples for batch POST and PUT (similar to what we discussed):
PUT (exactly as POST, but IDs have to exist):
And maybe an interesting side-effect (that I'm happy to have): we could make the id field mandatory for all resources in all cases (maybe except for a single item PATCH, but this is something special anyway). It will always be filled by using the approach sketched above. Cases where we want an "[store|change]-all-or-nothing" behavior can also be covered (however, that is the case for all examples we had, I think). PS: for a DELETE batch, the |
In addition to my proposal above (which should answer the part of your question on how the response could look like): Well, using POST, PUT, GET and DELETE (and maybe more) are different use cases that you have to handly anyway. If you PUT, DELETE or GET something, you have to be sure that the entity already exists or otherwise you already know that the call will fail. For POST your expectation is just that the item is stored (if it is a shape that is accepted by the receiving system, e.g. all fields required by that system are filled). In general I think that in most of the cases the ID field of an entity will not be null. At least for all the systems that exchange data that is stored in their system, that would be true (if they provide an interface for data exchange; could be ICAR). So, taking that entity and just posting it to another interface (in a batch or otherwise) would be very easy as the sending system wouldn't have to change anything. Of course, if using PUT or GET you would have to know the entity ID, but after you post something these should be known (based on the response I sketeched above). Otherwise, you would have to change EVERY entity that you want to POST: remove the ID and set the ID somewhere else. Not a real biggy, but still additional work (that would be superfluous if the systems share a common ID system). At least for PUT you would have to overwrite the ID anyway if the ID systems differ. But you would have to do the mapping only here. That means: a little less logic on a POST, the same for the other cases. |
For us @thomasd-gea proposed solution sounds good. If all agree that this is the way how we should work with POST batches - we'll use it. |
Hello, I like the concept of sourceID mentioned in: Therefore I think the client should fill out the fields in meta and we would recommend the storing of these to the receiving partner, if they plan to later pass this event to a third party. In additon I see no problem to fill the id in an POST, if it's used as UUID, but this needs to be confirmed by both exchanging parties during development or in their data exchange contract. Like stated in all of the last discussions: as long as we don't have UUIDs everywhere, we need to map IDs somehow, but this needs to be implemented only once. |
So to summarise our discussion at the meeting today: When POSTing resources (single or batch):
the server MUST return the object(s) posted with:
|
Extract from the minutes from the meeting from 3/6/2021: Anton states they need partial processing capability for the batch POST. This is not available in the PR from Andrew (single POST, transactional batch). Next to that there are doubts if also Collections (paging) is needed in a batch POST. The proposed response is not really future proof (if we also want partial batch POST). An option is to define all fields in the response as optional, so i is not needed to return the complete resource. Errors have to be returned in some way. Conclusion : Anton and Thomas will prepare a proposal with some examples for next meeting. So we can better understand en view if we can find some common standard for partial and trasactional batch POST. |
I spent a lot of time looking at how other schemes handled "partial acceptance". The general feedback seemed to be "this is not RESTful, don't do it that way" (ie, it doesn't perfectly fit the "resource" model and becomes more like a remote procedure call). However, if we wanted to handle this, then instead we should:
For ourpart, we would be likely to move to using HTTP2 and single object POST methods to retain compliance with REST and HTTP (while still achieving performance) rather than use this approach, but it is definitely feasible. |
Moved milking related observations and resources into an OpenAPI file for the ADE-milk label. Covers milking visits, test days and test day results, lactation status observations, and summaries for daily milking averages and lactations. Resolves adewg#154
Create a separate URL scheme for reproduction observations (label ADE-1.2-reproduction). There are 11 observations in this set, including mating, reproductive status, pregnancy checks, abortion, and parturition. Along with other commits, this resolves adewg#154
Added URL schemes for management (animal sets, animal set join/leave, and devices), and performance (weights, conformation scores, breeding values). Along with other commits, resolves adewg#154.
Create a URL scheme for feeds and feed management (storage, rations, intake). Support GET, batch and single POST. Along with other commits, resolves adewg#154.
As part of our work on URL schemes and recommended common implementations, this issue covers using HTTP POST to send new resources from a client to a service. This issue is defined to allow discussion. I've made a start below:
Goals
There are several parts to this issue:
How to determine if a collection on a server supports adding resources
There are three options:
The method and URL path to use when adding new resources
Adding single resources or arrays/collections of resources
There might be several options:
How to uniquely identify resources being added to the server
A discussion area in itself - broken out into issue #153
Determining status - success or errors
The recommended method for REST web services is to use HTTP status codes . For instance,
Substantial work on this has been done by JSON-API (https://jsonapi.org/format/#crud-creating-responses).
Return results when resources are successfully added
Based on the RFC-2616 specifications and JSON-API, it would seem that when new resources are added successfully, the server should:
Return results when the server is processing new resources asynchronously
Based on the RFC-2616 specifications and JSON-API, it would seem that when the server is still processing the submitted, resources (and would otherwise time out), it should:
Returning error results
The usual recommended approach is to return an HTTP 4xx client error status, and then return an error or problem response in the body.
We will need to decide the representation we want to use AND whether a server should always fail when an error (rather than a warning) is encountered, or whether it should attempt to process other objects in the request.
The text was updated successfully, but these errors were encountered: