-
Notifications
You must be signed in to change notification settings - Fork 5
Save and Update Nodes to API
We create a giant JSON from our Project node to send the all the nodes to the API in one request. This is an important step because it guards against bad data.
In the previous SDK, nodes had to be saved one at a time, this created issues because if half way through the script the user got an error, then within the database half of the project would have been uploaded and half would not have been. This becomes a bigger issue for update when if the script has a bug half way through upload, then half of the nodes are updated and half are outdated creating bad data in total.
With the new SDK we wanted it to work in such ways that either all the nodes get saved to the API or if there is an error while the API is parsing through it, then it rejects the entire request, does not save anything, and gives back an error to be corrected, therefore, always having valid data within the database.
We construct a giant JSON of a project node. We are using UUID to point to nodes that we have within the JSON. The UUID helps to reduce the JSON file size by referring to UUID of the node instead of inputting the full node into every part of the JSON.
Nodes repeating can increase the JSON size very quickly. In the begining the JSON size was 8M+ lines
The database schema enforces the use of UUID and guards against repeat of nodes because before the API does any kind of processing with the JSON, it will first check the JSON against the DB schema and if it fails then it will return the error as a response. Thus, the SDK has to send some nodes as full nodes and some UUID references to the node. For example, inventory is written so that it will always only accept UUID references when sending the Project JSON and will expect the materials to be already defined on the project.
The API also expects UUID to have already been saved and they should point to an already saved node within the API.
I believe UID is supposed to be used as a reference when the node is new
The SDK does not know which node has been in the DB and which needs to be new unless we traverse the entire nodes and search every single node one by one to find which one is new and old.
The way the SDK handles this is by getting feedback from the API, learning from the errors, correcting the errors, and saving the nodes
- Construct the giant JSON and try sending a POST request to the API
- If the API gives back an HTTP 200, then everything is fine and we can move on ✅
- if the API responds with HTTP 400
{Bad UUID Error}
then that means a node has been condensed a node to UUID that does not yet exist on the APIsave the condensed UUID node to the API, then try sending the giant JSON again
# core.py
def get_json(
self,
handled_ids: Optional[Set[str]] = None,
known_uuid: Optional[Set[str]] = None,
suppress_attributes: Optional[Dict[str, Set[str]]] = None,
is_patch=False,
condense_to_uuid={
"Material": ["parent_material", "component"],
"Inventory": ["material"],
"Ingredient": ["material"],
"Property": ["component"],
"ComputationProcess": ["material"],
"Data": ["material"],
"Process": ["product", "waste"],
"Project": ["member", "admin"],
"Collection": ["member", "admin"],
},
**kwargs
):
- Find the
Bad UUID
node by its UUID from the Project tree - Save the full node to the API
- remove all occurrences of the full node from the Project tree and only refer to the saved node by its UUID because it has already been saved to the API
- repeat this process until there are no
Bad UUID
API errors and everything has been saved
can update computation.notes
{
"node":["Computation"],
"updated_at":"2023-07-12T02:44:13.942095Z",
"model_version":"1.0.0",
"name":"my computation name",
"type":"data_fit",
"notes": "computation notes UPDATED"
}
can update computation.notes with software
{
"node": ["Computation"],
"updated_by": {
"node": ["User"],
"uid": "_:0x15f901"
},
"updated_at": "2023-07-12T02:57:39.943587Z",
"model_version": "1.0.0",
"name": "my computation name",
"type": "analysis",
"notes": "computation > software_configuration > software UPDATED",
"software_configuration": [
{"uuid": "37c2223a-164a-4ef7-b01f-4f6876fdf877"}
]
}
can update computation.notes
and has software_config with software
{
"node": [
"Computation"
],
"updated_by": {
"node": ["User"],
"uid": "_:0x15f901"
},
"updated_at": "2023-07-12T02:57:39.943587Z",
"model_version": "1.0.0",
"name": "my computation name",
"type": "analysis",
"notes": "these here are UPDATED NOTES BABY AGAINNNNN!!!!",
"software_configuration": [
{
"uuid": "37c2223a-164a-4ef7-b01f-4f6876fdf877"
}
]
}
-
isolate the node that you want to update and remove all the outer layer things
example: if you want to update experiment, isolate it and remove outer layers of collection and project
-
strip
'created_at', 'created_by', 'uid', 'uuid'
from JSON- Otherwise, API will respond with:
Additional properties are not allowed ('created_at', 'created_by', 'uid', 'uuid' were unexpected) at path: /
- Otherwise, API will respond with:
-
send request to get feedback on your updated
-
if you get a
Duplicate uuid: 5aa3f648-f27b-4478-a81c-fd64965e87bb provided
I think some other errors might be solved with this as well, it solves most errors I think
-
find the node and reduce it to just UUID
- From
"software":{ "node":["Software"], "uid":"_:0x196205", "uuid":"5aa3f648-f27b-4478-a81c-fd64965e87bb" }
To
"software": { "uuid": "5aa3f648-f27b-4478-a81c-fd64965e87bb" }
-
if this error pops up that it is not valid
is not valid under any of the given schemas at path: properties/software_configuration/items"
{ "uuid": "37c2223a-164a-4ef7-b01f-4f6876fdf877", "software": { "uuid": "5aa3f648-f27b-4478-a81c-fd64965e87bb" }, "model_version": "1.0.0", "updated_at": "now()", "created_at": "now()" }
- then remove that node from the JSON and save it separately
-
see if node exists
if yes
send patch
if error
find the node within the graph and send that to the API
try to post again
if error repeat the process
if not found
post project
if bad uuid
find the uuid that is bad from the graph
post that
get a 200 response
post the project again
if error, then repeat the process
REMOVE_ATTRIBUTES = [
"uid",
"uuid",
"public",
"locked",
"model_version",
"created_at",
"updated_at",
"created_by",
"updated_by",
]
- When working on integration write out the HTTP request payload by hand
- use Postman software to send HTTP response to the API and get the response
- design the step by step logic on paper, then attempt to code it