-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Meta: Future file structure #340
Comments
Thanks for opening the thread @christianlupus. Let me try to summarize the discussion on this topic from #120. Both issues should be considered together as sharing is an important functionality (Nextcloud is developing into a collaboration platform). So when deciding for a storage solution it should probably be in that spirit. Wishes
Suggested storage solutionsPure database storage of recipes
Pure (json-)file-based storage
Mixed file and database storage
Some argumentsPro database
Pro file-storage
|
Thank you @seyfeb for the summary! Considering the wished you formulated, the pure DB solution seems to be the worst candidate. Both access from external programs as well as easy backup is not really given. Considering my experiences with reading files directly, I only see the single chance in creating a central JSON file with an index or something in that sense. Iterating through all files in the recipes folder takes on my test machine (SSD) something in the range of 30ms per recipe. This looks not too much but if your cookbook starts to grow to a few hundred recipes, this soon becomes significant, especially as it would be required for almost every HTTP request. So, if we want to stick with our requirements/wishes, I fear the only way is to go with a DB plus separate files if there are no completely new ideas. However we do things, I highly recommend to insert an abstraction layer if we start designing a new file interface. I am thinking of an abstract class that defines the CRUD operations on a complete recipe/file. Then we can later do migrations and changes in the file structure more easily without affecting the other parts of the app. The current structure is to use the name of the recipe as a folder name. Inside this folder (in the configured recipe folder search path) there are a few files so far:
There are more things that need to be stored with a recipe
This list is not necessarily complete. There might be more things, that need to be considered in the future. So, we should define a generic structure that allows for such extensions. I am thinking of having a For metadata I suggest to add a file |
Ahh, and I forgot one thing. Just a quick addition: It might be a good point in time to consider extending to multiple cookbooks per person. Using that abstraction layer in place, one could have multiple folders representing multiple different cookbooks. Regarding sharing that would allow to share a whole cookbook (ro and rw with add/delete) or per-recipe. Adding the feature to copy/move a recipe to another cookbook would allow to move from the shared one (or the shared with me cookbook for single recipe shares) to an own one for modification/... That would solve the link vs copy issue discussed in #120. |
I like the idea of having multiple cookbooks as this would give the user more flexibility. However, this might not solve the problem as you expect. What if a user wants to have a recipe appear in multiple cookbooks? Then again the question arises: Are all recipes linked (and therefore changes of the recipe in a single cookbook propagate through all books), or do I create a copy of a recipe when adding to a different cookbook? I think the second case should be easily solvable by copying the recipe folder and is represented in the design you proposed. The first use case is probably more difficult to solve. One possibility could be to create a folder for this recipe containing a single file representing a “recipe node” which contains information about the recipe location:
I might be constructing a use case that nobody needs. Not sure about that, but we should at least actively decide not to support this ;) Anyways, I would propose using node files as an alternative to the Regarding the database question: The consensus seems to be to have both, a database and file storage, so we should probably settle on this. All advantages of having easily accessible data, simple backups, and fast searches seem to be possible. |
I am thinking in an analogous way as files and folders are handled in NC. The relation is obviously Folder<->Cookbook and File<->Recipe. If we are discussing cookbooks in cookbooks is another open question. Maybe for later.
Regarding the Another point is the following: If A shares a cookbook/recipe with B this sharing is represented just by a link in the DB. I was thinking that way we could optimize storage (each recipe is stored only once), run time (recipes need to be indexed only once not once per user) and code simplicity (just use the folder from another user internally). This however causes the issue that for user B the main files app of NC does not know anything of these shared files. Thus in the local file structure synced by the client the recipes are not going to be visible. It might be possible to register the files to be shared in the main NC app but I would have to look this up to be sure. The alternative would be to share the corresponding cookbook folder from the files app (so the file management is taken away from the cookbook app similarly to the current state but for multiple cookbooks possible). As we are basing on the files app, all users would have the same set of files (recipes) during normal file sync. |
Actually, I’m totally with you. The confusion probably comes from the fact that I was talking about a single user and you were talking about sharing between users. :) Reusing recipes in multiple cookbooks What I was trying to illustrate was the case when somebody wants to have the identical recipe in two different cookbooks. Two cookbooks means having two folders - one for each cookbook. Storing the recipe in one of the cookbooks can be handled as discussed (create a recipe folder and store all related data (JSON files, images, vids, etc) in the respective folder). Only how to link (not copy) the recipe to the second cookbook in a way that can be backed up (i.e., does not need entries in the database) would be less clear to me. Recipe node The idea of the node file was not to handle sharing between users. I totally agree that this should be done as you said - using the Nextcloud internals of file sharing. The idea was rather to have one file for each recipe that contains/links to all relevant information. For example, it determines the local location, i.e.: Is the recipe data located in the same folder or in a different folder (it would be something like a symlink then)? But the node file could also contain all data as you proposed for the Maybe the Obviously this linking part does require some implementation stuff: what if the original recipe is deleted from a cookbook? -> A popup must inform the user that the recipe is used in cookbooks X, Y, Z and asked if it should be deleted from all cookbooks or only from a selection. Depending on the response the original data then might need to be moved to a different location. In a first shot, the feature does not need to be there. But at least we would have a possible way to build such a feature later. Reindexing Location of cookbooks I agree that reindexing of cookbooks when the user is free to move them around his system is difficult. What if we require the user to have all his cookbooks located in a single folder? As a first step we could require a This would allow reindexing without the requirement to iterate over the complete Nextcloud content and look for recipe data. |
OK, I see we have two different approaches here in mind. Inode-like approachWe have a set of cookbooks and a set or recipes. These are unrelated in the first moment. Then associations are added that define, which cookbooks contain which recipes. This is similar to the data storage structure in most Linux file systems (see inodes). The clear benefit of this approach is that the association defines the access rights, thus sharing between users is done easily once we have it running for one user. It is then merely a UI issue. I doubt that the NC files app itself does support this file structure itself. So we need to make this inside the cookbook app. This has the additional drawback that external programs can no longer work on the synced file structure as the files are only saves outside the structure. The files app not recognize these things. [1] The real location of the data could be certain locations:
Folder-based approachThe main idea behind this approach is to consider a folder (in the user's file tree) to be a cookbook. Details can be discussed (e.g. should cookbook be allowed to contain other cookbooks?). Here are in fact two subtypes. The benefit of this is a clear notion of the owner of a recipe. Sharing the files/folders using official file app's processFiles folders are shared using the NC internal functionality. This approach has the drawback that linking a recipe into multiple cookbooks of the same user will not be possible. When allowing cookbooks in cookbooks this impact could be reduced by including several in matching fashion. [3] As everything is done with the knowledge of the files app, usage of external programs on local files is fulfilled trivially. Sharing internally in the cookbook appHere the files are saved in the file structure of the owner. The other apps do not know anything about the shared files as the sharing is mainly done in the cookbook app. Clear drawback is that the other users do not see the files in their synced local file tree. It might be possible to overcome this by adding a sharing using the files app. However this is just an educated guess. As we are not restricted to the possibilities of the files app, more sophisticated sharing might be possible (more tailored towards cookbook's use cases). Footnotes[1] The only exception to this rule I see would be to define a virtual dummy user that holds all files and these files are stared across the NC instance. However I feel this is breaking many architectural decisions made by the core team. So, let's forget that quickly. [2] This is especially true, as we have no saying in changes to the files through the files app. The user might decide to remove the recipe from his account but rendering it unreachable from all other cookbooks (his and others users'). [3] Something like a cookbook Christmas bakery could be included in the cookbook Baking. If the containing cookbook (baking) contains the unison of all sub-cookbooks a recipe in Christmas bakery would also appear in Baking. Obviously, this is only possible in a tree-like structure and thus a certain restriction. |
So, I might not have been clear enough on this^^ I really would want to stick with the approach of having all data available in the filesystem (for the requested backup and offline-editing solution) just as you propose. What about this approach: General setupAll cookbooks are located in
Sharing of whole cookbooks and recipes between users is handled using the NC file app by sharing the respective folder. The functionality for sharing can be exposed in the Cookbook’s interface with setting up the receiving-user’s directory later. I think at this point our approaches are identical and we should probably settle on such an approach. Internal sharing between cookbooks (not users)Now for the internal sharing between cookbooks of the same user. The For illustration: Assuming we have four cookbooks Given this setup, the file
while To prevent data loss (if the user accidentally deletes the "single source of truth") we could still have copies of the data (images, etc.) in all of the folders. Also, if a recipe is deleted from one cookbook (i.e., it is not available in the file system anymore) the entry from the synced recipes’ arrays can be deleted. The data of the linked recipes is available in the respective folders anyway. Updating a recipe via the Cookbook app could automatically update all synced recipes. If a recipe is edited only locally (e.g., offline) without updating the recipes in sync, an update via the cookbook app would have to be done manually (or periodically). Which recipe has been updated could be checked by looking at the timestamp of the |
OK, so it seems we are setting a consensus to use the stock files app to save and share the recipes. The details might be discussed and will most probably be discussed in #120. Regarding the issue with removing the files by the user, I rethought and maybe you are right: If the user is advanced (or silly) enough to fiddle around with the internal file structure (in destructive ways), he might be on his own. We cannot provide a second, third and fourth backup system in the app. Assuming you have the cookbooks A, B, and C. The recipe should be located in A (just arbitrarily). Then what about the following structure?: In A you have a
And in B and C both only a
The only implication I see here is that a shared recipe would only be in A in the offline cookbook folder. The others would be just arbitrary JSON files as the linking/cloning is not understood. Or we do a periodical sync of all |
There are multiple things. Your exampleIf I understand you correctly, your example reflects what I was trying to propose earlier - having the files in a single location and having placeholder/links in the other locations. As you said (and after thinking about this I would second the concern), only having the meta.json that tells you that a recipe is a clone of "A" and not having any data in the folder would collide with the approach of using the NC files app for sharing. If I share a complete cookbook with a second user and the cookbook contains clone references the user who receives the shared cookbook has no access to the recipe data. Periodic syncI did also see the problem of a "split-brain/conflict scenario". That is why I suggested to sync recipes based on the timestamp of the It gets tricky if there are changes in both directories. I only see two solutions to this scenario: (a) Show a dialog with the differences to the user and let him decide which ones to keep; (b) unlink the clones, keep all as separate instances and maybe inform the user about this. Single recipe.jsonHaving a single recipe.json, a single source for images, etc., as you said, would be ideal. However, I guess, this might be a decision against having linked clones. SidenoteI just tried to create a hard link on the file system level, share the link with a user and edit the shared file. The hard link did not survive ;) |
Yes, mainly.
As I wrote, we must distinguish between sharing between users and linking between cookbooks with one user's data. For the sharing with other users, the whole original folder (with all images, resources and For sharing internally, I see no real chance. A single
That is exactly the split brain (two independent evolutions or the data) scenario.
Yes, this is obvious as there is an additional level of abstraction. This will cause really strange effects, as the main core pretty sure does not consider hard links to be even present. You might end up overwriting the inode and thus changing files "behind the back of NC" causing a complete can of worms to open.
Not necessarily. We could keep the links withing the web view (stored in DB or To get this issue finished, I suggest to get back on track. This whole discussion might well be better located in #120 as it is mainly the discussion of the requirements and wishes regarding sharing. This issue was more related to the file structure within a single recipe. I see the main concerns of this issue as: What files should be saved in which location? What information should be stored where? |
You’re right, we got a little off track. Still, this specific use case might have (had) an influence on the structure of the file storage, so I guess that’s fine. Some last comments:
Recursive access of recipes - definitely. The only question which will arise is how to represent this sensibly in the UI. But that’s not for now. Linkings in the web-view are fine. That’s what I was going for anyway. Let’s just remember: All information that we only keep in the database won’t be available for the requested “easy backup”. BTTI think we are closing the circle here. Based on the current file structure, we can simply add a File system level:
Use cases:
Number 4 as described above does not consider recipes stored as links (see 5.1). This would require further (future) investigation. For syncing database/folders, it might be helpful to have a single access point in the file structure. For example, in the NC user root under Recipes/ or a user-defined location. Please correct me if I missed something or got something wrong. |
A new issue #364 came up recently. It should be implementable in a straight way in my intention: We could have a subfolder
I think this matches with the latest summary from you, @seyfeb. Any comments/enhancements? Regarding the single access point: I'd for now not restrict that. Let's keep it in mind but see how the implementation works out. |
OK, after some discussion in #364 I think I might need to reconsider my structure a bit. I will try to give a bit of structure. Recipe foldersSuch a folder contains exactly one recipe. The structure is proposed as following (similar to above):
All files related to a recipe ( New files are the Cookbook folderA cookbook is represented by a cookbook folder. Such a folder is special in the sense that it has a hidden folder Here without the content of the folders
Cookbook storage folder
|
These are interesting ideas. The proposal contains two new concepts:
Incremental vs. full storageI would prefer to have an incremental storage of different versions to prevent too much duplicate data cluttering the system. Especially when using large or many images (e.g., if feature requests requiring more than a single recipe per image), or video files may be present. However, I tend to storing changed files as a whole and not as a diff. This might be comparable to something like docker’s layered file system. Some files like images can’t be stored sensibly as diffs anyway. And textual files are pretty small anyway. This would have the advantage that each file is readable as a whole and potential errors when merging diffs won’t be a problem. NotesSome questions questions which were not immediately clear to me. I think the answers are contained in your post, but it may be helpful to have them stated as clearly as possible. I give it a try>
Is this correct? Documentation
I think you have already created a good starting point, although you got me confused for a second with the |
One night later I think it makes sense to name the files in About the incremental vs full lstorage, you are perfectly right. I would suggest to go with incremental as much as possible. Just thee very first commit must be full. specially the binary data I would anyways store completely all the times.
They are repeated (sort of, see below [1]). This allows read/usage in 3rd party apps on the synced files.
Correct
I am not yet sure about a double linked list (aka child commits). This might speed up things when iterating over the whole tree but requires sensible storing especially if a branch is cut off (deleted). We will have to keep everything in sync.
Exactly that was the idea behind the structure
Yes, in the I suggest to make at least a message required for human readability if commits are made manually [2]. For the branches I'd say yes as well. In the
Yes that is right unless the recipe should start a new history from scratch. Anyways, we will need a regular garbage collection to remove old entries in the history similar to the current database reindex approach.
Here we need to differentiate between the recipe folder and a recipe itself. The recipe folder might be identified by a simplified name of the latest recipe version (replace all special chars with dashes or so to avoid issues with the file name). The recipe is per definition inside the recipe folder. This makes manual identification of the recipes easy for those syncing and running 3rd party code. The history is referenced by the hashes. Thus these serve well for identification. [3]: I'd say (purely linguistically): A version or commit is a change of a recipe that has envolved over time.
Yes, any recipes in nested cookbooks have their history in the nested one. So we need to identify first the closest cookbook and then look up the history files there. The only structure I suggest to have a deeper folder structure I already mentioned:
Some more notes: [1] The history should be considered fixed, once the commit has been made. Thus, by diffing the latest history with the work copy under the recipe folder, we can detect if there was a change made (just the detection, next steps to be taken are not yet defined). Here it comes to play that we need to decidee if any change should automatically generate a commit or only of the user clicks on a button or [2] If we do auto-commits the commit messages might need to be something automatically generated. If the user uses the internal frontend, we can ask for a commit message during saving but for the 3rd party changes I see no chance to do so at all.
Uups. I updated. |
So for our current requirements, I see most problems solved with the proposed structure. Problems may arise in the future when trying to implement recipe sharing (#120 and discussion above).
That’s a good idea. I also like your term “variant” for the different branches. Maybe we should name the folders like that - "variants" instead of "versions".
What I don’t get is, what determines the "main recipe" in the recipe folder? Is this not just one of multiple variants? I guess, you propose this structure to comply to the old standard and to not break existing Cookbook-viewer clients? The latest commit is based on one of the variants. When you say "the work copy under the recipe folder", are you talking about the corresponding variant? Just to make sure: The recipe in the versions/variants folder is the status at the tip of a branch? What happens if I switch to an older commit of that branch and want to make that the one referenced in the folder? What if I add a new change/commit from there? What about the abandoned commits? Probably at some point this requires some UI to manage the tree. Open questionsAs far as I see, the open questions to be answered for the file structure are
Questions that don’t need to be answered in this thread
|
I think there is small misunderstanding here. The history can be seen mostly similar to a git history with the tree-like structure (no merges). This is only the Unlike the classical git approach, where I have only one working copy that I can switch between, I am voting to have all branches checked out at the same time. This was due to the intention to allow 3rd party apps to access all branches not only one of them. How should changes be realized? Checking out another branch will cause maybe trouble when syncing etc. Therefore my intention was for a recipe folder (see this comment's structure for a recipe folder) to guarantee a main variant workspace (think of the genderized git What I meant by my statement with the diffing: For each variant (be it named or We could move the
I think this is not 100% fixed yet. Of course, some UI will be needed. For the NC app the following is valid: I was a bit inspired by onshape (a CAD program I use from time to time for 3D printing). You can open there a side panel where a branch structure of the commits is depicted. I'd suggest to add a link to allow a user to view each individual version and to create a new branch of it. For all branches there need to be a way to edit each one. But this is only the UI issue around the whole thing. So from a backend perspective, I suggest to allow commits only to branches (no dangling ones). I assume this is what you mean by abandoned commits?
Here come a few suggestions (might be changed again):
Keep it empty for now, ready for any extensions, where we might need additional data
Most probably the required fields in the corresponding
This is the one question that might really be answered during implementation. But it is purely an optimization issue for the backend. |
This PR/issue depends on:
|
Currently the files are structured in a folder of the user. In this folder, one
recipe.json
and max two images are located.I suggest to extend this file/folder structure to allow for more information to be stored while keeping with the main motivation of saving all information as files for easier backup:
I suggest to add one optional file
meta.json
to each folder. This folder can take different options that are related to other currently open issues here (e.g. #311). The exact structure of this meta file needs to be discussed. As it is only internal to the cookbook app, we are relative free in the data to be stored there.This issue should serve as a discussion basis.
Depends on #1126
The text was updated successfully, but these errors were encountered: