-
-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improving metadata handling #1669
Comments
That makes lot of sense. I mean not creating commits to update meta data. In case you need to store more complex data, you could use the description of the pull/merge request. |
To @timaschew's point, there is a technique you can use to append data to a PR description which is visible only when editing the description. In fact, we're already using it in our templates - HTML comments. A tagged JSON object in an HTML comment in the (already auto-generated) PR description would seem to be a good fit for storing metadata. It's visible to the users when necessary, but hidden when irrelevant. The amount of metadata we'd need to store would be reduced if it's stored directly on the PR, since a lot of the existing metadata only serves to help identify which PR corresponds to each unpublished entry, or is an outright duplicate of information already provided by requesting the PR (e.g. usernames when using the We may have to introduce more HTTP requests to check cached things like the title and description if we want to minimize the amount of data we store in the PR description, but we will also be able to eliminate many requests for metadata since the metadata will already be in the PR description. Additionally, for PRs from forks, descriptions (as well as the contents of branches in PRs with "allow edits by maintainers" checked) have the unique property of being editable by both the user who created them and the maintainers of the repo. This is not true of labels, which are only editable by users who have write access to the repo. The only case this approach doesn't address is unpublished changes without a corresponding PR, which don't exist currently and will only exist in the fork workflow (drafts in the fork workflow don't create PRs right away). I think this case can be handled either with an approach similar to our current metadata (but much simplified) or we can simply infer or request all the required metadata for that edge case. |
Eager to see this pave the way for #568 |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Below is a short proposal on how to make our metadata handling less error prone by using it less and making it more obvious. The purpose of this proposal is to get feedback from the community.
Overview
Netlify CMS has a few kinds of metadata, and they're all called "metadata", unfortunately. This issue deals with the highest level of metadata, which is used to provide state for the editorial workflow and is only supported by the GitHub backend (currently).
This metadata is kept in a
_netlify_cms
prefixed orphan ref and consists of a separate json file for each editorial workflow item.How it works
A current metadata entry will look something like this (prettified):
Netlify CMS knows which editorial workflow entries exist by checking for branches prefixed with
cms/
and then checking for corresponding metadata files that look like the one above. Most of the metadata above is just copied for GitHub's API response for the pull request's data.What needs fixin'
The problem with this approach is that the metadata files must be kept in sync with the pull requests they represent. We've seen this break in two ways:
Manual pull request edits should not cause any issues at all, and bugs in metadata handling shouldn't be able to cause problems that a subsequent fix can't recover from.
Proposal
It'd be cool if we could:
Also keep in mind that, while this kind of metadata currently only serves the editorial workflow, it could serves lots of purposes in the future.
Inferred metadata
Most of our current metadata can be inferred direct from a given PR. Inferring metadata for a workflow entry in GitHub, for example, can look like:
cms/
That's it! The pull request data gives us what we need to infer the collection and title, and there's no metadata to keep synced. The only thing we can't track this way is workflow status.
Explicit metadata
Arbitrary data such as editorial workflow entry status can be handled by an explicit metadata concept we can call "annotations". For now I'd expect this annotations concept to only apply to unpublished entries. In GitHub, they would ideally be expressed as pull request labels. By default we'd use something like
netlifycms/draft
,netlifycms/review
, etc. A user could manually add and remove these labels from GitHub without consequence, and the CMS can automatically fix any idiosyncrasies it finds (like having two conflicting status labels). These labels could also be customized via config.The risk of someone changing a label should not be viewed as a risk at all - it's a feature. If this kind of ephemeral metadata is lost, the damage is limited as all unpublished entries are still in place, and only need their statuses updated. It's also not expected that annotations would be easily lost in the first place.
Hidden metadata
Hidden metadata shouldn't be necessary yet, but it would apply for performance operations like creating and caching thumbnails, because Netlify CMS can recreate and push up new thumbnails if they ever disappear. That kind of metadata doesn't need to be explicit or visible because it isn't critical.
The text was updated successfully, but these errors were encountered: