Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Task] Refactor the Account state #433

Closed
4 tasks
JelleMillenaar opened this issue Oct 8, 2021 · 11 comments · Fixed by #453
Closed
4 tasks

[Task] Refactor the Account state #433

JelleMillenaar opened this issue Oct 8, 2021 · 11 comments · Fixed by #453
Assignees
Labels
Added A new feature that requires a minor release. Part of "Added" section in changelog

Comments

@JelleMillenaar
Copy link
Collaborator

JelleMillenaar commented Oct 8, 2021

Note: This issue used to be about low-level API access, but this task has moved to #475.

Description

Refactor the Account state to enable lower-level access (#475) to the document stored inside. This should refactor the account snapshots with its event system to use IotaDocuments to represent the state of an identity. The stored document represent the last resolved tangle state of the document. When an account is created, this document is loaded and any updates to the identity are applied to that in-memory document.

Whether these updates are written into storage or stay temporary until published needs to be figured out. Doing so would enable updates to be applied to the identity locally, and publish them at a later point. The downside is the higher complexity involved, since the low-level access to the document isn't explicit to the account, so it cannot write them into storage immediately. Some caveats may apply in that case.

Finally, the refactor should introduce an abstraction that allows for easier implementation of future did methods.

Motivation

Simplify the internal state representation and take advantage of the account only having to manage one identity now.

To-do list

Create a task-specific to-do list . Please link PRs that match the To-do list item behind the item after it has been submitted.

  • Remove IdentitySnapshot and the event/commit system
  • Upgrade IotaDocument with any better APIs from IdentityState
  • Replace the internal state with an IotaDocument and apply all updates to that document
  • Abstract identity management into an interface
@JelleMillenaar JelleMillenaar added the Added A new feature that requires a minor release. Part of "Added" section in changelog label Oct 8, 2021
@PhilippGackstatter
Copy link
Contributor

State representation

To implement this issue I think we first need to come to a conclusion about how we want to represent the state of a document in the account. The Account currently stores the state of a document as a sequence of events, such as MethodCreated, ServiceDeleted, etc. To create the latest state, all events are folded together to create an IdentitySnapshot.

Another approach is to have two IotaDocuments, one representing the current state, and one where all updates are applied to. Once we want to publish to the tangle, we use the diff of both documents as a diff message or the second document as an integration message. To implement the low-level API here, we can either give out a mutable reference to the document and let developers mutate directly or give out a copy of the document and let developers resubmit that document into the account. Both of those versions are simple to implement with this two documents approach, whereas it's much harder with the event system. Finally, this approach likely also simplifies synchronizing with the on-tangle state (#140).

The current approach is hard to reconcile with the low-level API access, since the state is not naturally represented as a document, and we cannot simply return a mutable IotaDocument. So the question is whether the event system has advantages over the two documents approach.

The events are used to determine whether a collection of events should be published as a diff message or an integration message. However, iiuc, the logic is basically that it's an integration message if the document was just created, and a diff otherwise. This logic can be used with documents as well.
This kind of event sourcing would also be useful to explore the history of a document. However, we're using the tangle state as a single source of truth for the history API and so that use case also doesn't require it.

So are there any reasons for keeping the event system?

Where is the state?

A second discussion is around where that state is stored. With #436 an account handles only one identity. Loading the IotaDocument (or the IdentityState, which may be required for the chain state) that represents the identity and storing it in the account is feasible now. Then it would be easy to return a mutable document. It also creates some problems. Suppose we have a pre-existing identity in storage, and we use the same did to load this state into two accounts. Then both accounts hold the same document, but we can add a service on one of them, and delete a method on the other. Now we have two different states of the same identity.

The first question here is whether we want to prevent this or leave it up to downstream developers. I think some in the team have previously argued for the latter. I'm not quite convinced yet that this would be a great API, but can also see that the alternative is more complex.

Just to have it as an alternative, one other approach is to have the state live in the Storage like it does now. Since Storage is shared across accounts, it's possible to update the same identity from different accounts - only if we add (more) synchronization, though. This puts more burden on the Storage to implement that. Storage implementations could check whether a did is already in use and if so, throw an error during account creation. Or return an IotaDocument wrapped in a lock to synchronize access, so multiple accounts could manage the same did.

Any thoughts on whether we should synchronize access to the same identity or leave it up to the developer?

@eike-hass
Copy link
Collaborator

Regarding the state representation: I agree that although elegant, if not used as the source of truth, the event source is nor very useful. Is the logic for integration messages really that simplistic? I remember discussion where there were other cases mentioned but maybe they never made it into code, @JelleMillenaar what do you think?

@cycraig
Copy link
Contributor

cycraig commented Oct 21, 2021

The events are used to determine whether a collection of events should be published as a diff message or an integration message. However, iiuc, the logic is basically that it's an integration message if the document was just created, and a diff otherwise. This logic can be used with documents as well.

If the Account is changed to hold two documents internally (the last published state and the local state with changes), whether to publish an integration message can be determined by checking if any methods able to sign updates were added/removed.

Any thoughts on whether we should synchronize access to the same identity or leave it up to the developer?

I think #436 took a stance on this to prevent concurrent access to a DID from more than one (local) Account. Synchronising from the Tangle using multiple (remote) Accounts is a more complex issue but could be easier to deal with if we go with an IotaDocument. Otherwise, I believe we would need some way of converting a document from the Tangle to a set of events, or maybe just fetch the latest version and apply the local events on top of that document? Not sure.

tl;dr I'm in favour of using two documents to represent the internal state in the Account. Synchronising state from the Tangle is still a difficult issue.

Edit: synchronising from the Tangle might actually be more difficult with an IotaDocument. I'm unsure how we would determine if e.g. a service that was removed on the Tangle was added back in the local state or not (i.e. lost updates). In that case maintaining a set of local update events and applying them on top of the latest version would be easier. In light of this, maybe we store one IotaDocument which represents the previously-published state (and can be synchronised) and a set of update events to apply? Then if someone modifies that document unsafely, we just apply the regular updates on top of it, and they can choose whether or not to synchronise with the Tangle explicitly? And we clear the local events after publishing, storing the resulting document as the next source of truth.

@PhilippGackstatter
Copy link
Contributor

So how the tangle sync works is one of the primary concerns for the state representation, but I think we don't have a clear picture of how that's going to work, yet? This feels like a prerequisite to make a good decision here.

@PhilippGackstatter
Copy link
Contributor

Jelle and I discussed this a bit. The documents approach should work, if the sync works roughly like this: Whenever Account::load_identity is called, the latest state from the tangle is fetched (could be turned off via an option to make non-multi-device setups more efficient). Updates are applied and published to the tangle. If there was an update in the meantime, our update will be ignored due to the previous_message_id ordering. That's not ideal, but more importantly, it doesn't destroy the state on the tangle. This should work that way for both diff and integration message updates.

The local update is essentially lost. If we store updates as events in the store, then we'd have to remove that specific update from the store, during the next sync. It seems non-trivial to convert the on-tangle state to events and figure out if it should be removed or not. Having just a document as state, means on the next sync, the local doc is simply overwritten with the latest on-tangle state. Then apply the update again.

Essentially it means: If you have a multi-device setup, you should call Account::sync (or whatever it's called) yourself at appropriate times, so updates aren't lost.

I'm unsure how we would determine if e.g. a service that was removed on the Tangle was added back in the local state or not (i.e. lost updates).

The tangle is the source of truth. Any updates applied to a local state that didn't have that service removed, are ignored on-tangle, and a sync is required before a valid update can be published. Not entirely sure if that answers your point, though.

Aside: Would be nice to have update_identity do a check if the update was valid at the time of publication, so checking that no other message was published with that same previous_message_id - shouldn't require an entire resolution process, right?.

@PhilippGackstatter
Copy link
Contributor

PhilippGackstatter commented Oct 27, 2021

Since writing the above message, I felt like we should do better than just say "you can publish updates, but they might just be lost". This is mainly an elaboration on Craig's idea with the document and the set of updates.

State representation

As state, we keep in the storage (or in account, see open questions):

  1. the latest state of the identity as an IotaDocument (maybe IdentityState, not super important for this) which we verified in the past is part of the on-tangle state, and
  2. a set of update events (updates) we haven't yet published to the tangle - every time we publish (successfully) the set is cleared.

Synchronization

Synchronization here is defined as the process of exchanging messages between account and Tangle, such that afterwards, both local and on-tangle state are the same.

To that end, we first do a lighter-weight resolution:

As an additional local state, we keep two sets of message ids for the int (seen_int_message_ids) and diff chain (seen_diff_message_ids), that have been synced in the past. These message ids are not fetched in future synchronizations as an optimization. This only spells out the optimization over the normal resolution process.

  1. Query the integration chain index and check if newer integration messages have been published. Only fetch messages not in seen_int_message_ids.
  2. From the latest integration chain message, determine the latest diff chain message. If the integration message did not change, we can use the local seen_diff_message_ids, to only fetch unseen diff messages. Otherwise, that set is cleared and the newer diff messages are queried.
  3. Now we have resolved the latest on-tangle DID document. The on-tangle state always takes precedence, as it is our single source of truth. So, if the local state is older, it is overwritten by the newer on-tangle state.
  4. If the update set is not empty, we attempt to apply the changes to the local document and publish. This process might produce errors, as, for example, a service we wanted to remove may have already been removed on-tangle.

Publish Updates

The publish process itself is also modified, such that updates that have originally been applied to a local state that was older than what was on-tangle, are tried to be applied to the newer state. Reason being, that if we publish updates without first synchronizing, we risk our updates being ignored (because of previous_message_id & milestone ordering). This is why we should ideally check if the on-tangle state is newer, before we publish our updates. Still, for setups where developers know there are no updates published to that identity from other setups, the prior synchronization should be optional. Hence:

  • publish_updates would not include synchronization and work like today
  • Per the definition of synchronization, that process includes publishing updates. Essentially, Account::synchronize would call Account::publish_updates internally.

Low-Level Access

Edit: I mistook EventData::DiffMessage as being able to represent an arbitrary diff message, which is not the case. So to do the following, we'll probably need a new variant in the EventData enum that can represent an arbitrary diff between two documents. EventData::DiffDocument(identity_did::diff::diff_document::DiffDocument) might be possible?

With this state representation, granting low-level access is not quite straightforward. With normal updates, each update produces an Event that is appended to the update set. Exposing a mutable document means the account doesn't know when an update has been made, and so it cannot generate a new event to append to the update set. Instead, on every regular update through update_identity or on publish_updates / synchronize, we'd have to load the old state from storage and check if updates have been made, and if so, append it to the update set. That seems very inefficient, annoying to code and easy to get wrong.

To make this easier, we can also give out a copy of the current account state, let them make changes to it, and resubmit it, in which case we can easily make a diff. This raises other problems such as updating the account in the meantime, which is not a problem with the previous approach, since the document is mutably borrowed and so the account cannot be mutably borrowed to make updates at the same time.

To suffer from neither problem, we could take a closure and let users update the document through it. For example:

account.update_identity_unchecked(|document: &mut IotaDocument| {
    // Do some unchecked things
    document.methods.append(...)
}).await?;

In update_identity_unchecked we can then again simply do a diff of the old and new state and append an arbitrary diff to the update set. That allows doing regular updates, then some unchecked updates, and again some regular updates, or in any order. It should be flexible enough for our requirements. I can't see any major downside of this closure approach, but maybe I'm missing something.

Notes & Open Questions

  • State location: We could load the latest document into the Account once on load/create, and then apply multiple updates without going through storage, increasing efficiency. The natural place to write updates into storage would then be Account::synchronize. However, dropping the account without synchronizing would result in lost updates (both local and on-tangle), which is a feature or a bug, depending on the perspective.
  • Synchronization: If on-tangle state and local state are equal, then this process should be pretty quick, as no new messages have to be fetched (however, spam is a problem).
  • Low-level access with closure: We might have to come up with another approach for (Wasm) bindings, though, as I'm not sure how easy it is to take a closure there.
  • Does it make sense to combine publishing with synchronizing? Or should synchronizing only mean updating the local state?
  • This can be split across three PRs, state refactor, synchronization and low-level access.

Overall thoughts on proceeding this way? What did I miss?

@cycraig
Copy link
Contributor

cycraig commented Oct 27, 2021

Thank you for the writeup. I largely agree with most points.

1. State representation:

I agree that storing the latest document and a set of updates seems necessary to prevent lost updates while retaining local changes.

2. Synchronization:

The approach for lighter-weight resolution is a neat optimisation for frequent synchronisations (might be useful in the resolver too) but I would not implement it until milestone ordering is used for deterministic resolution. I'd rather be slow and correct than fast, but it's still a discussion.

3. Publish Updates:

I fully agree that synchronisation should be optional to avoid the resolution overhead. I imagine the range of applications which do not encounter distributed document updates is larger, so ensuring they are not negatively impacted is important.

"Does it make sense to combine publishing with synchronizing? Or should synchronizing only mean updating the local state?".

I envisioned that synchronisation would simply replace the local version of the document with whatever is on the Tangle, then a subsequent publish_updates call would attempt to apply the pending updates/commands to that document and publish it. Minor difference in ergonomics but I feel like the separation is better in code, e.g. easier to distinguish resolution errors from the synchronisation versus errors from applying conflicting updates.

4. Low-Level Access:

This seems to be where we diverge.

  • My initial idea was that the user would simply update the local version of the document (that is meant to represent the previous Tangle state), then any updates would be applied on top of that altered document. Can this break a lot of managed logic like enforcing an integration chain update if a signing method is changed? Yes, but that's why the low-level API is meant to be "unsafe" through the Account.
  • I dislike EventData::DiffDocument as it essentially allows two ways to represent any change, as a managed Account update event or as an arbitrary diff, and provides the illusion that the Account can or should handle the low-level changes gracefully.
  • I'm biased against using closures in this manner: it complicates error handling (no ? operator, which is annoying when many of the low-level operations are fallible) and may not translate well into other languages. I stand to be corrected on the latter point; I know closures are possible with wasm_bindgen but haven't seen how they would look in our Wasm API. You also raised this as a note.

Update: After thinking about this more, I agree the EventData::DiffDocument solution is elegant. However, I would like to raise two points:

  1. Both direct mutation of the previous IotaDocument and generating EventData::DiffDocument events only allow mutation of the DID document content - without something like a publish_unchecked() function it is still impossible to change e.g. the previousMessageId, updatedAt, or other automatic metadata set on publish_updates(). Is that fine? I recall the only examples raised of where direct low-level access is needed was to experiment with those fields set automatically.
  2. The alternative to closures would be allowing developers to apply DocumentDiff events directly, or is there a better way?

E.g.

let mut updated_document: IotaDocument = account.previous_document().clone();
updated_document.insert_method(...);
updated_document.insert_service(...);
updated_document.remove_method(...);

let diff: DocumentDiff = account.previous_document().diff(updated_document)?;
account.update_identity_unchecked(diff);
account.publish_updates().await?;

My philosophy with low-level/direct access is that the Account should provide practically all functionality necessary to manage and update a DID document, and if something can't be achieved through it then we should endeavour to add it as a feature if it makes sense. The point is that I believe we should stick to something simple and maintainable for low-level API access since it's not really a "feature" of the Account, it's a way to circumvent it.

Instead, on every regular update through update_identity or on publish_updates / synchronize, we'd have to load the old state from storage and check if updates have been made, and if so, append it to the update set. That seems very inefficient, annoying to code and easy to get wrong.

I don't quite understand this, isn't the old state/previous document already loaded from storage into memory? Edit: if I interpret this correctly, you're right that loading the previous document from storage would be needed to calculate a diff if low-level updates are applied directly to the document rather than as a diff event. I think it would only be necessary on publish_updates, however, and disk access is still generally faster than the network calls needed to publish a document so I would consider this acceptable if necessary.

This can be split across three PRs, state refactor, synchronization and low-level access.

That seems like a decent way of tackling it, they may be on an epic branch but could also be merged into dev separately. I honestly prefer incremental updates to dev to prevent major changes from epic branches compounding and breaking everything at once with large merge conflicts.

@cycraig
Copy link
Contributor

cycraig commented Oct 27, 2021

Alternative Solution Exploration

The entire idea for the proposed solution (representing the state as a document with update events) revolves around merging local changes with remote changes. What if we disregard that requirement and simply discard local changes on synchronisation? This is basically playing devil's advocate for a bit.

Why disregard merging local updates with the remote document?

Local changes are likely to be short-lived: are there any use-cases for making one or more document updates and not publishing them to the Tangle almost immediately? Any new/updated verification methods cannot/should not be used to sign anything since if they're not published the signature is meaningless, and services are similarly only useful when publicly viewable on the Tangle, so an update would naturally be published as soon as feasible.

In that case, a typical workflow would synchronise first and look something like:

account.synchronise().await?;       // fetch latest document
account.update_identity()...        // make some changes
account.update_identity()...
account.publish_changes().await?;   // publish immediately

The timespan between synchronising and publishing, which is where a conflicting document update to the Tangle from a different agent could occur, is going to be relatively small. I imagine the time between synchronising and publishing here is similar to the proposed workflow of applying updates then synchronising just before publishing, since proof-of-work and retrying takes some time in addition to confirming the message.

account.update_identity()...        // make some changes
account.update_identity()...
account.synchronise().await?;       // fetch latest document
account.publish_changes().await?;   // merge updates and publish, still takes time due to PoW/confirmation

To immediately counter this argument, however, the timespan can be greater when dealing with e.g. lost internet connection on a mobile SSI wallet, or a data centre outage causing updates to be stuck on local storage before being published to the Tangle.

Dual Document State:

This is my previous proposal where two documents represent the Account state: the last Tangle document and the local updated document.

Advantages:

  • updates are immediate rather than deferred, allowing errors to be caught earlier,
  • allows access to a document reference (&IotaDocument) of the latest changes cheaply, whereas the current and proposed solutions need to construct it each time,
  • simplifies updates significantly by leveraging the existing low-level API; commands/events would not be necessary to store but we could still use their builders to maintain the current API.

Disadvantages:

  • impossible to safely synchronise and merge a local document with a remote document due to the possibility for lost updates (e.g. adding back a removed service, or un-revoking a MerkleKeyCollection index due to the bitmap being unaltered locally), so synchronisation would have to overwrite the local document to be "safe".
  • since updates are immediate, dropping the local document means losing any newly-generated private keys in the storage, unless it tracks them somehow to clear them. I'm actually not sure how the proposed solution would handle this either if update events need to be dropped due to conflicts with a synchronised remote document, would only the conflicting update events be dropped so new verification methods would still be added?

NOTE

I believe both solutions suffer from the possibility for "lost" messages where:

  • update messages are published almost at the same time by different agents, so they cannot reference each other and one "wins" over the other, or
  • one agent publishes a diff-chain update while another publishes an integration chain update.

Coordination in distributed systems is a difficult problem; without some sort of locking mechanism we cannot eliminate these problems entirely.

To be clear: I am still partially in favour of the proposed solution (compared to the alternative solution with two documents), this was just to explore what would change if we dropped a certain requirement and could simplify the Account.

@PhilippGackstatter
Copy link
Contributor

Synchronization & State

You raise some excellent points there, thanks a lot! I agree that the assumption for a need to merge changes is moot, considering the timespan in the general case. In that case I no longer see a good reason to keep the event system, and am in favour of using only documents as state. I'm glad you brought this up, since we might be able to get rid of the event system completely and work with diffs only. I also like that your alternative solution is easier to understand from an outside perspective, as synchronizing, updating and publishing are their own methods, with the slight cost of potentially forgetting them.

In terms of where the state lives, we may want to have it in the account and storage at the same time. We write to storage on every update, but that doesn't necessarily gets propagated to disk, iirc. To be more specific, we'd have two documents in storage as state: the tangle doc and the updated one, but the account only needs to hold the latest doc. In the account we can then grant access to the latest document, as you mentioned, and we can apply the unchecked updates directly to that, too.

Nit: Not sure if I like the name "synchronization" in that case, since that implies for me that it's a bidirectional exchange of messages, not just one-way. Fetch or pull might be more appropriate. With these method names I would be less surprised if my local state was overwritten. Not too strong of an opinion, though :).

I believe both solutions suffer from the possibility for "lost" messages

Absolutely. It's just a tiny bit less likely to happen when we can guarantee that synchronizing and publishing happen one after the other, rather than giving users the responsibility of synchronizing and applying updates in time. But as you've shown, it doesn't matter for the majority of cases.

since updates are immediate, dropping the local document means losing any newly-generated private keys in the storage

I feel like this is less of a problem with your solution, actually. Since you just synchronized, it's very unlikely that an update you applied has to be dropped. In the merge solution, this is more likely to happen, and I had not thought about how to deal with this. I'm not sure we need to fix this, given how unlikely it is. Having a few stray private keys in a stronghold isn't a big deal, I think.

Low-Level Access

I will skip over some of the points you raised for EventData::DiffDocument since the event system seems to have no merits left, or does it?

Overall, the document-based state makes low-level access easier so that's an improvement. If we do want to write the latest unchecked update to storage though, we have the same issues as I described in the latest of my proposals. I still think the closure approach is not too bad.

it complicates error handling (no ? operator, which is annoying when many of the low-level operations are fallible)

The closure can return an error too, which in turn propagates out of update_identity_unchecked. I.e. the signature would be:

fn update_idenity_unchecked<F: Fn(doc: &mut IotaDocument) -> identity_account::errors::Result<()>>(&mut self, f: F) -> identity_account::errors::Result<()>

so ? is possible.

it is still impossible to change e.g. the previousMessageId, updatedAt, or other automatic metadata set on publish_updates(). Is that fine?

Honestly, I don't know. The only example I know of is setting the previous message id as you mentioned, but it'd be really helpful to have a few more examples of what unchecked updates should and shouldn't allow. That being said, if we allow arbitrarily modifying a document and add a publish_unchecked function, that simply publishes the diff without setting any other fields automatically, that might allow for most of it? That function could also take some optional parameters, if they make sense, such as the index to publish on, whether it's a diff or int message, etc. But again, I'm mostly speculating on this as the need is unclear.

The alternative to closures would be allowing developers to apply DocumentDiff events directly, or is there a better way?

While we want to allow access to low-level, unsafe APIs, I would still argue we should do what we can to make it as easy to use as possible. Diffs are an API devs need to learn, and it seems like something we should be able to wrap. Rather than going in this direction, I would still prefer the other options, like returning a copy of a document, let developers mutate it arbitrarily and publish it unchecked (which just does the diff internally). Here, we just need to be very clear about what invariants we expect the caller to uphold (because simultaneous updates to the account can be bad).

To summarize, I'm in favor of the dual document state. I would like to get started with the implementation, so if there is no disagreement, I would go ahead. We can still continue to discuss the synchronization and the low-level API.

@cycraig
Copy link
Contributor

cycraig commented Oct 28, 2021

Synchronization & State

I also like that your alternative solution is easier to understand from an outside perspective, as synchronizing, updating and publishing are their own methods, with the slight cost of potentially forgetting them.

We could still have updates publish immediately, but honestly I think that's just a convenience and explicitly choosing when to publish isn't that much of a hassle. I feel like "forgetting to publish" is just a bug by any other name and synchronising is based on the application.
So would we do away with lazy/batch publishing etc. in the Account and always make publishing explicit?

Nit: Not sure if I like the name "synchronization" in that case, since that implies for me that it's a bidirectional exchange of messages, not just one-way. Fetch or pull might be more appropriate. With these method names I would be less surprised if my local state was overwritten. Not too strong of an opinion, though :).

I think pull implies automatic merging from how it works in git. I'm in favour of fetch or synchronise.

Having a few stray private keys in a stronghold isn't a big deal, I think.

Maybe we could purge any key locations for methods no longer in the new document? I think "just leave them there" is a valid option since discovering my Stronghold keys were deleted automatically may be a bit of a shock.

Low-Level Access

I will skip over some of the points you raised for EventData::DiffDocument since the event system seems to have no merits left, or does it?

I'm still nervous about going all-in on simplifying the Account and removing events entirely. If we decide we do want to handle merging in the future, events will be necessary.

If we do want to write the latest unchecked update to storage though, we have the same issues as I described in the latest of my proposals. I still think the closure approach is not too bad.

In the dual-document approach we don't need to create the IotaDocument from events and generate a diff from it, the developer can just mutate the local document directly. Is the issue that they need some way to commit those changes to storage afterwards, and the closure approach would do it automatically for them? I think a closure is less necessary but we can see how it works out, particularly in the Wasm bindings.

That being said, if we allow arbitrarily modifying a document and add a publish_unchecked function, that simply publishes the diff without setting any other fields automatically, that might allow for most of it?

Yes, I think publish_unchecked might be necessary to allow mutating the metadata unsafely.

That function could also take some optional parameters, if they make sense, such as the index to publish on, whether it's a diff or int message, etc. But again, I'm mostly speculating on this as the need is unclear.

I don't think we need to go that far, we'll have to see on a case-by-case basis. On a related note, we probably need more options for the regular Account publishing too: e.g. which method to use for signing, whether to force an integration chain update (maybe, but not force a diff update since that could break the specification).

Conclusion

I think we're mostly aligned now?

  • State: maintain two documents---the last resolved document from the Tangle (or more likely the integration and diff chains), and a copy of document which has local changes applied to it.
  • Synchronising: resolve the latest version of the document from the Tangle, replace both local documents with it, overwriting any local changes. This operation may be optimised but probably not until milestone queries are implemented.
  • Publishing: publish the local document with changes to the Tangle, also replacing the last resolved document from the Tangle.
  • Low-Level Access: return a mutable document (and metadata once that's implemented, using a closure or not) to the developer that they update as needed; add publish_unchecked to bypass the regular signing, metadata updates (previousMessageId, updatedAt, etc.) and other checks.

Edit:

I felt like we should do better than just say "you can publish updates, but they might just be lost".

We are essentially saying this now. However, as previously mentioned, the possibility for lost updates due to distributed updates to a DID document is something developers will need to deal with in any case.

@PhilippGackstatter
Copy link
Contributor

So would we do away with lazy/batch publishing etc. in the Account and always make publishing explicit?

Yes, I'm perfectly fine with that.

If we decide we do want to handle merging in the future, events will be necessary.

I'm not sure actually, since there is merge functionality in the lower-level API, too. We could still hold a list of diffs that may have been applied to another document, and try to reapply (identity_diff::traits::Diff::merge) them to a newer one. Although I suspect this to be less robust compared to the event system.

Is the issue that they need some way to commit those changes to storage afterwards, and the closure approach would do it automatically for them?

Yes, but now I understand your point about only needing to do that in publish_updates. My point only applies to the event system, my bad. Without the merge functionality, we could reasonably say that any changes to the document (regular or unchecked) loaded into the account aren't written to storage until published. These changes are meaningless until published, and without merging we don't need to store updates until they are actually published. Would also significantly reduce how often we write to storage. Dropping the account gets rid of the updates, but that is much less surprising without merge functionality, imo.

Tos summarize: Just returning a &mut IotaDocument works (in the dual-document approach) and there's no need for the closure. Basically, what you wrote in the conclusion but without the closure part :).

On a related note, we probably need more options for the regular Account publishing too

I think we're mostly aligned now?

👍

We are essentially saying this now.

I know. Your point about the timespans showed that choosing the more complex system with merging has almost the same problems as that without, so I'm okay with going the simpler route.

@PhilippGackstatter PhilippGackstatter changed the title [Task] Provide low-level API access from within Account [Task] Refactor the Account state Nov 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Added A new feature that requires a minor release. Part of "Added" section in changelog
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants