Skip to content

Commit 4eb824f

Browse files
authored
feat(data): Datastore Docs (#9753)
1 parent 0c9c401 commit 4eb824f

14 files changed

+476
-0
lines changed

README.md

+4
Original file line numberDiff line numberDiff line change
@@ -128,3 +128,7 @@ If you can't migrate to [aws-sdk-js-v3](https://github.com/aws/aws-sdk-js-v3) or
128128
```js
129129
import { Auth } from 'aws-amplify';
130130
```
131+
132+
### DataStore Docs
133+
134+
For more information on contributing to DataStore / how DataStore works, see the [DataStore Docs](packages/datastore/README.md)

packages/datastore/README.md

+154
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,154 @@
1+
# AWS Amplify DataStore Docs
2+
3+
[Amplify DataStore](https://docs.amplify.aws/lib/datastore/getting-started/q/platform/js/) provides a programming model for leveraging shared and distributed data without writing additional code for offline and online scenarios, which makes working with distributed, cross-user data just as simple as working with local-only data.
4+
5+
---
6+
7+
| package | version | open issues | closed issues |
8+
| ---------------------- | --------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
9+
| @aws-amplify/datastore | ![npm](https://img.shields.io/npm/v/@aws-amplify/datastore.svg) | [![Open Issues](https://img.shields.io/github/issues/aws-amplify/amplify-js/DataStore?color=red)](https://github.com/aws-amplify/amplify-js/issues?q=is%3Aissue+label%3ADataStore+is%3Aopen) | [![Closed Issues](https://img.shields.io/github/issues-closed/aws-amplify/amplify-js/DataStore)](https://github.com/aws-amplify/amplify-js/issues?q=is%3Aissue+label%3ADataStore+is%3Aclosed) |
10+
11+
---
12+
13+
## **👋 Note For Contributers: 👋**
14+
15+
_**Please update these docs any time you find something that is incorrect or lacking. In particular, if a line in the docs prompts a question, take a moment to figure out the answer, then update the docs with the necessary detail.**_
16+
17+
---
18+
19+
## Getting Started
20+
21+
Before you start reading through these docs, take a moment to understand [how DataStore works at a high level](https://docs.amplify.aws/lib/datastore/how-it-works/q/platform/js/). Additionally, we recommend first reading through [docs.amplify.aws](https://docs.amplify.aws/lib/datastore/getting-started/q/platform/js/). The purpose of these docs is to dive deep into the codebase itself and understand the inner workings of DataStore for the purpose of contributing. Understanding these docs is **not** necessary for using DataStore. Lastly, before reading, take a look at [the diagrams below](#diagrams).
22+
23+
---
24+
25+
## Docs
26+
27+
- [Conflict Resolution](docs/conflict-resolution.md)
28+
- [Contributing](docs/contributing.md)
29+
- [DataStore Lifecycle Events ("Start", "Stop", "Clear")](docs/datastore-lifecycle-events.md)
30+
- This explains how DataStore fundementally works, and is a great place to start.
31+
- [Getting Started](docs/getting-started.md) (Running against a sample app, etc.)
32+
- [Namespaces](docs/namespaces.md)
33+
- [How DataStore uses Observables](docs/observables.md)
34+
- [Schema Changes](docs/schema-changes.md)
35+
- [Storage](docs/storage.md)
36+
- [Sync Engine](docs/sync-engine.md)
37+
- ["Unsupported hacks" / workarounds](docs/workarounds.md)
38+
39+
---
40+
41+
# Diagrams
42+
43+
_Note: relationships with dotted lines are explained more in a separate diagram._
44+
45+
## How the DataStore API and Storage Engine Interact
46+
47+
```mermaid
48+
flowchart TD
49+
%% API and Storage
50+
api[[DS API]]-- observe -->storage{Storage Engine}
51+
storage-- next -->adapter[[Adapter]]
52+
adapter-->db[[Local DB]]
53+
db-->api
54+
sync[[Sync Engine*]]-.-storage
55+
sync-.-appSync[(AppSync)]
56+
```
57+
58+
# How the Sync Engine Observes Changes in Storage and AppSync
59+
60+
_Note: All green nodes belong to the Sync Engine._
61+
62+
\* Merger first checks outbox
63+
64+
\*\* Outbox sends outgoing messages to AppSync
65+
66+
```mermaid
67+
flowchart TD
68+
69+
subgraph SyncEngine
70+
index{index.ts}-- observe -->reach[Core reachability]
71+
72+
subgraph processors
73+
mp[Mutation Processor]
74+
sp[Subscription Processor]
75+
syp[Sync Processor]
76+
end
77+
78+
reach--next-->mp[Mutation Processor]
79+
reach--next-->sp[Subscription Processor]
80+
reach--next-->syp[Sync Processor]
81+
82+
subgraph outbox / merger
83+
outbox[Outbox]
84+
merger[Merger]
85+
outbox---merger
86+
end
87+
88+
end
89+
90+
api[DS API]-.->storage
91+
mp-- 1. observe -->storage{Storage Engine}
92+
storage-- 2. next -->merger[merger*]-- next -->storage
93+
94+
95+
sp-- observe -->appsync[(AppSync)]
96+
appsync-- next -->sp
97+
98+
syp---appsync
99+
100+
mp-->outbox[outbox**]
101+
102+
appsync<--->outbox
103+
%% styling
104+
classDef syncEngineClass fill:#8FB,stroke:#333,stroke-width:4px,color:#333;
105+
class index,mp,sp,syp,merger,outbox syncEngineClass;
106+
```
107+
108+
---
109+
110+
# Project Structure
111+
112+
<pre>
113+
amplify-js/packages/datastore/src
114+
├── authModeStrategies
115+
│ └── defaultAuthStraegy.ts
116+
│ └── index.ts
117+
│ └── multiAuthStrategy.ts
118+
├── datastore
119+
│ └── datastore.ts # Entry point for DataStore
120+
├── predicates
121+
│ └── index.ts
122+
│ └── sort.ts
123+
├── ssr
124+
├── storage # Storage Engine
125+
│ └── adapter # Platform-specific Storage Adapters
126+
│ └── getDefaultAdapter
127+
│ └── AsyncStorageAdapter.ts
128+
│ └── AsyncStorageDatabase.ts
129+
│ └── index.ts
130+
│ └── IndexedDBAdapter.ts
131+
│ └── InMemoryStore.native.ts
132+
│ └── InMemoryStore.ts
133+
│ └── storage.ts # Entry point for Storage
134+
├── sync # Sync Engine
135+
│ └── dataStoreReachability
136+
│ └── index.native.ts
137+
│ └── index.ts
138+
│ └── processors # Sync Engine Processors
139+
│ └── mutation.ts
140+
│ └── subscription.ts
141+
│ └── sync.ts
142+
│ └── datastoreConnectivity.ts # Subscribe to reachability monitor
143+
│ └── index.ts # Entry point for Sync Engine
144+
│ └── merger.ts # <a href="https://github.com/aws-amplify/amplify-js/blob/datastore-docs/packages/datastore/docs/sync-engine.md#merger" title="merger doc">doc</a>
145+
│ └── outbox.ts # <a href="https://github.com/aws-amplify/amplify-js/blob/datastore-docs/packages/datastore/docs/sync-engine.md#outbox" title="outbox doc">doc</a>
146+
</pre>
147+
148+
---
149+
150+
## Other Resources:
151+
152+
- [High-level overview of how DataStore works](https://docs.amplify.aws/lib/datastore/how-it-works/q/platform/js/)
153+
- [DataStore Docs](https://docs.amplify.aws/lib/datastore/getting-started/q/platform/js/)
154+
- [re:Invent talk](https://www.youtube.com/watch?v=KcYl6_We0EU)
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
# Conflict Resolution
2+
- **AppSync is the source of truth for conflict resolution**
3+
1. In the event AppSync fails to resolve a conflict, the network response will contain an error message (`conflict unhandled`). This is how we give customers the chance to make an update, or try again.
4+
2. We use jittered retry (10x).
5+
- TODO: add more detail / links to how this retry logic occurs.
6+
- We err on the side of not deleting customer data when performing conflict resolution.
7+
- Auto-merge is the default resolution strategy. This relies on the version, and will attempt to merge fields that changed when possible.
8+
- For more, see [the AppSync docs](https://docs.aws.amazon.com/appsync/latest/devguide/conflict-detection-and-sync.html)
+17
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
# Contributing
2+
3+
## Formatting
4+
- We use Prettier to format our code. We recommend installing it within your IDE to prevent formatting code within other Amplify packages (as opposed to formatting from the Prettier CLI directly). Example [VS Code Extension](https://marketplace.visualstudio.com/items?itemName=esbenp.prettier-vscode)).
5+
6+
## Testing DataStore changes locally
7+
- On first build:
8+
- Within **amplify-js**: `yarn && yarn build && yarn link-all && yarn build:esm:watch`
9+
- Within sample app: `yarn && yarn link aws-amplify && yarn link @aws-amplify/datastore && yarn start`
10+
- On subsequent builds (useful if something isn't working):
11+
- Within **amplify-js**: `yarn clean && yarn build && yarn link-all && yarn build:esm:watch`
12+
- Within sample app: `rm -rf node_modules && yarn && yarn link aws-amplify && yarn link @aws-amplify/datastore && yarn start`
13+
14+
## Contributing to these docs
15+
- Do not link to specific lines of code, as these frequently change. Instead, do the opposite: link to the documentation within the code itself, as the docs are less likely to change.
16+
- Prefer small, self-contained sections over large, monolothic documents.
17+
- Do not use permalinks - instead, link to the most current files.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,88 @@
1+
# DataStore Lifecycle Events ("Start", "Stop", "Clear")
2+
3+
# DataStore Initialization ("Start")
4+
5+
**Understanding how DataStore starts is critical to understanding how DataStore fundamentally works.** At a high level, starting DataStore does the following things in order for each model:
6+
7+
> 1. Init Schema
8+
> 2. Init the Storage Engine
9+
> 3. Migrate schema changes
10+
> 4. Sync Engine Operations
11+
> 5. Empty the Outbox / processes the mutation queue
12+
> 6. Begin processing the subscription buffer
13+
> 7. DataStore is now in "ready" state
14+
15+
- _We can eagerly start DataStore by calling `DataStore.start`. Otherwise, invoking a method (query, save, delete, observe) will start it up_
16+
- _When importing a model, DS consumes `schema.js`, and creates the IndexedDB store._
17+
18+
## **How it works:**
19+
20+
1. ### **Init schema**
21+
- **1.1** First we call `initSchema` [here](packages/datastore/src/datastore/datastore.ts)
22+
- **1.2** Codegen then generates `schema.js` from the schema
23+
- **1.3** DataStore consumes `schema.js`
24+
25+
2. ### **Init the storage engine**
26+
- **2.1** The adapter is initialized
27+
- **2.2** The local database gets created if it doesn’t exist already
28+
- **2.3** The adapter [has a `setUp`](packages/datastore/src/storage/adapter/IndexedDBAdapter.ts#L82) method that then calls the database's `init` method
29+
- **2.4** Relations are established
30+
- _Nothing happens until the first interaction with DataStore_
31+
32+
3. ### **Migrate schema changes (if needed)**
33+
- See [schema-changes.md](./schema-changes.md)
34+
- If the user has updated the schema, we perform the migration here
35+
36+
4. ### **Sync Engine Operations**
37+
- #### 4.1 Instantiate Sync Engine (`this.sync = new SyncEngine(`)
38+
- The Sync Engine is only instantiated if there is a graphql endpoint (meaning we’ve already provisioned the backend). Otherwise, DataStore is in local-only mode. See [datastore.ts](packages/datastore/src/datastore/datastore.ts#L735)
39+
- **Note: at this step, we do not yet process the buffer**
40+
- There are three subscriptions per model: `create`, `update`, and `delete`
41+
42+
- #### 4.2 Sync Engine is started (`syncSubscription = this.sync.start(`)
43+
- **4.2.1 Subscribe to the Sync Engine**
44+
- Messages from this subscription are emmited as Hub events for DataStore
45+
- When ready, we call `initResolve`
46+
- If unauthorized, DS keeps working, as we may have publicly readable models
47+
- If a validation error occurs, DataStore Sync breaks entirely
48+
- Without subscriptions, DataStore doesn’t work (when there is an endpoint present)
49+
- Subscriptions are the only component that update the local store from remote
50+
- If updates come in, they’re buffered to be processed after sync is complete
51+
- Prepares the sync predicates (similar to adapter setup)
52+
53+
- **4.2.2 Subscribe to DataStore connectivity observable (notifications about network status)**
54+
- `this.datastoreConnectivity.status().subscribe`
55+
- Subscribe Amplify Core component that monitors network reachability
56+
- We do this here because we need the ability to stop or start the sync process. When offline, we disconnect the websocket and stop syncing. Once online, we reconnect the websocket and start base / delta syncing
57+
- Sync engine subscribes to the Storage Engine
58+
- Every write may need to get translated to a mutation in the outbox
59+
- Storage engine is local source of truth for DataStore, all other pieces are observing
60+
61+
- **4.2.3 Run the Sync Queries (when online)**
62+
- _Note: We perform a topological sort of the data - we sync the children first, so when we query the parent, the children are already present We also use an optimisation to parallelize this process if possible (i.e. non-dependent models)_
63+
- Sync queries are Graphql queries that are necessary to hydrate the local store initially
64+
- The first time we run the app, we will perform a query to perform a scan or query of DynamoDB with up to 10k records per table. This populates the local store. With selective sync, we perform a query instead of a scan against DynamoDB.
65+
- Subsequent changes after the initial sync query come in through subscriptions
66+
- There are two mechanisms:
67+
- Base sync - retrieve all records up to total sync value
68+
- Delta sync - one table per model, one delta sync table per DS store
69+
- AppSync makse the final decision regarding which sync (base vs delta) to perform
70+
- The client sends the last sync param with the sync query, service then compares the diff
71+
- There is a TTL on all delta sync table records
72+
- To find the TTL within the AppSync Console, see "Update Data Source"
73+
74+
5. ### **Empty the Outbox / processes the mutation queue**
75+
- Example: when performing mutations offline, records are added to the queue. Once there is connectivity, we start sending these **ONE BY ONE**.
76+
- **Note: No batch API is exposed to consumers**
77+
- Mutation events have ids
78+
- Syncs get applied before mutations are sent
79+
6. ### **Begin processing the subscription buffer**
80+
- If we receive subscription messages any time in the process of initializing subscriptions, performing sync queries, or processing the mutation queue, we buffer the subscription messages until everything else is completed. Once we have completed processing the mutation queue, we then process the subscription buffer
81+
7. ### **DataStore is now in "ready" state**
82+
- For additional reference, and how the above are published as Hub events, see [the docs](https://docs.amplify.aws/lib/datastore/datastore-events/q/platform/js/)
83+
84+
## Stop
85+
- Stops the DataStore sync process. This will close the real-time subscription connection when your app is no longer interested in updates. You will typically call DataStore.stop() just before your application is closed. You can also force your DataStore sync expressions to be re-evaluated at runtime by calling stop(), followed by start()
86+
87+
## Clear
88+
- Clears local data from DataStore. DataStore will now require a full sync (not a delta sync) to populate the local store with data
+26
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
# Onboarding with DataStore
2+
3+
## Understand the primary DataStore events
4+
- Read the [DataStore lifecycle events doc](docs/datastore-lifecycle-events.md)
5+
6+
## Building a sample app with DataStore to understand how it works
7+
1. Build a basic DataStore sample app (no @auth, just 1-2 models).
8+
2. Build a similar sample app with the API category.
9+
3. Add @auth rules to the DataStore app:
10+
1. `amplify update api`
11+
2. Modify schema according to auth rules docs (https://docs.amplify.aws/lib/datastore/setup-auth-rules/q/platform/js).
12+
3. `amplify push`
13+
4. `amplify codegen models`
14+
4. Add [selective sync](https://docs.amplify.aws/lib/datastore/sync/q/platform/js#selectively-syncing-a-subset-of-your-data).
15+
5. Enable [real-time changes](https://docs.amplify.aws/lib/datastore/real-time/q/platform/js).
16+
6. While interacting with your app, examine the IndexedDB tables (Application > IndexedDB within Chrome dev tools):
17+
1. Check out the different stores in IDB that get created for your schema. Note the internal stores prefixed with sync_, and the stores corresponding to your models prefixed with user_.
18+
2. Familiarize yourself with how actions taken in the UI affect the data stored in IDB. This may be easier to do while throttling the network connection. You'll be able to see how outgoing mutations first get persisted into the corresponding store, then added to the mutation queue / outbox (sync_MutationEvent), and then updated in the store with data from AppSync.
19+
7. Turn on DEBUG logging (`Amplify.Logger.LOG_LEVEL = "DEBUG";`) at the root of your project, and inspect the logs in the console while using your app. Additionally, [enable hub events](https://docs.amplify.aws/lib/datastore/datastore-events/q/platform/js#usage) for DataStore.
20+
8. The best way to understand DataStore events is to place several debuggers or breakpoints throughout DataStore.
21+
- With logging / Hub events enabled, you can see what operations DataStore is performing (i.e. start, sync, etc.) as you step through with the debugger.
22+
9. Testing offline scenerios / concurrent user sessions is a useful way to test the full functionality of DataStore, and to fully understand how the sync process actually works.
23+
10. Next steps:
24+
- Create a React Native example (uses a different storage type)
25+
- Try more complex schema types
26+
- Observe changes in records within DynamoDB (for instance, soft deletion).

packages/datastore/docs/namespaces.md

+52
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
# Namespaces
2+
- `datastore`
3+
- Settings
4+
- `user`
5+
- Models that came from user schema
6+
- `sync`
7+
- Metadata (last time ran query, etc.)
8+
- `storage`
9+
- Deprecated
10+
# Local database examples:
11+
12+
- *Note: Anything prepended with `sync_` is an internal table.*
13+
14+
- ## datastore_Setting
15+
- Used for schema versioning
16+
- See the [schema changes doc](docs/schema-changes.md)
17+
```
18+
{
19+
id: "01FYABF3DMBZZJ46W1CC214NH2"
20+
key: "schemaVersion"
21+
value: "\"4401034582a70c60713e1f7f9da3b752\""
22+
}
23+
```
24+
- ## sync_ModelMetadata
25+
- Sync Engine metadata
26+
- Includes information about the last time we synced a model
27+
```
28+
{
29+
fullSyncInterval: 86400000
30+
id: "01FYABF3DMBZZJ46W1CC214NH3"
31+
lastFullSync: 1647467532307
32+
lastSync: 1647467532307
33+
lastSyncPredicate: null
34+
model: "Todo"
35+
namespace: "user"
36+
}
37+
```
38+
- ## sync_MutationEvent
39+
- ## user_[Model Name]
40+
- The actual records themselves.
41+
```
42+
{
43+
createdAt: "2022-03-16T21:52:07.718Z"
44+
description: null
45+
id: "6f69055b-b081-4225-8fc4-1d6d52732660"
46+
name: "name 1647467527489"
47+
updatedAt: "2022-03-16T21:52:07.718Z"
48+
_deleted: null
49+
_lastChangedAt: 1647467527754
50+
_version: 1
51+
}
52+
```
+11
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
# How DataStore uses Observables
2+
- All of DataStore internally uses event driven methods (observables) to handle everything from the sync process, to observing online connectivity. **This makes the Storage Engine the single source of truth for DataStore.**
3+
- Examples:
4+
- The Sync Engine observes DataStore Connectivity
5+
- The Sync Engine observes the Storage Engine
6+
- The client observes DataStore with `observe` and `observeQuery`:
7+
- https://docs.amplify.aws/lib/datastore/real-time/q/platform/js/
8+
9+
## Understanding Observables
10+
- DataStore uses [`zen-observable`](https://github.com/zenparsing/zen-observable)
11+
- [The RXJS docs](https://rxjs.dev/guide/observable) do a good job of describing observables in more detail.

0 commit comments

Comments
 (0)