Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing database migration for active_slot_coeff #1251

Closed
KtorZ opened this issue Jan 7, 2020 · 14 comments
Closed

Missing database migration for active_slot_coeff #1251

KtorZ opened this issue Jan 7, 2020 · 14 comments
Assignees

Comments

@KtorZ
Copy link
Member

KtorZ commented Jan 7, 2020

Context

We "recently" added the active_slot_coeff blockchain parameters to the checkpoint data. Yet, for database existing before the release, we have missed a migration..

Information -
Version v2019.12.16+
Platform All
Installation N/A

Steps to Reproduce

  1. Create a wallet with a version prior to v2019.12.16
  2. Stop the server
  3. Start a server running on v2019.12.16+

Expected behavior

  1. Wallets can be restored successfully

Actual behavior

  1. The worker exits because of a failing migration.
SQLite3 returned ErrorConstraint while attempting to perform step: NOT NULL constraint failed: checkpoint_backup.active_slot_coeff

Resolution


QA

  • Correct behavior is visible from the logs:
[...]
[cardano-wallet.network:Info:22] [2020-01-18 17:01:00.96 UTC] Waiting for Jörmungandr to be ready on tcp/35027
[cardano-wallet.application:Info:22] [2020-01-18 17:01:00.97 UTC] Found existing wallet: c8416497faead398770a1ae14eb31a41683c8f67
[cardano-wallet.wallet-db:Notice:28] [2020-01-18 17:01:00.97 UTC] checkpoint table does not contain required field 'active_slot_coeff'. Adding this field with a default value of 1.0.
[...]
[cardano-wallet.wallet-engine:Info:28] [2020-01-18 17:01:00.99 UTC] c8416497: Applying blocks [1158224.6 ... 1158226.8]

How to reproduce the above:

  1. Download v2019-12-13
  2. Launch the server using integration config params
  3. Create a wallet
  4. Turn it off, and launch the server using latest master
  5. Observe

On subsequent runs, nothing is logged, unless the severity is at least DEBUG

$ cardano-wallet launch ... --trace-wallet-db debug --log-level debug -- ... 
[...]
[cardano-wallet.wallet-db:Debug:28] [2020-01-18 17:02:06.23 UTC] checkpoint table already contains required field 'active_slot_coeff'.
[cardano-wallet.wallet-db:Debug:28] [2020-01-18 17:02:06.23 UTC] Running database action - Start
[cardano-wallet.wallet-db:Debug:28] [2020-01-18 17:02:06.23 UTC] Running database action - Finish
[cardano-wallet.wallet-db:Debug:28] [2020-01-18 17:02:06.23 UTC] No database migrations were necessary.
@jonathanknowles
Copy link
Member

Hi @KtorZ

What value should the migration provide, and how should it obtain this value?

The current values seem to be obtained by reading getActiveSlotCoefficient from the value obtained by calling getInitialBlockchainParameters (which is dependent on the Jormungandr backend). Can this value change between calls?

Is this parameter available at the point where we perform a DB migration, and should we use the current value, or a historical value instead?

(Inspecting the DB manually shows this value is currently 1.0.)

@jonathanknowles
Copy link
Member

jonathanknowles commented Jan 9, 2020

I tried to reproduce this with the following steps:

  1. Build release v2019-12-13 (the release prior to v2019-12-16).
  2. Start the server (with launch) and create a wallet.
  3. Shut down the server.
  4. Build e135938 (the current tip of master)
  5. Start the server (with `launch).

I get the following pool_owner_index error, which is slightly different from the error reported in this issue:

[cardano-wallet:Debug:22] [2020-01-09 03:15:00.49 UTC] Using connection string: /home/jsk/.local/share/cardano-wallet/jormungandr/testnet/wallets/stake-pools.sqlite
[cardano-wallet.query:Debug:22] [2020-01-09 03:15:00.49 UTC] CREATE TEMP TABLE "pool_owner_backup"("pool_id" VARCHAR NOT NULL,"pool_owner" VARCHAR NOT NULL,"pool_owner_index" INTEGER
 NOT NULL, PRIMARY KEY ("pool_id","pool_owner","pool_owner_index"))
[cardano-wallet.query:Debug:22] [2020-01-09 03:15:00.49 UTC] INSERT INTO "pool_owner_backup"("pool_id","pool_owner") SELECT "pool_id","pool_owner" FROM "pool_owner"
SQLite3 returned ErrorConstraint while attempting to perform step: NOT NULL constraint failed: pool_owner_backup.pool_owner_index
[cardano-wallet:Debug:4] [2020-01-09 03:15:00.49 UTC] Logging shutdown.
cardano-wallet-jormungandr: SQLite3 returned ErrorConstraint while attempting to perform step: NOT NULL constraint failed: pool_owner_backup.pool_owner_index

I'll try to reproduce with an earlier version (v2019-12-16), and see if I can replicate the same error reported in this issue.

@jonathanknowles
Copy link
Member

jonathanknowles commented Jan 9, 2020

Just tried with v2019-12-13 --> v2019-12-16, and got the same pool_owner_index error on startup:

[cardano-wallet:Debug:22] [2020-01-09 03:50:23.50 UTC] Using connection string: /home/jsk/.local/share/cardano-wallet/jormungandr/testnet/wallets/stake-pools.sqlite
[cardano-wallet.query:Debug:22] [2020-01-09 03:50:23.51 UTC] CREATE TEMP TABLE "pool_owner_backup"("pool_id" VARCHAR NOT NULL,"pool_owner" VARCHAR NOT NULL,"pool_owner_index" INTEGER NOT NULL, PRIMARY KEY ("pool_id","pool_owner","pool_owner_index"))
[cardano-wallet.query:Debug:22] [2020-01-09 03:50:23.51 UTC] INSERT INTO "pool_owner_backup"("pool_id","pool_owner") SELECT "pool_id","pool_owner" FROM "pool_owner"
SQLite3 returned ErrorConstraint while attempting to perform step: NOT NULL constraint failed: pool_owner_backup.pool_owner_index
[cardano-wallet:Debug:4] [2020-01-09 03:50:23.51 UTC] Logging shutdown.
cardano-wallet-jormungandr: SQLite3 returned ErrorConstraint while attempting to perform step: NOT NULL constraint failed: pool_owner_backup.pool_owner_index

@jonathanknowles
Copy link
Member

jonathanknowles commented Jan 9, 2020

Okay. I can finally reproduce the bug described in this issue, but only after performing the following steps:

  1. Build release v2019-12-13 (the release prior to v2019-12-16).
  2. Start the server (with launch) and create a wallet.
  3. Shut down the server.
  4. Delete the file ~/.local/share/cardano-wallet/jormungandr/testnet/wallets/stake-pools.sqlite.
  5. Build release v2019-12-16.
  6. Start the server (with launch).

Then we see:

SQLite3 returned ErrorConstraint while attempting to perform step: NOT NULL constraint failed: checkpoint_backup.active_slot_coeff
[cardano-wallet.worker:Error:28] [2020-01-09 04:13:50.61 UTC] bf04d83b: Worker has exited unexpectedly: SQLite3 returned ErrorConstraint while attempting to perform step: NOT NULL constraint failed: checkpoint_backup.active_slot_coeff

@jonathanknowles
Copy link
Member

jonathanknowles commented Jan 9, 2020

@KtorZ Should we create a separate bug report for #1251 (comment)?

It's trivial to work around (by deleting ~/.local/share/cardano-wallet/jormungandr/testnet/wallets/stake-pools.sqlite and restarting), but perhaps it is something that we should handle automatically, instead of asking users to delete files manually.

@KtorZ
Copy link
Member Author

KtorZ commented Jan 9, 2020

@jonathanknowles This was a known issue taken care of by #1177. The pool database will "migrate" itself automatically but that's only since v2019.12.23 (or exactly, a few commits after v2019.12.16).

@KtorZ
Copy link
Member Author

KtorZ commented Jan 9, 2020

So, in order to reproduce the bug above, it's probably best to go from v2019.12.13 to v2019.12.23 (or simply current master) such that the pool db migration error is handled, and remains only the active_slot_coeff one.

@jonathanknowles
Copy link
Member

jonathanknowles commented Jan 13, 2020

So, in order to reproduce the bug above, it's probably best to go from v2019.12.13 to v2019.12.23 (or simply current master) such that the pool db migration error is handled, and remains only the active_slot_coeff one.

@KtorZ Unfortunately this doesn't appear to be working.

Going from v2019-12-13 (f3576be) to the current master (10b5512) still results in an NOT NULL constraint failed: pool_owner_backup.pool_owner_index error:

jsk@neon:~/projects/input-output-hk/cardano-wallet$ stack exec cardano-wallet-jormungandr -- launch --genesis-block lib/jormungandr/test/data/jormungandr/block0.bin
[cardano-wallet:Info:4] [2020-01-13 03:17:06.42 UTC] Using directory: /home/jsk/.local/share/cardano-wallet/jormungandr/testnet
[cardano-wallet:Info:4] [2020-01-13 03:17:06.42 UTC] Using directory: /home/jsk/.local/share/cardano-wallet/jormungandr/testnet/wallets
[cardano-wallet:Info:4] [2020-01-13 03:17:06.42 UTC] Running as v2020.1.7 (git revision: 10b5512222e09ef788dcea82628113add4291460)
[cardano-wallet.application:Info:4] [2020-01-13 03:17:06.42 UTC] Wallet backend server starting...
[cardano-wallet.application:Info:4] [2020-01-13 03:17:06.42 UTC] Node is Jörmungandr on testnet
[cardano-wallet.daedalus-ipc:Info:17] [2020-01-13 03:17:06.42 UTC] Daedalus IPC is not enabled.
[cardano-wallet.network:Notice:18] [2020-01-13 03:17:06.44 UTC] Starting process jormungandr
     --genesis-block lib/jormungandr/test/data/jormungandr/block0.bin
     --rest-listen 127.0.0.1:35759
     --storage /home/jsk/.local/share/cardano-wallet/jormungandr/testnet/chain

[cardano-wallet.network:Info:18] [2020-01-13 03:17:06.44 UTC] [jormungandr.17209] Process started
Jan 13 03:17:06.451 INFO Starting jormungandr 0.8.5 (, release, linux [x86_64]) - [rustc 1.38.0 (625451e37 2019-09-23)], task: init
Jan 13 03:17:06.451 WARN Node started without path to the stored secret keys (not a stake pool or a BFT leader), task: init
Jan 13 03:17:06.452 INFO storing blockchain in '"/home/jsk/.local/share/cardano-wallet/jormungandr/testnet/chain/blocks.sqlite"', task: init
Jan 13 03:17:06.493 WARN No trusted peers joinable to bootstrap the network, task: bootstrap
[cardano-wallet.network:Info:22] [2020-01-13 03:17:07.45 UTC] Waiting for Jörmungandr to be ready on tcp/35759
[cardano-wallet.network:Info:22] [2020-01-13 03:17:07.45 UTC] Jörmungandr is ready.
cardano-wallet-jormungandr: SQLite3 returned ErrorConstraint while attempting to perform step: NOT NULL constraint failed: pool_owner_backup.pool_owner_index

But as pointed out in #1251 (comment), I can work around this issue by deleting ~/.local/share/cardano-wallet/jormungandr/testnet/wallets/stake-pools.sqlite and restarting, and only then do I see the following error:

wallet.application:Info:22] [2020-01-13 03:20:19.70 UTC] Found existing wallet: b15b1cc1109485096c779c3605a7e22e71654fb9
[cardano-wallet.wallet-engine:Info:28] [2020-01-13 03:20:19.70 UTC] b15b1cc1: Worker has exited unexpectedly: SQLite3 returned ErrorConstraint while attempting to perform step: NOT N
ULL constraint failed: checkpoint_backup.active_slot_coeff                                     

@KtorZ
Copy link
Member Author

KtorZ commented Jan 13, 2020

@jonathanknowles indeed :o !
I thought it was an occurence of #1200 but, it is yet something else. See #1261

@Fell-x27
Copy link

Fell-x27 commented Jan 13, 2020

I'm afraid there is one more trouble. Don't know why, but sometimes in the end of an epoch cardanoWallet pools worker may just stuck and remove the database with no any actions after. But it recreates this after manually restart. I cant reproduce it by my wish, so didnt open an issue. Prolly its related with something from this thread.

@jonathanknowles
Copy link
Member

jonathanknowles commented Jan 14, 2020

Hi @KtorZ

@rvl and I had a discussion about this issue earlier today. We came up with two potential solutions:

Solution 1: Remove the field

  • Change the code so that it gets the activeSlotCoefficient value from the (initial) result of calling getInitialBlockchainParameters on the backend node (e.g., Jörmungandr) rather than reading it from the checkpoint table.
  • Remove the active_slot_coeff field from the checkpoint table schema definition inCardano.Wallet.DB.Sqlite.TH.
  • Change newDBLayer so that immediately after calling startSqliteBackend, we check whether the checkpoint table has an active_slot_coeff field.
    If the field IS present, we:
    • issue a query to delete this field from the table;
    • continue with startup as normal.

Solution 2: Keep the field

  • Change newDBLayer so that after calling startSqliteBackend, we check whether the checkpoint table has the active_slot_coeff field.
    If the field is NOT present, we:
    • issue a query to delete all rows from the checkpoint table (since they don't have a value for this field, and we can't assume a particular default value);
    • issue a query to add the active_slot_coeff field;
    • continue with startup as normal.

@KtorZ From the above options, do you have a strong preference for how we solve this issue?

Thanks!

iohk-bors bot added a commit that referenced this issue Jan 14, 2020
1261: widen 'isMigrationError' to catch constraint errors r=KtorZ a=KtorZ

# Issue Number

<!-- Put here a reference to the issue this PR relates to and which requirements it tackles -->

#1251 

# Overview

<!-- Detail in a few bullet points the work accomplished in this PR -->

On start-up, we run persistent migrations and in case of migration issues, we
execute some particular handling (e.g. the pool db resets itself). Yet, we
failed at catching migration issues coming from constraints introduced later
and that can't be satisfied on already existing data. This commit fixes it.

Before
======

```
[cardano-wallet.network:Info:22] Jörmungandr is ready.
cardano-wallet-jormungandr: SQLite3 returned ErrorConstraint while attempting to perform step: NOT NULL constraint failed: pool_owner_backup.pool_owner_index
(exiting with an error code)
```
After
=====
```
[cardano-wallet.network:Info:22] Jörmungandr is ready.
[cardano-wallet.stake-pool-db:Error:22]  Failed to migrate the database: : NOT NULL constraint failed: pool_owner_backup.pool_owner_index
[cardano-wallet.stake-pool-db:Notice:22] Non backward compatible database found. Removing old database and re-creating it from scratch. Ignore the previous error.
[cardano-wallet.stake-pool-db:Notice:22] 1 migrations were applied to the database.
(... and continuing as normal)
```
# Comments

<!-- Additional comments or screenshots to attach if any -->

<!-- 
Don't forget to:

 ✓ Self-review your changes to make sure nothing unexpected slipped through
 ✓ Assign yourself to the PR
 ✓ Assign one or several reviewer(s)
 ✓ Once created, link this PR to its corresponding ticket
 ✓ Assign the PR to a corresponding milestone
 ✓ Acknowledge any changes required to the Wiki
-->


Co-authored-by: KtorZ <matthias.benkort@gmail.com>
@KtorZ
Copy link
Member Author

KtorZ commented Jan 14, 2020

@Fell-x27 if you'd like to open an issue describing what you saw (logs are especially useful in that case), please do 🙏

@jonathanknowles Solution 1 is not going to be an option I am afraid. The active_slot_coefficient belongs to the realm of updatable protocol parameters. Thus, even though reading it from the genesis block would work as an immediate fix, we would have to re-introduce it to be tracked alongside checkpoint anyway.

Solution 2 works better if I understand correctly what the approach would be. Incidentally, I just discussed with Daedalus something suggested by a user earlier in a GH ticket: the ability to force-resync a wallet. This would be pretty handy at this level here, forcing to drop every checkpoints in the database (but the genesis one) and, resync.
The main problem with doing this for the migration is that we would have to also drop the genesis checkpoint here, because it also contains the active slot coefficient. This checkpoint can however be replaced easily without loss of user-data (which are stored in a separate table).

iohk-bors bot added a commit that referenced this issue Jan 18, 2020
1266: Add a manual DB migration step for `active_slot_coeff`. r=KtorZ a=jonathanknowles

# Issue Number

#1251 

# Overview

This PR:

- [x] Adds the ability to perform manual DB migration actions after connecting to a SQLite database.
- [x] Uses this ability to detect whether or not the `active_slot_coeff` field is present in the `checkpoint` table.
- [x] Adds the `active_slot_coeff` field manually, if it was not present already, using a default value that is passed down from the network layer.
- [ ] Figure out the cause of repeated log entries of the form:
```hs
[cardano-wallet.wallet-engine:Info:29] [2020-01-16 09:05:14.78 UTC] e834cdb3: Try rolling back to 0.0
[cardano-wallet.wallet-engine:Info:29] [2020-01-16 09:05:14.78 UTC] e834cdb3: Rolled back to 0.0
```

# Comments

Co-authored-by: Jonathan Knowles <jonathan.knowles@iohk.io>
Co-authored-by: KtorZ <matthias.benkort@gmail.com>
iohk-bors bot added a commit that referenced this issue Jan 18, 2020
1266: Add a manual DB migration step for `active_slot_coeff`. r=KtorZ a=jonathanknowles

# Issue Number

#1251 

# Overview

This PR:

- [x] Adds the ability to perform manual DB migration actions after connecting to a SQLite database.
- [x] Uses this ability to detect whether or not the `active_slot_coeff` field is present in the `checkpoint` table.
- [x] Adds the `active_slot_coeff` field manually, if it was not present already, using a default value that is passed down from the network layer.
- [ ] Figure out the cause of repeated log entries of the form:
```hs
[cardano-wallet.wallet-engine:Info:29] [2020-01-16 09:05:14.78 UTC] e834cdb3: Try rolling back to 0.0
[cardano-wallet.wallet-engine:Info:29] [2020-01-16 09:05:14.78 UTC] e834cdb3: Rolled back to 0.0
```

# Comments

Co-authored-by: Jonathan Knowles <jonathan.knowles@iohk.io>
Co-authored-by: KtorZ <matthias.benkort@gmail.com>
@piotr-iohk
Copy link
Contributor

OK. migration from 2019-12-13 to recent master works fine now. I suppose. Wallet is operational and works fine after migration.

[cardano-wallet.pools-db:Error:22] [2020-01-20 09:36:33.35 UTC] Failed to migrate the database: : NOT NULL constraint failed: pool_owner_backup.pool_owner_index
[cardano-wallet.pools-db:Notice:22] [2020-01-20 09:36:33.35 UTC] Non backward compatible database found. Removing old database and re-creating it from scratch. Ignore the previous error.

@piotr-iohk
Copy link
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants