-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Blockfrost RYO specific improvements #1686
Comments
Some initial thoughts on this:
We've made some work on this. It slowed down syncing speed, so we decided to park it. It could become an optional feauture though, or default if we neutralise its effect with separate performance improvements
A challenge with these cases is that each tx triggers an update instead of an inertion. When there are updates, handling rollbacks is never easy.
DBSync should be able to extract this from the epoch boundary ledger state.
I don't really understand this. Would this have an entry for each tx and address? |
Wut? 😄 Hope if this is prioritized, the nomen clature continues to be based on utility, not companies. There are multiple API providers as well as projects that have requested parts of this over months (besides other featuresets)🙂 . Sorry for knit picking this - but just want to point it out , now that it is part of IO umbrella.
@kderme - Given the recent config additions, could maybe make that a flag - as discussed in #1333 , the advantage from query pov (+ resource, particularly storage and IOPs consumption) is abundantly higher than initial 1-time sync |
Fear not our goal is to improve usability/functionality/performance for the whole community who use In all honesty we're really wanting to hear back from the community to know about their real life user cases so we can better steer what is and isn't needed. Also for us as a team to better explain the restrictions we always have to balance when trying to provide new features/fix existing issues and keeping up with new standards etc. We have started a public Matrix channel (which we'll publicly advertise asap) we hope it will be a great place for informal discussions/questions to happen and open the communication channels up more. Then if things of importance arises on matrix we can move them to github issues with agreed specs and such. 😄 |
Optional sound reasonable.
Yes, I understand that. If it helps, I believe I have a working and battle-tested trigger solution (inserts/upserts/rollbacks,...), although just for ADA balances.
Great to hear that!
Basically yes. The way we are currently retrieving txs are per address. Since there are some addresses which can be very large and we are using offset pagination instead of cursor (multiple reasons why), data can be very sparse. We need them dense and ideally know which page lands exactly where so we can retrieve the data with constant complexity. |
FYI DBSync now has this table https://github.com/IntersectMBO/cardano-db-sync/blob/master/doc/schema.md#pool_stat It's not populated by default only when "pool_stat": enabled is used in the db-sync insert config. |
Hi, I would like to propose additional improvements/additions to the db sync project, which would help to improve speed of RYO in the future.
A separate
--blockfrost
switch (or more, to allow for granularity) could be introduced, as some of the data might generally be not needed and would take considerably large amount of disk space, since some of the improvements would require additional de-normalized tables. Other improvements include pre-calculation of historic data, which is not live, but takes considerable amount of resources to calculate on the fly.Running balances
A starting point is to introduce
address
andpayment_cred
tables containing a single address just once (as opposed totx_out
), similar tostake_address
table, which could then be used in several ways.e.g.:
balance
(i.e. the current state) of anaddress
/payment_cred
andstake_address
(stake_address table would need additionalbalance
field). This would greatly improve the overall speed when fetching balance. There are issues with multi assets (foraddress
/payment_cred
) though, as those could bloat some addresses, especially those related to exchanges etc. so this would require further thought, but calculating ADA balance is doable right off the bat. Having utxos available sure helps, but I believe it would be even better to also keep the balances.balance_in
andbalance_out
fields as well, or use those instead of a singlebalance
fieldLive data
In addition, I would like to opt for:
pool_hash
table, or create a new table withlive_stake
,active_stake
,live_delegators
, and historicalblocks_minted
information which is expensive to calculate on the fly.assets
table, which would contain a given asset withid
(appended/ordered by MIN(ma_tx_mint.id) to allow for pagination), and calculatedquantity
andmint_or_burn_count
.stats
table, containing the network supply:total
,circulating
,locked
,treasury
,reserves
and total stake stats:live
andactive
- this info is needed for example to calculate pool saturation, live size etc.Auxiliary data
Next, data which is not live, but still expensive to calculate on the fly:
stake_amount
(SUM) intoepoch
table (helpful in calculation of historicallive_size
of a pool).pool_history
table, which would contain data grouped by epoch, e.g.:Denormalized transactions
Due to how pagination works in RYO, transaction history of
address
,payment_cred
andasset
will require denormalized tables in order to speed up lookup, e.g.:with a requirement of
tx_id
being consecutive i.e. without any gaps. Otherwise the lookups won't be optimal. This means not relying on pg's serial, but maintaining the ids internally (or usingrow_number () OVER ...
and always selecting the previous MAX).Practically all aforementioned improvements have been tested one way or the other (in forms or triggers or matviews), so in case further SQL implementation/examples are needed, I am more than happy to share them with the team.
Note: this is a first attempt to address RYO improvements directly in db-sync, so it will definitely get amended and/or edited.
THANKS!
The text was updated successfully, but these errors were encountered: