Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(merkle_tree): general purpose nested list #378

Merged
merged 4 commits into from
Nov 12, 2024

Conversation

0xNineteen
Copy link
Contributor

  • uses a new general purposed nested list struct which can be used with any [][]T type

@0xNineteen 0xNineteen enabled auto-merge (squash) November 12, 2024 19:44
Copy link
Contributor

@InKryption InKryption left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one question, otherwise looks solid.

Copy link
Contributor

@InKryption InKryption left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@0xNineteen 0xNineteen merged commit fa66022 into main Nov 12, 2024
6 checks passed
@0xNineteen 0xNineteen deleted the 19/merkle-refactor branch November 12, 2024 23:13
dadepo added a commit that referenced this pull request Nov 20, 2024
 fix(snapshot-download): url (#355)

test(ledger): cleanup database testing (#353)

* fix database testing

* move `std.testing.allocator` usages into test blocks

---------

Co-authored-by: Drew Nutter <dnut@users.noreply.github.com>

fix(shred-collector): shred receiver metrics not being counted (#356)

The following metrics were not being incremented:
- shred_receiver_received_count
- shred_receiver_satisfactor_shred_count

This pr fixes it by incrementing them in the appropriate places.

Fix after merge

Scope MerkleRootValidator

Remove unused import

Scoping functions not attached to structs

Update runService to take logger as anytype

Update printTimeEstimate to take logger as anytype

Remove unused imports

Scope init function of AppBase to AppBase

Scope RocksDB

refactor(ledger): rename roots and orphans to rooted_slots and orphan_slots (#357)

improve(benchmarks): support multiple values + visualizing runtimes (#316)

* build: add a `no-run` option
  Useful for when you need to build a specific executable, but not install it.

* speed up the snapshot downloader

* bench: add verifySnapshot benchmark

* download: return an error instead of panicing in `writeCallback`

* free `loading_threads` if `AccountsDB.init` fails

* benchmark progress

* feat: csv benchmark outputs

* fix: lint and remove unused

* more fixes

* add variance to single output

* fix: path allocations

* re-enable read/write benchmarks

* fix: formatting

* fix: column headers

* fix: leaks in read/write test

* fix benchmarks

* ci fix: dont run all the benchmarks

* attempt fix: OOM ci

* CI check

* fix leak in CI

* fix and identify leaks

* fix lint

* feat: start agg print results

* feat: add option to output all runtimes

* remove extra bench

* add script to view benchmark runtimes

* more improvements on scripts

* fix: update script for different lengths

* fixes

* fix formatting

* add docs

* fix: lint

* reduce logger memory

* fix script

* remove commented out code

* fix tests / build after merge

* output data to files

* support output for humans

* improve formatting

* support ledger benchs

* merge cache changes

* remove trailing commas

* fix test leak

* fix another test leak

* address comments

* support custom time units per benchmark

* fix field output

* fix post merge

* fix lint

* add back bincode benchs

* change to switch and use tmpdir

* remove assert type alias

* address more comments

* remove account_ref defaults

* lower case enums

* more comments

* fix tests

---------

Co-authored-by: David Rubin <daviru007@icloud.com>

Remove unused import

feat(ci) add line size checking to the CI (#360)

* add line size checking to the CI

* add exclude list + format the python scripts

* lexicographically order the exclude list

* add missing exclude files

feat(ledger): bench getting code shreds (#354)

* fix(ledger): backward merkle root chaining bugs

It was ignoring result of checking chaining, and just always marked it as a duplicate (as if the check failed), and always returned true (as if the check succeeded). complete nonsense

It was comparing chained root to chained root, which is wrong. it should be chained to actual root

This change also refactors common logic into a single place.

Also this now returns an error if the merkle root is missing, because that means there is a severe bug in the validator and it needs to be shutdown and patched immediately.

* Update src/ledger/shred_inserter/merkle_root_checks.zig

Co-authored-by: InKryption <59504965+InKryption@users.noreply.github.com>

* fix(ledger): zig fmt

* refactor(ledger): merkle root validator struct

the four methods all have basically the same dependencies so this bundles them together to clarify their purpose and simplify their expression

* refactor(ledger): use working stores to clarify and minimize dependencies, and to organize working state code

* refactor(ledger): in merkle root checks, rename early and late shreds to old and new shreds

* Added code shreds

* Add BlockstoreReader.getCodeShred

* Bench reader.getCodeShredsForSlot

---------

Co-authored-by: Drew Nutter <dnut@users.noreply.github.com>
Co-authored-by: InKryption <59504965+InKryption@users.noreply.github.com>

feat(accountsdb,gossip): Initial snapshot propagation work (#333)

* Refactor latest snapshot info state
  1. Guard the full and incremental snapshot info under the same rwmux,
    both to deduplicate the full slot value, and to synchronize the
    existence of incremental snapshots relative to full snapshots.
  2. Also do the same with the first snapshot info.

* Improve, simplify, and fix some bincode code

* Minor significant gossip/data rework & misc

* Pass const slice instead of arraylist

* Remove unused import

* Big refactor on accountsdb fields and init & misc
* Grouped fields into a few categories.
* Change init to take only an `InitParams` struct.
* Lowercase `AllocatorConfig` fields.
* Share the `AllocatorConfig` enum tag across a few different places.
* Inline `InitConfig`'s fields and remove it.
* Fix code that potentially leaked a disk allocator pointer.

* Notify gossip of new full snapshot

* Re-group and alphabetize type aliases

* Remove pointless `file_size` field

* Fix `writeSnapshotTarWithFields`
It was incorrectly reporting `file_info.length` as the file size,
when it should have been using `account_file.memory.len`.
This also adds a couple more runtime checks to assert the total written
bytes are aligned to full 512 byte blocks.

* Simplify allocation scheme in untar procedure

* Set up test for snapshot gossip advertisement
It's currently disabled because it overwrites the snapshot archives
in the test data dir; can comment the skip statement in order to
test it locally for now. As the TODO says, we need a mechanism for
loading from a snapshot which isn't in the accountsdb's snapshot
working directory.
This also adds a simple way to test the equality of the `SnapshotHashes`
structure in tests.

* Re-simplify SnapshotHashes incremental list
The potential memory savings were minimal at best, and came at a
significant cost in complexity - a simple tagged union is sufficient
to avoid the overhead of an allocation, whilst retaining simplicity
across the rest of the code (especially wrt generic comparisons).

* Defer gossip service shutdown
Sequence it before deinit

* Use workaround in gossip snapshot update unit test
In the future there should be a better API for loading from a snapshot
that's outside the accountsdb snapshot working directory.

* Fixes possible invalid deinit

* Remove unused imports

* Remove unused default field values

* Improve loading thread code & safety.

* Various naming improvements

* Gossip push queue & signing refactor
* Refactor `initSigned`, and related init methods, and call sites.
* Move methods into more scoped types.
* Flatten the `Gossip` struct into `AccountsDB`, and only
  store `my_pubkey`, instead of the whole key pair.
* Change `push_msg_queue` to queue unsigned gossip data, and then
  sign it in `drainPushQueueToGossipTable`.
* Set the wallclock time in `drainPushQueueToGossipTable` as well.

* Remove useless debug field

* Affirm and improve warning logs

* Put pubkey and queue back into combined optional

* Remove old commented code, adjust comparison

fix(scripts): unused import regex and misc refactoring (#362)

* fix(scripts): check_style.py regex issues, avoid early exit, and refactoring

* refactor(scripts) rename check_style.py to style.py since it's not just for checking, it does autofix as well. --check is a cli option and confusingly redundant with the filename

* refactor: remove unused imports

I noticed the regex for detecting unused imports was changed in a recent PR. This broke the detection of unused imports. The change made it so you would only detect unused lines that explicitly say `@import("...")`. But those are not the only lines we want to check. We also want to check for things like `const Thing = sig.utils.Thing;`. The recent change also broke detection of identifiers with numbers and underscores. So I fixed these issues.

I also made the regex slightly better at detecting file paths within `@import` lines by including `/` as a valid character. This identified few more unused imports which I also removed in this PR.

I consolidated the code to a main function so it's clearer what's actually running when you run the script.

I used return values to indicate failed checks, instead of directly calling `exit(1)` within the check function. That way it always runs all the checks, instead of exiting early.

I renamed the file to `style.py`. `check_style.py` is misleading because it's not just a check. By default this script will automatically edit the code to adjust its style. Only if you provide the `--check` flag does it limit itself to checks. At that point, the name is redundant with the flag. I see "style" as analogous to zig's "fmt" subcommand. By default `zig fmt` will autoformat the code, and then it limits itself to checks if you use `--check`.

fix(gossip): send pull requests with ContactInfo instead of LegacyContactInfo (#374)

perf(shred-collector): cache verified merkle roots (#376)

This improves the performance of signature verification in the shred collector by a factor of up to 64.

The shred signature signs the merkle root. Since all the shreds from an erasure set share the same merkle root, you actually only need to verify the signature once for that merkle root, then every shred from the erasure set should be considered "verified" as long as you can reproduce the merkle root from its merkle proof.

Before this pr, the code was redundantly verifying the signature of the merkle root for every single shred. This leads to a performance bottleneck in the shred verifier because signature verification is computationally expensive.

This change removes the redundancy. It uses an lru cache to keep track of the last 1024 merkle roots to be verified.

add fix (#377)

feat(ledger): user-selectable db, hashmapdb fixes, add tests, safer memory management (#372)

* feat(ledger): user-selectable db, hashmapdb fixes, add tests, safer memory management

* fix(ledger): hashmapdb thread safety problem when copied

* fix(ledger): potential deadlock in hashmap db

* refactor(allocators): remove dead code noAlloc

feat(bincode): config default values (#380)

refactor(merkle_tree): general purpose nested list (#378)

feat(shred-collector): add metrics and grafana dashboard (#375)

* feat(shred-collector): distinguish turbine vs repair packets in metrics

* feat(grafana): shred collector dashboard

refactor(bincode): file_id config (#379)

* refactor(bincode): file_id config

* move config into file_id.zig

feat(ci) start correcting files to fit the new style guide (#361)

* start correcting files to fit the new style

* continue formatting files

* remove non-style change

* some more refactors

* fix some try precedence issues

fix(accountsdb): inf loop (#383)

Remove unused import

scope runService logger to ServiceManager

Un-scope logger on ShredCollectorDependencies

docs(research): add lmdb/accountsdb writing to docs/research (#390)

* add lmdb details to docs/research

feat(dashboards): add hardware metrics (#392)

fix(rocksdb): zig build fails on linux intel systems (#394)

This upgrades rocksdb to a new commit from our fork of rocksdb. That commit includes some fixes, and upgrades rocksdb from 9.5 to 9.7.

----
Normally, I build sig for my work machine, which is a macbook, and it builds fine. I tried to build sig on my personal laptop and my desktop, and it did not build on either system. Both are running linux with intel cpus (tiger lake and arrow lake).

There are two errors. Both are failures to build rocksdb.

----

The first error is due to incorrect syntax from our commit to our fork of rocksdb.
```
/home/drew/.cache/zig/p/1220dc54736c65c61ee181cd2716db055cb78131ebed2b88ccf4733f43d35ac39eeb/port/lang.h:79:16: error: extra tokens at end of #ifdef directive
               ^
/home/drew/.cache/zig/p/1220dc54736c65c61ee181cd2716db055cb78131ebed2b88ccf4733f43d35ac39eeb/util/crc32c.cc:18:10: note: in file included from /home/drew/.cache/zig/p/1220dc54736c65c61ee181cd2716db055cb78131ebed2b88ccf4733f43d35ac39eeb/util/crc32c.cc:18:
         ^
```
Commit introducing the bug: facebook/rocksdb@a9c9eee
Fix I committed today: facebook/rocksdb@7c32e83

----

The next error is due to a bug in zig. Zig should be able to detect that crc32 is supported by any architecture supporting sse4, but it does not. So zig fails to activate the crc32 feature even though it is available.

Meanwhile, RocksDB simply checks to see if sse4 is supported, then it proceeds to use crc32 functions. The error occurs because zig has failed to activate the crc32 feature, despite it being supported by the cpu.
```
/home/drew/.cache/zig/p/1220dc54736c65c61ee181cd2716db055cb78131ebed2b88ccf4733f43d35ac39eeb/util/crc32c.cc:611:13: error: always_inline function '_mm_crc32_u64' requires target feature 'crc32', but would be inlined into function 'crc32c_3way' that is compiled without support for 'crc32'
            CRCtriplet(crc, next, -128);
            ^
/home/drew/.cache/zig/p/1220dc54736c65c61ee181cd2716db055cb78131ebed2b88ccf4733f43d35ac39eeb/util/crc32c.cc:432:12: note: expanded from macro 'CRCtriplet'
  crc##0 = _mm_crc32_u64(crc##0, *(buf##0 + offset)); \
```
To be more precise, RocksDB should check for crc32 itself, instead of checking for sse4, since it's actually only interested in whether crc32 is supported. So I committed that change here: facebook/rocksdb@d879aa8

This avoids the compilation error because the compiler thinks crc32 is not supported, so it will not attempt to use the code without activating it. Ideally the zig compiler should also be fixed so we can actually use crc32 on platforms that support it.

feat(allocator): add recycle buffer (#381)

feat(accountsdb): integrate improved reference allocator (#382)

* private recycle method

* add unsafe suffix to unsafe methods

* feat(accountsdb): improved reference allocator

improve(benchmarks): better cli parsing (#391)

* improve(benchmarks): better cli parsing

* update usage to by dynamic with enum

Use same scope identier in logger in cleanup_service.zig

Scope to rockdb in rocksdb.zig

remove usage of @src().fn_name in log scope
dnut pushed a commit that referenced this pull request Nov 20, 2024
* Add scope to logger in Gossip Service

* Add scope to logger in Shred Inserter

* Improve adding scope to logger in Gossip Service

* Add scope to logger in transaction sender service

* Add scope to logger in AccountsDB

* Add scope to logger in GossipDumpService

* Add scope to logger in RepairService

* Add scope to logger in DownloadProgress

* Add scope to logger in BlockstoreReader

* Add scope to logger in RepairRequester

* Add scope to logger in BlockstoreWriter

* Add scope to logger in Client

* Add scope to logger in ShredCollectorDependencies

* Add scope to logger in BasicShredTracker

* Removed unused import

* Add scope to logger in ShredReceiver

* Add scope to logger in LeaderInfo

* Fix leader info

* Add scope to logger in ServiceManager

* Add scope to logger in AppBase

* Scope Gossip metrics to gossip service

* Fix after merge

* Removed unused imports

* More scoping

 fix(snapshot-download): url (#355)

test(ledger): cleanup database testing (#353)

* fix database testing

* move `std.testing.allocator` usages into test blocks

---------

Co-authored-by: Drew Nutter <dnut@users.noreply.github.com>

fix(shred-collector): shred receiver metrics not being counted (#356)

The following metrics were not being incremented:
- shred_receiver_received_count
- shred_receiver_satisfactor_shred_count

This pr fixes it by incrementing them in the appropriate places.

Fix after merge

Scope MerkleRootValidator

Remove unused import

Scoping functions not attached to structs

Update runService to take logger as anytype

Update printTimeEstimate to take logger as anytype

Remove unused imports

Scope init function of AppBase to AppBase

Scope RocksDB

refactor(ledger): rename roots and orphans to rooted_slots and orphan_slots (#357)

improve(benchmarks): support multiple values + visualizing runtimes (#316)

* build: add a `no-run` option
  Useful for when you need to build a specific executable, but not install it.

* speed up the snapshot downloader

* bench: add verifySnapshot benchmark

* download: return an error instead of panicing in `writeCallback`

* free `loading_threads` if `AccountsDB.init` fails

* benchmark progress

* feat: csv benchmark outputs

* fix: lint and remove unused

* more fixes

* add variance to single output

* fix: path allocations

* re-enable read/write benchmarks

* fix: formatting

* fix: column headers

* fix: leaks in read/write test

* fix benchmarks

* ci fix: dont run all the benchmarks

* attempt fix: OOM ci

* CI check

* fix leak in CI

* fix and identify leaks

* fix lint

* feat: start agg print results

* feat: add option to output all runtimes

* remove extra bench

* add script to view benchmark runtimes

* more improvements on scripts

* fix: update script for different lengths

* fixes

* fix formatting

* add docs

* fix: lint

* reduce logger memory

* fix script

* remove commented out code

* fix tests / build after merge

* output data to files

* support output for humans

* improve formatting

* support ledger benchs

* merge cache changes

* remove trailing commas

* fix test leak

* fix another test leak

* address comments

* support custom time units per benchmark

* fix field output

* fix post merge

* fix lint

* add back bincode benchs

* change to switch and use tmpdir

* remove assert type alias

* address more comments

* remove account_ref defaults

* lower case enums

* more comments

* fix tests

---------

Co-authored-by: David Rubin <daviru007@icloud.com>

Remove unused import

feat(ci) add line size checking to the CI (#360)

* add line size checking to the CI

* add exclude list + format the python scripts

* lexicographically order the exclude list

* add missing exclude files

feat(ledger): bench getting code shreds (#354)

* fix(ledger): backward merkle root chaining bugs

It was ignoring result of checking chaining, and just always marked it as a duplicate (as if the check failed), and always returned true (as if the check succeeded). complete nonsense

It was comparing chained root to chained root, which is wrong. it should be chained to actual root

This change also refactors common logic into a single place.

Also this now returns an error if the merkle root is missing, because that means there is a severe bug in the validator and it needs to be shutdown and patched immediately.

* Update src/ledger/shred_inserter/merkle_root_checks.zig

Co-authored-by: InKryption <59504965+InKryption@users.noreply.github.com>

* fix(ledger): zig fmt

* refactor(ledger): merkle root validator struct

the four methods all have basically the same dependencies so this bundles them together to clarify their purpose and simplify their expression

* refactor(ledger): use working stores to clarify and minimize dependencies, and to organize working state code

* refactor(ledger): in merkle root checks, rename early and late shreds to old and new shreds

* Added code shreds

* Add BlockstoreReader.getCodeShred

* Bench reader.getCodeShredsForSlot

---------

Co-authored-by: Drew Nutter <dnut@users.noreply.github.com>
Co-authored-by: InKryption <59504965+InKryption@users.noreply.github.com>

feat(accountsdb,gossip): Initial snapshot propagation work (#333)

* Refactor latest snapshot info state
  1. Guard the full and incremental snapshot info under the same rwmux,
    both to deduplicate the full slot value, and to synchronize the
    existence of incremental snapshots relative to full snapshots.
  2. Also do the same with the first snapshot info.

* Improve, simplify, and fix some bincode code

* Minor significant gossip/data rework & misc

* Pass const slice instead of arraylist

* Remove unused import

* Big refactor on accountsdb fields and init & misc
* Grouped fields into a few categories.
* Change init to take only an `InitParams` struct.
* Lowercase `AllocatorConfig` fields.
* Share the `AllocatorConfig` enum tag across a few different places.
* Inline `InitConfig`'s fields and remove it.
* Fix code that potentially leaked a disk allocator pointer.

* Notify gossip of new full snapshot

* Re-group and alphabetize type aliases

* Remove pointless `file_size` field

* Fix `writeSnapshotTarWithFields`
It was incorrectly reporting `file_info.length` as the file size,
when it should have been using `account_file.memory.len`.
This also adds a couple more runtime checks to assert the total written
bytes are aligned to full 512 byte blocks.

* Simplify allocation scheme in untar procedure

* Set up test for snapshot gossip advertisement
It's currently disabled because it overwrites the snapshot archives
in the test data dir; can comment the skip statement in order to
test it locally for now. As the TODO says, we need a mechanism for
loading from a snapshot which isn't in the accountsdb's snapshot
working directory.
This also adds a simple way to test the equality of the `SnapshotHashes`
structure in tests.

* Re-simplify SnapshotHashes incremental list
The potential memory savings were minimal at best, and came at a
significant cost in complexity - a simple tagged union is sufficient
to avoid the overhead of an allocation, whilst retaining simplicity
across the rest of the code (especially wrt generic comparisons).

* Defer gossip service shutdown
Sequence it before deinit

* Use workaround in gossip snapshot update unit test
In the future there should be a better API for loading from a snapshot
that's outside the accountsdb snapshot working directory.

* Fixes possible invalid deinit

* Remove unused imports

* Remove unused default field values

* Improve loading thread code & safety.

* Various naming improvements

* Gossip push queue & signing refactor
* Refactor `initSigned`, and related init methods, and call sites.
* Move methods into more scoped types.
* Flatten the `Gossip` struct into `AccountsDB`, and only
  store `my_pubkey`, instead of the whole key pair.
* Change `push_msg_queue` to queue unsigned gossip data, and then
  sign it in `drainPushQueueToGossipTable`.
* Set the wallclock time in `drainPushQueueToGossipTable` as well.

* Remove useless debug field

* Affirm and improve warning logs

* Put pubkey and queue back into combined optional

* Remove old commented code, adjust comparison

fix(scripts): unused import regex and misc refactoring (#362)

* fix(scripts): check_style.py regex issues, avoid early exit, and refactoring

* refactor(scripts) rename check_style.py to style.py since it's not just for checking, it does autofix as well. --check is a cli option and confusingly redundant with the filename

* refactor: remove unused imports

I noticed the regex for detecting unused imports was changed in a recent PR. This broke the detection of unused imports. The change made it so you would only detect unused lines that explicitly say `@import("...")`. But those are not the only lines we want to check. We also want to check for things like `const Thing = sig.utils.Thing;`. The recent change also broke detection of identifiers with numbers and underscores. So I fixed these issues.

I also made the regex slightly better at detecting file paths within `@import` lines by including `/` as a valid character. This identified few more unused imports which I also removed in this PR.

I consolidated the code to a main function so it's clearer what's actually running when you run the script.

I used return values to indicate failed checks, instead of directly calling `exit(1)` within the check function. That way it always runs all the checks, instead of exiting early.

I renamed the file to `style.py`. `check_style.py` is misleading because it's not just a check. By default this script will automatically edit the code to adjust its style. Only if you provide the `--check` flag does it limit itself to checks. At that point, the name is redundant with the flag. I see "style" as analogous to zig's "fmt" subcommand. By default `zig fmt` will autoformat the code, and then it limits itself to checks if you use `--check`.

fix(gossip): send pull requests with ContactInfo instead of LegacyContactInfo (#374)

perf(shred-collector): cache verified merkle roots (#376)

This improves the performance of signature verification in the shred collector by a factor of up to 64.

The shred signature signs the merkle root. Since all the shreds from an erasure set share the same merkle root, you actually only need to verify the signature once for that merkle root, then every shred from the erasure set should be considered "verified" as long as you can reproduce the merkle root from its merkle proof.

Before this pr, the code was redundantly verifying the signature of the merkle root for every single shred. This leads to a performance bottleneck in the shred verifier because signature verification is computationally expensive.

This change removes the redundancy. It uses an lru cache to keep track of the last 1024 merkle roots to be verified.

add fix (#377)

feat(ledger): user-selectable db, hashmapdb fixes, add tests, safer memory management (#372)

* feat(ledger): user-selectable db, hashmapdb fixes, add tests, safer memory management

* fix(ledger): hashmapdb thread safety problem when copied

* fix(ledger): potential deadlock in hashmap db

* refactor(allocators): remove dead code noAlloc

feat(bincode): config default values (#380)

refactor(merkle_tree): general purpose nested list (#378)

feat(shred-collector): add metrics and grafana dashboard (#375)

* feat(shred-collector): distinguish turbine vs repair packets in metrics

* feat(grafana): shred collector dashboard

refactor(bincode): file_id config (#379)

* refactor(bincode): file_id config

* move config into file_id.zig

feat(ci) start correcting files to fit the new style guide (#361)

* start correcting files to fit the new style

* continue formatting files

* remove non-style change

* some more refactors

* fix some try precedence issues

fix(accountsdb): inf loop (#383)

Remove unused import

scope runService logger to ServiceManager

Un-scope logger on ShredCollectorDependencies

docs(research): add lmdb/accountsdb writing to docs/research (#390)

* add lmdb details to docs/research

feat(dashboards): add hardware metrics (#392)

fix(rocksdb): zig build fails on linux intel systems (#394)

This upgrades rocksdb to a new commit from our fork of rocksdb. That commit includes some fixes, and upgrades rocksdb from 9.5 to 9.7.

----
Normally, I build sig for my work machine, which is a macbook, and it builds fine. I tried to build sig on my personal laptop and my desktop, and it did not build on either system. Both are running linux with intel cpus (tiger lake and arrow lake).

There are two errors. Both are failures to build rocksdb.

----

The first error is due to incorrect syntax from our commit to our fork of rocksdb.
```
/home/drew/.cache/zig/p/1220dc54736c65c61ee181cd2716db055cb78131ebed2b88ccf4733f43d35ac39eeb/port/lang.h:79:16: error: extra tokens at end of #ifdef directive
               ^
/home/drew/.cache/zig/p/1220dc54736c65c61ee181cd2716db055cb78131ebed2b88ccf4733f43d35ac39eeb/util/crc32c.cc:18:10: note: in file included from /home/drew/.cache/zig/p/1220dc54736c65c61ee181cd2716db055cb78131ebed2b88ccf4733f43d35ac39eeb/util/crc32c.cc:18:
         ^
```
Commit introducing the bug: facebook/rocksdb@a9c9eee
Fix I committed today: facebook/rocksdb@7c32e83

----

The next error is due to a bug in zig. Zig should be able to detect that crc32 is supported by any architecture supporting sse4, but it does not. So zig fails to activate the crc32 feature even though it is available.

Meanwhile, RocksDB simply checks to see if sse4 is supported, then it proceeds to use crc32 functions. The error occurs because zig has failed to activate the crc32 feature, despite it being supported by the cpu.
```
/home/drew/.cache/zig/p/1220dc54736c65c61ee181cd2716db055cb78131ebed2b88ccf4733f43d35ac39eeb/util/crc32c.cc:611:13: error: always_inline function '_mm_crc32_u64' requires target feature 'crc32', but would be inlined into function 'crc32c_3way' that is compiled without support for 'crc32'
            CRCtriplet(crc, next, -128);
            ^
/home/drew/.cache/zig/p/1220dc54736c65c61ee181cd2716db055cb78131ebed2b88ccf4733f43d35ac39eeb/util/crc32c.cc:432:12: note: expanded from macro 'CRCtriplet'
  crc##0 = _mm_crc32_u64(crc##0, *(buf##0 + offset)); \
```
To be more precise, RocksDB should check for crc32 itself, instead of checking for sse4, since it's actually only interested in whether crc32 is supported. So I committed that change here: facebook/rocksdb@d879aa8

This avoids the compilation error because the compiler thinks crc32 is not supported, so it will not attempt to use the code without activating it. Ideally the zig compiler should also be fixed so we can actually use crc32 on platforms that support it.

feat(allocator): add recycle buffer (#381)

feat(accountsdb): integrate improved reference allocator (#382)

* private recycle method

* add unsafe suffix to unsafe methods

* feat(accountsdb): improved reference allocator

improve(benchmarks): better cli parsing (#391)

* improve(benchmarks): better cli parsing

* update usage to by dynamic with enum

Use same scope identier in logger in cleanup_service.zig

Scope to rockdb in rocksdb.zig

remove usage of @src().fn_name in log scope

* Fixes after rebasing

* Style

* More fixes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

2 participants