This contains notes for a workshop at Surfin' Bitcoin 2022, specifically to review PR #25768, but can be generalized for whichever PR you want.
-
Fork the bitcoin/bitcoin repo.
-
Clone your fork locally. Add both bitcoin/bitcoin and youruser/bitcoin as remotes and name them something non-confusing for yourself. There are multiple ways to do this. Here is the way through https. I've named my remote "origin" and the bitcoin/bitcoin remote "upstream."
git clone bitcoin https://github.com/<user>/bitcoin.git
cd bitcoin
git remote add upstream https://github.com/bitcoin/bitcoin.git
git remote add origin https://github.com/<user>/bitcoin.git
- Download the dependencies and build the source code. See docs for building instructions and options for your computer. Here is what I do on my Ubuntu box:
./autogen.sh
./configure
make -j $(nproc + 1)
- See that you now have a bitcoind executable. You can run your own bitcoin node! If you built the gui, you can also run that.
./src/bitcoind
- Run some tests. I'd start with the unit and functional tests.
make check
test/functional/test_runner.py
(Optional)
See the Bitcoin Core productivity tips. Top tips: install ccache and multi-thread builds.
Bitcoin Core is a large codebase. Make it easy for yourself to find things in the code. Install a good grepper, use ctags, or use an IDE with search functionality built in.
Get familiar with github, collaborating using pull requests and using git in an open source project.
A diff viewer can go a long way in helping you review PRs.
There are various Bitcoin Core-specific resources on general practices, review culture, first-hand accounts, etc. You don't need to read everything before you get started, but they might help.
Find an IRC client you like and join the #bitcoin-core-dev channel on the libera.chat server to attend meetings, ask questions, and/or ask for CI restarts.
This is what I normally do to review a PR.
Check out the PR locally. There are multiple ways to do this, e.g., fetching through the repo,
adding the author as a remote, or setting git config to automatically create branches for pull
requests (I like this a lot because it also means I get PR numbers whenever I use git log
).
For today, we can do the simplest method:
git fetch upstream pull/25768/head:review25768
git checkout review25768
Rebase on master to detect silent merge conflicts. Build and run the unit + functional test suite for a smoke check. This should tell us if there is something obvious wrong. By no means should you consider "all tests passed" to mean this PR is good to go.
git rebase master
make check -j12 && test/functional/test_runner.py -j12
If there are multiple commits, run this smoke check on each one to make sure the commits compile and
"work" individually. Where n
is the number of commits to be merged from the branch:
git rebase -i --exec "make check && test/functional/test_runner.py" HEAD~n
These steps above might take a while, so now is a good time to pull up the PR on github, look at the PR description if we haven't already, read other reviewers' comments, and gather background information on the concept. Usually I've already done this before I checked it out locally, but getting building and running as early as possible (especially if people are doing this for the first time) makes sense for our 60 minute-workshop.
For example, was there a PR Review Club on this PR? In our case, yes!
All PRs must be sufficiently motivated, even if it's a "minor" change that wouldn't affect users. Before looking at the details of the C++ code, let's ask ourselves whether this PR is a good idea.
PR #25768's motivation is something we can classify as a bug fix.
These questions should follow:
- Is this actually a bug? Is it a real bug or a misuse of the software, something impossible to hit realistically? Can the issue be solved simply with better documentation?
- "How bad" is the bug? What users are affected?
- Is there a test that can be run before to reproduce the issue, and after to verify the fix?
- Is there a description of how to reproduce the bug manually? Can we reproduce the bug ourselves?
Since PR #25768 includes a test, let's first make sure the test passes with the changes in this PR and fails without it. Since this is a test that only sometimes fails, we can run it multiple times:
# while on the PR branch, this should succeed
make && test/functional/wallet_resendwallettransactions.py
# one way is to drop the first commit with the wallet changes
git rebase -i HEAD~2 # drop the top commit
# another way is to cherry-pick the test commit onto master
git checkout master
git cherry-pick e40ddf36be
# this should fail
make
for i in {1..10}; do test/functional/wallet_resendwallettransactions.py; done
Once you've determined whether this PR is a good idea, you might be comfortable leaving a "Concept ACK" review on the PR.
So we've determined this is a good idea. Now we want to look at the code. This part is much less structured. We'll probably just walk through the code, try to understand it, test things out, and ask outselves questions like:
- Are there any alternative approaches? How does this approach compare?
- Are space and time complexity acceptable for the use case?
- Could this implementation easily be swapped for an STL algorithm or container?
We'll also look at the test and ask ourselves questions like:
- Is there good test coverage? Can we think of any untested edge-cases? If we change a line, do any tests fail?
- Is the type of test appropriate?
- Is the test well-written? Is it unacceptably slow, thread-safe, or unmaintainably complex?
Once you've decided this approach is more appropriate than any alternatives, you might be comfortable leaving an "Approach ACK" review on the PR.
Some other questions (from the review club):
-
Why is it difficult to write a test that reliably demonstrates that transactions may be rebroadcast in the wrong order? How does
std::unordered_map
order its elements? -
Can you think of any other methods of getting the wallet transactions sorted in this order?
-
Would it make sense to keep the wallet transactions sorted in this order? Why is
mapWallet
implemented as astd::unordered_map
; would another data structure make sense? -
If you thought of an alternative approach, how does it compare? What is its runtime complexity? How much {more, less} memory is allocated? Does it require a lot of code to be (re)written? Is its complexity appropriate for this purpose?
-
How is
GetSortedTxs()
, the function extracted fromReacceptWalletTransactions()
, implemented? How much memory is needed, as a function of the transactions inmapWallet
? -
GetSortedTxs()
is declared as requiring the wallet lock to already be held. Why or why not does this make sense? -
A structured binding declaration is used in two places in
GetSortedTxs()
. What are the types of each variable (wtxid
,wtx
,_
, andwtx
) and what do they represent? Why or why not is this better than declaring the types explicitly? -
Can you find any behavior changes to
ReacceptWalletTransactions()
, or is it a move-only code change? -
The test calls both
setmocktime
andmockscheduler
to trigger a rebroadcast. What is the difference between these two calls, and why or why not is it necessary to call both of them? -
Can you find any behavior changes to
ReacceptWalletTransactions()
, or is it a move-only code change? -
The test "evicts" the transactions from the mempool by calling
setmocktime
. Why is this necessary, and why does it work?
Doxygen is quite helpful in looking at the caller/callee graphs of functions in the codebase.
For example, in this PR, we can look up the path for
CWallet::ResendWalletTransactions
.
Bitcoin Core is a large and old(ish) software project, but its development culture is very unique in
that everything must be strongly motivated and almost all decisions are publicly documented in
the commit messages and discussions. At times, it may seem tedious to have paragraphs of
justification in commit messages, but if you're ever wondering "when and why was this implemented
this way?" you can always find the answer through a few git blame
, git log
, and/or github
searches. This is incredibly powerful!
For example, if you're wondering why wallet rebroadcast works the way it does, or
are confused by the comment above ResendWalletTransactions
, use the github blame
feature
to see who wrote it, copy the commit hash into your github search bar, and read the discussion on
the original PR and
issue.
If you want to see the history of changes to a function, even if things were renamed or moved to
different files, you can use git log
to see the history. For example, using git log
on the
ReacceptWalletTransactions()
can show us when SubmitMemoryPoolAndRelay
was introduced:
$ git log -s -L 1837,1852:src/wallet/wallet.cpp
commit 9a556564e9dc64ae0ad723c78da33d0c982f006f
Author: Andrew Chow <achow101-github@achow101.com>
Date: Mon Aug 1 20:10:37 2022 -0400
wallet: Sort txs in ResendWalletTransactions
ReacceptWalletTransactions sorts txs by insertion order,
ResendWalletTransactions should as well.
Includes a minor refactor of ResendWalletTranactions in order to share
tx sorting code.
commit b11a195ef450bd138aa03204a5e74fdd3ddced26 (upstream/pr/22100)
Author: Russell Yanofsky <russ@yanofsky.org>
Date: Fri Feb 12 18:01:22 2021 -0500
refactor: Detach wallet transaction methods (followup for move-only)
Followup to commit "MOVEONLY: CWallet transaction code out of
wallet.cpp/.h" that detaches and renames some CWalletTx methods, making
into them into standalone functions or CWallet methods instead.
There are no changes in behavior and no code changes that aren't purely
mechanical. It just gives spend and receive functions more consistent
names and removes the circular dependencies added by the earlier
MOVEONLY commit.
There are also no comment or documentation changes. Removed comments
from transaction.h are just migrated to spend.h, receive.h, and
wallet.h.
commit b66c429c56c85fa16c309be0b2bca9c25fdd3e1a
Author: Antoine Riard <ariard@student.42.fr>
Date: Mon Apr 29 09:52:01 2019 -0400
Remove locked_chain from GetDepthInMainChain and its callers
We don't remove yet Chain locks as we need to preserve lock
order with CWallet one until swapping at once to avoid
deadlock failures (spotted by --enable-debug)
commit c8b53c3beafa289dcdbd8c2ee9f957bdeca79cbc (upstream/pr/16557)
Author: John Newbery <john@johnnewbery.com>
Date: Fri Aug 9 11:07:30 2019 -0400
[wallet] Restore confirmed/conflicted tx check in SubmitMemoryPoolAndRelay()
Restores the confirmed/conflicted tx check removed in
8753f5652b4710e66b50ce87788bf6f33619b75a. There should be no external
behaviour change (these txs would not get accepted to the mempool
anyway), but not having the check in the wallet causes log spam.
Also adds a comment to ResentWalletTransactions() that
confirmed/conflicted tx check is done in SubmitMemoryPoolAndRelay().
commit 8753f5652b4710e66b50ce87788bf6f33619b75a
Author: Antoine Riard <ariard@student.42.fr>
Date: Thu Apr 11 16:01:58 2019 +0000
Remove duplicate checks in SubmitMemoryPoolAndRelay
IsCoinBase check is already performed early by
AcceptToMemoryPoolWorker
GetDepthInMainChain check is already perfomed by
BroadcastTransaction
To avoid deadlock we MUST keep lock order in
ResendWalletTransactions and CommitTransaction,
even if we lock cs_main again further.
in BroadcastTransaction. Lock order will need
to be clean at once in a future refactoring
commit 611291c198eb2be9bf1aea1bf9b2187b18bdb3aa
Author: Antoine Riard <ariard@student.42.fr>
Date: Thu Apr 11 15:58:53 2019 +0000
Introduce CWalletTx::SubmitMemoryPoolAndRelay
Higher wallet-tx method combining RelayWalletTransactions and
AcceptToMemoryPool, using new Chain::broadcastTransaction
Bitcoin Core is written in C++, which is usually a new language to people. But don't worry, the need to learn more C++ will never stop! Sometimes syntax is new to C++14 or C++17, some annotations are used with specific compilers and/or options. For example, this PR uses structured binding and clang thready safety analysis, and the test's nondeterminism is due to how STL unordered maps work.
If you're coming from higher-level programming languages, something like A Tour of C++ can give you a nice tutorial overview to the C++ memory model, usage of STL containers and algorithms, and an introduction to some C++17 and C++20 niceities. I've also found Effective Modern C++ to be a good "recipe book" for C++14.
Now that you've reviewed your first PR, what's next?
Leave a review on the PR. Here's a writeup on what makes a "good" or "bad" review. Some more resources on "how to contribute" to Bitcoin Core:
Attend a PR review club meeting! I personally learned about Bitcoin Core by attending these review clubs weekly. I reviewed the PR ahead of time on Tuesdays (initially, it would take me a whole day to get through the notes), asked questions at the meeting on Wednesdays, and usually left a review on Thursdays.
"How do I know what PRs or issues to work on?"
- High Priority PR Board
- Marco's Code Coverage Graphs
- good first issues
- Jonatack's Blog Post On Review