- Feature Name:
stolon
- Start Date: 2020-08-31
- Design PR: ZcashFoundation/zebra#1006
- Zebra Issue: ZcashFoundation/zebra#1643
- Zebra Implementation: ZcashFoundation/zebra#1007
Note: This is a draft Zebra RFC. See ZcashFoundation/zebra#1643 for more details.
Stolon is a transaction diffusion algorithm inspired by Dandelion, with additional privacy properties. Dandelion diffuses transactions across a p2p network in two phases, a "stem" phase where transactions are relayed in a line and a "fluff" phase where transactions are broadcast as widely as possible. Stolon, named for runners that extend horizontally at or just below the soil to spread new plants, tweaks this approach to perform the "stem" phase over ephemeral Tor connections and the "fluff" phase over the clearnet p2p network.
Stolon is designed to for use both by full nodes such as Zebra and Zcashd as well as by wallet software. This RFC describes:
- the threat model Stolon addresses;
- how Zebra's architecture makes Stolon easy to implement;
- how Stolon can be used by Zebra or Zcashd;
- and how Stolon can be used by wallet software.
At a high level, Zcash can be thought of as a fault-tolerant system for replicating ledger state (such as transactions and blocks) among the participants in the Zcash network. This ledger state is public and globally-shared, as in Bitcoin, but unlike Bitcoin, Zcash uses advanced cryptographic techniques to provide a private ledger using public data.
This functionality is provided by shielded transactions, whose public data (sent over the network and stored in the block chain) does not reveal private information about ledger state (such as transaction amounts, senders, or receivers).
However, shielded transactions are created by some particular user, and this poses a linkability challenge: how can we manage the transition of some particular transaction from its creation by some particular user into being part of the public, globally-shared chain state, without revealing its provenance. This is the problem of private transaction diffusion.
Zcashd's transaction diffusion mechanism does not attempt to provide privacy. Zebra does not yet have any transaction diffusion mechanism, so designing one for Zebra is an opportunity to improve on this state of affairs. This RFC proposes Stolon.
As noted in the definitions section, this RFC considers a broad definition of transaction diffusion that encompasses the complete process from the initial production of a transaction up to its inclusion in a mined block. Thus, while Stolon is primarily designed as a transaction diffusion mechanism for Zebra, it is also intended to be useful for wallet software.
This problem is made more important to address by the existence of Zebra. Unlike Zcashd, Zebra uses its scalable network architecture to attempt to connect to as many peers as possible, so every Zebra node is a supernode in the connection graph. And because Zebra is designed to support observability and instrumentation, Zebra can be trivially repurposed as the network monitoring adversary described in the Dandelion paper by setting an appropriate tracing filter.
-
transaction diffusion: the complete process from the initial production of a transaction up to its inclusion in a mined block.
-
relay phase: the initial, small-fanout diffusion phase, described below.
-
broadcast phase: the final, wide-fanout diffusion phase, described below.
-
node: a participant in the Zcash network.
-
wallet software: software that generates transactions. This can either be standalone software (light wallets) or integrated with a full node.
Stolon, named for runners that extend horizontally at or just below the soil to spread new plants, diffuses transactions in two phases:
- a small-fanout relay phase, performed over special, ephemeral Tor connections. These connections are isolated from all other node state and from each other, and use a special, minimally-distinguishable Zcash handshake.
- a wide-fanout broadcast phase, performed over clearnet p2p connections, spreading transactions as quickly as possible.
Transaction diffusion starts in the relay phase, and transitions to the broadcast phase with random probability.
The relay phase does not require any changes to the Zcash network protocol,
but it does require a protocol implementation that can create completely
isolated network connections and configure them not to leak distinguishing
information. This is impossible with Zcashd's implementation, which
commingles protocol state between all connections, but is easy to implement
using zebra-network
. The implementation of the relay phase is provided by a
new stolon
crate, which also handles communication with Tor's SOCKS proxy.
This crate can be used as a Rust library, and provides a standalone stolon
binary that reads a transaction from stdin and relays it to the Zcash
network. This allows the relay phase to be easily performed by wallet
software or Zcashd, by integrating the stolon
binary or library.
Stolon is inspired by Dandelion, a design for more private transaction diffusion in Bitcoin. Dandelion is described in detail in the Dandelion paper, the Dandelion++ paper, and in BIP156.
At a high level, Dandelion works as follows:
-
Each node chooses two "Dandelion destinations" among its peer set, which it will use for transaction relaying. These destination peers serve a similar role to guard nodes in Tor, and each node's choice of destination peers forms a subgraph of the full connection graph, called the "privacy subgraph".
-
In the "stem phase", transactions are forwarded along the privacy subgraph. This forwarding is performed with Dandelion-specific network messages and transactions are stored in a special "stempool" separate from the ordinary mempool. At each step, a transaction can transition to the fluff phase with small probability.
-
In the "fluff phase", transactions are broadcast as widely as possible across the full connection graph.
An adversary maintaining connections to many peers and monitoring which transactions they advertise has much more limited insight into the privacy subgraph than the full connection graph. Forwarding transactions along the privacy subgraph aims to obscure the origin of any particular transaction.
The original Dandelion design in BIP156 required changes to the Bitcoin network protocol, and was not adopted. A simplified version, Dandelion-lite, also exists. The simplified version drops the special network messages and only performs the stem phase for newly-generated transactions.
Dandelion's privacy gains are estimated empirically, but it's unclear how well these estimates apply to Zcash in practice. For instance, the original paper studies information flow in 16-regular graphs, on the basis that the default Bitcoin configuration creates 8 peer connections. But in practice the connection graph is not regular, because outbound connections are not uniformly distributed across other nodes. This issue becomes more extreme in the case of Zcash, where every Zebra node is a supernode, so further analysis of information flow in irregular graph models would be desirable.
Zcash inherits the Bitcoin networking protocol. Transactions are handled using the following messages:
-
inv
: theinv
message contains a list of object hashes, and is used by a node to advertise the availability of some object to other peers. In particular, transactions can be advertised to other peers by sending them aninv
message containing the hash of the transaction. -
getdata
: thegetdata
message requests objects by hash, and is used by a node to request the contents of some previously-advertised data. -
tx
: thetx
message contains transaction data. It can be sent in response to agetdata
or unsolicited. -
mempool
: themempool
message requests aninv
message containing a list of hashes of transactions in the node's mempool.
As described above, Stolon has two phases: a relay phase, using isolated, minimally-distinguishable Tor connections, and a broadcast phase, using clearnet p2p connections. Stolon has the following parameters:
- The transition probability
0 < q < 1
; - The fanout degree
f
.
Newly created transactions begin in the relay phase. When a node receives an
unsolicited tx
message, it assumes that the transaction it contains is in the
relay phase. It attempts to commit the transaction to its mempool. If
successful, the node continues relaying with probability 1-q
or enters the
broadcast phase with probability q
.
To continue relaying, the node chooses f
relay peers uniformly at random from
its active peer connections. For each relay peer, the node attempts to open an
ephemeral Tor connection to that peer (as described in more detail below),
forwards the transaction in an unsolicited tx
message, and closes the
connection. Failures are ignored.
To enter the broadcast phase, the node immediately sends an inv
message
containing the transaction hash to every connected peer.
When a node receives an inv
message advertising a transaction, it assumes
that the advertised transaction is in the broadcast phase. If the hash is
unknown, it attempts to download the transaction and commit the transaction to
its mempool. If successful, it broadcasts the transaction to its connected
peers.
Because transactions enter the mempool in the relay phase, it would be possible for an adversary to learn of the presence of a new transaction in advance of it entering the broadcast phase. To prevent this, the mempool can maintain a list of relay transaction hashes used to filter responses to mempool queries. Hashes can be removed from the relay transaction list when they are observed from other peers, for instance by removing a hash from the relay transaction list as part of a mempool membership query (since the mempool is queried to determine whether an advertised transaction is already known).
Using Zcash or Bitcoin over Tor is generally inadvisable. The reason is that while Tor provides network-level anonymity, the Bitcoin p2p protocol used in Zcash leaks an enormous amount of distinguishing information about a node's state, with at best half-hearted and ineffective fingerprinting countermeasures.
For instance, peer discovery is implemented the getaddr
message, which
requests an addr
message containing a list of peers and their last-seen
timestamps. The last-seen timestamps record the time the responding node last
received a message from that peer. In addition to node fingerprinting, this is
probably a useful additional source of information for an adversary seeking to
trace transaction origins. Zcashd attempts to prevent fingerprinting by
serving at most one addr
message per peer connection, but does so
statelessly, so tearing down and rebuilding the connection defeats this
prevention.
Building Zcash connections over Tor is therefore only useful if the connections can be:
-
completely isolated from all other node state, so that peers cannot make queries revealing information about the node;
-
constructed with minimal distinguishing information, such as IP addresses, timestamps, etc.
This is not possible with the network implementation in Zcashd but is fairly
straightforward to implement using zebra-network
. The required steps are:
-
Change the
Handshake
service to allow full configuration of the data sent while constructing a connection; -
Provide a new method that constructs an isolated connection:
pub async fn connect_isolated( conn: TcpStream, user_agent: String, ) -> Result<BoxService<Request, Response, BoxedStdError>, BoxedStdError>
This method takes an existing
TcpStream
so that it can be used with a pre-constructed Tor circuit. This method should construct a newHandshake
instance and configure it with the fixed default values specified below. TheHandshake
service takes a handlerService
used to process inbound requests from the remote peer;connect_isolated
should pass a no-opservice_fn
that ignores requests. -
Use
tokio-socks
to construct aTcpStream
backed by a fresh Tor circuit, and pass it toconnect_isolated
.
This allows Zebra, or any other user of zebra-network
, to construct anonymous
connections to Zcash peers. However, there is some amount of work required to
glue these pieces together.
This glue is implemented in the stolon
crate, which provides a single method
for the relay operation:
pub async fn send_transaction(
tor_addr: SocketAddr,
peer_addr: SocketAddr,
transaction: Arc<Transaction>,
) -> Result<(), BoxError>
This method requires knowledge of the peer address and the use of an async Rust
library. However, this may be inconvenient for wallet software. To address this
use-case, the stolon
crate provides a standalone stolon
binary, which reads
a Zcash transaction from stdin, obtains peers by performing a DNS-over-Tor
lookup to the Zcash Foundation's DNS seeders, and then sends the transaction to
those peers.
Zcashd can implement Stolon by using the stolon
crate as a library.
To be minimally-distinguishable, the version message sent during the handshake should have the following fields:
Field | Value |
---|---|
version |
Current supported Zebra network version |
services |
0 , no advertised services |
timestamp |
Local UNIX timestamp, truncated to 5-minute intervals |
address_recv |
(NODE_ADDR , 0.0.0.0:8233 ) |
address_from |
(0 , 0.0.0.0:8233 ) |
nonce |
Random value |
user_agent |
"" , the empty string |
start_height |
0 |
relay |
false |
The timestamp is chosen to be truncated to 5-minute intervals as a balance between two goals:
- providing minimally-distinguishable data in the handshake;
- not interfering with
zcashd
's existing code.
There is no specification of the network protocol, so when considering the
latter goal the best we can do is to make inferences based on the current state
of the zcashd
source code. Currently, zcashd
only uses the handshake
timestamp at the end of parsing the version message:
pfrom->nTimeOffset = timeWarning.AddTimeData(pfrom->addr, nTime, GetTime());
The AddTimeData
method defined in src/timedata.cpp
is a no-op as long as
the difference between the specified timestamp and zcashd
's local time is less
than a TIMEDATA_WARNING_THRESHOLD
set to 10 minutes. Otherwise, the data is fed
into a process that attempts to detect when the local time is wrong.
Truncating to the nearest 5 minutes (half of TIMEDATA_WARNING_THRESHOLD
)
attempts to try to stay within this interval, so as to not confuse zcashd
.
In general, truncation may not be sufficient to prevent an observer from obtaining distinguishing information, because of aliasing between the truncated data and the original signal. For instance, continuously broadcasting the node's timestamp truncated to a 5-minute interval would not hide the node's clock skew, which would be revealed at the moment the node switched to the next truncation. However, this is unlikely to be a problem in practice, because the number of samples an observer can obtain from a single node is likely to be small. And, unlike other approaches like adding random noise, truncation is simple, reliable, and deterministic, making it easy to test.
Unfortunately, Tor is difficult to use as a library and its ease of deployment varies by platform. We therefore need to consider how to support Tor integrations and to provide a fallback mechanism when Tor is not available.
To handle fallback, a Zebra node without Tor access can perform the relay phase over an existing peer connection, rather than over an ephemeral Tor connection. This is much worse for privacy, but is similar to the design of Dandelion-lite. Ideally, Zebra+Tor is the common case for Zcash nodes, and Zebra nodes with Tor support can improve network privacy for Zebra without Tor and Zcashd nodes.
We should consider how we can encourage Tor deployments for Zebra. One option would be to recommend Docker as the primary install mechanism, and bundle a Tor relay image with the Zebra image. Another would be to attempt to package Zebra with a dependency on Tor.
-
This design is more complicated than simply immediately broadcasting transactions to all peers.
-
This design requires integration with external software (Tor) to be effective, although the fallback method probably provides better privacy than immediate broadcast.
-
Zcashd keeps IP-based peer reputation scores and may reject connections from Tor exits if too many invalid transactions are sent from one exit. However, only one relay-to-Zcashd has to succeed for Zcashd to broadcast a transaction to all of its peers, and Zebra, which does not yet have a reputation system, can implement one that's aware of transaction relaying. It's unclear whether this will be a problem in practice, and the solution would probably be to fix Zcashd.
-
Dandelion without Tor: this option has the advantage of not requiring integration with external software, but it provides weaker privacy benefits. Unlike Dandelion, in Stolon, a single step in the relay phase completely unlinks a transaction from its source.
-
Overlap with wallet code: the code required to implement Stolon is reusable for wallet software, and provides strong anonymity properties for that use case regardless of how widely Zebra is deployed.
-
It's something: Zebra needs a transaction diffusion mechanism, and this one can share code with wallet software and provide strong benefits in the Zebra+Tor case, even if the privacy properties of the fallback mechanism are unknown.
-
Using Tor connections only for wallets, not for node relay: this doesn't save much complexity, since the Tor code has to exist anyways, but it does prevent Zebra+Tor nodes from anonymously relaying transactions on behalf of software that does not use Tor connections.
-
Guard peers: instead of using random clearnet peers for non-Tor fallback, we could choose
f
guard peers as in Tor. This is probably workable but would complicate the backpressure design of the peer set somewhat, as we would probably have to dedicatef
peers to only serve transaction relay requests. This might introduce additional network distinguishers. -
Mixnets: Tor provides strong privacy properties in practice, and can be used today. In contrast, mixnets are not yet at the same level of technology readiness and are significantly more complex to integrate.
This design is inspired by Dandelion and its variants, as described above.
Zcashd's transaction diffusion algorithm immediately advertises new transactions to all peers, and does not attempt to hide their source.
-
How should we encourage Zebra to be deployed alongside Tor?
-
What privacy benefits, if any, are provided by the clearnet fallback relay method, compared to Zcashd's baseline?
-
What parameter choices should we make?
q = 0.33
,f = 2
seems like a reasonable starting point, but we should use actual estimates, not just made up parameters. -
How can we help wallet software integrate Stolon?
- Quantitative analysis of Stolon's privacy properties in the fallback relay case.