-
Notifications
You must be signed in to change notification settings - Fork 9
/
Copy pathparts.tex
2211 lines (1377 loc) · 263 KB
/
parts.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
\green{}
The Swarm project is set out to build a permissionless storage and communication infrastructure for the self-sovereign digital society of tomorrow. From a developer's perspective, Swarm is best seen as public infrastructure that powers real-time interactive web applications familiar from the \gloss{Web 2.0} era. It provides a low-level API to primitives that serve as building blocks of complex applications, as well as the basis for the tools and libraries for a Swarm-based \gloss{Web 3.0} development stack. Designed to allow access from any traditional web browser, the API and tools ensure that Swarm can swiftly offer a private and decentralised alternative to today's \gloss{World Wide Web} (\gloss{WWW}).
\begin{figure}[htbp]
\centering
\includegraphics[width=0.8\textwidth]{fig/swarm-layered-design_new.pdf}
\caption[Swarm's layered design \statusgreen]{Swarm's layered design}
\label{fig:Swarm-layered-design}
\end{figure}
This part details the design and architecture of the system. In accordance with the principles laid out in \ref{sec:design-principles}, we prioritise a modular design approach for Swarm, conceiving it with clearly separable layers, each dependent on the previous one (see Figure \ref{fig:Swarm-layered-design}):
\begin{itemize}
\item[(1)] {A peer-to-peer network protocol to serve as underlay transport.}
\item[(2)] {An overlay network with protocols powering a distributed immutable storage of \glossplural{chunk} (fixed size data blocks).}
\item[(3)] {A component providing high-level data access and defining APIs for base-layer features.}
\item[(4)] {An application layer defining standards and outlining best practices for more elaborate use-cases.}
\end{itemize}
We regard (2) and (3) as the core of Swarm. Since the network layer relies on it, we will also formulate the requirements for (1), but consider the detailed treatment of both (1) and (4) outside the scope of this book.
Central to the design is the architecture of Swarm's overlay network (layer 2 in Figure \ref{fig:Swarm-layered-design}), which is discussed in Chapter \ref{sec:network}. Chapter \ref{sec:incentivisation} complements this by describing the system of economic incentives which makes Swarm self-sustaining. In Chapter \ref{sec:high-level-functionality}, we introduce the algorithms and conventions that allow Swarm to map data concepts onto the chunk layer to enable the high-level functionalities for storage and communication, notably data structures such as filesystems and databases, \gloss{access control}, indexed \gloss{feeds}, and direct messaging which comprise layer 3 of Swarm.
In Chapter \ref{sec:persistence}, we present ways to prevent garbage-collected chunks from disappearing from the network, including: \glossplural{erasure code}, \gloss{pinning} and insurance, and also provide ways to monitor, recover and re-upload them using missing chunk notifications and insurance challenges.
Finally, in Chapter \ref{sec:ux}, we will look at functionality from the perspective of the developer who builds on Swarm.
\chapter{Network}\label{sec:network}
This chapter elaborates on how the Swarm overlay network is built on top of a \gloss{peer-to-peer} network protocol to form a topology that allows for the routing of messages between nodes (\ref{sec:topology-routing}). In \ref{sec:kademlia-storage}, we describe how such a network can serve as a scalable \gloss{distributed storage} solution for data chunks (\ref{sec:chunks}) and present the logic underpinning the protocols for retrieval/download and syncing/upload (\ref{sec:push-and-pull}).
\section{Topology and routing \statusgreen}\label{sec:topology-routing}
This section sets the scene (\ref{sec:underlay-transport}) for the \gloss{overlay network} of Swarm by making explicit the assumptions about the underlay network. \ref{sec:overlay-addressing} introduces the \gloss{overlay address space} and explains how nodes are assigned an address. In \ref{sec:kademlia-routing}, we present the \gloss{Kademlia} \gloss{overlay topology} (connectivity pattern) and explain how it solves routing between nodes. In \ref{sec:bootstrapping}, we show how nodes running the Swarm client can discover each other, bootstrap, and then maintain the overlay topology.
\subsection{Requirements for underlay network \statusyellow}\label{sec:underlay-transport}
\yellow{}
Swarm is a network operated by its users. Each node in the network is supposed to run a client complying with the protocol specifications. On the lowest level, the nodes in the network connect using a peer-to-peer network protocol as their transport layer. This is called the \gloss{underlay network}.
In its overall design, Swarm is agnostic of the particular underlay transport used as long as it satisfies the following requirements:
\begin{labelledlist}
\item[\emph{Addressing}] Nodes are identified by their \gloss{underlay address}.
\item[\emph{Dialling}] Nodes can initiate a direct connection to a peer by dialing them on their underlay address.
\item[\emph{Listening}] Nodes can listen to other peers dialing them and can accept incoming connections. Nodes that do not accept incoming connections are called \glossplural{light node}.
\item[\emph{Live connection}] A node connection establishes a channel of communication, indicating that the remote peer is online and accepting messages until explicitly disconnected.
\item[\emph{Channel security}]
The channel provides identity verification and implements encrypted and authenticated transport resisting man-in-the-middle attacks.
\item[\emph{Protocol multiplexing}]
The underlay network service can accommodate several protocols running on the same connection. Peers communicate the protocols with the name and versions that they implement, and the underlay service identifies compatible protocols and starts up peer connections on each matched protocol.
\item[\emph{Delivery guarantees}]
Protocol messages have \gloss{guaranteed delivery}, i.e. any delivery failures due to network problems prompt direct error responses.
The sequence in which messages are delivered is guaranteed within each protocol, and ideally, the underlay protocol provides prioritisation.
If protocol multiplexing takes place on the same transport channel, framing is likely implemented to prevent long messages from blocking higher-priority ones.
\item[\emph{Serialisation}]
The protocol message construction supports arbitrary data structure serialisation conventions.
\end{labelledlist}
The \gloss{libp2p} library can provide all the needed functionality and is the designated underlay connectivity driver in the specification.%
%
\footnote{Swarm's initial Golang implementation uses Ethereum's \gloss{devp2p}/rlpx which satisfies the above criteria and uses TCP/IP with custom cryptography added for security. The underlay network address that devp2p uses is represented using the \gloss{enode URL scheme}. Devp2p dispatches protocol messages based on their message ID. It uses RLP serialisation which is extended with higher level data type representation conventions. In order to provide support for the Ethereum 1.x blockchain and for storing its state on Swarm, we may provide a thin devp2p node that proxies queries to the new libp2p-based Swarm client, or just uses its API. Otherwise we expect the devp2p networking support to be discontinued.}
\subsection{Overlay addressing\statusgreen}\label{sec:overlay-addressing}
\green{}
While clients use the \gloss{underlay address} to establish connections to peers, each node running Swarm is additionally identified with an \gloss{overlay address}. It is this address that determines which peers a node will connect to and also directs the way messages are forwarded. The overlay address is assumed to be stable as it defines a node's identity across sessions and ultimately affects which content is prioritised for storage in the node's local storage.
The node's \gloss{overlay address} is derived from an Ethereum account by hashing the corresponding elliptic curve public key with the \gloss{BZZ network ID} using the 256-bit Keccak algorithm. The inclusion of the bzz network ID is rooted in the fact that there can be multiple Swarm networks (e.g.\ test net, main net, or private Swarms). It serves to ensure that the same address cannot be used across different networks. Assuming any sample of base accounts are independently selected, the resulting overlay addresses are expected to have a uniform distribution in the address space of 256-bit integers. It is important to derive the address from a public key, as it allows the nodes to issue commitments associated with an overlay location using cryptographic signatures that are verifiable by third parties.
Using the long-lived communication channels of the \gloss{underlay network}, Swarm nodes form a network with \emph{quasi-permanent} peer connections. The resulting connectivity graph can then realise a particular topology defined over the address space. The \gloss{overlay topology} chosen is called \gloss{Kademlia}: It enables communication between any two arbitrary nodes in the Swarm network by providing a strategy to relay messages using only underlay peer connections. The protocol that describes how nodes share information with each other about themselves and other peers is called 'hive'. How nodes use this protocol to bootstrap the \gloss{overlay topology} is discussed in \ref{sec:bootstrapping}.
% The theoretical basis for \gloss{Kademlia topology} is formalised rigorously in \ref{sec:kademlia-connectivity}.
It is crucial that the overlay address space encompasses the full range of 256-bit integers. One central concept in Swarm is \gloss{proximity order} (\gloss{PO}), which quantifies the relatedness of two addresses on a discrete scale.%
%
\footnote{Proximity order is the discrete logarithmic scale of proximity, which, in turn is the inverse of normalised XOR distance.% \cite{specs}.
% See \ref{sec:proximity} for a formal definition.
}
%
Given two addresses, $x$ and $y$, $\mathit{PO}(x,y)$ counts the matching bits of their binary representation starting from the most significant bit up to the first one that differs. The highest proximity order is therefore 256, designating the maximum relatedness, i.e. where $x=y$.
\subsection{Kademlia routing \statusgreen}\label{sec:kademlia-routing}
\yellow{rework together with appendix}
\glossupper{Kademlia topology} can be used to route messages between nodes in a network using overlay addressing. It has excellent scalability as it allows for universal routing such that both (1) the number of hops and (2) the number of peer connections are always logarithmic to the size of the network.
In what follows, we show the two common flavours of routing: \emph{iterative/zooming} and \emph{recursive/forwarding}. Swarm's design crucially relies on the latter, forwarding flavour, which sets it apart as a less common approach compared to the more prevalent iterative flavour found in much of the peer-to-peer literature and used in most other implementations \citep[see][]{maymounkov2002kademlia,baumgart2007s,lua2005survey}. To provide a comprehensive understanding, we will walk the reader through both approaches to reveal their idiosyncrasies.
\subsubsection{Iterative and forwarding Kademlia}
Let $R$ be an arbitrary binary relation over nodes in a network. Nodes that are in relation $R$ with a particular node $x$ are called \glossplural{peer} of $x$. Peers of $x$ can be indexed by their \gloss{proximity order} (\gloss{PO}) relative to $x$%
% (see \ref{sec:proximity})%
.
The equivalence classes of peers are called \glossplural{proximity order bin}, or just bins for short. Once arranged in bins, these groups of peers form the \gloss{Kademlia table} of the node $x$ (see Figure \ref{fig:kademlia-table}).
\begin{figure}[htbp]
\centering
\includegraphics[width=.8\textwidth]{fig/kademlia-3.5_new.pdf}
\caption[From overlay address space to Kademlia table \statusgreen]{From overlay address space to Kademlia table. \textbf{Top}: the overlay address space is represented with a binary tree, colored leaves are actual nodes. The path of the pivot node (+) is shown with thicker lines. \textbf{Centre}: peers of the pivot nodes are shown keyed by the bits of their xor distance measured from the pivot. Here, $0$s represent a matching bit with the pivot, and $1$s show a differing bit. The leaf nodes are ordered by their xor distance from the pivot (leftmost node). \textbf{Bottom}: the Kademlia table of the pivot: the subtrees branching off from the pivot path on the left are displayed as the rows of the table representing proximity order bins in increasing order.}
\label{fig:kademlia-table}
\end{figure}
Node $x$ has a \gloss{saturated Kademlia table} if there is a $0\leq d_x\leq \mathit{maxPO}$ called the \gloss{neighbourhood depth} such that (1) the node has at least one peer in each bin up to and excluding \gloss{proximity order} bin $d_x$ and (2) all nodes at least as near as $d_x$ (called the \gloss{nearest neighbours}) are peers of $x$. If each node in a network has a saturated Kademlia table, then we say that the network has \gloss{Kademlia topology}.
\begin{figure}[htbp]
\centering
\includegraphics[width=\textwidth]{fig/kademlia-3.pdf}
\caption[Nearest neighbours \statusgreen]{Nearest neighbours in a 4 bit network with $d = 2$ }
\label{fig:bin-density}
\end{figure}
Let $R$ be the "is known to" relation: $y$ "is known to" $x$ if $x$ has both overlay and underlay addressing information for $y$.
In iterative Kademlia routing, the \gloss{requestor node} iteratively extends the graph of peers that are known to it. Using their \gloss{underlay address}, the requestor node will contact the peers that they know are nearest the destination address for peers that are further away (commonly using UDP). On each successive iteration, the peers become at least one order closer to the destination (see Figure \ref{fig:iterative-forwarding-kademlia}). Because of the Kademlia criteria, the requestor will eventually discover the destination node's underlay address and can then establish direct communication with it. This iterative strategy%
%
\footnote{The iterative protocol is equivalent to the original Kademlia routing that is described in \citet{maymounkov2002kademlia}.
}
%
critically depends on the nodes' ability to find peers that are currently online. In order to find such a peer, a node needs to collect several candidates for each bin. The best predictor of availability is the recency of the peer's last response, so peers in a bin should be prioritised according to this ordering.
\begin{figure}[p]
\centering
\vspace{-0.5cm}
\includegraphics[width=.8\textwidth]{fig/iterative-kademlia_new.pdf}\\\vspace{-1.3cm}
\includegraphics[width=.8\textwidth]{fig/forwarding-kademlia-3_new.pdf}
\caption[Iterative and Forwarding Kademlia routing \statusgreen]{Iterative and Forwarding Kademlia routing: A requestor node shown with a cross in the circle at address $...0000...$ wants to route to a destination address $...1111...$ to which the closest peer online is the blue circle at $...1110...$ These initial ellipses represent the prefix shared by requestor and destination addresses which is $n$ bits long. \textbf{Top:} In the iterative flavour, the requestor contacts the peers (Step 1, dotted black arrows) that they know are nearest the destination address. Peers that are online (yellow) respond with information about nodes that are even closer (green arrow, Step 2) so the requestor can now repeat the query using these closer peers (green, Step 3). On each successive iteration, the peers (yellow, green and blue) are at least one PO closer to the destination until eventually the requestor is in direct contact with the node that is nearest to the destination address. \textbf{Bottom:} In the forwarding flavour, the requestor forwards a message to the connected peer they know that is nearest to the destination (yellow). The recipient peer does the same. Applying this strategy recursively relays the message via a chain of peers (yellow, green, blue) each at least one PO closer to the destination.}
\label{fig:iterative-forwarding-kademlia}
\end{figure}
Swarm uses an alternative flavour of Kademlia routing, first described in \citet{heep2010r} and then expanded on and worked out by \citet{tronetal2019-network}. Here, a recursive method is employed, whereby the successive steps of the iteration are "outsourced" to a \gloss{downstream peer}.
Each node recursively passes a message to a direct peer at least one \gloss{proximity order} closer to the destination. Thus, \gloss{routing} using this approach simply means relaying messages via a chain of peers which are ever closer to the destination, as shown in Figure \ref{fig:iterative-forwarding-kademlia}.
In this way, Swarm's underlay transport offers quasi-stable peer connections over TCP with communication channels that are kept alive. These open connections can then be used as $R$ to define another notion of a \gloss{peer}. The two criteria of healthy \gloss{Kademlia connectivity} in Swarm translate as: For each node $x$, there exists a \gloss{neighbourhood depth} $d_x$ such that (1) node $x$ has an open connection with at least one node for each \gloss{proximity order bin} up to but excluding $d_x$, and (2) is connected to all the online nodes that are at least as near as $d_x$. If each node in the network has a saturated Kademlia table of peers, then the network is said to have \gloss{Kademlia topology}. Since connected peers are guaranteed to be online, the recursive step consists solely of forwarding the message to a connected peer strictly closer to the destination. We can call this alternative \gloss{forwarding Kademlia}.
In a \gloss{forwarding Kademlia} network, a message is said to be \emph{routable} if there exists a path from sender to destination through which the message can be relayed. In a mature subnetwork with \gloss{Kademlia topology} every message is routable.
If all peer connections are stably online, a \gloss{thin Kademlia table}, i.e.\ a single \gloss{peer} for each bin up to $d$, is sufficient to guarantee routing between nodes. In reality, however, networks are subject to \gloss{churn}, i.e.\ nodes are expected to go offline regularly. In order to ensure \gloss{routability} in the face of churn, the network needs to maintain \gloss{Kademlia topology}. This means that each individual node needs to have a \gloss{saturated Kademlia table} at all times. By keeping several connected peers in each \gloss{proximity order bin}, a node can ensure that node dropouts do not damage the saturation of their Kademlia table. Given a model of node dropouts, we can calculate the minimum number of peers needed per bin to guarantee that nodes are saturated with a probability that is arbitrarily close to 1. The more peers a node keeps in a particular proximity order bin, the more likely that the message destination address and the peer will have a longer matching prefix. As a consequence of forwarding the message to that peer, the \gloss{proximity order} increases more quickly, and the message ends up closer to the destination than it would with less peers in each bin (see also Figure \ref{fig:bindensity}).
With \gloss{Kademlia} saturation guaranteed, a node will always be able to forward a message and ensure \gloss{routability}. If nodes comply with the forwarding principles (and that is ensured by aligned incentives), the only case when relaying could possibly break down is when a node drops out of the network after having received a message but before it managed to forward it.%
%
\footnote{Healthy nodes could commit to being able to forward within a (very short) constant time; let's call this the \gloss{forwarding lag}. In the case that a \gloss{downstream peer} disconnects before this forwarding lag passes, then the \gloss{upstream peer} can re-forward the message to an alternative peer, thereby keeping the message passing unbroken. See \ref{sec:retrieval} for more detail.
}
An important advantage of forwarding Kademlia is that this method of routing requires a lot less bandwidth than the iterative algorithm. In the iterative version, known peers are not guaranteed to be online, so finding one that is available adds an additional level of unpredictability.
\subsubsection{Sender anonymity}
\glossupper{sender anonymity} is a crucial feature of Swarm. It is important that peers further down in the request cascade can never know who the originator of the request was, because requests are relayed from peer-to-peer.
The above rigid formulation of Kademlia routing would suggest that if a node receives a message from a peer and that message and peer have a proximity order of $0$, then the recipient would be able to conclude that the peer it received the message from must be the sender. If we allow \gloss{light node} Swarm clients, i.e. clients that due to resource constraints do not keep a full Kademlia saturation but instead have just a local neighbourhood, then even a message from a peer in bin $0$ remains of ambiguous origin.
\subsubsection{Bin density and multi-order hops} \label{sec:bindensity}
As a consequence of logarithmic distance and uniform node distribution, the population of peers exponentially increases as we move further away from a particular node.
This means that unless the number of required connections in a bin doubles as bins increase in distance from the node, shallower bins will always allow more choice of nodes for potential connection. In particular, nodes have a chance to increase the number of connections per bin in such a way that peer addresses maximise density (i.e., in proximity order bin $b$, the subsequent bits of peer addresses form a \gloss{balanced binary tree}). Such an arrangement is optimal in the sense that for a bin depth of $d$, nodes are able to relay all messages so that in one hop the proximity order of the destination address will increase by $d$ (see Figure \ref{fig:bindensity}).
\begin{figure}[htbp]
\centering
\includegraphics[width=\textwidth]{fig/bindensity.pdf}
\caption[Bin density \statusgreen]{Bin density: types of saturation for PO bin $0$ for a node with overlay address starting with bit $0$. \textbf{Top left}: A "thin" bin with a single peer is not resilient to churn and only increases PO by $1$ in one hop. \textbf{Top right:} At least two peers are needed to maintain Kademlia topology in case of churn; two peers when not balanced cannot guarantee multi-order hops. \textbf{Bottom left:} Two peers balanced guarantees an increase of 2 POs in one hop. \textbf{Bottom right:} Four peers, when balanced, can guarantee an increase of $3$ POs in one hop.}
\label{fig:bindensity}
\end{figure}
\subsubsection{Factoring in underlay proximity}
It is expected that as Swarm clients continue to evolve and develop, nodes may factor in throughput when they select peers for connection. All things being equal, nodes physically closer to each other tend to have higher throughput, and therefore will be preferred in the long run. This is how forwarding Kademlia is implicitly aware of the underlay topology \citep{heep2010r}.
\subsection{Bootstrapping and maintaining Kademlia topology \statusgreen}\label{sec:bootstrapping}
\orange{rework based on new hive protocol}
This section discusses how a stable Kademlia topology can emerge. In particular, it outlines the exact bootstrapping protocol that each node must follow to reach and maintain a saturated Kademlia connectivity. Nodes joining a \gloss{decentralised network} are supposed to be initially naive, potentially initiating connection via only a single known peer with no prior knowledge. For this reason, the bootstrapping process needs to include an initial step that helps naive nodes to begin exchanging information about each other. This discovery process is called the \gloss{hive protocol}.
\subsubsection{Bootnodes}
Swarm has no distinct node type or operation mode for bootnodes. This means that naive nodes should be able to connect to any node on the network and bootstrap their desired connectivity. In order not to overburden any single node, electing one particular node as an initial connection should be avoided, and the role of being a bootnode for the newly connecting naive nodes should ideally be distributed among participant nodes. This is achieved either with an invite system, or a centralised bootnode service running a public gateway that responds to an API call with the bzz address of a randomly chosen node among online peers.
Once connected to a node in the network, the hive protocol kicks in and the naive node begins to learn about the bzz addresses of other nodes, and thus it can start bootstrapping its connectivity.
\subsubsection{Building up connections}
Initially, each node begins with zero as their \gloss{saturation depth}. Nodes keep advertising their saturation depth to their connected peers as it changes. When a node $A$ receives an attempt to establish a new connection from a node $B$, she notifies each of her other peers about $B$ connecting to her only in the case that each peer's proximity order relative to the connecting node $A$ is not lower than that peer's advertised saturation depth. The notification is always sent to a peer that shares a proximity order bin with the new connection. Formally, when $y$ connects to $x$, $x$ notifies a subset of its connected peers. A peer $p$ belongs to this subset if $\mathit{PO}(x, p) = \mathit{PO}(x, y)$ or $d_p\leq \mathit{PO}(y, p)$. The notification takes the form of a protocol message and includes the full \gloss{overlay address} and \gloss{underlay address} information.%
%
\footnote{Light nodes that do not wish to relay messages and do not aspire to build up a healthy Kademlia are not included, see Section \ref{sec:light}. }
% \begin{figure}[htbp]
% \centering
% \caption[Hive protocol: bootstrapping and maintaining Kademlia topology \statusred]{Hive protocol: bootstrapping and maintaining Kademlia topology}
% \label{fig:bootstrapping-kademlia}
% \end{figure}
\subsubsection{Mature connectivity}
After a sufficient number of nodes are connected, a bin becomes saturated and the node's neighbourhood depth can begin to increase. Nodes keep their peers up to date by advertising their current depth if it changes. As their depth increases, nodes will get notified of fewer and fewer peers. Once the node finds all their nearest neighbours and has saturated all the bins, no new peers are to be expected. For this reason, a node can conclude a saturated Kademlia state if it receives no new peers for some time.%
%
\footnote{Note that the node does not need to know the total number of nodes in the network. In fact, some time after the node stops receiving new peer addresses, the node can effectively estimate the size of the network: the depth of network is $\log_2(n+1)+ d$ where $n$ is the number of remote peers in the nearest neighbourhood and $d$ is the depth of that neighbourhood. It then follows that the total number of nodes in the network can be estimated simply by taking this to the power of 2.}
%
Instead of having a hard deadline and a binary state of saturation, we can quantify the certainty of saturation by considering the time elapsed since the arrival of the most recent new peer. Assuming stable connections, eventually each node online will get to know its nearest neighbours and connect to them while keeping each bin up to $d$ non-empty. Therefore each node will converge on the saturated state. To maintain a robust Kademlia topology in the face of changing peer connections, it is crucial to include multiple peers within each proximity order bin. This prevents the node from regressing to a lower saturation state, even when there are no new nodes joining the network.
\section{Swarm storage\statusgreen}\label{sec:kademlia-storage}
In this section, in \ref{sec:disc}, we first show how a network with quasi-permanent connections in a Kademlia topology can support a \gloss{load balancing}, \gloss{distributed storage} of fixed-sized data blobs. In \ref{sec:chunks}, we detail the generic requirements on chunks and introduce actual chunk types. Finally, in \ref{sec:redundancy-by-local-replication}, we turn to \gloss{redundancy} by neighbourhood replication as a first line of defense against node churn.
\subsection{Distributed immutable store for chunks\statusgreen}\label{sec:disc}
In this section, we discuss how networks using Kademlia overlay routing are a suitable basis on which to implement a serverless storage solution using \glossplural{distributed hash table} (\glossplural{DHT}). Then we introduce the \gloss{DISC}%
%
\footnote{DISC is \gloss{distributed immutable store for chunks}. In earlier work, we have referred to this component as the 'distributed preimage archive' (DPA), however, this phrase became misleading since we now also allow chunks that are not the preimage of their address.}
%
model, Swarm's narrower interpretation of a DHT for storage. This model
imposes some requirements on chunks and necessitates 'upload' protocols.
As is customary in Swarm, we provide a few resolutions of this acronym, which summarise the most important features:
\begin{itemize}
\item {decentralised infrastructure for storage and communication},
\item {distributed immutable store for chunks},
\item {data integrity by signature or content address},
% \item \emph{downloadable incentivised secure communications}
\item {driven by incentives with smart contracts}.
\end{itemize}
\subsubsection{From DHT to DISC}
Swarm's \gloss{DISC} shares many similarities with widely known \glossplural{distributed hash table}. The most important difference is that Swarm does not keep track of \emph{where} files are to be found, instead it actually \emph{stores pieces of the file itself} directly with the closest node(s).
In what follows, we review DHTs, as well as dive into the similarities and differences with DISC in more detail.
\glossupperplural{distributed hash table} use an overlay network to implement a key--value container distributed over the nodes (see Figure \ref{fig:DHT}). The basic idea is that the keyspace is mapped onto the overlay address space, and the value for a key in the container is to be found with nodes whose addresses are in the proximity of the key. In the simplest case, let us say that this is the single closest node to the key that stores the value. In a network with Kademlia connectivity, any node can route to a node whose address is closest to the key, therefore a \emph{lookup} (i.e.\ looking up the value belonging to a key) is reduced simply to routing a request.
\begin{figure}[htbp]
\centering
\includegraphics[width=0.7\textwidth]{fig/dht_new.pdf}
\caption[Distributed hash tables (DHTs) \statusgreen]{Distributed hash tables (DHTs) used for storage: node $D$ (downloader) uses Kademlia routing in Step $1$ to query nodes in the neighbourhood of the chunk address to retrieve seeder info in Step $2$. The seeder info is used to contact node $S$ (seeder) directly to request the chunk and deliver it in Steps $3$ and $4$.}
\label{fig:DHT}
\end{figure}
DHTs used for \gloss{distributed storage} typically associate content identifiers (as keys/addresses) with a changing list of seeders (as values) that can serve that content \citep{ipfs2014, crosby2007analysis}. However, the same structure can be used directly: in Swarm, it is not information about the location of content that is stored at the node closest to the address, but the content itself (see Figure \ref{fig:disc}).
\begin{figure}[htbp]
\centering
\includegraphics[width=0.7\textwidth]{fig/swarm disc_new.pdf}
\caption[Swarm DISC: Distributed Immutable Store for Chunks \statusgreen]{Swarm DISC: Distributed Immutable Store for Chunks. In Step $1$, downloader node $D$ uses Kademlia connectivity to send a request for the chunk to a peer storer node that is closer to the address. This peer then repeats this until node $S$ is found that has the chunk. In other words peers relay the request recursively via live peer connections ultimately to the neighbourhood of the chunk address (request forwarding). In Step $2$ the chunk is delivered along the same route using the forwarding steps in the opposite direction (response backwarding).}
\label{fig:disc}
\end{figure}
\subsubsection{Constraints}
The \gloss{DISC} storage model is opinionated about which nodes store what content and this implies the following restrictions:
\begin{labelledlist}
\item[\emph{fixed-size chunks}] Load balancing of content is required among nodes and is realised by splitting content into equal-sized units called \glossplural{chunk} (see \ref{sec:chunks}).
\item[\emph{syncing}] There must be a process whereby chunks get to where they are supposed to be stored, no matter which node uploads them (see \ref{sec:push-syncing}).
\item[\emph{plausible deniability}] Since nodes do not have a say in what they store, measures should be employed that serve as the basis of legal protection for node operators. They need to be able to plausibly deny knowing (or even being able to know) anything about the chunks' contents (see \ref{sec:chunk-encryption}).
\item[\emph{garbage collection}] Since nodes commit to store any data close to them, there needs to be a strategy to select which chunks are kept and which are discarded in the presence of storage space constraints.
\end{labelledlist}
\subsubsection{Chunks}\label{sec:chunks}
\glossupperplural{chunk} are the basic storage units used in Swarm's network layer. They are an association of an address with content. Since retrieval in Swarm (\ref{sec:retrieval}) assumes that chunks are stored with nodes close to their address, fair and equal \gloss{load balancing} requires that the addresses of chunks should also be uniformly distributed in the address space, and have their content limited and roughly uniform in size.
When chunks are retrieved, the downloader must be able to verify the correctness of the content given the address. Such integrity translates to guaranteeing uniqueness of content associated with an address. In order to protect against frivolous network traffic, a third party of \glossplural{forwarding node} should be able to verify the integrity of chunks using only local information available to the node.
The deterministic and collision-free nature of addressing implies that chunks are unique as a key--value association: If there exists a chunk with an address, then no other valid chunk can have the same address; this assumption is crucial as it makes the chunk store \gloss{immutable}, i.e.\ there is no replace/update operation on chunks. Immutability is beneficial in the context of relaying chunks as nodes can negotiate information about the possession of chunks simply by checking their addresses. This plays an important role in the stream protocol (see \ref{sec:pull-syncing}) and justifies the DISC resolution as a \emph{distributed immutable store for chunks}.
To sum up, chunk addressing needs to fulfill the following requirements:
\begin{labelledlist}
\item[\emph{deterministic}] To enable local validation.
\item[\emph{collision-free}] To provide integrity guarantee.
\item[\emph{uniformly distributed}] To deliver load balancing.
\end{labelledlist}
In the current version of Swarm, we support two types of chunks: \glossplural{content addressed chunk} and \glossplural{single owner chunk}.
\subsection{Content addressed chunks\statusgreen}\label{sec:content-addressed-chunks}
A \gloss{content addressed chunk} is not assumed to be a meaningful storage unit, i.e. they can be just blobs of arbitrary data resulting from splitting a larger data blob, a file. The methods by which files are disassembled into chunks when uploading and then reassembled from chunks when downloading are detailed in \ref{sec:datastructures}. The data size of a content addressed Swarm chunk is limited to 4 kilobytes. One of the desirable consequences of using this small chunk size is that concurrent retrieval is available even for relatively small files, reducing the latency of downloads.
\begin{figure}[htbp]
\centering
\includegraphics[width=\textwidth]{fig/content-addressed-chunk-3.pdf}
\caption[Content addressed chunk\statusgreen]{Content addressed chunk. An at most 4KB payload with a 64-bit little-endian encoded span prepended to it constitutes the chunk content used in transport. The content address of the chunk is the hash of the byte slice that is the span and the BMT root of the payload concatenated.}
\label{fig:content-addressed-chunk}
\end{figure}
\subsubsection{Binary Merkle tree hash}
The canonical content addressed chunk in Swarm is called a \gloss{binary Merkle tree chunk} (\gloss{BMT chunk}).
The address of BMT chunks is calculated using the \gloss{binary Merkle tree hash} algorithm (\gloss{BMT hash}). The base hash used in \gloss{BMT} is Keccak256, properties of which such as uniformity, irreversibility, and collision resistance all carry over to the \gloss{BMT hash} algorithm. As a result of uniformity, a random set of chunked content will generate addresses evenly spread in the address space, i.e.\ imposing storage requirements balanced among nodes.
\begin{figure}[htbp]
\centering
\resizebox{1\textwidth}{!}{
\input{fig/bmthash.tex}}
\caption[BMT: Binary Merkle Tree hash used as chunk hash in Swarm \statusgreen]{Binary Merkle Tree chunk hash in Swarm: the 1337 bytes of chunk data is segmented into 32-byte segments. Zero padding is used to fill up the rest up to 4 kilobytes. Pairs of segments are hashed together using Keccak256 to build up the binary tree. On level 8, the binary Merkle root is prepended with the 8-byte span and hashed to yield the BMT chunk hash.}
\label{fig:BMT}
\end{figure}
The BMT chunk address is the hash of the 8-byte span and the root hash of a \gloss{binary Merkle tree} (\gloss{BMT}) built on the 32-byte segments of the underlying data (see Figure \ref{fig:BMT}). If the chunk content is less than 4k, the hash is calculated as if the chunk was padded with all zeros up to 4096 bytes.
This structure allows for compact \gloss{inclusion proofs} with a 32-byte resolution. An inclusion proof is a proof that one string is a substring of another string, for instance, that a string is included in a chunk. Inclusion proofs are defined on a data segment of a particular index, see Figure \ref{fig:chunk-inclusion}. Such Merkle proofs are also used as proof of custody when storer nodes provide evidence that they possess a chunk (see \ref{sec:redistribution}). Together with the Swarm file hash (see \ref{sec:files}), they allow for logarithmic inclusion proofs for files, i.e.,\ proof that a string is found to be part of a file.
\begin{figure}[htbp]
\centering
\includegraphics[width=\textwidth]{fig/inclusion-proof_new.pdf}
\caption[Compact segment inclusion proofs for chunks \statusgreen]{Compact segment inclusion proofs for chunks. Assume we need proof for segment 26 of a chunk (yellow). The orange hashes of the BMT are the sister nodes on the path from the data segment up to the root and constitute what needs to be part of a proof. When these are provided together with the root hash and the segment index, the proof can be verified. The side on which proof item $i$ needs to be applied depends on the $i$-th bit (starting from least significant) of the binary representation of the index. Finally, the span is prepended and the resulting hash should match the chunk root hash.}
\label{fig:chunk-inclusion}
\end{figure}
\subsection{Single owner chunks\statusgreen}\label{sec:single-owner-chunks}
With \glossplural{single owner chunk}, a user can assign arbitrary data to an address and attest chunk integrity with their digital signature. The address is calculated as the hash of an \gloss{identifier} and an \gloss{owner}. The chunk content is presented in a structure composed of the identifier, the \gloss{payload}, and a signature attesting to the association of identifier and payload (see Figure \ref{fig:single-owner-chunks}).
\begin{labelledlist}
\item[\emph{content}] payload
\begin{labelledlist}
\item[\emph{identifier}] 32 bytes arbitrary identifier
\item[\emph{signature}] 65 bytes $\langle r,s,v \rangle$ representation of an EC signature (32+32+1 bytes),
\item[\emph{span}] an 8-byte little-endian binary of uint64 chunk span,
\item[\emph{payload}] maximum 4096 bytes of regular chunk data.
\end{labelledlist}
\item[\emph{address}] Keccak256 hash of identifier + owner account.
\end{labelledlist}
\begin{figure}[htbp]
\centering
\includegraphics[width=\textwidth]{fig/single-owner-chunk.pdf}
\caption[Single owner chunk\statusgreen]{Single owner chunk. The chunk content is composed of headers followed by an at most 4KB payload. The last header field is the 8-byte span prepended just like in content addressed chunks. The first two header fields provide single-owner attestation of integrity: an identifier and a signature signing off on the identifier and the BMT hash of span and payload. The address is the hash of the ID and the signer account.}
\label{fig:single-owner-chunks}
\end{figure}
The validity of a \gloss{single owner chunk} is checked with the following process:
\begin{enumerate}
\item Deserialise the chunk content into fields for identifier, signature, and payload.
\item Construct the expected plain text composed of the identifier and the \gloss{BMT hash} of the payload.
\item Recover the owner's address from the signature using the plain text.
\item Check the hash of the identifier and the owner (expected address) against the chunk address.
\end{enumerate}
Single owner chunks offer a virtual partitioning of part of the address space into subspaces associated with the single owner. Checking their validity is actually an authentication verifying that the owner has write access to the address with the correct identifier.
As suggested by the span and the length of the payload, a single owner chunk can encapsulate a regular content addressed chunk. Anyone can simply reassign a regular chunk to an address in their subspace designated by the identifier (see also \ref{sec:notification-requests}).
It should be noted that the notion of integrity is somewhat weaker for single owner chunks than in the case of content addressed chunks: After all, it is, in principle, possible to assign and sign any payload to an identifier. Nonetheless, given the fact that the chunk can only be created by a single owner (of the private key that the signature requires), it is reasonable to expect uniqueness guarantees because we hope the node will want to comply with application protocols to get the desired result. However, if the owner of the private key signs two different payloads with the same identifier and uploads both chunks to Swarm, the behaviour of the network is unpredictable. Measures can be taken to mitigate this in layer (3) and are discussed later in detail in \ref{sec:feed-integrity}.
With two types of chunks, integrity is linked to collision-free hash digests, derived from either a single owner and an arbitrary identifier attested by a signature or directly from the content. This justifies the resolution of the DISC acronym as \emph{data integrity through signing or content address}.
\subsection{Chunk encryption\statusgreen}\label{sec:chunk-encryption}
Chunks should be encrypted by default. Beyond client needs for confidentiality, encryption has two further important roles. (1) Obfuscation of chunk content by encryption provides a degree of \gloss{plausible deniability}; using it across the board makes this defense stronger. (2) The ability to choose arbitrary encryption keys together with the property of uniform distribution offer predictable ways of \gloss{mining chunks}, i.e.,\ generating an encrypted variant of the same content so that the resulting chunk address satisfies certain constraints, e.g. is closer to or farther away from a particular address. This is an important property used in (1) price arbitrage (see \ref{sec:pricing}) and (2) efficient utilisation of postage stamps (see \ref{sec:postage-stamps}).
\begin{figure}[htbp]
\centering
\includegraphics[width=\textwidth]{fig/chunk-encryption.pdf}
\caption[Chunk encryption in Swarm \statusgreen]{Chunk encryption in Swarm. Symmetric encryption with a modified counter-mode block cipher. The plaintext input is the content padded with random bytes to 4 kilobytes. The span bytes are also encrypted as if they were continuations of the payload.}
\label{fig:chunk-encryption}
\end{figure}
Chunks shorter than 4 kilobytes are padded with random bytes (generated from the chunk encryption seed). The full chunk plaintext is encrypted and decrypted using stream cipher; XOR with a PRNG seeded with the corresponding symmetric key. In order not to increase the attack surface by introducing additional cryptographic primitives, the stream cipher of choice is using Keccak256 in counter mode, i.e. hashing together the key with a counter for each consecutive segment of 32 bytes. In order to allow selective disclosure of individual segments that are part of an encrypted file, yet leak no information about the rest of the file, we add an additional step of hashing to derive the encryption key for a segment within the chunk. This scheme is easy and cheap to implement in the \gloss{EVM}, lending itself to use in smart contracts containing the plaintext of encrypted Swarm content.
The prepended metadata encoding the chunk span is also encrypted as if it was a continuation of the chunk, i.e. with counter 128. Encrypted chunk content is hashed using the \gloss{BMT hash} digest just as unencrypted ones are. The fact that a chunk is encrypted may be guessed from the \gloss{span value}, but apart from this, in the network layer, encrypted chunks behave in exactly the same way as unencrypted ones.
\glossupperplural{single owner chunk} can also be encrypted, which simply means that they wrap an encrypted regular chunk. Therefore, their payload and span reflect the chunk content encryption described above, the hash signed with the identifier is the \gloss{BMT hash} of the encrypted span and payload, i.e. the same as that of the wrapped chunk.
\subsection{Redundancy by local replication\statusgreen}\label{sec:redundancy-by-local-replication}
It is important to have a resilient means of requesting data. To achieve this, Swarm implements the approach of defence in depth. In the case that a request fails due to a problem with forwarding, one can retry the request with another peer. Alternatively, to guard against these occurrences, a node can initiate concurrent \glossplural{retrieve request} right away. However, such fallback options are not available if the single last node that stores the chunk drops out from the network. Therefore, redundancy is of major importance to ensure data availability. If the closest node is the only storer of the requested data and it drops out of the network, then there is no way to retrieve the content. This basic scenario is handled by ensuring that each set of \gloss{nearest neighbours} hold replicas of each chunk that is closest to any one of them, duplicating the storage of chunks and therefore providing data redundancy.
\subsubsection{Size of nearest neighbourhoods}
If the Kademlia connectivity is defined over storer nodes, then in a network with Kademlia topology there exists a depth $d$ such that (1) each \gloss{proximity order bin} less than $d$ contains at least $k$ storer peers, and (2) all \glossplural{storer node} with \gloss{proximity order} $d$ or higher are actually connected peers. In order to ensure data redundancy, we can add to this definition a criterion that (3) the nearest neighbourhood defined by $d$ must contain at least $r$ peers.
Let us define \gloss{neighbourhood size} $\mathit{NHS}_x(d)$ as the cardinality of the neighbourhood defined by depth $d$ of node $x$.
Then, a node has Kademlia connectivity with redundancy factor $r$ if there exists a depth $d$ such that (1) each proximity order bin lower than $d$ contains at least $k$ storer peers ($k$ is the bin density parameter, see \ref{sec:bindensity}), and (2) all storer nodes with proximity order $d$ or higher are actually connected peers, and (3) $\mathit{NHS}_x(d)\geq r$.
We can then take the highest depth $d'$ such that (1) and (2) are satisfied. Such a $d$ is guaranteed to exist and the \gloss{hive protocol} is always able to bootstrap it. As we decrease $d'$, the number of distinct neighbourhoods grow proportionally, so for any redundancy parameter not greater than the network size $r\leq N=\mathit{NHS}_x(0)$, there will be a highest $0<d_r\leq d'$ such that $\mathit{NHS}_x(d_r)\geq r$. Therefore, \gloss{redundant Kademlia connectivity} is always achievable.
For a particular redundancy, the area of the fully connected neighbourhood defines an \gloss{area of responsibility}. The proximity order boundary of the area of responsibility defines a \gloss{radius of responsibility} for the node. A storer node is said to be \emph{responsible} for (storing) a chunk if the chunk address falls within the node's radius of responsibility.
It is already instructive at this point to show neighbourhoods and how they are structured, see Figure \ref{fig:nearest-neighbours}.
\begin{figure}[htbp]
\centering
\includegraphics[width=\textwidth]{fig/nearest-neighbours-2.pdf}\\
\includegraphics[width=\textwidth]{fig/asymmetric-nodes2.pdf}
\caption[Nearest neighbours \statusgreen]{Nearest neighbours. \textbf{Top}: Each PO defines a neighbourhood, the neighbourhood depth of the node (black circle) is defined as the highest PO such that the neighbourhood has at least R=4 peers (redundancy parameter) and all shallower bins are non-empty. \textbf{Bottom}: An asymmetric neighbourhood. Nearest neighbours of the orange node include the black node but not the other way round.}
\label{fig:nearest-neighbours}
\end{figure}
\subsubsection{Redundant retrievability}
A chunk is said to have \gloss{redundant retrievability} with degree $r$ if it is retrievable and would remain so even after any $r$ nodes responsible for it leave the network. The naive approach presented so far requiring the single closest node to keep the content can be interpreted as degree zero retrievability. If nodes in their area of responsibility fully replicate their content (see \ref{sec:pull-syncing}), then every chunk in the Swarm DISC is redundantly retrievable with degree $r$. Let us take the node $x$ that is closest to a chunk $c$. Since it has Kademlia connectivity with redundancy $r$, there are $r+1$ nodes responsible for the chunk in a neighbourhood fully connected and replicating content. After $r$ responsible nodes drop out, there is just one node remaining which still has the chunk. However, if Kademlia connectivity is maintained as the $r$ nodes leave, this node will continue to be accessible by any other node in the network, and therefore the chunk is still retrievable. Now, for the network to ensure that all chunks remain redundantly retrievable with degree $r$, the nodes comprising the new neighbourhood formed due to the reorganising of the network must respond by re-syncing their content to satisfy the protocol's replication criteria. This is called the guarantee of \gloss{eventual consistency}.
\subsubsection{Resource constraints}
Let us assume then that (1) the forwarding strategy that relays requests along \glossplural{stable node} and (2) the storage strategy that each node in the nearest neighbourhood (of $r$ storer nodes) stores all chunks whose addresses fall within their radius of responsibility. As long as these assumptions hold, each chunk is retrievable even if $r$ storer nodes drop offline simultaneously. As for (2), we still need to assume that every node in the nearest neighbour set can store each chunk. Realistically, however, all nodes have resource limitations. With time, the overall amount of distinct chunks ever uploaded to Swarm will increase indefinitely. Unless the total storage capacity steadily increases, we should expect that the nodes in Swarm are able to store only a subset of chunks. Some nodes will reach the limit of their storage capacity and therefore face the decision whether to stop accepting new chunks via syncing or to make space by deleting some of their existing chunks.
The process that purges chunks from their local storage is called \gloss{garbage collection}. The process that dictates which chunks are chosen for garbage collection is called the \gloss{garbage collection strategy}. For a profit-maximizing node, it holds that it is always best to garbage-collect the chunks that are predicted to be the least profitable in the future and, in order to maximize profit, it is desired for a node to get this prediction right (see \ref{sec:postage-stamps}). So, in order to factor in these capacity constraints, we will introduce the notion of \gloss{chunk value} and modify our definitions using the minimum value constraint:
In Swarm's DISC, at all times, there is a chunk value $v$ such that every chunk with a value greater than $v$ is both retrievable and eventually (after syncing) redundantly retrievable with degree $r$.
This value ideally corresponds to the relative importance of preserving the chunk that uploaders need to indicate. In order for storer nodes to respect it, the value should also align with the profitability of a chunk and is therefore expressed in the pricing of uploads (see \ref{sec:depths}).
% \subsection{}\label{sec:}
\section{Push and pull: chunk retrieval and syncing\statusgreen}\label{sec:push-and-pull}
\green{}
In this section, we demonstrate how chunks actually move around in the network: How they are pushed to the storer nodes in the neighbourhood they belong to when they are uploaded, as well as how they are pulled from the storer nodes when they are downloaded.
\subsection{Retrieval\statusgreen}\label{sec:retrieval}
In a distributed chunk store, we say that a chunk is an \gloss{accessible chunk} if a message is routable between the requester and the node that is closest to the chunk. Sending a retrieve request message to the chunk address will reach this node. Because of \gloss{eventual consistency}, the node closest to the chunk address will store the chunk. Therefore, in a \gloss{DISC} distributed chunk store with healthy Kademlia topology, all chunks are always accessible for every node.
\subsubsection{Chunk delivery}
For retrieval, accessibility needs to be complemented with a process to have the content delivered back to the requesting node, preferably using only the chunk address. There are at least three alternative ways to achieve this (see Figure \ref{fig:chunk-delivery}):
\begin{labelledlist}
\item[\gloss{direct delivery}] The chunk delivery is sent via a direct underlay connection.
\item[\gloss{routed delivery}] The chunk delivery is sent as message using routing.
\item[\gloss{backwarding}] The chunk delivery response simply follows the route along which the request was forwarded, just backwards all the way to the originator.
\end{labelledlist}
\begin{figure}[htbp]
\centering
\includegraphics[width=\textwidth]{fig/chunk-delivery.pdf}
\caption[Alternative ways to deliver chunks: direct, routed and backward \statusgreen]{Alternative ways to deliver chunks. \textbf{Top:} \emph{direct delivery}: via direct underlay connection. \textbf{Centre:} \emph{routed delivery}: chunk is sent using Kademlia routing. \textbf{Bottom:} \gloss{backwarding} re-uses the exact peers on the path of the request route to relay the delivery response.}
\label{fig:chunk-delivery}
\end{figure}
Firstly, using the obvious direct delivery, the chunk is delivered in one step via a lower-level network protocol. This requires an ad-hoc connection with the associated improvement in latency traded off for worsened security of privacy.%
%
\footnote{Beeline delivery has some merit, i.e. bandwidth saving and better latency, so we do not completely rule out the possibility of implementing it.
}
Secondly, using routed delivery, a chunk is delivered back to its requestor using ad-hoc routing as determined from the storer's perspective at the time of sending it. Whether direct or routed, allowing deliveries routed independently of the request route presupposes that the requestor's address is (at least partially) known by the storer and routing nodes. Consequently, these methods disclose information that can identify the requestor. However, with forwarding--backwarding Kademlia this is not necessary: The storer node responds back to their requesting peer with the delivery, while intermediate \glossplural{forwarding node} remember which of their peers requested what chunk. When the chunk is delivered, they pass it on back to their immediate requestor, and so on until it eventually arrives at the node that originally requested it. In other words, the chunk delivery response simply follows the request route back to the originator (see Figure \ref{fig:request-response}). Since it is the reverse of the forwarding, we can playfully call this \gloss{backwarding}. Swarm uses this option, which makes it possible to disclose no requestor identification in any form, and thus Swarm implements completely \gloss{anonymous retrieval}.
\begin{figure}[htbp]
\centering
\includegraphics[width=\textwidth]{fig/request-response-forwarding.pdf}
\caption[Backwarding: a pattern for anonymous request-response round-trips in forwarding Kademlia \statusgreen]{Backwarding: pattern for anonymous request--response round-trips in forwarding Kademlia. Here a node with overlay address $...0000...$ sending a request to target $....1111...$ to which the closest online node is $...1110...$ The leading ellipsis represents the prefix shared by the requestor and target and has a length of $n$ bits. The trailing ellipsis represents part of the address that is not relevant for routing as at that depth nodes are already unique. The request uses the usual Kademlia forwarding, but the relaying nodes on the way remember the peer the request came from so that when the response arrives, they can \emph{backward} it (i.e.\ pass it back) along the same route.}
\label{fig:request-response}
\end{figure}
Ensuring requestor anonymity by default in the retrieval protocol is a crucial feature that Swarm insists upon. This feature aims to safeguard user privacy and enables censorship-resistant access.
\begin{figure}[htbp]
\centering
\includegraphics[width=\textwidth]{fig/retrieval.pdf}
\caption[Retrieval \statusgreen]{Retrieval. Node $D$ (Downloader) sends a retrieve request to the chunk's address. Retrieval uses forwarding Kademlia, so the request is relayed via forwarding nodes $F_0$, ..., $F_n$ all the way to node $S$, the storer node closest to the chunk address. The chunk is then delivered by being passed back along the same route to the downloader.}
\label{fig:retrieval}
\end{figure}
The generic solution of implementing retrieval by backwarding as depicted in Figure \ref{fig:retrieval} has further benefits relating to spam protection, scaling and incentivisation, which will be discussed in the remainder of this section.
\subsubsection{Protection against unsolicited chunks}
In order to remember requests, the \gloss{forwarding node} needs to allocate resources that come with a certain cost (they occupy space in memory). The requests that are not followed by a corresponding delivery should eventually be garbage collected, so there needs to be a defined time period during which they are active. Downstream peers also need to be informed about the timeout of this request. This makes sense, since the originator of the request will want to attach a time-to-live duration to the request to indicate how long it will wait for a response.
Sending unsolicited chunks is an offence as it can lead to \gloss{denial of service (DoS)}. By remembering a request, nodes are able to recognise unsolicited chunk deliveries and penalise the peers sending them. Chunks that are delivered after the request expires will be treated as unsolicited. Since there may be some discrepancy assessing the expiry time between nodes, there needs to be some tolerance for unsolicited chunk deliveries, but if they go above a particular (but still small) percentage of requests forwarded, the offending peer is disconnected and blacklisted. Such local sanctions are the easiest and simplest way to incentivise adherence to the protocol (see \ref{sec:sanctions}).
\subsubsection{Re-requesting}
There is the potential for a large proportion of Swarm nodes to not be always stably online. Such a high churn situation would be problematic if we used the naive strategy of forwarding requests to any one closer node: If a node on the path were to go offline before delivery is completed, then the request-response round trip is broken, effectively rendering the chunk requested not retrievable. Commitment to pay for a chunk is considered void if the connection to the requested peer is dropped, so there is no harm in re-requesting the chunk from another node.
\subsubsection{Timeout vs. not found}
Note that in Swarm there is no explicit negative response for chunks not being found. In principle, the node that is closest to the retrieved address can determine the absence of a chunk at that address and could issue a "not found" response. However, this is not desirable for the following reason: While the closest node to a chunk can verify its absence from its expected location in the network, nodes further away from the chunk cannot reliably conclude the same. They lack first-hand verification, and any positive evidence regarding the chunk's retrievability obtained later can be retrospectively plausibly deniable.
All in all, as long as delivery has the potential to create earnings for the storer, the best strategy is to keep a pending request open until it times out and be prepared in case the chunk should appear. There are several ways the chunk could arrive after the request: (1) syncing from existing peers (2) appearance of a new node or (3) if a request precedes upload, e.g. the requestor has already "subscribed" to a single owner address (see \ref{sec:messaging}) to decrease latency of retrieval. This is conceptually different from the usual server-client based architectures where it makes sense to expect a resource to be either on the host server or not.
\subsubsection{Opportunistic caching}
Using backwarding for chunk delivery responses to retrieve requests also enables \gloss{opportunistic caching}, where a \gloss{forwarding node} receives a chunk and the chunk is then saved in case it will be requested again. This mechanism is crucial in ensuring that Swarm scales the storage and distribution of popular content automatically (see \ref{sec:caching}).
\subsubsection{Incentives}
So far, we have shown that by using the retrieval protocol and maintaining Kademlia connectivity, nodes in the network are capable of retrieving chunks. However, since forwarding is expending a scarce resource (bandwidth), without providing the ability to account for this bandwidth use, network reliability will be contingent on the proportion of freeriding and altruism. To address this, in Section \ref{sec:incentivisation} we will outline a system of economic incentives that align with the desired behaviour of nodes in the network. When these profit-maximising strategies are employed by node operators, they give rise to emergent behaviour that is beneficial for users of the network as a whole.
\subsection{Push syncing\statusgreen}\label{sec:push-syncing}
In the previous sections, we presented how a network of nodes maintaining a Kademlia overlay topology can be used as a distributed chunk store and how Forwarding Kademlia routing can be used to define a protocol for retrieving chunks.
When discussing retrieval, we assumed that chunks are located with the node whose address is closest to theirs. This section describes the protocol responsible for realising this assumption: ensuring delivery of the chunk to its prescribed storer after it has been uploaded to any arbitrary node.
This network protocol, called \gloss{push syncing}, is analogous to chunk retrieval: First, a chunk is relayed to the node closest to the chunk address via the same route as a retrieval request would be, and then in response a \gloss{statement of custody receipt} is passed back along the same path (see Figure \ref{fig:push-syncing}). The statement of custody sent back by the storer to the \gloss{uploader} indicates that the chunk has reached the neighbourhood from which it is universally retrievable. By tracking these responses for each constituent chunk of an upload, uploaders can make sure that their upload is fully retrievable by any node in the network before sharing or publishing the address of their upload. Keeping this count of chunks push-synced and receipts received serves as the back-end for a \emph{progress bar} that can be displayed to the uploader to give feedback of the successful propagation of their data across the network (see \ref{sec:upload}).
\begin{figure}[htbp]
\centering
\includegraphics[width=\textwidth]{fig/push-sync.pdf}
\caption[Push syncing \statusgreen]{Push syncing. Node $U$ (Uploader) push-syncs a chunk to the chunk's address. Push-sync uses forwarding, so the chunk is relayed via forwarding nodes $F_0$, ..., $F_n$ all the way to node $S$, the storer node closest to the chunk address (the arrows represent transfer of the chunk via direct peer-to-peer connection). A statement of custody receipt signed by $S$ is then passed back along the same route as an acknowledgment to the uploader.}
\label{fig:push-syncing}
\end{figure}
Statements of custody are signed by the nodes that claim to be the closest to the address.
Similarly to downloaders in the retrieval protocol, the identity of uploaders can also remain hidden, hence forwarding Kademlia can implement \gloss{anonymous uploads}.
Another similarity is that in order to allow backwarding for responses, nodes should remember which peer sent a particular chunk. This record should persist for a short period while the statement of custody responses are expected. When this period ends, the record is removed. A statement of custody not matching a record is considered unsolicited and is allowed only up to a small percentage of all push-sync traffic with a peer. Going above this tolerance threshold is sanctioned with disconnection and blacklisting (see \ref{sec:sanctions}).
In this section, we described how the logistics of chunk uploads can be organised with a network protocol using Forwarding Kademlia routing with response backwarding. However, this solution is not complete until it is secured with aligned incentives: The strategy to follow this protocol should be incentivised and DoS abuse should be disincentivised. These are discussed later in detail in \ref{sec:postage-stamps} and \ref{sec:push-sync-incentives}).
\subsection{Pull syncing\statusgreen}\label{sec:pull-syncing}
\glossupper{pull syncing} is the protocol that is responsible for the following two properties:
\begin{labelledlist}
\item[\emph{eventual consistency}] Syncing neighbourhoods as and when the topology changes due to churn or new nodes joining.
\item[\emph{maximum resource utilisation}] Nodes can pull chunks from their peers to fill up their surplus storage.%
%
\footnote{Maximum storage utilisation may not be optimal in terms of the profitability of nodes. Put differently, storer nodes have an optimal storage capacity based on how often content is requested from them. This means that in practice, profit-optimised maximum utilisation of storage capacity requires operators to run multiple node instances.}
\end{labelledlist}
Pull syncing is node-centric as opposed to chunk-centric, i.e. it makes sure that a node's storage is filled if needed, as well as syncing chunks within a neighbourhood. When two nodes are connected, they will start syncing both ways so that on each peer connection there is bidirectional chunk traffic. The two directions of syncing are managed by distinct and independent \emph{streams}. In the context of a stream, the consumer of the stream is called \gloss{downstream peer} or client, while the provider is called the \gloss{upstream peer} or server.
When two nodes connect and engage in \gloss{chunk synchronisation}, the upstream peer offers all the chunks it stores locally in a data stream per proximity order bin. To receive chunks closer to the downstream peer than to the upstream peer, a downstream peer can subscribe to the chunk stream of the proximity order bin that the upstream peer belongs to in their Kademlia table. If the peer connection is within the nearest neighbour depth $d$, the client subscribes to all streams with proximity order bin $d$ or greater. As a result, peers eventually replicate all chunks belonging to their area of responsibility.
A pull syncing server's behaviour is referred to as being that of a \gloss{stream provider} in the stream protocol. Nodes maintain a record of the time when they stored a chunk locally by indexing them with an ascending storage count known as the \gloss{bin ID}. For each proximity order bin, upstream peers offer to stream chunks in descending order of storage timestamp. As a result of syncing streams on each peer connection, a chunk can be synced to a downstream peer from multiple upstream peers. In order to save bandwidth by not sending data chunks to peers that already have them, the stream protocol implements a round-trip: Before sending chunks, the upstream peer presents a batch of chunks identified by their address. The downstream peer then responds with indicating which chunks from the offered batch they actually need (see Figure \ref{fig:pull-syncing}). Note that the downstream peer determines whether they have the chunk based on the chunk address. Thus, this method critically relies on the chunk integrity assumption discussed in \ref{sec:chunks}.
\begin{figure}[htbp]
\centering
\includegraphics[width=\textwidth]{fig/pull-sync.pdf}
\caption[Pull syncing \statusgreen]{Pull syncing. Nodes continuously synchronise their nearest neighbourhood. If they have free capacity, they also pull sync chunks belonging to shallower bins from peers falling outside the neighbourhood depth.}
\label{fig:pull-syncing}
\end{figure}
In the context of a peer connection, a client is said to be \emph{synced} if it has synced all the chunks of the upstream peer. Note that due to disk capacity limitations, nodes must impose a value cutoff, and as such, "all chunks" reads as shorthand for "all chunks having a value greater than $v$" ($v$ is a constant ranking function, the origin of which is discussed later in \ref{sec:reserve}). In order for a node to promise they store all chunks with value greater than $v$, it is necessary for all its neighbours to have also stored all chunks greater than value $v$. In other words, nodes syncing inherit the maximum such value from among their storer peers.
If chunks are synced in the order they are stored, this may not result in the node always having the most profitable (most often requested) chunks. Thus it may be advisable to sync chunks starting with the most popular ones according to upstream peers and finish syncing when storage capacity is reached. In this way, a node's limited storage will be optimised. Syncing and garbage collection are discussed further in \ref{sec:postage-stamps} and \ref{sec:depths}.
To conclude this section, we show how the criteria of \gloss{eventual consistency} are met in a healthy Swarm. Chunks found in the local store of any node will become retrievable after being synced to their storers. This is because as long as those as peers in the network pull chunks closer to them than to the upstream peer, each chunk travels a route that would also qualify as valid a forwarding path in the push-sync protocol. If new nodes are added and old nodes drop out, neighbourhoods change, but as long as local redundancy is high enough that churn can not render previously retrievable chunks non-retrievable, neighbourhoods eventually replicate their content and redundancy is restored. Consider the unlikely event that a whole new neighbourhood is formed and the nodes that originally held the content belonging to this neighbourhood end up outside of it, resulting in a temporary unavailability of those chunks. Even in this scenario, as long as there is a chain of nodes running pull-syncing streams on the relevant bins, redundant retrievability is eventually restored.
\subsection{Light nodes\statusgreen}
\label{sec:light}
The concept of a \gloss{light node} refers to a special mode of operation necessitated by poor bandwidth environments, e.g. mobile devices on low throughput networks or devices allowing only transient or low-volume storage.
A node is said to be light by virtue of not participating fully in the usual protocols detailed in the previous sections, i.e. retrieval, push syncing, or pull syncing.
A node that has restricted bandwidth environment or in whatever way has limited capacity to maintain underlay connections is not expected to be able to forward messages conforming to the rules of Kademlia routing. This needs to be communicated to its peers so that they do not relay messages to it.
As all protocols in Swarm are modular, a node may switch on or off any protocol independently (depending on capacity and earnings requirements). To give an example: a node that has no storage space available, but has spare bandwidth, may participate as a forwarding node only. Of course, while switching off protocols is technically feasible, a node must at all times take into account the fact that his peers expect a certain level of service if this is advertised and may not accept that some services are switched off and choose not to interact with that node.
Since forwarding can earn revenue, these nodes may still be incentivised to accept retrieve requests. However, if the light node has Kademlia connectivity above proximity order bin $p$ (i.e. they are connected to all storer nodes within their nearest neighbourhood of $r$ peers at depth $d$, and there is at least one peer in each of their proximity order bin from $p$ to $d$), they can advertise this and therefore participate in forwarding.
When they want to retrieve or push chunks, if the chunk address falls into a proximity order bin where there are no peers, they can just pick a peer in another bin. Though this may result in a spurious hop (where the proximity of the message destination to the latest peer does not increase as a result of the relaying), the Kademlia assumption that routing can be completed in logarithmic steps still holds valid.
A node that is advertised as a storer/caching node is expected to store all chunks above a certain value. In order to maintain consistency, they need to synchronise content within their area of responsibility, which requires them to run the pull-sync protocol. The same applies to aspiring storer nodes that come online with available storage and open up to pull-sync streams to fill their storage capacity. In the early stages of this, it does not make sense for a node to sync to other full storer nodes. However, it can still be useful for them to sync with other similar newcomer nodes, especially if storer nodes are maxing out on their bandwidth.
The crucial thing here is that for redundancy and hops to work, light nodes with incomplete, unsaturated Kademlia tables should not be considered by other peers when calculating the level of saturation of the network.
\chapter{Incentives}\label{sec:incentivisation}
The Swarm network comprises many independent nodes, running software which implements the Swarm protocol. It is important to realise that even though nodes run the same protocol, the emergent behaviour of the network is not guaranteed by the protocol alone; as nodes are autonomous, they are essentially "free" to react in any way they desire to incoming messages of peers.
It is, however possible to make it profitable for a node to react in a way that is beneficial for the desired emergent behaviour of the network, while making it costly to act in a way that is detrimental. Broadly speaking, this is achieved in Swarm by enabling a transfer of value from those nodes who are using the resources of the network (\glossplural{net user}) to those who are providing it (\glossplural{net provider}).
\begin{figure}[htbp]
\centering
\includegraphics[width=\textwidth]{fig/incentive-design-3.pdf}
\caption[Incentive design \statusgreen]{Incentive design}
\label{fig:incentives}
\end{figure}
% The rest of this chapter is concerned with describing the desired emergent behaviour of the network (\ref{sec:emergent_behavior}), after which we describe the actions which incur costs and those that provide value (\ref{sec:cost_benefit}). Finally, we proceed by describing the incentive mechanisms which ensures that costs are borne as directly as possible by the initiator of the action, with benefits flowing to the nodes who provided the initiator with the expected outcome, ultimately facilitating the desired emergent behavior (see \ref{sec:incentive_mechanisms}).
% \section{Incentive design \statusred}
% \wip{foundational requirement + analysis}
% \subsection{WIP Desired emergent behavior \statusred}\label{sec:emergent_behavior}
% \subsection{WIP Analysis on expected costs and benefits of actions \statusred}\label{sec:cost_benefit}
% \subsection{WIP Proposed incentive mechanisms \statusred}\label{sec:incentive_mechanisms}
% This section will constitute most of the sections which are already described below.
\section{Sharing bandwidth\statusgreen}
\green{}
\subsection{Incentives for serving and relaying\statusgreen}\label{sec:incentives-relaying}
\green{}
\subsubsection{Forwarding Kademlia and repeated dealings}
The retrieval of a chunk is ultimately initiated by someone accessing content, and therefore all costs related to this retrieval should be borne by them. While paid retrievals may not sound like a popular idea when today's web is "free", many of the problems with the current web stems from consumers' inability to share the costs of hosting and distribution with content publishers directly. In principle, the retrieval of a chunk can be perceived as a functional unit where the storer acts as a service provider and the requestor as consumer. The provider renders a service to the consumer, and in return, the consumer should provide compensation. Such a direct transaction would normally require that transactors are known to each other, so if we are to maintain the anonymity requirement on downloads, we must conceptualise compensation in a novel way.
As we use forwarding Kademlia, chunk retrieval subsumes a series of relaying actions performed by forwarding nodes. Since these are independent actors, it is already necessary to incentivise each act of relaying independently. Importantly, if only instances of relaying matter, then transactors are restricted to connected peers, regardless of the specifics of accounting and compensation (see \ref{sec:accounting}). Given that the set of ever connected peers forms a quasi-permanent set across sessions, we are able to frame the interaction within the context of repeated dealings. Such a setting always creates extra incentive for the parties involved to play nice. It is reasonable to exercise preference for peers showing an untainted historical record. Moreover, since the set of connected peers is logarithmic in the network size, any book-keeping or blockchain contract that the repeated interaction with a peer might necessitate is kept manageable, offering a scalable solution. Turning the argument around, we could say that keeping balances with a manageable number of peers, as well as the ambiguity of request origination are the very reasons for nodes to have limited connectivity, i.e., that they choose leaner Kademlia bins.
\subsubsection{Charging for backwarded response}
If accepting a retrieve request already constitutes revenue for forwarding nodes, i.e. an accounting event crediting the downstream peer is triggered before the response is delivered, then it creates a perverse incentive not to forward the requests. Conditioning the request revenue fulfilment on successful retrieval is the natural solution: The accounting event is triggered only when a requested chunk is delivered back to its requestor, see Figure \ref{fig:retrieval-payment}.
\begin{figure}[htbp]
\centering
\includegraphics[width=\textwidth]{fig/retrieval-payment.pdf}
\caption[Incentivising retrieval \statusgreen]{Incentivising retrieval. Node $D$ (Downloader) sends a retrieve request to the chunk's address. Retrieval uses forwarding, so the request is relayed via forwarding nodes $F_0$, ..., $F_n$ all the way to node $S$, the storer node closest to the chunk address. The chunk is delivered by being passed back along the same route to the downloader. Receiving the chunk response triggers an accounting event.}
\label{fig:retrieval-payment}
\end{figure}
If, however, there is no cost to a request, then sending many illegitimate requests for non-existing chunks (random addresses) becomes possible. This is easily mitigated by imposing sanctions on peers that send too many requests for chunks that do not exist (see \ref{sec:sanctions}).
Once a node initiates (starts or forwards) a request, it commits to pay for that chunk if it is delivered within the defined \gloss{time to live} (\gloss{TTL}), therefore there is never an incentive to block timely deliveries when the chunk is passed back. This commitment also dissuades nodes from frivolously asking too many peers for a chunk, since, if multiple peers respond with delivery, each must be paid.
\subsection{Pricing protocol for chunk retrieval\statusgreen}\label{sec:pricing}
\green{}
Next, we describe the protocol which nodes use to communicate their price for delivering chunks in the Swarm network. Building on top of this protocol, strategies can then be implemented by nodes who wish to compete in the market with other nodes in terms of quality of service and price.
\subsubsection{Price discovery}\label{sec:retrieval-price-discovery}
The main merit of the protocol is that it allows for the mechanisms of price discovery to be based only on local decisions, which is essential for the following reasons: (1) Bandwidth costs are not homogeneous around the world: Allowing nodes to express their cost structure via their price will enable competition on price and quality, ultimately benefiting the end-user. (2) The demand for bandwidth resource is constantly changing due to fluctuations in usage or connectivity. (3) Being able to react directly to changes creates a self-regulating system.
Practically, without this possibility, a node operator might decide to shut down their node when costs go up or, conversely, end-users might overpay for an extended period of time when costs or demand decrease and there is no competitive pressure for nodes to reduce their price accordingly.
Bandwidth is a service that comes with "instant gratification" and therefore immediate acknowledgement and accounting of its cost are justified. Since it is hard to conceive of any externalities or non-linearities in the overall demand and supply of bandwidth, a pricing mechanism which provides for both (1) efficient and immediate signalling and (2) competitive choice with minimal switching and discovery costs is most likely to accommodate strategies that result in a globally optimal resource allocation.
To facilitate this, we introduce a protocol message that can communicate these prices to upstream peers. We can conceptualise this message as an alternative response to a request. Nodes maintain the prices associated with each peer for each proximity order. Therefore, when they issue a retrieve request, they already know the price they commit to pay as long as the downstream peer successfully delivers the valid chunk within the time-to-live period. However, there is no point in restricting the price signal just to responses: If, for whatever reason, a peer decides to change the prices, it is in the interest of both parties to exchange this information even if there is a request to respond to. In order to prevent DoS attacks by flooding upstream peers with price change messages, the rate of price messages is limited. Well-behaved and competitively priced nodes are favoured by their peers; if a node's prices are set too high or their prices exhibit a much higher volatility than others in the network, then peers will be less willing to request chunks from them.%
%
\footnote{While this suggests that unreasonable pricing is taken care of by market forces, in order to prevent catastrophic connectivity changes as a result of radical price fluctuations, limiting the rate of change may need to be enforced on the protocol level. }
For simplicity of reasoning, we posit that the default price is zero, corresponding to a free service (altruistic strategy).
\subsubsection{Differential pricing of proximities}\label{sec:diff-pricing-prox}
If the price of a chunk is the same at all proximities, then there is no real incentive for nodes to forward requests other than the potential to cache the chunk and earn revenue by reselling it. This option is hard to justify for new chunks, especially if they are in the shallow proximity orders of a node where they are unlikely to be requested. More importantly, if the pricing of chunks is uniform across proximity orders, colluding nodes can generate chunk traffic and pocket exactly as much as they send, virtually a free DoS attack (see Figure \ref{fig:ddos-uniform-price}).
\begin{figure}[htbp]
\centering
\includegraphics[width=\textwidth]{fig/ddos-uniform-price.pdf}
\caption[Uniform chunk price across proximities would allow a DoS \statusgreen]{Uniform chunk price across proximities would allow a DoS attack. An attacker can create a flow of traffic between two nodes $D$ and $S$ by sending retrieve requests towards $S$ which only $S$ can serve. If prices are the same across proximities, such an attack would incur no cost for the attacker.}
\label{fig:ddos-uniform-price}
\end{figure}
To mitigate this attack, the price a requestor pays for a chunk needs to be strictly greater than what the storer node would receive as compensation when a request is routed from requestor to storer. We need to establish a pricing scheme that rewards forwarding nodes, hence, this necessitates the implementation of differential pricing based on node proximity. If the price of delivery is lower as a node gets further from the chunk, then the request can always be sent that way because the forwarder will pocket the difference and therefore make a profit. This means that an effective differential scheme will converge to a pricing model where delivery costs more if the peer is further from the chunk address, i.e. rewards for chunk deliveries are a decreasing function of proximity.
Due to competitive pressure along the delivery path and in the neighborhood, we expect that the differential a node is applying to the downstream price to converge towards the marginal cost of an instance of forwarding.
The downstream price is determined by the bin density of the node. Assuming balanced bins with cardinality $2^n$, a node can guarantee to increase the proximity order by $n$ in one hop. At the same time, it also means that they can spread the cost over $n$ proximity bins pushing the overall price down.
\subsubsection{Uniformity of price across peers}
Take a node $A$ that needs to forward a request for a chunk which falls into $A$'s PO bin $n$. Notice that all other peers of $A$ in bins $n+1, n+2, ...$, just like $A$ also have the chunk in their PO $n$. If any of these peers, say $B$, has a price for proximity order $n$ cheaper than $A$, $A$ can lower its price for PO bin $n$, forward all increased traffic to $B$ and still pocket the difference, see Figure \ref{fig:price-arbitrage}. Note that this is not ideal for the network as it introduces a \gloss{spurious hop} in routing, i.e., in relaying without increasing the proximity.
\begin{figure}[htbp]
\centering
\includegraphics[width=\textwidth]{fig/price-arbitrage-3.pdf}
\caption[Price arbitrage \statusgreen]{Price arbitrage. Nodes keep a price table for prices of every proximity order for each peer. The diagram shows node $0101$ trying to forward a retrieve request for $0000$. The arrows originate from the closest node, and point to cells where other peers although further from the chunk, offer cheaper to forward. Choosing the cheaper peer will direct traffic away from the overpriced peer and lead to a pressure on both to adjust.}
\label{fig:price-arbitrage}
\end{figure}
Similarly, peers of $A$ in shallower bins that have lower price than $A$ for their respective bins, e.g., $B$ in bin $n-1$ being cheaper than $A$ in bin $n$, then $A$ can always forward any request to $B$ and pocket the difference.
Now let's assume that all peers have price tables which are monotonically decreasing as PO decreases. Also assume that shallower bins have higher prices for bins less than $n$, and all deeper peers in bins higher than $n$ have the same prices for $n$. Let $B$, $C$, $D$ and $E$ be the peers in bin $n$ densely balanced. $A$ wants to forward a chunk to a peer so that the PO with its target address increases by 3. If peers $B$ and $C$ attempt to collude against $A$ and raise the price of forwarding chunks to bin $n+3$, they are still bound by $D$ and $E$'s price on PO bin $n+2$. In particular, if they are lower than $B$ and $C$ for $n+3$.
Such price discrepancies offer nodes an arbitrage opportunity; the strategy to forward to the cheapest peer will direct traffic away from expensive peers and increase traffic for cheaper ones. As a consequence, prices will adjust.
All else being equal, this price arbitrage strategy will achieve (1) uniform prices for the same proximity order across the network, (2) prices that linearly decrease as a function of proximity (3) nodes can increase connectivity and keep prices lower. In this way, incentivisation is designed so that strategies that are beneficial to individual nodes are also neatly aligned in order to benefit the health of the system as a whole.
\subsubsection{Bin density}
Charging based on the downstream peer's proximity to the chunk has the important consequence that the net revenue earned from a single act of non-local delivery to a single requestor is a monotonically increasing function of the difference between the chunk's proximity to the node itself and to the peer the request was forwarded to. In other words, the more distance we can cover in one forward request, the more we earn.
This incentive aligns with downloaders' interest to save hops in serving their requests, leading to lower-latency delivery and reduced bandwidth overhead. This scheme incentivises nodes to keep a gap-free balanced set of addresses in their Kademlia bins as deep as possible (see Figure \ref{fig:bindensity}), i.e, it is better for a node to keep dense Kademlia bins than thin ones.
Nodes that are able to maintain denser bins actually have the same cost as thinner ones, but saving hops will improve latency and make the peer more efficient. This will lead to the peer being preferred over other peers that have the same prices. Increased traffic essentially can also lead to bandwidth contention, which eventually allows the raising of prices.
Note that such arbitrage is more efficient in \glossplural{shallow bin} where the number of peers to choose from is higher. This is in major opposition to \glossplural{deep bin} in the area of responsibility. If a node does not replicate its neighbourhood's chunks, some of these chunks will need to be requested by the node closer to the address, but further from the node. This will only be possible at a loss. An added incentive for neighbours to replicate their area of responsibility is discussed in \ref{sec:redistribution}. With the area of responsibility stored however, a node can choose to set their price arbitrarily.
\subsubsection{Caching and auto-scaling}\label{sec:caching}
Nodes receive a reward every time they serve a chunk, therefore the profitability of a chunk is proportional to its popularity: the more often a chunk is requested, the higher the reward relative to the fixed cost of storage per time unit. When nodes reach storage capacity limits and need to decide which chunks to delete, a rational agent seeking to maximise profit would opt to remove chunks with the lowest profitability. A reasonably%
%
\footnote{Better metrics for predicting chunk profitability than the age of last request will continue to be identified and developed.}
good predictor for this is the age of last request. In order to maximise the set of chunks to select from, nodes engage in opportunistic caching of the deliveries they relay as well as the chunks they sync. This results in popular chunks being more widely spread and faster served, transforming the whole of Swarm into an auto-scaled and auto-balanced \emph{content distribution network}.
\subsubsection{Non-caching nodes}
Any scheme that ensures \glossplural{relaying node} make a profit creates a positive incentive for forwarding-only non-caching nodes to enter the network. Such nodes are not inherently beneficial to the network as they are creating unnecessary bandwidth overhead. On the one hand, their presence could, in principle, unburden storer nodes from relaying traffic, so using them in shallow bins may not be detrimental. On the other hand, closer to neighbourhood depth, their peers will favour a caching/storing node to them because of their disadvantage at least for chunks in their hypothetical area of responsibility. Non-caching nodes can also contribute to increase anonymity (see \ref{sec:retrieval}).
\subsection{Incentivising push-syncing\statusgreen}\label{sec:push-sync-incentives}
\green{}
The push-sync (see \ref{sec:push-syncing}) protocol ensures that chunks uploaded into the network arrive at their designated address. In what follows, we will explain how forwarding is incentivised.
%
%
\footnote{To complement our solution for bandwidth compensation, further measures are needed for spam protection and storage incentivisation which are discussed later in \ref{sec:postage-stamps} and \ref{sec:redistribution}, respectively.}
Push-syncing is analogous to the retrieval protocol in the sense that their respective message exchange sequences travel the same route.
The delivery of the chunk in the push-sync protocol is analogous to a retrieval request and, conversely, the statement of custody receipt in push-sync is analogous to the chunk delivery response in retrieval.
In principle, push-syncing could be left without explicit forwarding incentives. Due to the retrieval protocol, as nodes expect chunks to be found in the neighbourhood of their address, participants in Swarm are at least weakly incentivised to help deliver uploaded chunks to their destination. However, we need to provide the possibility that chunks are uploaded via nodes further from it than the requestor (light nodes or retries). Thus, if push-syncing was free, nodes could generate wasteful amounts of bandwidth.
Requiring payment only for push-sync delivery by downstream peers would put the forwarder in a position to bargain with a storer node regarding the delivery of the chunk. The possession of a chunk is valuable for the prospective storer node because there is also a system of rewards for storage (see \ref{sec:redistribution}). Based on this, the forwarder node could, in theory, hold onto the chunk unless the storer node pays marginally more than the value of possessing that chunk, factoring in the profit potential due to storage incentives. In particular, since forwarders on the route from the uploader are not numerous, any profits generated from a storage reward mechanism might be captured by these forwarding nodes.
Instead, in push-sync, by making the statement of custody receipt a paid message, the roles switch. The forwarder node is no longer in the position to bargain. To understand why, let's consider a scenario where a forwarding node tries to hold on to a chunk to negotiate a price for pushing it to a storer node. In this case, the uploader will not get a statement of custody receipt within the expected time frame. As a result, the uploader will assume that the attempt has failed and re-upload the chunk via a different route. Now, suddenly the original forwarding node is forced to compete with another forwarding node in getting compensation for their bandwidth costs. Since all forwarding nodes are aware of this dynamic, emergent behaviour will produce a series of peers that are willing to forward the chunk to the storer node for a relatively small compensation and the bandwidth costs incurred. This eliminates the need for the original forwarding node to try and bargain with the storer node in the first place: Instead, they can generate a small profit immediately by simply returning the statement of custody receipt.
\begin{figure}[htbp]
\centering
\includegraphics[width=\textwidth]{fig/push-payment.pdf}
\caption[Incentives for push-sync protocol \statusgreen]{Incentives for push-sync protocol. Node $U$ (uploader) sends the chunk towards its address, the closest node to which is node $S$ (storer) via forwarding nodes $F_0, \ldots F_n$. The storer node responds with a statement of custody receipt which is passed back to the uploader via the same forwarding nodes $F_n, \ldots F_0$. Receiving the statement of custody receipt triggers an accounting event.}
\label{fig:syncing-swap}
\end{figure}
This scheme highlights why the incentivisation of the two protocols relies on the same premises: there are many sellers (forwarders) and only one buyer (uploader) for a homogeneous good (the statement of custody receipt). As a result, the price of the service (delivering the chunk to the storer) is determined by the sum of the marginal costs of forwarding for each node along the route. At the same time, the storer node can capture all the profits from the storage compensation scheme.
In this way, we can make sure that (1) storers actually respond with receipts, and (2) have a way to detect timed out or unsolicited receipt responses to protect against DoS, see Figure \ref{fig:syncing-swap}.
Similar to the retrieval protocol, the pricing in this scheme is expected to vary based on different proximities (see \ref{sec:diff-pricing-prox}). Additionally, as the costs of the nodes in the network fluctuate (depending on capacity utilization and node efficiency), the pricing will also be subject to change over time. Given that the compensation is calculated for one chunk and one shorter message (retrieve request and custody receipt) during the accounting process, we can safely conclude that the price structure for forwarding for both protocols is identical. Consequently, a unified pricing scheme for forwarding can be applied to both protocols, as discussed in \ref{sec:retrieval-price-discovery}.
What distinguishes push-sync from the retrieval protocol is that, unlike in retrieval where the chunk is delivered back and its integrity can be validated, the accounting event in push-sync is a statement of custody which can be spoofed. Due to the forwarding incentive, nodes may be motivated to withhold forwarding and impersonate a storer node by issuing the statement of custody. This makes it advisable to query (retrieve) a chunk via alternative routes. If such retrieval attempts fail, it may be necessary to try push-syncing chunks through alternative routes.
\section{Swap: accounting and settlement\statusgreen}\label{sec:accounting-and-settlement}
\green{}
This section covers aspects of incentivisation relating to bandwidth sharing.
In \ref{sec:accounting}, we introduce a mechanism to keep track of the data traffic between peers and offer peer-to peer-accounting for message relaying.
Subsequently, in \ref{sec:cheques}, we describe the conditions of compensating for unbalanced services and show how settlement can be achieved.
In particular we introduce the concept of a. \glossplural{cheque} and the \gloss{chequebook contract}. In \ref{sec:waiver}, we discuss waivers as an optimisation that allows for additional savings on transaction costs. In \ref{sec:zero-cash-entry} we discuss how an incentivised service of sending in cashing transactions enables zero-cash entry to Swarm and, finally, in \ref{sec:sanctions} we delve into the fundamental set of sanctions that serve as incentives for nodes to play nice and adhere to the protocols.
\subsection{Peer to peer accounting\statusgreen}\label{sec:accounting}
\citet{ethersphere2016sw3} introduce a protocol for peer-to-peer accounting, called \gloss{swap}. Swap is a tit-for-tat accounting scheme that scales microtransactions. The scheme allows directly connected peers to swap payments or payment commitments. The system's key characteristics are captured playfully with different mnemonic resolutions of the acronym SWAP:
\begin{labelledlist}
\item[\emph{Swarm accounting protocol}] A scheme used by Swarm for keeping a record of reciprocal sharing of bandwidth.
\item[\emph{service wanted and provided}] Allows service exchange.
\item[\emph{settle with automated payments}] Send a \gloss{cheque} when \gloss{payment threshold} is exceeded.
\item[\emph{send waiver as payment}] Debt can be waived in the value of un-cashed cheques.
\item[\emph{start without a penny}] Zero-cash entry is supported by unidirectional swap.
\end{labelledlist}
\subsubsection{Service for service}
\gloss{swap} allows service for service exchange between connected peers. When there is equal consumption with low variance over time, bidirectional services can be accounted for without the need for any payments. Data relaying is an example of such a service, making Swap well-suited for implementing bandwidth incentives in content delivery or mesh networks.
\begin{figure}[htbp]
\input{fig/swap-balance-thresholds.tex}
\caption[Swap balance and swap thresholds \statusgreen]{Swap balance and swap thresholds.
Zero balance in the middle indicates the equal consumption and provision of services.
The current channel balance represents the difference in uncompensated service provision:
If the balance is to the right of zero, it tilts in favour of A with peer B being in debt, whereas to the left,
the balance tilts in favour of B with A being in debt.
The orange interval represents loss tolerance. When the balance exceeds the payment threshold, the party in
debt sends a \gloss{cheque} to its peer. If it reaches the \gloss{disconnect threshold}, the peer in debt is disconnected.}
\label{fig:swap}
\end{figure}
\subsubsection{Settling with payments}
In situations where there is high variance or unequal consumption of services, the balance will eventually tilt significantly toward one peer. In such cases, the indebted party issues a payment to the creditor to restore the nominal balance to zero. This process is automatic and justifies the concept of swap as \emph{settle (the balance) with automated payments} (see Figure \ref{fig:swap}). These payments can be in the form of commitments rather than immediate transactions.
\subsubsection{Payment thresholds}
To quantify what counts as "significant tilt", the swap protocol requires peers to advertise a \gloss{payment threshold} as part of the handshake: When their relative debt to their peer goes above this threshold, they send a message containing a payment to their peer. It is reasonable for any node to send a message when the debt reaches this level, as there is also a disconnect threshold in place. The disconnect threshold can be set freely by any peer, but it is recommended to choose a value that takes into account the usual
variance in accounting balances between the two peers. This can be done by considering the difference between the payment threshold and the disconnect threshold.
\subsubsection{Atomicity}
Sending the \gloss{cheque} and updating the balance on the receiving side cannot be made an atomic operation without substantial added complexity. For instance, a client could crash between receiving and processing the message, so even if the sending returns with no error, the sending peer can not be sure the payment was received, this can result in discrepancies in accounting on both sides. The tolerance expressed by the difference between the two thresholds ($\mathit{DisconnectThreshold}-\mathit{PaymentThreshold}$) guards against this, i.e. if the occurrence of such crashes is infrequent and happens with roughly equal probability for both peers, the resulting minor discrepancies are filtered out. This mechanism protects nodes from facing sanctions.
\begin{center}
\begin{figure}[htbp]
\input{fig/chequeswap.tex}
\caption[Cheque swap \statusgreen]{Peer B's swap balance (with respect to A) reaches the payment threshold (left),
B sends a cheque to peer A. B keeps the cheque and restores the swap balance to zero.}
\label{fig:chequeswap}
\end{figure}
\end{center}
\subsection{Cheques as off-chain commitments to pay\statusgreen}\label{sec:cheques}
One of the major challenges with conducting direct \glossplural{on-chain payment} in a blockchain network is the high transaction costs associated with processing each transaction by every participating node. It is, however, possible to create a payment without presenting this payment on-chain. Such payments are called \gloss{second-layer payment} strategies. One such strategy is deferring payments and processing them in bulk. In exchange for reduced cost, the beneficiary must be willing to incur a higher risk of settlement failure. We argue that this is perfectly acceptable in the case of bandwidth incentivisation in Swarm, where peers will engage in repeated dealings.
\subsubsection{The chequebook contract}
A simple smart contract called the \gloss{chequebook contract}, introduced in \citet{ethersphere2016sw3}, allows the beneficiary to determine the timing of payments. This contract acts as a wallet that can process \glossplural{cheque} issued by its owner. Similar to traditional financial transactions, the issuer signs a \emph{cheque} specifying the \emph{beneficiary}, \emph{date}, and \emph{amount}, providing it to the recipient as a token of promise to pay at a later date. The smart contract plays the role of the bank. When the recipient wishes to get paid, they "cash the cheque" by submitting it to the smart contract. The contract, after validating the signature, date and the amount specified on the cheque, transfers the amount to the beneficiary's account (see Figure \ref{fig:swap-chequebook}). Analogous to the person taking the cheque to the bank to cash it, anyone can send a digital cheque in a transaction to the owner's chequebook account, initiating the transfer.
The swap protocol specifies that when the \emph{payment threshold} is exceeded, a cheque is sent over by the creditor peer. Such cheques can be immediately cashed by sending them to the issuer's chequebook contract. Alternatively, cheques can also be held, which effectively serves as a form of lending on credit, enabling parties to save on transaction costs.
The amount deposited in the chequebook (\gloss{global balance}) serves as collateral for the cheques and is pooled over the beneficiaries of all outstanding cheques. In this simplest form, the chequebook provides the same guarantees as real-world cheques: None. Since funds can be freely moved out of the chequebook wallet at any time, solvency at the time of cashing can never be guaranteed: If the chequebook's balance is less than the amount specified in a submitted cheque, the cheque will bounce. This is the trade-off between transaction costs and risk of settlement failure.
While, strictly speaking, there are no guarantees for solvency, nor is there an explicit punitive measure in the case of insolvency, a bounced cheque can negatively impact the issuer's reputation as the chequebook contract records such incidents. On the premise that cheques are swapped in the context of repeating dealings, peers will refrain from issuing cheques beyond their balance. In other words, a node's interest in keeping a good reputation with their peers serves as a sufficient incentive to maintain its solvency.
\begin{figure}[htbp]
\centering
\input{fig/swap.tex}
\caption[The basic interaction sequence for swap chequebooks \statusgreen]{The basic interaction sequence for swap chequebooks}
\label{fig:swap-chequebook}
\end{figure}
\subsubsection{Double cashing}
Since these digital cheques are files and can therefore be copied, it is crucial to implement measures to prevent the cashing of the same cheque multiple times. Such "double cashing" can be prevented by assigning each cheque given to a particular beneficiary a serial number which the contract will store when the cheque is cashed. The chequebook contract can then rely on the serial number to make sure cheques are cashed in sequential order, thus needing to store only a single serial number per beneficiary.
Alternatively, to address repeated payments to the same beneficiary, the cheques can contain the \emph{cumulative} total amount ever credited to that beneficiary. The contract maintains a record of the total amount that has been cashed out for each beneficiary, and when a new cheque is submitted, the contract compares the amount on the cheque to the stored total. Cheques with an amount equal to or less than the stored total are ignored, while cheques with a higher total will result in the transfer of the difference to the beneficiary.
This simple trick also makes it possible to cash cheques in bulk because only the most recent "last cheque" needs to be processed, leading to a significant reduction of transaction costs.
\subsubsection{Cashing without Ether}\label{sec:zero_eth}
Not all peers in Swarm are expected to have the Ether needed to pay for the transaction costs to cash out a cheque. The chequebook allows third parties to cash cheques. The sender of the transaction is incentivised with a reward for the service performed.
\subsection{Waivers\statusgreen}\label{sec:waiver}
If the imbalance in the swap channel is due to high variance rather than unequal consumption, after a period of accumulating cheques, the channel balance starts tilting in the opposite direction. Normally, it is now up to the other party to issue cheques to its peer, resulting in uncashed cheques accumulating on both sides.
To allow for further savings in transaction costs, it might be desirable to be able to offset these cheques against each other.
Such a process is possible, but it requires certain important changes within the chequebook contract. In particular, cashing cheques can no longer be immediate and must incur a security delay, a concept familiar from other payment channel implementations \citep{poon2015bitcoin, diferrante2017payment, mcdonald2017payment, tremback2015universal}.
Let us imagine a system analogous to cheques being returned to the issuer. Assume peer $A$ issued cheques to $B$ and the balance was brought back to zero. Later, the balance tilts in $A$'s favour, but the cheques from $A$ to $B$ have not been cashed. In the traditional financial world, $B$ could either simply return the last cheque to $A$ or provably destroy it. In our case, it is not so simple; we need some other mechanism by which $B$ \emph{commits not to cash} that particular cheque. Such a commitment could take several forms; it could be implemented by $B$ signing a message allowing $A$ to issue a new `last cheque` with a lower cumulative total amount than before, or perhaps $B$ could issue a form of `negative` cheque for A's chequebook, effectively offsetting the amount as if a cheque with the same amount had been paid.
These implementations share the characteristics of not allowing the instantaneous cashing of cheques in the chequebook. Upon receiving a cheque-cashing request, the contract must wait to allow the other party to submit potentially missing information about cancelled cheques or reduced totals. To accommodate (semi-)bidirectional payments using a single chequebook, we introduce the following modifications:
\begin{itemize}[noitemsep]
\item All cheques from user A to user B must contain a serial number.
\item Each new cheque issued by A to B must have a serial number higher than the previous one.
\item A's chequebook contract records the serial number of the last cheque that B cashed.
\item During the cashing delay, any valid cheque with a higher serial number supersedes any previously submitted cheques, regardless of their face value.
\item Any submitted cheque which decreases the payout of the previously submitted cheque is only valid if it is signed by the beneficiary.
\end{itemize}
\begin{figure}[htbp]
\centering
\includegraphics[width=\textwidth]{fig/waivers-diagram-2_new.pdf}
\caption[Example sequence of mixed cheques and waivers exchange \statusgreen]{Example sequence of mixed cheques and waivers exchange}
\label{fig:waivers-diagram}
\end{figure}
With these rules in place, it is easy to see how cheque cancellation would work. Let's consider the scenario where user $A$ has issued cheques $c_0 \ldots c_n$ with cumulative totals $t_0 \ldots t_n$ to user $B$. Suppose that the last cheque $B$ cashed was $c_i$. The chequebook contract has recorded that $B$ has received a payout of $t_i$ and that the last cheque cashed had serial number $i$.
Let us further suppose that the balance starts tilting in $A$'s favour by some amount $x$. If $B$ had already cashed cheque $c_n$, then $B$ would be required to issue a cheque of her own using $B$'s chequebook as the source and naming $A$ as the beneficiary. However, since cheques $c_{i+1} \ldots c_n$ are still uncashed, $B$ can instead send to $A$ a cheque with $A$'s chequebook as the source, $B$ as the beneficiary, with serial number $n+1$ and cumulative total $t_{n+1} = t_n - x$. Due to the rules enumerated above, $A$ will accept this as an equivalent payment of amount $x$ from $B$. In this scenario, instead of sending a cheque to $A$, $B$ waives part of their earlier entitlement. This justifies the concept of SWAP as \emph{send waiver as payment}.
This process can be repeated multiple times until the cumulative total is brought back to $t_i$. At this point, all outstanding debt has effectively been cancelled, and any further payments must be made in the form of a proper cheque from $B$'s chequebook to $A$ (see Figure \ref{fig:waivers-diagram}).
% \subsection{Best effort settlement strategy}
% Abels text
\subsection{Zero cash entry\statusgreen}\label{sec:zero-cash-entry}
Swap accounting can also work in a one-directional manner. When a party enters the system with zero liquid capital (a \gloss{newcomer}) but connects to a peer with funds (an \gloss{insider}), the newcomer can begin to provide a service (and not use any) in order to earn a positive swap balance.
If the insider has a chequebook, they are able to simply pay the newcomer with a cheque. However, this has a caveat: The newcomer will be able to earn cheques for the services provided but will not have the means to cash them. Cashing cheques requires sending a transaction to the blockchain, and therefore requires gas, unless the node can convince one of its peers to execute the transaction on its behalf. To facilitate this, nodes are able to sign off on a structure that they want to be sent, and then extend the Swap contract with a preprocessing step that triggers payment to the newcomer, covering the transaction’s gas cost plus a service fee for the sender of the transaction. The newcomer's cheque may be cashed by any insider (see Figure \ref{fig:zero-cash-entry}). This feature justifies the concept of SWAP as \emph{start without a penny, send with a peer}.
\begin{figure}[htbp]
\centering
\input{fig/zero-cash-entry.tex}
\caption[Zero cash entry \statusorange]{Bootstrapping or how to launch as a swap capable node consuming and providing a
service and earn money.}
\label{fig:zero-cash-entry}
\end{figure}
The possibility to earn small amounts of money without starting capital is crucial, as it provides a way for new users to get access to Swarm without the need to purchase the token. This benefit extends to the Ethereum ecosystem in general: using Swarm, anybody can earn small amounts of money to start paying the gas to fuel their dapps, without the need to go through a painful process of acquiring tokens prior to onboarding.
% \subsection{Cashing out and risk of insolvency\statusred}
\subsection{Sanctions and blacklisting \statusgreen}\label{sec:sanctions}
\red{}
This section complements the SWAP scheme with additional incentives and protection against foul play.
\subsubsection{Protocol breach}
In a peer-to-peer trustless setting, implementing nuanced sanctions against undesired peer behaviour can be challenging. However, when the basic rules of interaction are violated, the node that detects it can simply disconnect from that peer. In order to avoid deadlocks due to attempted reconnection, the sanctions imposed on transgressive nodes also include recording the peer's address into a blacklist. This simple measure is enough to provide a clear disincentive to nodes seeking to exploit the protocol.
\subsubsection{Excessive frivolity}
Both retrieval and push-sync protocols have an incentive structure where only the response to a request generates income. Although this creates a strong incentive to play ball, it may also be necessary to take measures to ensure that nodes are not able to spam the network with frivolous requests that have no associated cost. In the case of push-syncing, it is especially important not to allow chunks to expunge others at no cost. This will form the topic of a later section where we introduce postage stamps (see \ref{sec:postage-stamps}).
In the case of pull-sync retrieval, the potential attack consists of requesting non-existing chunks and causing downstream peers to initiate a lot of network traffic, as well as some memory consumption, due to requests being persisted during the time-to-live period.
Surely, there is a possibility that a requestor may unknowingly request non-existing chunks, and what is more, the requested chunk could have been be garbage-collected in the network, in which case, the requestor may have acted in good faith.
To mitigate this, each node maintains a record of the number of retrieve requests from each of its peers and then updates the relative frequency of failed requests, i.e. requests that have timed out even though the node in question had forwarded it. If the proportion of failed requests to successful requests exceeds a certain threshold, sanctions are imposed on the peer: it is disconnected and blacklisted.
%
% \footnote{Note that policing frivolous requestors is much easier than policing forwarders, since the latter requires positive evidence from other nodes while the former only requires forwarding.}
%
By remembering the requests they have forwarded, nodes can distinguish legitimate responses from a potential DoS attack: for retrieval, if the chunk delivered does not fulfil an open request, it is considered unsolicited; for push-sync, if a statement of custody response does not match an existing entry for forwarded chunk, it is considered unsolicited.
Timeouts are crucial here. After the time-to-live period for a request has passed, the record of the open request can be removed. Any response received after this point is considered unsolicited, as it is indistinguishable from messages that were never requested.
To account for slight discrepancies in time measurement, once again a small percentage of illegitimate messages are tolerated from a peer before they are disconnected and blacklisted.
\subsubsection{Quality of service}
Beyond the rate of unsolicited messages, nodes can cause grievances on other ways, such as by setting high prices, having low network throughput, or long response latencies. Similarly to excessively frivolous requests, there is no need for a distinction between malicious attacks or sub-optimal (poor quality, overpriced) service provided in good faith. As a result, mitigating quality of service issues is discussed in the context of peer selection strategies in forwarding and connectivity.
\subsubsection{Blacklisting}
Blacklisting is a strategy that complements disconnection as a measure against peers. It is supposed to extend our judgment expressed in the act of disconnection that the peer is unfit for business.
Blacklists serve as a reference when accepting incoming connections as well as in the peer suggestion strategy of the connectivity driver. On the one hand, blacklisting can save the node from being deadlocked in a cycle of malicious peers trying to reconnect. On the other hand, care must be taken not to blacklist peers acting in good faith, as this could negatively impact network connectivity.
\input{storage-incentives}
\input{insurance}
\section{Summary}
In the first two chapters of the architecture part of the book, we introduced the core of Swarm: the peer-to peer-network layer described in Chapter \ref{sec:network} implements a distributed immutable storage for chunks, which is complemented by the incentive system described in the subsequent chapter. The resulting base layer system provides:
\begin{itemize}
\item permissionless participation and access,
\item zero-cash entry for node operators,
\item maximum resource utilisation,
\item load-balanced distribution of data,
\item scalability,
\item censorship-resistance and privacy for storage and retrieval,
\item auto-scaling popular content,
\item basic plausible deniability and confidentiality,
\item churn-resistance and eventual consistency in a dynamic network with node dropouts,
\item sustainability without intervention due to built-in economic incentives,
\item robust private peer-to-peer accounting,
\item incentivised bandwidth sharing,
\item off-chain micro-commitments with on-chain settlement,
\item DoS resistance and spam protection,
\item positive (i.e., motivated by reward) incentives for storage,
\item negative (i.e., discouraged through threat of punitive measures) incentives against data loss.
\end{itemize}
\chapter{Building on the DISC}\label{sec:high-level-functionality}
This chapter builds on the distributed chunk store and introduces data structures and processes that enable higher-level functionality, offering a rich experience handling data. In particular, we show how chunks can be organised to represent files (\ref{sec:files}) and how files can be organised to represent collections (\ref{sec:collections}). We also introduce key--value maps (\ref{sec:maps}) and briefly discuss the potential of arbitrary functional data structures. We then shift our attention to presenting our solution for providing confidentiality and access control (\ref{sec:access-control}).
In \ref{sec:feeds}, we introduce Swarm feeds, which are designed to represent a wide variety of sequential data, such as versioning updates of mutable resources or indexing messages for real-time data exchange: offering a system of persisted pull messaging. To implement push notifications of all kinds, \ref{sec:pss} introduces the novel concept of \glossplural{Trojan chunk} that allow messages to be disguised as chunks and directed to their intended recipient in the swarm. We explain how Trojan chunks and feeds can be combined to form a fully-fledged communication system with robust privacy features.
\section{Data structures\statusgreen} \label{sec:datastructures}
\green{}
In the first two chapters, we made the assumption that data is structured in the form of chunks, i.e. fixed-size data blobs. However, in this section, we will present the algorithms and structures that enable the representation of data of arbitrary length. We will introduce \glossplural{Swarm manifest}, which form the basis of representing collections, indexes, and routing tables, allowing Swarm to host websites and offer URL-based addressing.
\subsection{Files and the Swarm hash\statusgreen}\label{sec:files}
In this section, we introduce the concept of the \emph{Swarm hash}, which provides a mechanism for combining chunks to represent larger sets of structured data, such as files. The idea behind the Swarm hashing algorithm is that chunks can be arranged in a Merkle tree. In this structure, leaf nodes correspond to chunks from consecutive segments of input data, while intermediate nodes correspond to chunks that are composed of the chunk references of their children. These combined chunks are packaged together to form another chunk, as illustrated in Figure \ref{fig:Swarm-hash}.
\begin{figure}[htbp]
\centering
\resizebox{1\textwidth}{!}{
\input{fig/Swarm-hash.tex}
}
\caption[Swarm hash \statusgreen]{Swarm hash: data input is segmented to 4-kilobyte chunks (gray), that are BMT hashed. Their hashes are packaged into intermediate chunks starting on level $0$, all the way until a single chunk remains on level $n$. }
\label{fig:Swarm-hash}
\end{figure}
\subsubsection{Branching factor and reference size}