Skip to content

Commit

Permalink
Merge pull request #99 from freespek/th/final-pass
Browse files Browse the repository at this point in the history
Final pass
  • Loading branch information
thpani authored Dec 17, 2024
2 parents 5fd3887 + f1236fc commit b2456e0
Show file tree
Hide file tree
Showing 13 changed files with 125 additions and 115 deletions.
27 changes: 26 additions & 1 deletion reports/3sf.tex
Original file line number Diff line number Diff line change
Expand Up @@ -74,4 +74,29 @@ \section{Basic 3SF concepts}\label{sec:3sf}
{by having access to all messages sent,} it is possible to slash at least $\frac{n}{3}$ of the validators.
% Finally, we also say that a chain is $f$-\emph{accountable} if the protocol outputting it has Accountable Safety with resilience $f$.
% If a protocol $\Pi$ comprises $k$ subprotocols, $\Pi_i$, each of which outputs a chain, $\Chain_i$, we say that $\Chain_i$ is $f$-accountable if $\Pi_i$ is $f$-accountable.
\end{definition}
\end{definition}

\subsection{Complexity of (model-checking) the protocol}

The 3SF protocol is intricate, with a high degree of combinatorial complexity,
making it challenging for automatic model checking. During our specification
work, we have observed multiple layers of complexity in the protocol:
\begin{itemize}
\item The Python specification considers all possible graphs over all proposed
blocks. From graph theory~\cite{cayley1878theorem}, we know that the number
of labelled rooted forests on $n$ vertices is ${(n+1)}^{n-1}$. (Observe that
this number grows faster than the factorial~$n!$.) This is the number of
possible block graphs that the model checker has to consider for $n$ blocks.
\item The protocol introduces a directed graph of checkpoints (pairs $(b,n)$
of a block $b$ and an integer $n$) \emph{on top} of the block graph.
Validator-signed votes form a third labeled directed graph over pairs of
checkpoints. In addition, all of these edges have to satisfy arithmetic
constraints.
\item Justified and finalized checkpoints introduce an inductive structure
that the model checker has to reason about. Essentially, the solvers have to
reason about chains of checkpoints on top of chains of blocks.
\item Finally, the protocol introduces set cardinalities, both for determining a
quorum of validators and as a threshold for \textit{AccountableSafety}.
Cardinalities are known to be a source of inefficiency in automated
reasoning.
\end{itemize}
42 changes: 19 additions & 23 deletions reports/discussion.tex
Original file line number Diff line number Diff line change
Expand Up @@ -28,26 +28,6 @@
remains accountably safe, even for system sizes substantially larger than
those we were able to exhaustively verify.

\paragraph{Advantages of human expertise over automated translation.} Applying
translation rules to derive checkable specifications from existing artifacts can
serve as a valuable starting point. However, such translations often introduce
inefficiencies because they cannot fully capture the nuances of the specific
context. This can result in suboptimal performance. Therefore, while
translations provide a baseline, manually crafting specifications from the
outset is usually more effective. When relying on translated specifications, it is
essential to apply manual optimizations to ensure both accuracy and efficiency.

\paragraph{Value of \tlap{}.} \tlap{} is a powerful language for specifying and
verifying distributed systems. Although our most promising experimental results
were derived from the Alloy specification, the insights gained through
iterative abstraction in \tlap{} were indispensable.\ \tlap{} enabled us to
start with an almost direct translation of the Python code and progressively
refine it into higher levels of abstraction. This iterative process provided a
deeper understanding of the protocol and laid the groundwork for the more
efficient Alloy specification. The connection between the Python specification
and the Alloy specification is definitely less obvious than the tower of
abstractions that we have built in~\tlap{}.

\paragraph{Value of producing examples.} Even though checking accountable
safety proved to be challenging, our specifications are not limited to proving
only accountable safety. They are also quite useful for producing examples. For
Expand All @@ -56,17 +36,33 @@
this unique value of specifications that are supported by model checkers:

\begin{itemize}

\item Executable specifications in Python require the user to provide program
inputs. These inputs can be also generated randomly, though in the case of
3SF, this would be challenging: We expect the probability of hitting
``interesting'' values, such as producing finalizing checkpoints, to be
quite low.

\item Specifications in the languages supported by proof systems usually do
not support model finding. The TLAPS Proof System is probably a unique
exception here, as~\tlap{} provides a common playground for the prover and the
model checkers~\cite{KonnovKM22}.

\end{itemize}

\paragraph{Advantages of human expertise over automated translation.} Applying
translation rules to derive checkable specifications from existing artifacts can
serve as a valuable starting point. However, such translations often introduce
inefficiencies because they cannot fully capture the nuances of the specific
context. This can result in suboptimal performance. Therefore, while
translations provide a baseline, manually crafting specifications from the
outset is usually more effective. When relying on translated specifications, it is
essential to apply manual optimizations to ensure both accuracy and efficiency.

\paragraph{Value of \tlap{}.} \tlap{} is a powerful language for specifying and
verifying distributed systems. Although our most promising experimental results
were derived from the Alloy specification, the insights gained through
iterative abstraction in \tlap{} were indispensable.\ \tlap{} enabled us to
start with an almost direct translation of the Python code and progressively
refine it into higher levels of abstraction. This iterative process provided a
deeper understanding of the protocol and laid the groundwork for the more
efficient Alloy specification. The connection between the Python specification
and the Alloy specification is definitely less obvious than the tower of
abstractions that we have built in~\tlap{}.
63 changes: 22 additions & 41 deletions reports/intro.tex
Original file line number Diff line number Diff line change
@@ -1,11 +1,17 @@
%! TeX root = report.tex

\section{Key Outcomes}\label{sec:outcomes}

% we embed the discussion right in the introduction
\input{discussion}

\section{Introduction}

At the time of writing this report, Ethereum is using
Gasper~\cite{buterin2020combining} as the underlying consensus protocol.
\subsection{Quick Intro to 3SF}

At the time of writing this report, Ethereum is using Gasper~\cite{buterin2020combining} as the underlying consensus protocol.
In Gasper, time is divided into slots, which represent intervals during which a new block is proposed to extend the blockchain and undergoes voting.
Finalizing a block--ensuring that it is permanently added to the blockchain and cannot be reversed--typically requires 64 to 95 slots.
Finalizing a block -- ensuring that it is permanently added to the blockchain and cannot be reversed -- typically requires 64 to 95 slots.
This delay in finality makes the network more vulnerable to potential
block reorganizations when the network conditions change,
e.g., during periods of asynchronous network conditions.
Expand Down Expand Up @@ -33,11 +39,15 @@ \section{Introduction}
The second confirmation rule defines the \emph{finalized chain}, and provides safety even under network partitions, but loses liveness
either under asynchrony or in case of fluctuation in the participation level.

In this research project, we targeted the 3SF specification in Python \footnote{Link to the Python specification:
\subsection{This Project: Model Checking of 3SF}

In this research project, we targeted the 3SF specification in Python\footnote{Link to the Python specification:
\href{https://github.com/saltiniroberto/ssf/blob/ad3ba2c21bc1cd554a870a6e0e4d87040558e129/high_level/common/ffg.py}{https://github.com/saltiniroberto/ssf/.../ffg.py}} as the case study focusing only on the finality gadget protocol, which is mostly specified in the file \texttt{ffg.py}.
Our main goal was to demonstrate
\emph{Accountable Safety} of this protocol by the means of model checking.
Accountable Safety is the property which ensures that if two conflicting chains (i.e. chains where neither is a prefix of the other) are finalized then, by having access to all messages sent, it is possible to identify at least $\frac{1}{3}$ responsible participants.
Accountable Safety is the property which ensures that if two conflicting chains
(i.e. chains where neither is a prefix of the other) are finalized, then -- by
having access to all messages sent -- it is possible to identify at least $\frac{1}{3}$ responsible participants.

% \paragraph{}\colorbox{yellow}{TODO:}

Expand All @@ -54,42 +64,14 @@ \section{Introduction}
specifications of Paxos~\cite{lamport2001paxos}, Raft~\cite{Ongaro14}, and
Tendermint~\cite{abs-1807-04938,TendermintSpec2020}. Since consensus algorithms
are quite challenging for classical model checkers like TLC, we choose
Apalache~\cite{Apalache2024,KT19,KonnovKM22}. This model checker utilizes the
the symbolic model checker Apalache~\cite{Apalache2024,KT19,KonnovKM22}. This model checker utilizes the
SMT solver~Z3~\cite{MouraB08} in the background. Apalache was used for model
checking of agreement and accountable safety of
Tendermint~\cite{TendermintSpec2020}. As added benefit, four of the project
participants developed Apalache in the past and know its strenghts and
participants have developed Apalache in the past and know its strenghts and
weaknesses.

\paragraph{Complexity of (model-checking) the protocol.} The 3SF protocol is
intricate, with a high degree of combinatorial complexity, making it challenging
for automatic model checking. We have observed multiple layers of complexity in the protocol:
\begin{itemize}
\item The Python specification considers all possible graphs over all proposed
blocks. From graph theory~\cite{cayley1878theorem}, we know that the number
of labelled rooted forests on $n$ vertices is ${(n+1)}^{n-1}$. (Observe that
this number grows faster than the factorial~$n!$.) This is the number of
possible block graphs that the model checker has to consider for $n$ blocks.
\item The protocol introduces a directed graph of checkpoints (pairs $(b,n)$
of a block $b$ and an integer $n$) \emph{on top} of the block graph.
Validator-signed votes form a third labeled directed graph over pairs of
checkpoints. In addition, all of these edges have to satisfy arithmetic
constraints.
\item Justified and finalized checkpoints introduce an inductive structure
that the model checker has to reason about. Essentially, the solvers have to
reason about chains of checkpoints on top of chains of blocks.
\item Finally, the protocol introduces set cardinalities, both for determining a
quorum of validators and as a threshold for \textit{AccountableSafety}.
Cardinalities are known to be a source of inefficiency in automated
reasoning.
\end{itemize}

\subsection{Key Outcomes}\label{sec:outcomes}

% we embed the discussion right in the introduction
\input{discussion}

\subsection{Structure of the report}
\subsection{Structure of the Report}

\begin{figure}
\input{artifacts}
Expand All @@ -100,7 +82,10 @@ \subsection{Structure of the report}
that we have produced in the project:

\begin{enumerate}
\item We have started with the executable specification in Python.
\setcounter{enumi}{0}
\item Section~\ref{sec:3sf} provides an introduction to the technical
details of the 3SF protocol. We start our specification work from the
executable specification in Python \texttt{ffg.py}.

\item \SpecOne{}: This is the specification
\texttt{spec1-2/ffg\_recursive.tla}. It is the result of a manual
Expand Down Expand Up @@ -154,7 +139,3 @@ \subsection{Structure of the report}

\end{enumerate}

\subsection{Potential extensions of this project}\label{sec:future}

\input{future}

2 changes: 1 addition & 1 deletion reports/meta.tex
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
\subsection{Meta-rules for translating recursive definitions}
\subsection{Meta-Rules for Translating Recursive Definitions}
\label{subsec:recrules}

So far, we have been dealing with the primitive building blocks. In order to
Expand Down
2 changes: 1 addition & 1 deletion reports/pyToTla.tex
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
\subsection{Translation rules}
\subsection{Translation Rules}

We give one translation rule per expression.

Expand Down
4 changes: 2 additions & 2 deletions reports/recursion-proofs.tex
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
\section{Detailed proofs}\label{proofs}
\section{Detailed Proofs}\label{proofs}

In the following, we show soundness of our translation rules for the recursive
operators.

\subsection{Additional definitions}
\subsection{Additional Definitions}
Let $f$ be any \tlap function. We use the shorthand $D_f \coloneqq \DOMAIN f$.
We use $\nat$ to refer to the set of all natural numbers (i.e. nonnegative integers).

Expand Down
72 changes: 40 additions & 32 deletions reports/report.tex
Original file line number Diff line number Diff line change
Expand Up @@ -12,23 +12,21 @@
\newcommand{\tth}[1]{\footnote{\textcolor{blue}{#1}}}
\newcommand{\tbh}[1]{\textsc{\textbf{#1}}}

\title{Technical Report:\\
Exploring Automatic Model-Checking of the Ethereum specification\footnote{%
The specifications, scripts, and experimental results can be accessed at:\\
\url{https://github.com/freespek/ssf-mc}
}}
\title{\vspace{-2em}Technical Report: Exploring Automatic Model-Checking of the Ethereum specification\footnote{%
The specifications, scripts, and experimental results can be accessed at: \url{https://github.com/freespek/ssf-mc}}}
\author{%
Igor Konnov\thanks{This work was supported by Ethereum Foundation
under grant FY24--1535.}\\ \small Independent Researcher \\
under grant FY24--1535.}\\ \footnotesize Independent Researcher%
\and
Jure Kukovec\footnotemark[2] \\ \small Independent Researcher \\
Jure Kukovec\footnotemark[2] \\ \footnotesize Independent Researcher%
\and
Thomas Pani\footnotemark[2] \\ \small Independent Researcher \\
Thomas Pani\footnotemark[2] \\ \footnotesize Independent Researcher%
\and
Roberto Saltini\thanks{This work was supported by Ethereum Foundation
under grant FY24--1536 and was partially done while at Consensys.} \\ \small Independent Researcher \\
under grant FY24--1536 and was partially done while at Consensys.} \\
\footnotesize Independent Researcher%
\and
Thanh Hai Tran\footnotemark[3] \\ \small Independent Researcher
Thanh Hai Tran\footnotemark[3] \\ \footnotesize Independent Researcher%
}
\date{}

Expand All @@ -38,27 +36,31 @@

\maketitle

\begin{abstract}

We explore automated model-checking of the Ethereum specification, specifically
the accountable safety of the 3SF protocol. We use~\tlap{} as our primary
specification language and the Apalache model checker with the Z3 SMT solver as
a backend. As a first step, we manually translate the Python specification
into~\tlap{}. This reveals the high combinatorial complexity related to the
definitions of accountable safety in the 3SF protocol.

To address these challenges, we apply multiple layers of manual abstractions,
such as (1) using folds instead of recursion, (2) integers in place of abstract
graphs, and (3) decomposing chain configurations. We also develop
specifications in SMT (CVC5) and Alloy. Despite the challenges, our results show that
exhaustive verification of accountable safety is feasible for small
instances with up to 7 checkpoints and 24 validator votes, and we found no
falsification of accountable safety in slightly larger configurations. Our
study underscores the value of manual abstraction and expertise in enhancing
model-checking efficiency and highlights \tlap{}'s flexibility in handling
abstractions.

\end{abstract}
\begin{abstract}%
We investigate automated model-checking of the Ethereum specification, focusing
on the \emph{Accountable Safety} property of the 3SF consensus protocol. We
select 3SF due to its relevance and the unique challenges it poses for formal
verification. Our primary tools are~\tlap{} for specification and the Apalache
model checker for verification.

Our formalization builds on the executable Python specification of 3SF\@. To
begin, we manually translate this specification into~\tlap{}, revealing
significant combinatorial complexity in the definition of Accountable Safety.
To address these challenges, we introduce several layers of manual abstraction:
(1) replacing recursion with folds, (2) substituting abstract graphs with
integers, and (3) decomposing chain configurations.
To cross-validate our results, we develop alternative encodings in SMT (CVC5)
and Alloy.

Despite the inherent complexity, our results demonstrate that exhaustive
verification of Accountable Safety is feasible for small instances --
supporting up to 7 checkpoints and 24 validator votes. Moreover, no violations
of Accountable Safety are observed, even in slightly larger configurations.
Beyond these findings, our study highlights the importance of manual
abstraction and domain expertise in enhancing model-checking efficiency and
showcases the flexibility of~\tlap{} for managing intricate specifications.%
\end{abstract}%
\newpage%

\setcounter{tocdepth}{2} % hide anything at or below \subsubsection
\tableofcontents
Expand All @@ -81,7 +83,7 @@
\input{spec4}
\input{spec4b}

\section{Summary}\label{sec:summary}
\section{Summary \& Future Work}\label{sec:summary}
We have presented a series of specifications modeling the 3SF protocol from
various perspectives. Initially, we developed a direct translation of the
protocol's Python specification into \tlap{}, but this approach proved
Expand All @@ -96,8 +98,14 @@ \section{Summary}\label{sec:summary}
while the Alloy specification demonstrated exceptional performance in
combination with the Kissat SAT solver.

\subsection{Key Outcomes of the Project}

To revisit the key outcomes of the project, see Section~\ref{sec:outcomes}.

\subsection{Potential Extensions of this Project}\label{sec:future}

\input{future}

\bibliographystyle{plain}
\bibliography{ref}

Expand Down
2 changes: 1 addition & 1 deletion reports/rules.tex
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
%! TeX root = report.tex

\section{Translating Python specifications to \tlap{}}\label{section3}
\section{Translating Python Specifications to \tlap{}}\label{section3}

In this section, we present our results on translating a subset of Python that
is used to write executable specifications in the projects such as
Expand Down
2 changes: 1 addition & 1 deletion reports/spec1.tex
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
%! TeX root = report.tex

\section{\SpecOne{}: Translation from Python}\label{sec:spec1}
\section{\SpecOne{}: Translation from Python to \tlap{}}\label{sec:spec1}

\begin{figure}
\centering
Expand Down
2 changes: 1 addition & 1 deletion reports/spec2.tex
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
%! TeX root = report.tex

\section{\SpecTwo{}: Fold-Based Specification}\label{sec:spec2}
\section{\SpecTwo{}: Fold-Based Specification in \tlap{}}\label{sec:spec2}

\SpecTwo{} addresses some of the limitations inherent in the straightforward
translation of \SpecOne{} from the Python executable specification.
Expand Down
8 changes: 4 additions & 4 deletions reports/spec3.tex
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
%! TeX root = report.tex

\section{\SpecThree{}: Adding a state machine}\label{sec:spec3}
\section{\SpecThree{}: Adding a State Machine}\label{sec:spec3}

In the course of writing~\SpecTwo{}, we realized
that the executable Python specification is essentially sequential. In other
words, even though the 3SF algorithm is distributed, its Python
specification as well as~\SpecOne{} and~\SpecTwo{} are encoding
all possible protocol states in a single specification state.

\subsection{State machines in \tlap{}}
\subsection{State Machines in \tlap{}}

Since \tlap{} is designed for reasoning about state machines, Apalache is tuned
towards incremental model checking of the executions. For instance, if a state
Expand All @@ -30,7 +30,7 @@ \subsection{State machines in \tlap{}}
states. Importantly, any counterexample discovered through symbolic simulation
is just as valid as one found via exhaustive model checking.

\subsection{Introducing a state machine}
\subsection{Introducing a State Machine}

Having the observations above in mind, we introduce~\SpecThree{} in which we
specify a state machine that incrementally builds possible protocol states by
Expand Down Expand Up @@ -80,7 +80,7 @@ \subsection{Introducing a state machine}
\caption{The transition predicate of~\SpecThree{}}\label{fig:abstract-ffg-next}
\end{figure}

\subsection{Model checking experiments}
\subsection{Model Checking Experiments}

We have conducted model checking experiments with this specification. They are
summarized in Table~\ref{tab:abstract-ffg-mc}. Two things to notice:
Expand Down
Loading

0 comments on commit b2456e0

Please sign in to comment.