Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DSL interface for high-level modules and BCB implementation #102

Merged
merged 9 commits into from
Jun 27, 2022

Conversation

xosmig
Copy link
Contributor

@xosmig xosmig commented Jun 17, 2022

The high-level goal is to make it possible to implement distributed protocols in mir with a syntax that is very close to the standard pseudocode notations as well as to separate the protocol logic from the boilerplate.
The way it is done is basically by creating a domain-specific language inside go.

The implementation itself (file pkg/modules/dsl/dslmodule.go) is mostly ready albeit not yet tested, but the work on the motivating example (file pkg/cb/cbmodule.go) is still in progress.

I suggest first reading the example (file pkg/cb/cbmodule.go) to understand the main idea, then the core implementation (file pkg/modules/dsl/dslmodule.go), and only then the boilerplate (files pkg/modules/dsl/events.go, pkg/cb/cbdsl/dslmodule.go, pkg/cb/cbdsl/events.go, and the rest).

@sergefdrv
Copy link
Contributor

@xosmig This is an interesting approach to using Mir framework. Apart from the usual event-based notation, I would also like to explore something like the control-oriented notation mentioned in this lecture, but I'm not sure if that is easy to do in Mir architecture.

@xosmig
Copy link
Contributor Author

xosmig commented Jun 19, 2022

@sergefdrv if I understand you correctly, this is similar to coroutines being executed on 1 thread (i.e., without any parallelism or preemption). For pseudocode, I also prefer this notation over event-based and often use it.

Note that this is rather an extension than a replacement to the event-based notation.

Here is a slightly bigger example than the one in the video (from this paper):

image

Note that it uses both "wait for" and "upon" constructions. If converted to actual code, it would probably need a few more "upon" handlers to manage the replies for lines 179 and 185.

It does seem to be doable in mir: we will need to run a separate goroutine each time we execute a handler. Then the main goroutine would act as a coordinator to make sure that at most 1 goroutine at a time is actually being executed. When the handler reaches a waiting statement, it tells the main goroutine the event type it is waiting for and blocks until it gets the event back from the main goroutine. The main goroutine, in turn, blocks until the handler tells it that it either has reached a waiting command or has finished.

The downside of this implementation is that it incurs the overhead of creating a goroutine for each event handler even if you don't use the "wait for" statement. Moreover, in the current design of dsl modules, each event may have multiple handlers (hence, there would be multiple goroutines each time an event arrives).

Here are some other ways to implement it I could think of. None of them seem to be practical:

  1. Use 1 goroutine. Manually save the goroutine context (the state of the CPU registers and so on) on a waiting statement and then manually restore it. However, I'm not sure if go runtime exports an API that would allow us to do it (I couldn't find anything similar in the go runtime package). If there was a coroutine library for go, we could probably use it, but I couldn't find any (maybe it's a sign that it is in fact impossible in go).

  2. Construct (or extract) the AST of the (pseudo)code in runtime and then interpret them in runtime. This would probably be very slow. Potential ways to get the AST:
    i. Use the parser package.
    ii. Replace all language operations by custom functions: instead of writing x = y, you would write ast.Assign(x, y), which would create a node in the AST for this assignment.
    iii. (not applicable to go) In a language like C++, we could create fake objects, which would override operations like = to create AST nodes instead of actually doing the assignment.

  3. Finally, there is always an option to use go generate to generate event-based code from the control-oriented (pseudo)code.

@xosmig
Copy link
Contributor Author

xosmig commented Jun 19, 2022

Perhaps, the overhead could be lowered. Instead of creating a goroutine for each event handler, the main goroutine can create 1 worker goroutine and pass event handlers to it one by one. After passing a handler to the worker, the main goroutine waits until the worker tells it that it either has finished processing the handler or that it blocked on a "wait for" statement. In the latter case, the main goroutine creates a new worker and continues processing events using it.

In this case, goroutines will be created only when it is actually necessary.

@xosmig
Copy link
Contributor Author

xosmig commented Jun 19, 2022

To optimize even further, the main goroutine can just pass the whole list of events to the worker. The worker executes them until it encounters a "wait for" statement, in which case it notifies the main goroutine where it stopped processing the list and blocks.

So, in the case of a module that never actually uses "wait for" statements, the overhead will be minimal.

@matejpavlovic
Copy link
Contributor

matejpavlovic commented Jun 19, 2022

Both ideas (DSL moudle and coroutines) are a very interesting further development of Mir that I'm all for pursuing.
AFAIU, the DSL module is a really nice add-on that seems basically implemented at this point and can directly be very useful for protocol implementation. The coroutine extension seems also very useful, but seems to involve much more implementation overhead if we wanted to implement it now.

Thus, I'd suggest moving the coroutine discussion to a separate issue and continuing to discuss there, so the ideas don't get lost. At this point, I think we should focus on the most immediate goal of implementing the basic availability layer (basic Narwhal / multisig certificate collector protocol) so we can reach the goal for our milestone that is coming closer and closer. It seems to me that the DSL module will probably accelerate this, as it seems almost ready. Then we can pick up the coroutines again and see whether to also implement them.

This is definitely not meant to kill the discussion on the coroutine-based programming model. On the contrary, I'm encouraging that, also in form of such spontaneous brainstorms. Just want to be on the same page about the immediate goals in terms of writing code :)

@sergefdrv
Copy link
Contributor

similar to coroutines being executed on 1 thread (i.e., without any parallelism or preemption)

Not necessarily, I think there are many way to achieve this. I was actually considering something like communicating sequential processes. One could also do waiting with traditional mutexes and conditional variables, or Go's select statements, or STM, or whatever...

It does seem to be doable in mir

I wouldn't try sticking to Mir architecture when exploring possible ways to implement the alternative approach, but I don't have any specific idea at this point. Clearly, this is out of the scope for now due to immediate higher-priority tasks.

coroutine-based programming model

I'd still like to put focus more generally: on control-oriented notation.

Copy link
Contributor

@matejpavlovic matejpavlovic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks nice, just some minor comments. Didn't look through the context store, as I assume it to be the same as in #103

pkg/modules/dsl/dslmodule.go Outdated Show resolved Hide resolved
pkg/cb/cbdsl/events.go Outdated Show resolved Hide resolved
@xosmig
Copy link
Contributor Author

xosmig commented Jun 20, 2022

Didn't look through the context store, as I assume it to be the same as in #103

Oh, I think I committed it here by accident. Thank you for noticing.

@xosmig
Copy link
Contributor Author

xosmig commented Jun 20, 2022

@sergefdrv sorry, apparently, I didn't quite get what you mean by "control-oriented notation". The example in the video looks similar to the model I mentioned, but, apparently, it's a coincidence.

@xosmig
Copy link
Contributor Author

xosmig commented Jun 20, 2022

At this point, I think we should focus on the most immediate goal of implementing the basic availability layer

Yes, sure :)
I didn't really consider implementing the "wait for" statement in this PR or in the near future. It looks more like a nice-to-have but an unnecessary feature.

As for control-oriented notation, it seems to be a big separate direction.

@xosmig
Copy link
Contributor Author

xosmig commented Jun 21, 2022

Two major changes to the DSL module design:

  1. Introduced a standardized way for context management for asynchronous operations like signature verification and hash computation.

    • pros: writing code that uses such asynchronous functions now requires much less boilerplate.
    • cons: none? :)
    • I also had a couple of alternative designs for managing the context. I chose this one as it was the simplest.
  2. Replaced UponEvent[EvTp, Ev](m Module, handler func(ev *Ev) error) by RegisterEventHandler[EvTp](m Module, handler func(ev *EvTp) error).

    • a bit of context: EvTp is the generated "wrapper" type (e.g., eventpb.Event_Request) and Ev is the actual event type, (e.g., eventpb.Request). EvTp is structurally identical to struct { ev *Ev } (unless Ev is a primitive type, in which case it is identical to struct { ev Ev }).
    • pros: due to some quirks of the code generated by the protobuf compiler, UponEvent had to verify that EvTp and Ev are compatible in runtime, at the moment of handler registration. Moreover, the object of type Ev was extracted from the object of type EvTp in a rather hacky way (see this file). The implementation of RegisterEventHandler avoids these issues because it passes the EvTp object directly to the handler, without unwrapping it.
    • cons: RegisterEventHandler is less user-friendly and goes against the "declarative" nature of dsl modules.
    • motivation: in the initial design, UponEvent was supposed to be actually called in the protocol implementation. However, the design has evolved, and now the protocol implementation is not really supposed to use this function directly and is rather supposed to use wrappers like the ones in pkg/dsl/events.go. For the wrappers (which are separated from the protocol logic), it is ok to use a slightly less user-friendly and non-declarative function for the sake of type safety.
    • alternative solution: It would be possible to implement UponEvent in a type-safe way with a small protobuf plugin that would augment EvTp with a function Unwrap() *Ev. However, since we are not sure whether we are going to stick with protobufs in the future, it would probably not make sense to invest more time in it now.

@xosmig
Copy link
Contributor Author

xosmig commented Jun 22, 2022

I think the design is more or less stable at this point. I've started adding some unit tests.
@matejpavlovic @sergefdrv it would be nice if you could do a high-level review before I invest too much time in polishing the current code.

@xosmig xosmig force-pushed the dsl-v3 branch 3 times, most recently from cafa380 to 753d806 Compare June 23, 2022 13:43
Copy link
Contributor

@matejpavlovic matejpavlovic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice, good job!

pkg/bcb/protobuf.go Outdated Show resolved Hide resolved
pkg/dsl/dslmodule.go Outdated Show resolved Hide resolved
pkg/dsl/dslmodule.go Outdated Show resolved Hide resolved
pkg/dsl/dslmodule.go Outdated Show resolved Hide resolved
pkg/dsl/dslmodule.go Show resolved Hide resolved
pkg/dsl/dslmodule.go Outdated Show resolved Hide resolved
@xosmig xosmig force-pushed the dsl-v3 branch 3 times, most recently from cb40f96 to 7a227c5 Compare June 23, 2022 19:50
@xosmig
Copy link
Contributor Author

xosmig commented Jun 23, 2022

@matejpavlovic thanks for your review! Could you please also take a look at this commit? Do I interpret the NodeSigsVerified event correctly?

The rest of the commits after your review are mostly tests and comments.

@xosmig xosmig marked this pull request as ready for review June 23, 2022 19:54
@xosmig xosmig changed the title [WIP] DSL interface for high-level modules DSL interface for high-level modules and Byzantine Consistent Broadcast implementation Jun 23, 2022
@xosmig xosmig changed the title DSL interface for high-level modules and Byzantine Consistent Broadcast implementation DSL interface for high-level modules and BCB implementation Jun 23, 2022
pkg/dsl/events.go Outdated Show resolved Hide resolved
pkg/dsl/events.go Show resolved Hide resolved
The high-level goal is to make it possible to implement
distributed protocols in mir with a syntax that is very close to
the standard pseudocode notations as well as to separate the
protocol logic from the boilerplate. The way it is done is
basically by creating a domain-specific language inside go.

The motivating example can be found in pkg/bcb/bcbmodule.go.
The core implementation is in pkg/dsl/dslmodule.go.
The rest is mostly boilerplate and auxiliary functions.
The new interface is less user-friendly, but more type-safe.
@xosmig xosmig force-pushed the dsl-v3 branch 2 times, most recently from 0a3db43 to fe11b42 Compare June 27, 2022 15:19
The goal is to prevent the programmer from making a bug where they
pass the context by value when they send a request, but accept it
by reference in the handler (or vice versa). This would make the
handler not match the response event.
This modification does not completely eradicate the issue, but
makes it much harder to make such a mistake.
@xosmig xosmig merged commit 65d3696 into consensus-shipyard:main Jun 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants