Improve the EVA protocol: kozlovsky edition #6845

kozlovsky · 2022-04-04T05:16:16Z

@drew2a I reviewed PR #6831 and did not find problems in this PR, but I think I found the other significant issue in EVA protocol unrelated to the PR changes.

The problem is in the EVAProtocol.on_data method:

async def on_data(self, peer, payload):
    ...
    can_be_handled = transfer.block_number == payload.block_number - 1
    if not can_be_handled or transfer.nonce != payload.nonce:
        return
    ...

As I understand it, this method expects to receive the incoming messages in strictly sequential order, and a new message should have a block number that is one more than the previous one. At the same time, the on_acknowledgement method sends outgoing messages in batches of 16, where 16 is window_size:

async def on_acknowledgement(self, peer: Peer, payload: Acknowledgement):
    ...
    for block_number in range(transfer.block_number, transfer.block_number + transfer.window_size):
        ...
        self.community.eva_send_message(peer, Data(block_number, transfer.nonce, data))

As messages are UDP packets, the network can deliver the messages of the same batch in any order; it is incorrect to expect they will be received in the same sequence. The current implementation of the on_data method will reject many messages as "out-of-order", and it can significantly slow down the data transfer. We can add a diagnostic counter of how many messages are received out-of-order, I suspect the number may be significant if there is some traffic in the network.

The on_data method should accept any message whose number lies inside the current window. The method can put incoming messages in an array of window_size, using the block number to calculate the message's offset in the array. Initially, it can be initialized as an array of Nones, and after all messages of the current window are received, they can be joined together and appended to the result binary string. This way, EVAProtocol performance can potentially be much better under a heavy load.

Also, a couple of minor notices:

First:

transfer.window_size = max(self.MIN_WINDOWS_SIZE, min(payload.window_size, self.binary_size_limit))

It does not look correct to calculate min between window_size and binary_size_limit because window_size is a dimensionless quantity (the number of messages), while binary_size_limit` is the number of bytes. Apples and oranges.

Second:

transfer.data_binary += payload.data_binary

This is potentially slow because it copies the entire previous data each time a new piece is appended. The correct way is to accumulate binary chunks in a list:

transfer.data_list.append(payload.data_binary)

And join all chunks to a single binary string at the end in a single step:

result = ""b.join(transfer.data_list)

Additional from @devos50

It would be nice to have ReadRequest along with the WriteRequest.

Add confirm_callback like:

    def confirm_callback(self, peer: Peer, info: bytes, data_size: int):
        if info == b'accept_me':
            return True
        return False

The text was updated successfully, but these errors were encountered:

devos50 · 2022-04-04T07:18:27Z

Thanks for this review, very helpful!

As messages are UDP packets, the network can deliver the messages of the same batch in any order; it is incorrect to expect they will be received in the same sequence.

Yes, and I observed a few instances in my experiments where a retransmission was triggered. I assumed this was because one or more messages in the window batch was dropped, but it could also be because of this logic. To reduce the impact of this, I lowered the retransmit timeout.

Not related to the bugs, but I'm a bit unsure how to properly tune the EVA parameters to maximise throughput. There are quite some interdependencies between the parameters, and it would also depend on factors external to EVA such as the UDP packet loss. For example, I can increase the window size (which, assuming a perfect network, would decrease the transfer time) but this also increases the network load and eventually slows down the entire system. Analysing the impact of these dependencies would actually be a very interesting experiment 😁. Also, congestion control would be very helpful here, but I guess this is well beyond the scope of the current implementation (for now, we simply limit the number of simultaneous transfers). Maybe you guys have some ideas on picking the right parameters?

drew2a · 2022-04-04T09:31:18Z

Ref: TFTP Windowsize Option: https://datatracker.ietf.org/doc/html/rfc7440#page-4

drew2a · 2022-06-28T11:46:31Z

We implemented everything from this issue, except "confirm_callback" which will be implemented in the case of real need.

kozlovsky added the type: bug label Apr 4, 2022

drew2a added this to the 7.13.0 April milestone Apr 4, 2022

drew2a added type: enhancement estimation: 1 day and removed type: bug labels Apr 4, 2022

drew2a self-assigned this Apr 4, 2022

drew2a mentioned this issue Apr 4, 2022

Improve the EVA protocol Tribler/trustchain-superapp#101

Open

drew2a changed the title ~~Possible bug in EVA protocol~~ Improve the EVA protocol: kozlovsky edition Apr 8, 2022

drew2a mentioned this issue Apr 20, 2022

EVA: add incoming array #6878

Merged

drew2a added estimation: 3 days and removed estimation: 1 day labels Apr 20, 2022

This was referenced Apr 22, 2022

EVA: refactor transfer class #6882

Merged

EVA: refactor the Transfer class #6883

Merged

Split the EVA protocol file #6887

Merged

Add tests for the Transfer class #6888

Merged

drew2a mentioned this issue May 2, 2022

Add Read Request #6889

Merged

drew2a closed this as completed Jun 28, 2022

drew2a mentioned this issue Nov 20, 2023

EVA: enable concurrent transfers to the same peer #7698

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve the EVA protocol: kozlovsky edition #6845

Improve the EVA protocol: kozlovsky edition #6845

kozlovsky commented Apr 4, 2022 •

edited by drew2a

Loading

devos50 commented Apr 4, 2022

drew2a commented Apr 4, 2022

drew2a commented Jun 28, 2022

Improve the EVA protocol: kozlovsky edition #6845

Improve the EVA protocol: kozlovsky edition #6845

Comments

kozlovsky commented Apr 4, 2022 • edited by drew2a Loading

devos50 commented Apr 4, 2022

drew2a commented Apr 4, 2022

drew2a commented Jun 28, 2022

kozlovsky commented Apr 4, 2022 •

edited by drew2a

Loading