Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bundle messages at broadcast #4436

Merged
merged 29 commits into from
Aug 31, 2020

Conversation

chimp1984
Copy link
Contributor

We use BundleOfEnvelopes at the connection level to collect messages which are sent out with a short time interval. unfortunatelty that approach did not had much effect as it is per connection and per connection we rarely get jammed too much. This PR is applying the idea to the Broadcast class where we aggregate all messages for broadcast and send them as bundle every second to our connections.
If only 1 message arrived in that interval we do not wrap it into a BundleOfEnvelopes but send it directly.
We also improved the handling of a listener used for notifying about the stored in mailbox event. Previously we added a 3 sec. delay after we received the first successful broadcast. Now we trigger the event once we had at least 3 successful broadcasts.
As we have now mixed messages we had to change the handling of own messages. We treat the bundle with priority if at least one message was originated by ourself.
The notification of listeners handled is by message. To achieve that we use the BroadcastRequest value object.
There was a bug with the timeout delay where a value in seconds have been not converted to ms.
We also refactored the convoluted error handling.

For deploying that PR I recommend to start with 1 seed node and observe the statistics if it have led to lower number of sent messages compared to the current version.
The #4435 PR has added better logging for network statistics and is merged into that PR.

@chimp1984
Copy link
Contributor Author

chimp1984 commented Aug 27, 2020

Doing some statistics comparisions now:
Previous version sends about 300-800 msg/minute
Version of this PR sends about 100 msg/minute

Received messages will lower as well once the majority of the network is using this version. So we can expect a network traffic reduction from number of msg or factor 5-6 (see msg/sec).
Overall traffic in bytes might also be reduced. In the current comparision setup sent data is lower then received data in the new version. 18MB received/15MB sent. versus 12MB received and 17 MNB sent in the old version.

Here a screenshot from the test run of both after 10 minutes. Will post another one after a few hours (left side is using this PR).
Screen Shot 2020-08-27 at 09 40 57

@chimp1984 chimp1984 changed the title Bundle messages at broadcast [WIP] Bundle messages at broadcast Aug 27, 2020
@chimp1984
Copy link
Contributor Author

I added WIP prefix to signal that this PR should not be merged directly. We should first try it out at a seed node and analyse the results then.

@chimp1984
Copy link
Contributor Author

chimp1984 commented Aug 27, 2020

A weird observation is that we receive more bytes in the new version but less messages. We have to find out why this is the case. I don't have an explaination for that yet.

I reduced ping time as well in that PR. Maybe that causes less preferred connectivity to receive less data. So far I see in the logs its less RefreshOfferMessage, BundleOfEnvelopes, AddDataMessage, Ping and Pong. So either one node was connected to less resiliant set of network nodes or some of the changes caused this. I will try later to start again, might be just a random issue and will test with 2 apps with the old version to see if there is also that much difference.
Note that my internet connection is very low, that might be an influence as well. I do not have Tor debug logs enabled, so it might be also that one node gets throttled by tor.

New version:
numTotalReceivedMessages/totalReceivedMessages: 16368 / {NewDaoStateHashMessage=35, RefreshOfferMessage=13237, GetProposalStateHashesResponse=2, RemoveDataMessage=68, GetPeersRequest=1, GetBlindVoteStateHashesResponse=2, BundleOfEnvelopes=75, AddDataMessage=2688, GetDaoStateHashesResponse=2, AddPersistableNetworkPayloadMessage=32, GetPeersResponse=10, Ping=47, GetDataResponse=4, CloseConnectionMessage=3, NewBlockBroadcastMessage=30, RemoveMailboxDataMessage=127, Pong=4, GetBlocksResponse=1};

versus old version:
Number of received messages/Received messages: 19612 / {NewDaoStateHashMessage=45, RefreshOfferMessage=15418, RemoveDataMessage=65, GetPeersRequest=3, BundleOfEnvelopes=134, AddDataMessage=2988, GetDaoStateHashesResponse=2, AddPersistableNetworkPayloadMessage=33, GetPeersResponse=9, Ping=6, GetDataResponse=4, NewBlockBroadcastMessage=18, CloseConnectionMessage=2, RemoveMailboxDataMessage=128, Pong=756, GetBlocksResponse=1};
Number of sent messages of last sec/Sent messages of last sec: 1 / {PreliminaryGetDataRequest=0, NewDaoStateHashMessage=0, GetBlocksRequest=0, RefreshOfferMessage=0, RemoveDataMessage=0, GetPeersRequest=0, BundleOfEnvelopes=0, AddDataMessage=0, GetDaoStateHashesRequest=0, AddPersistableNetworkPayloadMessage=0, Ping=1, GetPeersResponse=0, NewBlockBroadcastMessage=0, GetUpdatedDataRequest=0, RemoveMailboxDataMessage=0, Pong=0};
Bytes received: 15935.169921875 kb

@chimp1984
Copy link
Contributor Author

Here are statistics after a few hours. Left side is the PR version, right side are 2 apps running old verison.

The difference in number of messages is even larger. About factor 10! Data size of sent data is not much different but also a bit lower (about 5%). Data received is more or less same.

Screen Shot 2020-08-27 at 15 15 47

It is important that we flush our queued requests
at shutdown and wait until broadcast is completed as a maker need to
remove his offers at shutdown.

- Add handling for the case that there are very few connections (as in
dev setup).

- Make BundleOfEnvelopes extend BroadcastMessage

- Add complete handler for broadCaster to shutdown in P2PService and
wait with shutdown of other services until broadcaster is completed.
- Remove case for repeated shutdown call on P2PService as it cannot
happen.
# Conflicts:
#	p2p/src/main/java/bisq/network/p2p/P2PService.java
…ndle-msg-at-broadcast

# Conflicts:
#	p2p/src/main/java/bisq/network/p2p/P2PService.java
@chimp1984
Copy link
Contributor Author

chimp1984 commented Aug 29, 2020

@Emzy did run the PR for 6 hours to get some log data and observe if all is ok.

Below the results of logs from PR version versus old version (using the PR with better logging).
It confirms that sent messages are much lower (factor 5) as well as less pings. Once the network has updated the number of received msg (and pongs) will go down as well.

I would suggest that @Emzy starts to use that PR on one seed for an extended test before we merge it.

current verison:
time period: 21 std
Bytes sent: 2385 MB; 113 MB/hour
Bytes received: 2636 MB; 125 MB/hour
Number of sent messages/Sent messages: 691509; 9,14 msg/sec; top: RefreshOfferMessage=552571, AddDataMessage=110788, Pong=58603
Number of received messages/Received messages: 1810464; 23,94 msg/sec

new version:
time period: 6 std
Bytes sent: 884 MB; 147 MB / hour
Bytes received: 920 MB; 153 MB / hour
Number of sent messages/Sent messages: 48861; 2,26 msg/sec; top: BundleOfEnvelopes=28393, RefreshOfferMessage=6507, AddDataMessage=1899, Pong=10752
Number of received messages/Received messages: 478111; 22,13 msg/sec

current verison:
Aug-28 18:14:25.263 [SeedNodeMain] INFO b.n.p.n.Statistic: Network statistics:
totalSentBytes: 21831.76953125 kb;
numTotalSentMessages/totalSentMessages: 29 / {PreliminaryGetDataRequest=3, GetDataResponse=1, GetUpdatedDataRequest=3, GetPeersRequest=8, Pong=1, BundleOfEnvelopes=15};
numTotalSentMessagesLastSec/totalSentMessagesPerSec: 2 / {PreliminaryGetDataRequest=0, GetDataResponse=0, GetUpdatedDataRequest=0, GetPeersRequest=0, Pong=0, BundleOfEnvelopes=2};
totalReceivedBytes: 14509.162109375 kb
numTotalReceivedMessages/totalReceivedMessages: 49 / {PreliminaryGetDataRequest=1, AddDataMessage=14, Ping=1, GetPeersResponse=6, GetDataResponse=4, RefreshOfferMessage=24};
numTotalReceivedMessagesLastSec/totalReceivedMessagesPerSec: 11 / {PreliminaryGetDataRequest=0, AddDataMessage=4, Ping=0, GetPeersResponse=0, GetDataResponse=1, RefreshOfferMessage=6};

new version:
Aug-29 00:45:42.564 [SeedNodeMain] INFO b.n.p.n.Statistic: Network statistics:
totalSentBytes: 906362.5458984375 kb;
numTotalSentMessages/totalSentMessages: 48890 / {PreliminaryGetDataRequest=3, NewDaoStateHashMessage=156, GetProposalStateHashesRequest=5, RefreshOfferMessage=6507, GetProposalStateHashesResponse=100, RemoveDataMessage=186, GetPeersRequest=61, GetBlindVoteStateHashesResponse=98, BundleOfEnvelopes=28393, AddDataMessage=1899, GetDaoStateHashesRequest=5, GetDaoStateHashesResponse=109, GetBlindVoteStateHashesRequest=5, AddPersistableNetworkPayloadMessage=66, GetPeersResponse=27, Ping=25, GetDataResponse=247, CloseConnectionMessage=37, NewBlockBroadcastMessage=102, GetUpdatedDataRequest=3, RemoveMailboxDataMessage=19, Pong=10752, GetBlocksResponse=85};
totalReceivedBytes: 935125.5732421875 kb
numTotalReceivedMessages/totalReceivedMessages: 478160 / {PreliminaryGetDataRequest=138, NewDaoStateHashMessage=1289, GetBlocksRequest=89, GetProposalStateHashesRequest=42, RefreshOfferMessage=373842, GetProposalStateHashesResponse=5, RemoveDataMessage=4425, GetPeersRequest=41, GetBlindVoteStateHashesResponse=5, BundleOfEnvelopes=3825, AddDataMessage=79920, GetDaoStateHashesRequest=107, GetDaoStateHashesResponse=5, AddPersistableNetworkPayloadMessage=1454, GetBlindVoteStateHashesRequest=41, Ping=10730, GetPeersResponse=61, GetDataResponse=6, CloseConnectionMessage=99, NewBlockBroadcastMessage=955, GetUpdatedDataRequest=116, RemoveMailboxDataMessage=984, Pong=20};

@chimp1984 chimp1984 changed the title [WIP] Bundle messages at broadcast Bundle messages at broadcast Aug 30, 2020
@chimp1984
Copy link
Contributor Author

Seed node sn3emzy56u3mxzsr4geysc52feoq5qt7ja56km6gygwnszkshunn2sid.onion:8000 (@Emzy ) is running the PR now. I will test with a local node connected only to that seed to observe behaviour. If anyone wants to test or run from the PR would be welcome. P2P network changes always carries some risk and we should test this very well.
I remove the WIP prefix and it is ready for review and merge from my point of view.

# Conflicts:
#	p2p/src/main/java/bisq/network/p2p/P2PService.java
I don't know why the tests failed as I just added an overloaded method
and it should not have any impact. There is also one exception which
makes it even more obscure. I guess its some test framework issue.

See comment at the exceptional handling
// If we remove the last argument (isNull()) tests fail. No idea why as the broadcast method has an
/ overloaded method with nullable listener. Seems a testframework issue as it should not matter if the
// method with listener is called with null argument or the other method with no listener. We removed the
// null value from all other calls but here we can't as it breaks the test.
Copy link
Member

@sqrrm sqrrm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks pretty rather solid. I have run my live node getting good stats with no issues seen as well
image

Just some small comments that would be good to tend to.

Copy link
Member

@sqrrm sqrrm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK

@sqrrm sqrrm merged commit c551adb into bisq-network:master Aug 31, 2020
@sqrrm sqrrm added this to the v1.3.8 milestone Aug 31, 2020
@chimp1984 chimp1984 deleted the bundle-msg-at-broadcast branch September 2, 2020 00:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants