Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[P2P] Refactor/consolidate P2P modules #576

Merged
merged 90 commits into from
Apr 17, 2023

Conversation

bryanchriswhite
Copy link
Contributor

@bryanchriswhite bryanchriswhite commented Mar 10, 2023

@Reviewer

This PR may be more digestible / reviewable on a commit-by-commit basis. Commits are organized logically and any given line is only modified in a single commit, with few exceptions.


Description

Re-consolidates Libp2p and P2P modules and types.

Issue

Fixes #554

Type of change

Please mark the relevant option(s):

  • New feature, functionality or library
  • Bug fix
  • Code health or cleanup
  • Major breaking change
  • Documentation
  • Other

List of changes

  • Moved peer & url conversion utils to p2p/utils package
  • Refactor getPeerIP to use net.DefaultResolver for easier testing
  • Moved & refactor libp2p host.Host setup util to p2p/utils
  • Consolidated Libp2p & P2P modules.Module implementations
  • Consolidated Libp2p & P2P stdnetwork typesP2P.Network implementations
  • Refactored raintree typesP2P.Network implementation to use libp2p
  • Moved shared/p2p package into p2p/types packages
  • Removed Trnasport interface and implementations
  • Removed ConnectionFactory type and related members
  • Added libp2p host.Host mock generator
  • Refactor debug CLI post P2P module re-consolidation
  • Removed temporary shared/p2p package
  • Removed runtime/configs.Config#UseLibp2p field
  • Use pod IP for validator DNS resolution tilt localnet
  • Add LIBP2P_DEBUG env var
  • Debug logging improvements
  • Set validator POCKET_P2P_HOSTNAME env var to the pod IP
  • Set validator p2p.hostname config value to empty string so that the env var applies

Testing

  • make develop_test
  • LocalNet w/ all of the steps outlined in the README

Required Checklist

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have tested my changes using the available tooling
  • I have updated the corresponding CHANGELOG

If Applicable Checklist

  • I have updated the corresponding README(s); local and/or global
  • I have added tests that prove my fix is effective or that my feature works
  • I have added, or updated, mermaid.js diagrams in the corresponding README(s)
  • I have added, or updated, documentation and mermaid.js diagrams in shared/docs/* if I updated shared/*README(s)

@bryanchriswhite bryanchriswhite added the p2p P2P specific changes label Mar 10, 2023
@bryanchriswhite bryanchriswhite self-assigned this Mar 10, 2023
@bryanchriswhite bryanchriswhite force-pushed the refactor/consolidate-p2p-modules branch from 807d771 to 2263be9 Compare March 14, 2023 20:03
bryanchriswhite added a commit that referenced this pull request Mar 15, 2023
## Description

This was a `TECHDEBT` item that became trivial to do as work progressed
on another task which also required use of `mockdns` (#576).

## Issue

Not related to any specific issue.

## Type of change

Please mark the relevant option(s):

- [ ] New feature, functionality or library
- [ ] Bug fix
- [x] Code health or cleanup
- [ ] Major breaking change
- [ ] Documentation
- [ ] Other <!-- add details here if it a different type of change -->

## List of changes

- Adds `mockdns` as a test dependency
- Mocks DNS resolution in `url_conversion_test.go`
- Adds regression tests to `url_conversion_test.go` for single- and
multi-record DNS responses

## Testing

- [ ] `make develop_test`
- [ ]
[LocalNet](https://github.com/pokt-network/pocket/blob/main/docs/development/README.md)
w/ all of the steps outlined in the `README`

## Required Checklist

- [x] I have performed a self-review of my own code
- [x] I have commented my code, particularly in hard-to-understand areas
- [x] I have tested my changes using the available tooling
- [x] I have updated the corresponding CHANGELOG

### If Applicable Checklist

- [ ] I have updated the corresponding README(s); local and/or global
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added, or updated,
[mermaid.js](https://mermaid-js.github.io) diagrams in the corresponding
README(s)
- [ ] I have added, or updated, documentation and
[mermaid.js](https://mermaid-js.github.io) diagrams in `shared/docs/*`
if I updated `shared/*`README(s)
@bryanchriswhite bryanchriswhite force-pushed the refactor/consolidate-p2p-modules branch from 9b1dff8 to 5fef7f3 Compare March 16, 2023 16:19
@bryanchriswhite bryanchriswhite mentioned this pull request Mar 17, 2023
16 tasks
@bryanchriswhite bryanchriswhite force-pushed the refactor/consolidate-p2p-modules branch 3 times, most recently from 7276db9 to 4bfa96b Compare March 18, 2023 22:56
@bryanchriswhite bryanchriswhite force-pushed the refactor/consolidate-p2p-modules branch 3 times, most recently from c24e361 to 79c81cc Compare March 20, 2023 13:47
@bryanchriswhite

This comment was marked as resolved.

@bryanchriswhite

This comment was marked as outdated.

@Olshansk

This comment was marked as resolved.

@bryanchriswhite
Copy link
Contributor Author

bryanchriswhite commented Mar 21, 2023

@bryanchriswhite Are you still blocked on this or getting support from @deblasis?

@Olshansk, still working through it and got some help from @deblasis. We paired for a bit this morning and thought through how to proceed. I put the notes from our session in notion.

UPDATE (resolved): As embarrassing as it is to admit, may it go as a reminder to others (because apparently we all need one from time to time 😅), to doubletriple-check your setup/teardown when experiencing flakiness in tests which exercise concurrency.

Postmortem

  1. As written (excerpt below), the P2P modules were being #Stop()ed in the main goroutine (perhaps by some twist of git fate). Meanwhile, the assertion which will time out on a wait group if it takes too long, was being called in a defer; also in the main goroutine.
  2. I didn't re-run every permutation to determine to what extent being more explicit about using libp2p/network.Stream#CloseRead() and #CloseWrite(), as opposed to just #Close(), had an effect but it's definitely relevant and I'm confident we're doing it more correctly now. 👍
// Wait for completion
defer waitForNetworkSimulationCompletion(t, &wg)  // <-- 

// Send the first message (by the originator) to trigger a RainTree broadcast
p := &anypb.Any{}
p2pMod := p2pModules[origNode]
require.NoError(t, p2pMod.Broadcast(p))

// Stop all p2p modules outside of loop
for _, p2pMod := range p2pModules {  
  err := p2pMod.Stop()                            // <--
  require.NoError(t, err)
}

🤦🤦🤦🤦🤦🤦

Base automatically changed from chore/p2p-interfaces to main March 21, 2023 17:33
@bryanchriswhite bryanchriswhite force-pushed the refactor/consolidate-p2p-modules branch from 6db30a9 to ec1a481 Compare March 23, 2023 12:20
github-actions[bot]

This comment was marked as outdated.

@bryanchriswhite bryanchriswhite force-pushed the refactor/consolidate-p2p-modules branch from ec1a481 to e11be8d Compare March 23, 2023 15:09
@github-actions github-actions bot dismissed their stale review March 23, 2023 15:10

The check succeeded, dismissing the review comment.

@bryanchriswhite bryanchriswhite force-pushed the refactor/consolidate-p2p-modules branch 2 times, most recently from e0038e7 to 9d88684 Compare March 23, 2023 15:31
@bryanchriswhite bryanchriswhite marked this pull request as ready for review March 23, 2023 15:32
github-actions[bot]

This comment was marked as resolved.

@github-actions github-actions bot dismissed their stale review April 14, 2023 08:53

The check succeeded, dismissing the review comment.

Copy link
Member

@Olshansk Olshansk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bryanchriswhite I appreciate all the comments you added after the last PR.

Please do a last sanity check with localnet (docker-compose and k8s) before merging to avoid breaking main, but otherwise, let's merge this in!

bryanchriswhite and others added 4 commits April 15, 2023 20:56
…p-modules

* pokt/main:
  [Peristence] Fix `TxIndexer` to index sender and recipient transactions with height and index suffixes (#677)
  Update e2e/docs/CHANGELOG.md (#681)
  Update devlog5.md (#672)
  [Infra] validator helm chart for LocalNet (#658)
  Update issue.md
github-actions[bot]

This comment was marked as outdated.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changelog validation failed with the following output:
Missing changelog in module: charts/

Changelog verification failed. See error messages for more detail.

Please update the relevant CHANGELOG.md files and ensure they follow the correct format.

Copy link
Contributor Author

@bryanchriswhite bryanchriswhite left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do a last sanity check with localnet (docker-compose and k8s) before merging to avoid breaking main, but otherwise, let's merge this in!

@Olshansk so I have good news and bad news.

Good news: both docker compose and tilt localnets were able to run in autonomous mode past 250 blocks before I shut them down. 🙌

Bad news: after merging the tilt validator refactor which introduces the validator helm chart (#658) there is a new class of error presenting with respect to DNS resolution. This error isn't present in main and it's not immediately clear to me how this issue wouldn't be presenting there as well as I've not modified genesis (where those hostnames come from) and both branches are resolving DNS and the diff there in the code is quite minimal (net.DefaultResolver#LookupHost instead of net.ResolveTCPAddr).

{"level":"fatal","error":"setting up network: converting peer info, pokt address: 00404a570febd061274f72b50d0a37f611dfe339: resolving peer IP for hostname: v1-validator004: resolving peer IP for hostname: v1-validator004, lookup v1-validator004 on 10.96.0.10:53: no such host; converting peer info, pokt address: 00104055c00bed7c983a48aac7dc6335d7c607a7: resolving peer IP for hostname: v1-validator001: resolving peer IP for hostname: v1-validator001, lookup v1-validator001 on 10.96.0.10:53: no such host; converting peer info, pokt address: 00204737d2a165ebe4be3a7d5b0af905b0ea91d8: resolving peer IP for hostname: v1-validator002: resolving peer IP for hostname: v1-validator002, lookup v1-validator002 on 10.96.0.10:53: no such host; converting peer info, pokt address: 00304d0101847b37fd62e7bebfbdddecdbb7133e: resolving peer IP for hostname: v1-validator003: resolving peer IP for hostname: v1-validator003, lookup v1-validator003 on 10.96.0.10:53: no such host","time":"2023-04-17T07:20:31Z","message":"Failed to start pocket node"}

Tapping @okdas once more for help. 😅

@Olshansk (or @okdas) I would like to ask if I may please delegate the responsibility of subsequent manual testing and merging of this PR after this (last last last) fix? It seems that due to the timezones, by the time I'm both online and have approval to merge, there is always something new on main that needs to be merged in and manually tested which more often than not leads to the need for more small changes in this PR, causing a cycle. That combined with the PR fatigue that everyone is feeling makes me feel like we should all push to merge this as soon as possible. Let's be catching these issues in other PRs instead of collecting them all here and perpetuating further divergence from main.

@@ -93,6 +92,7 @@ config:
max_conn_idle_time: 1m
health_check_period: 30s
p2p:
hostname: ""
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-applying #576 (comment) in the helm values.

Comment on lines +38 to +41
- name: POCKET_P2P_HOSTNAME
valueFrom:
fieldRef:
fieldPath: status.podIP
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-applying bb07e7e

Copy link
Member

@okdas okdas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the configuration changes look good.

@github-actions github-actions bot dismissed stale reviews from themself April 17, 2023 17:45

The check succeeded, dismissing the review comment.

@bryanchriswhite bryanchriswhite merged commit 49a3223 into main Apr 17, 2023
bryanchriswhite added a commit that referenced this pull request Apr 17, 2023
* pokt/main:
  [P2P] Refactor/consolidate P2P modules (#576)
bryanchriswhite added a commit that referenced this pull request Apr 17, 2023
* pokt/main:
  [P2P] Refactor/consolidate P2P modules (#576)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
p2p P2P specific changes
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

[P2P] Remove "legacy" P2P module & rename libp2p module directory
4 participants