Skip to content
This repository has been archived by the owner on May 26, 2022. It is now read-only.

clear out extra dial jobs after dial finishes #51

Merged
merged 1 commit into from
Jan 4, 2018

Conversation

whyrusleeping
Copy link
Contributor

This really wouldnt be an issue if we didnt have tens of thousands of addresses hanging around for some reason

@ghost ghost assigned whyrusleeping Jan 3, 2018
@ghost ghost added the status/in-progress In progress label Jan 3, 2018
Copy link
Contributor

@vyzo vyzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so we were leaking dial jobs...

Stebalien
Stebalien previously approved these changes Jan 3, 2018
Copy link
Member

@Stebalien Stebalien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That won't fix the bug we discussed but may help a bit (maybe).

@whyrusleeping
Copy link
Contributor Author

@Stebalien how does it not solve the problem?

@Stebalien Stebalien dismissed their stale review January 3, 2018 20:23

Misread something.

Copy link
Member

@Stebalien Stebalien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't look closely enough. This does fix the dial issue (although I'd still like to simplify all this code and remove the numPeersDialing^2 operation).

limiter.go Outdated

arr := dl.waitingOnFd[:0]
for _, j := range dl.waitingOnFd {
if j.peer != p {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are currently considered "active". We'll need to decrement the activePerPeer counter for each one we remove.

Copy link
Member

@Stebalien Stebalien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, this isn't buggy but still isn't perfect. However, it's better than nothing.

delete(dl.waitingOnPeerLimit, p)
// NB: the waitingOnFd list doesnt need to be cleaned out here, we will
// remove them as we encounter them because they are 'cancelled' at this
// point
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as the waitingOnPeerLimit list. The problem is that it could be a while before that happens if we're really backed up behind a bunch of slow dials. It won't be quite as bad because it's limited by dials*perPeerLimit but it will still be large and can still grow in the same way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Stebalien Yeah, the tradeoff we make is the O(n) operation under a held lock after completion vs this potential extra memory buffering.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That could be solved with a doubly linked list (wait queues like this are really one of the only uses for them).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, great point. We never really need to iterate over it, so that makes perfect sense.

@whyrusleeping
Copy link
Contributor Author

So the final thing I need to do here is add metrics as @lgierth requested

@whyrusleeping
Copy link
Contributor Author

alright, so it turns out we don't have things set up yet to nicely add metrics from go-libp2p. I'm thinking that we should push to get this merged sooner, and work out the metrics later

@ghost
Copy link

ghost commented Jan 4, 2018

I'm thinking that we should push to get this merged sooner, and work out the metrics later

Agreed

@whyrusleeping whyrusleeping merged commit f7c26f4 into master Jan 4, 2018
@ghost ghost removed the status/in-progress In progress label Jan 4, 2018
@whyrusleeping whyrusleeping deleted the fix/clear-cancelled-dials branch January 4, 2018 17:22
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants