Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reset DefaultPinger to reconnect to server #228

Closed

Conversation

minyukim
Copy link
Contributor

closes #227

@@ -563,8 +571,10 @@ func (c *Client) close() {
close(c.stop)

c.debug.Println("client stopped")
c.config.PingHandler.Stop()
c.debug.Println("ping stopped")
if _, ok := e.(*pingerError); !ok {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider what could happen if the connection drops when a ping is scheduled:

  1. c.error called by something else
  2. pinger exits with "failed to send PINGREQ" (having called reset)
  3. c.error call from step 1 runs and stops the refreshed pinger (meaning that all future calls to Run will fail).

I realise that the above sequence of events is pretty unlikely but I believe it's possible (and it would be pretty hard to trace!). I think it's preferable for the reset to happen in Run meaning Run will work for a new DefaultPinger or following a clean shutdown (the user must always wait for Run to terminate before calling it again). This should be documented in the interface to make it clear that the pinger is reusable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I think that scenario is possible too. I'll think about it more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MattBrittan What about adding Reset() to Pinger interface like this

type Pinger interface {
	// Run() starts the pinger. It blocks until the pinger is stopped.
	// If the pinger stops due to an error, it returns the error.
	// If the keepAlive is 0, it returns nil immediately.
	// Run() must be called only once.
	Run(conn net.Conn, keepAlive uint16) error

	// Stop() gracefully stops the pinger.
	Stop()

	// Reset() resets the pinger to be reusable.
	Reset()

	// PacketSent() is called when a packet is sent to the server.
	PacketSent()

	// PingResp() is called when a PINGRESP is received from the server.
	PingResp()

	// SetDebug() sets the logger for debugging.
	// It is not thread-safe and must be called before Run() to avoid race conditions.
	SetDebug(log.Logger)
}

and call it right after closing client?

func (c *Client) close() {
	c.mu.Lock()
	defer c.mu.Unlock()

	defer c.config.PingHandler.Reset()

	select {
	case <-c.stop:
		// already shutting down, return when shutdown complete
		<-c.done
		return
	default:
	}

	close(c.stop)

	c.debug.Println("client stopped")
	c.config.PingHandler.Stop()
	c.debug.Println("ping stopped")
	_ = c.config.Conn.Close()
	c.debug.Println("conn closed")
	c.acksTracker.reset()
	c.debug.Println("acks tracker reset")
	c.config.Session.ConnectionLost(nil)
	if c.config.autoCloseSession {
		if err := c.config.Session.Close(); err != nil {
			c.errors.Println("error closing session", err)
		}
	}
	c.debug.Println("session updated, waiting on workers")
	c.workers.Wait()
	c.debug.Println("workers done")
	close(c.done)
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem I see with this is "what should the Pinger do if Reset() is called whilst the pinger is running". We may just say "this can't happen" (probably a mistake!) but then there is no real benefit to having a separate function (you can effectively call Reset from within Run). If we agree that Run could conceivably be called before Stop completes then the question becomes "what can you do about it" (only think I can come up with is to stop the old one and log a message).

As such I don't think there is much value in adding Reset and it's probably simplest/safest if Run:

  • Checks if another Run is active and, if so:
    • Stop it (Run should return an error so it will be logged etc). Run may need to wait for Run to terminate.
  • Reset things
  • Start the new process

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry - ran out of time today so will try to have a look tomorrow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MattBrittan No problem.
Are we 100% sure that Run() will always be called before Stop()?
In this scenario,

  1. c.close() is called right after c.Connect()
  2. Stop() is called before calling Run() in gorutine

I think it can be possible because we don't wait for calling Run() in c.Connect(), although it will always never happen.
To be 100% sure, then I guess we need to separate Run() to things like Start() and Wait() like below, so that c.Connect() can wait for Start() of PingHandler.

// client.go
func (c *Client) Connect(ctx context.Context, cp *Connect) (*Connack, error) {
	...
        pingerWait := make(chan struct{})
	c.debug.Println("received CONNACK, starting PingHandler")
	c.workers.Add(1)
	go func() {
		defer c.workers.Done()
		defer c.debug.Println("returning from ping handler worker")
		if err := c.config.PingHandler.Start(c.config.Conn, keepalive); err != nil {
			...
		}
		close(pingerWait)
		if err:= c.config.PingHandler.Wait(); err != nil {
			go c.error(fmt.Errorf("ping handler error: %w", err))
		}
	}()
	...

	<- pingerWait

	return ca, nil	
}

// pinger.go
func (p *DefaultPinger) Start(conn net.Conn, keepAlive uint16) error {
	...	
}
func (p *DefaultPinger) Wait() error {
	return <-p.errChan
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I had been wondering if it would make more sense to pass Run a Context and use that for termination (was going to mock this up today but ran out of time).

@minyukim
Copy link
Contributor Author

PR #229 dealt with the same issue, so close this.

@minyukim minyukim closed this Jan 18, 2024
@minyukim minyukim deleted the bugfix/reset-defaultpinger branch January 18, 2024 23:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Autopaho set DefaultPinger on ClientConfig can't reconnect to server after disconnection
2 participants