Skip to content

Commit

Permalink
ts2phc: check for errors on polling the sink devices
Browse files Browse the repository at this point in the history
The ts2phc_pps_sink_poll() function polls on the sink clock devices to
capture PPS events. It attempts to poll until every sink has at least one
event.

The function does not check POLLERR. If one of the sink clocks has an error
while polling, it will be ignored, and the ts2phc_pps_sink_poll() function
may iterate in an infinite loop. The poll function will be called
repeatably, reporting POLL_ERR on the descriptor for the bad clock.

The loop will never terminate, because the sink with a bad clock will never
get a sink event, and the all_sinks_have_events will never be true.

This is relatively easy to trigger by simply removing the associated PTP
clock of a running instance of ts2phc. For example, if you remove the
driver of the associated networking device.

You can see the poll() behavior via strace:

  poll([{fd=4, events=POLLIN|POLLPRI}], 1, 2000) = 1 ([{fd=4, revents=POLLERR}])
  poll([{fd=4, events=POLLIN|POLLPRI}], 1, 2000) = 1 ([{fd=4, revents=POLLERR}])
  poll([{fd=4, events=POLLIN|POLLPRI}], 1, 2000) = 1 ([{fd=4, revents=POLLERR}])
  poll([{fd=4, events=POLLIN|POLLPRI}], 1, 2000) = 1 ([{fd=4, revents=POLLERR}])
  poll([{fd=4, events=POLLIN|POLLPRI}], 1, 2000) = 1 ([{fd=4, revents=POLLERR}])
  poll([{fd=4, events=POLLIN|POLLPRI}], 1, 2000) = 1 ([{fd=4, revents=POLLERR}])
  poll([{fd=4, events=POLLIN|POLLPRI}], 1, 2000) = 1 ([{fd=4, revents=POLLERR}])
  poll([{fd=4, events=POLLIN|POLLPRI}], 1, 2000) = 1 ([{fd=4, revents=POLLERR}])
  poll([{fd=4, events=POLLIN|POLLPRI}], 1, 2000) = 1 ([{fd=4, revents=POLLERR}])
  poll([{fd=4, events=POLLIN|POLLPRI}], 1, 2000) = 1 ([{fd=4, revents=POLLERR}])
  poll([{fd=4, events=POLLIN|POLLPRI}], 1, 2000) = 1 ([{fd=4, revents=POLLERR}])
  poll([{fd=4, events=POLLIN|POLLPRI}], 1, 2000) = 1 ([{fd=4, revents=POLLERR}])

The ts2phc_pps_sink_poll() function repeatably calls poll() and never
exits.

Worse, the interrupts SIGINT, SIGQUIT, SIGTERM, and SIGHUP all fail to stop
the program. These interrupts are all handled via
handle_int_quit_term(), which sets the global 'running' variable to 0. This
is checked via is_running(), but only in the main() function in ts2phc.c
Because the ts2phc_pps_sink_poll() function does not exit, the is_running()
check is never triggered.

Thus, if a user removes the PTP clock, they will be unable to kill the
ts2phc program via usual means and must resort to a SIGKILL.

If one of the configured clocks is no longer accessible, ts2phc should stop
attempting to synchronize it. In most other cases where a clock operation
fails unexpectedly, ts2phc reports an error message and exits.

Add a check for the POLLERR return event when iterating over the sink
clocks. When an error is detected, log an error message and exit. Note that
unlike sockets, there is no equivalent to sk_get_error() to determine the
specific cause of the polling error.

Reported-by: Alexander Nowlin <Alexander.Nowlin@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
  • Loading branch information
jacob-keller authored and richardcochran committed Sep 6, 2024
1 parent 0f8ff22 commit f848316
Showing 1 changed file with 8 additions and 0 deletions.
8 changes: 8 additions & 0 deletions ts2phc_pps_sink.c
Original file line number Diff line number Diff line change
Expand Up @@ -406,6 +406,14 @@ int ts2phc_pps_sink_poll(struct ts2phc_private *priv)
}

for (i = 0; i < priv->n_sinks; i++) {
if (polling_array->pfd[i].revents & POLLERR) {
sink = polling_array->sink[i];

pr_err("%s: error polling on pfd[%d]\n",
sink->name, i);
return -EIO;
}

if (polling_array->pfd[i].revents & (POLLIN|POLLPRI)) {
enum extts_result result;

Expand Down

0 comments on commit f848316

Please sign in to comment.