Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
ts2phc: check for errors on polling the sink devices
The ts2phc_pps_sink_poll() function polls on the sink clock devices to capture PPS events. It attempts to poll until every sink has at least one event. The function does not check POLLERR. If one of the sink clocks has an error while polling, it will be ignored, and the ts2phc_pps_sink_poll() function may iterate in an infinite loop. The poll function will be called repeatably, reporting POLL_ERR on the descriptor for the bad clock. The loop will never terminate, because the sink with a bad clock will never get a sink event, and the all_sinks_have_events will never be true. This is relatively easy to trigger by simply removing the associated PTP clock of a running instance of ts2phc. For example, if you remove the driver of the associated networking device. You can see the poll() behavior via strace: poll([{fd=4, events=POLLIN|POLLPRI}], 1, 2000) = 1 ([{fd=4, revents=POLLERR}]) poll([{fd=4, events=POLLIN|POLLPRI}], 1, 2000) = 1 ([{fd=4, revents=POLLERR}]) poll([{fd=4, events=POLLIN|POLLPRI}], 1, 2000) = 1 ([{fd=4, revents=POLLERR}]) poll([{fd=4, events=POLLIN|POLLPRI}], 1, 2000) = 1 ([{fd=4, revents=POLLERR}]) poll([{fd=4, events=POLLIN|POLLPRI}], 1, 2000) = 1 ([{fd=4, revents=POLLERR}]) poll([{fd=4, events=POLLIN|POLLPRI}], 1, 2000) = 1 ([{fd=4, revents=POLLERR}]) poll([{fd=4, events=POLLIN|POLLPRI}], 1, 2000) = 1 ([{fd=4, revents=POLLERR}]) poll([{fd=4, events=POLLIN|POLLPRI}], 1, 2000) = 1 ([{fd=4, revents=POLLERR}]) poll([{fd=4, events=POLLIN|POLLPRI}], 1, 2000) = 1 ([{fd=4, revents=POLLERR}]) poll([{fd=4, events=POLLIN|POLLPRI}], 1, 2000) = 1 ([{fd=4, revents=POLLERR}]) poll([{fd=4, events=POLLIN|POLLPRI}], 1, 2000) = 1 ([{fd=4, revents=POLLERR}]) poll([{fd=4, events=POLLIN|POLLPRI}], 1, 2000) = 1 ([{fd=4, revents=POLLERR}]) The ts2phc_pps_sink_poll() function repeatably calls poll() and never exits. Worse, the interrupts SIGINT, SIGQUIT, SIGTERM, and SIGHUP all fail to stop the program. These interrupts are all handled via handle_int_quit_term(), which sets the global 'running' variable to 0. This is checked via is_running(), but only in the main() function in ts2phc.c Because the ts2phc_pps_sink_poll() function does not exit, the is_running() check is never triggered. Thus, if a user removes the PTP clock, they will be unable to kill the ts2phc program via usual means and must resort to a SIGKILL. If one of the configured clocks is no longer accessible, ts2phc should stop attempting to synchronize it. In most other cases where a clock operation fails unexpectedly, ts2phc reports an error message and exits. Add a check for the POLLERR return event when iterating over the sink clocks. When an error is detected, log an error message and exit. Note that unlike sockets, there is no equivalent to sk_get_error() to determine the specific cause of the polling error. Reported-by: Alexander Nowlin <Alexander.Nowlin@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
- Loading branch information