Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

select() blocks the FreeRTOS scheduler on Linux target (IDFGH-13498) #14395

Closed
3 tasks done
snake-4 opened this issue Aug 18, 2024 · 2 comments
Closed
3 tasks done

select() blocks the FreeRTOS scheduler on Linux target (IDFGH-13498) #14395

snake-4 opened this issue Aug 18, 2024 · 2 comments
Assignees
Labels
Resolution: NA Issue resolution is unavailable Status: Done Issue is done internally Type: Bug bugs in IDF

Comments

@snake-4
Copy link
Contributor

snake-4 commented Aug 18, 2024

Answers checklist.

  • I have read the documentation ESP-IDF Programming Guide and the issue is not addressed there.
  • I have updated my IDF branch (master or release) to the latest version and checked that the issue is present there.
  • I have searched the issue tracker for a similar issue and not found a similar issue.

IDF version.

v5.4-dev-2004-g8e4454b285

Espressif SoC revision.

Linux

Operating System used.

Linux

How did you build your project?

Command line with idf.py

If you are using Windows, please specify command line type.

None

What is the expected behavior?

The scheduler should continue executing other tasks as expected while the MQTT client is connected to a server.

What is the actual behavior?

The scheduler no longer switches between tasks when the MQTT client establishes a connection.

Steps to reproduce.

  • main/app_main.c
#include <esp_event.h>
#include <esp_log.h>
#include <esp_netif.h>
#include <freertos/FreeRTOS.h>
#include <freertos/timers.h>
#include <mqtt_client.h>
#include <Mockesp_timer.h>
#include <sys/time.h>
static int64_t mock_esp_timer_get_time(int cmock_num_calls) {
    struct timeval tv;
    gettimeofday(&tv, NULL);
    return 1000000 * tv.tv_sec + tv.tv_usec;
}
static void timer_cb(TimerHandle_t xTimer) { ESP_LOGI("timer_cb", "called"); }
void app_main(void) {
    esp_timer_get_time_StubWithCallback(mock_esp_timer_get_time);
    ESP_ERROR_CHECK(esp_netif_init());
    ESP_ERROR_CHECK(esp_event_loop_create_default());

    xTimerStart(xTimerCreate("Timer", pdMS_TO_TICKS(500), pdTRUE, NULL, timer_cb), 0);
    vTaskDelay(pdMS_TO_TICKS(2000));

    esp_mqtt_client_config_t cfg = {};
    cfg.broker.address.uri = "mqtt://test.mosquitto.org";
    cfg.credentials.client_id = "espidf-bug-repro-1";
    esp_mqtt_client_start(esp_mqtt_client_init(&cfg));
}
  • main/CMakeLists.txt
idf_component_register(SRCS "app_main.c" REQUIRES "esp_event" "esp_netif" "esp_timer" "mqtt")
  • CMakeLists.txt
cmake_minimum_required(VERSION 3.16)
list(APPEND EXTRA_COMPONENT_DIRS "$ENV{IDF_PATH}/tools/mocks/esp_timer/")
set(COMPONENTS main)
include($ENV{IDF_PATH}/tools/cmake/project.cmake)
project(firmware)

The above code when built for the Linux target, will block the FreeRTOS scheduler while it attempts to establish a connection and while it maintains the connection.

Debug Logs.

No response

More Information.

No response

@snake-4 snake-4 added the Type: Bug bugs in IDF label Aug 18, 2024
@github-actions github-actions bot changed the title ESP-MQTT blocks the FreeRTOS scheduler on Linux target ESP-MQTT blocks the FreeRTOS scheduler on Linux target (IDFGH-13498) Aug 18, 2024
@espressif-bot espressif-bot added the Status: Opened Issue is new label Aug 18, 2024
@snake-4
Copy link
Contributor Author

snake-4 commented Aug 19, 2024

I've tracked it down to this line:

while ((ret = real_select(fd, rfds, wfds, efds, tv)) < 0 && errno == EINTR) {

MQTT task seems to call esp_transport_poll_read in a loop, which ends up calling select() continuously. The FreeRTOS scheduler is somehow unable to schedule tasks in this scenario.

@snake-4
Copy link
Contributor Author

snake-4 commented Aug 28, 2024

Here's what I've found so far:
CONFIG_FREERTOS_TIMER_TASK_PRIORITY is 1 by default
whereas CONFIG_MQTT_TASK_PRIORITY is 5.

When the code is compiled for the embedded targets, lwIP's FreeRTOS port correctly uses the FreeRTOS for the timeout, so the task will sleep and the lower priority task will be ran by the scheduler.

However, when the Linux's select syscall is used instead, the FreeRTOS scheduler doesn't know that the task is supposed to sleep, and so it keeps scheduling the higher priority task to run.

This problem will happen with every slow syscall on the Linux target. A simple solution would be to call the actual select without the timeout and use vTaskDelay to simulate a timeout.

@snake-4 snake-4 changed the title ESP-MQTT blocks the FreeRTOS scheduler on Linux target (IDFGH-13498) select() blocks the FreeRTOS scheduler on Linux target (IDFGH-13498) Aug 28, 2024
snake-4 added a commit to snake-4/esp-idf that referenced this issue Aug 28, 2024
The select function wrapper was rewritten to be non-blocking on Linux systems, as it was stealing all the CPU time from lower priority tasks when called from a higher priority task. This is because the FreeRTOS scheduler does not know that the task thread is sleeping during the system call.

This issue manifests all "slow" system calls on the Linux target, but handling the case of select fixes the problems for most ESP-IDF components.

The FreeRTOS POSIX port documentation lists this as a known issue, so user code is responsible handling this case if other system calls are used, even if unknowingly.

This closes espressif#14395 "select() blocks the FreeRTOS scheduler on Linux target" (IDFGH-13498).
snake-4 added a commit to snake-4/esp-idf that referenced this issue Aug 28, 2024
The select function wrapper was rewritten to be non-blocking
on Linux systems, as it was stealing all the CPU time
from lower priority tasks when called from a higher priority task.
This is because the FreeRTOS scheduler does not know
that the task thread is sleeping during the system call.

This issue manifests all "slow" system calls on the Linux target,
but handling the case of select fixes the problems for most ESP-IDF components.

The FreeRTOS POSIX port documentation lists this as a known issue,
so user code is responsible handling this case if other system calls are used,
even if unknowingly.

This closes GH issue espressif#14395 "select() blocks the FreeRTOS scheduler on Linux target"
snake-4 added a commit to snake-4/esp-idf that referenced this issue Aug 28, 2024
The select function wrapper was rewritten to be non-blocking
on Linux systems, as it was stealing all the CPU time
from lower priority tasks when called from a higher priority task.
This is because the FreeRTOS scheduler does not know
that the task thread is sleeping during the system call.

This issue manifests all "slow" system calls on the Linux target,
but handling the case of select fixes the problems for most ESP-IDF components.

The FreeRTOS POSIX port documentation lists this as a known issue,
so user code is responsible handling this case if other system calls are used,
even if unknowingly.

This closes GH issue espressif#14395 "select() blocks the FreeRTOS scheduler on Linux target"
snake-4 added a commit to snake-4/esp-idf that referenced this issue Aug 28, 2024
The select function wrapper was rewritten to be non-blocking
on Linux systems, as it was stealing all the CPU time
from lower priority tasks when called from a higher priority task.
This is because the FreeRTOS scheduler does not know
that the task thread is sleeping during the system call.

This issue manifests all "slow" system calls on the Linux target,
but handling the case of select fixes the problems for most ESP-IDF components.

The FreeRTOS POSIX port documentation lists this as a known issue,
so user code is responsible handling this case if other system calls are used,
even if unknowingly.

This closes GH issue espressif#14395 "select() blocks the FreeRTOS scheduler on Linux target"
snake-4 added a commit to snake-4/esp-idf that referenced this issue Sep 4, 2024
The select function wrapper was rewritten to be non-blocking
on Linux systems, as it was stealing all the CPU time
from lower priority tasks when called from a higher priority task.
This is because the FreeRTOS scheduler does not know
that the task thread is sleeping during the system call.

This issue manifests all "slow" system calls on the Linux target,
but handling the case of select fixes the problems for most ESP-IDF components.

The FreeRTOS POSIX port documentation lists this as a known issue,
so user code is responsible handling this case if other system calls are used,
even if unknowingly.

This closes GH issue espressif#14395 "select() blocks the FreeRTOS scheduler on Linux target"
@espressif-bot espressif-bot added Status: Done Issue is done internally Resolution: NA Issue resolution is unavailable and removed Status: Opened Issue is new labels Sep 12, 2024
@snake-4 snake-4 closed this as completed Sep 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Resolution: NA Issue resolution is unavailable Status: Done Issue is done internally Type: Bug bugs in IDF
Projects
None yet
Development

No branches or pull requests

4 participants