Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

esp-mqtt: With TLS enabled large messages cause memory corruption on ESP32-C3 (IDFGH-10684) #11910

Closed
3 tasks done
mistoll opened this issue Jul 19, 2023 · 6 comments
Closed
3 tasks done
Assignees
Labels
Resolution: Won't Do This will not be worked on Status: Done Issue is done internally Type: Bug bugs in IDF

Comments

@mistoll
Copy link

mistoll commented Jul 19, 2023

Answers checklist.

  • I have read the documentation ESP-IDF Programming Guide and the issue is not addressed there.
  • I have updated my IDF branch (master or release) to the latest version and checked that the issue is present there.
  • I have searched the issue tracker for a similar issue and not found a similar issue.

IDF version.

v5.1

Operating System used.

Windows

How did you build your project?

Command line with idf.py

If you are using Windows, please specify command line type.

PowerShell

Development Kit.

ESP32-C3-WROOM-02

Power Supply used.

Battery

What is the expected behavior?

When a large message is received I expect it to be dispatched to mqtt_event_handler in chunks. This is the way it behaves when TLS is disabled.

What is the actual behavior?

When a large message (larger than 16330bytes) is received the memory get corrupted. If the message is large enough a panic is raised.

Steps to reproduce.

  1. Setup an mqtt broker with TLS enabled
  2. Run an ESP32C3 with this program (change url, username, password and topic for your needs, app_wifi_init() is a generic wifi connect function.
#include <freertos/FreeRTOS.h>
#include <freertos/task.h>
#include <nvs_flash.h>
#include <sys/time.h>
#include <esp_crt_bundle.h>
#include <mqtt_client.h>
#include "esp_check.h"
#include "esp_log.h"
#include "app_wifi.h"

static const char* TAG = "main";

static struct {
  esp_mqtt_client_handle_t client;
} app_mqtt_ = {};

static void mqtt_event_handler(void *handler_args, esp_event_base_t base, int32_t event_id, void *event_data)
{
  ESP_LOGD(TAG, "Event dispatched from event loop base=%s, event_id=%ld", base, event_id);
  esp_mqtt_event_handle_t event = event_data;
  switch ((esp_mqtt_event_id_t)event_id) {
  case MQTT_EVENT_CONNECTED:
    ESP_LOGI(TAG, "MQTT_EVENT_CONNECTED");

    // subscribe
    if (-1 == esp_mqtt_client_subscribe(app_mqtt_.client, "topic", 1))
      ESP_LOGE(TAG, "Failed to subscibe to mqtt topic (image)");
    break;
  case MQTT_EVENT_DATA:
    ESP_LOGI(TAG, "MQTT_EVENT_DATA");
    break;
  case MQTT_EVENT_ERROR:
    ESP_LOGW(TAG, "MQTT_EVENT_ERROR");
    if (event->error_handle->error_type == MQTT_ERROR_TYPE_TCP_TRANSPORT) {
        ESP_LOGI(TAG, "reported from esp-tls 0x%x", event->error_handle->esp_tls_last_esp_err);
        ESP_LOGI(TAG, "reported from tls stack 0x%x", event->error_handle->esp_tls_stack_err);
        ESP_LOGI(TAG, "captured as transport's socket errno 0x%x",  event->error_handle->esp_transport_sock_errno);
        ESP_LOGI(TAG, "Last errno string (%s)", strerror(event->error_handle->esp_transport_sock_errno));
    }
    break;
  default:
    break;
  }
}

esp_err_t mqtt_install()
{
  esp_err_t ret = ESP_OK;
  esp_mqtt_client_handle_t client = NULL;

  // start client
  esp_mqtt_client_config_t mqtt_cfg = {
    .broker.address.uri = "mqtts://mymqtt:8883",
    .credentials.username = "user",
    .credentials.authentication.password = "pwd",
    .session.disable_clean_session = false, // do not keep message queue on broker

    // we need more time, because we have large packages.
    // broker appends SUBACK to PUBLISH, so SUBSCRIBE might end after image data received.
    // if we have this to short, SUBSCRIBE will be retransmitted (and also PUBLISH /image)
    .session.message_retransmit_timeout = 20000,

    // use embedded certificate bundle
    .broker.verification.crt_bundle_attach = &esp_crt_bundle_attach,
  };
  client = esp_mqtt_client_init(&mqtt_cfg);
  app_mqtt_.client = client;    

  ESP_GOTO_ON_ERROR(esp_mqtt_client_register_event(client, ESP_EVENT_ANY_ID, mqtt_event_handler, NULL), err, TAG, "Failed to register mqtt event handler");
  ESP_GOTO_ON_ERROR(esp_mqtt_client_start(client), err, TAG, "Failed to start mqtt client");
  client = NULL;

  return ret;
err:
  app_mqtt_.client = NULL;
  if (client != NULL) {
    esp_mqtt_client_destroy(client);
  }

  return ret;
}

static void setup() {
  // NVS
  esp_err_t err = nvs_flash_init();
  if (err == ESP_ERR_NVS_NO_FREE_PAGES || err == ESP_ERR_NVS_NEW_VERSION_FOUND) {
    ESP_ERROR_CHECK(nvs_flash_erase());
    err = nvs_flash_init();
  }
  ESP_ERROR_CHECK( err );

  // communication
  app_wifi_init();
  // Start the Wi-Fi.
  // If the node is provisioned, it will start connection attempts,
  // else, it will start Wi-Fi provisioning. The function will return
  // after a connection has been successfully established
  err = app_wifi_start(POP_TYPE_MAC);
  if (err != ESP_OK) {
      ESP_LOGE(TAG, "Could not start Wifi. Aborting!!!");
      vTaskDelay(5000/portTICK_PERIOD_MS);
      abort();
  }
}

void app_main() {
  esp_err_t ret = ESP_OK;
  
  setup();
  
  // system state and config
  ESP_LOGI(TAG, "Setup mqtt ");
  ret = mqtt_install(false, NULL, NULL);


  ESP_LOGI(TAG, "Exit main");
}
  1. Use a tool like "MQTT Explorer" to publish a message with 32kB (like 32k times "1")
  2. The esp will panic.

Debug Logs.

Panic output:

ssert failed: remove_free_block tlsf.c:331 (next && "next_free field can not be null")
Core  0 register dump:
MEPC    : 0x403806d2  RA      : 0x403867aa  SP      : 0x3fca8f90  GP      : 0x3fc90a00  
Stack dump detected
0x403806d2: panic_abort at D:/Espressif/frameworks/esp-idf-v5.1/components/esp_system/panic.c:452

0x403867aa: __ubsan_include at D:/Espressif/frameworks/esp-idf-v5.1/components/esp_system/ubsan.c:313

TP      : 0x3fc647c0  T0      : 0x37363534  T1      : 0x7271706f  T2      : 0x33323130  
S0/FP   : 0x00000070  S1      : 0x00000001  A0      : 0x3fca8fcc  A1      : 0x3fc922ed  
A2      : 0x00000001  A3      : 0x00000029  A4      : 0x00000001  A5      : 0x3fc98000  
A6      : 0x7a797877  A7      : 0x76757473  S2      : 0x00000009  S3      : 0x3fca90f7  
S4      : 0x3fc922ec  S5      : 0x3fcadf20  S6      : 0x3fcaff34  S7      : 0x3fcaff64  
S8      : 0x00000001  S9      : 0x3fc91d88  S10     : 0x3fc92000  S11     : 0x3fc92000  
T3      : 0x6e6d6c6b  T4      : 0x6a696867  T5      : 0x66656463  T6      : 0x62613938  
MSTATUS : 0x00001881  MTVEC   : 0x40380001  MCAUSE  : 0x00000007  MTVAL   : 0x00000000  
0x40380001: _vector_table at ??:?

MHARTID : 0x00000000  

More Information.

This looks like a buffer overflow to me. It could even be a security issue.

The rx buffer of mbedtls is 16384. This is pretty close to the message size. So together with some encryption overhead this could be the buffer which is overflowing.

@mistoll mistoll added the Type: Bug bugs in IDF label Jul 19, 2023
@espressif-bot espressif-bot added the Status: Opened Issue is new label Jul 19, 2023
@github-actions github-actions bot changed the title esp-mqtt: With TLS enabled large messages cause memory corruption esp-mqtt: With TLS enabled large messages cause memory corruption (IDFGH-10684) Jul 19, 2023
@KaeLL
Copy link
Contributor

KaeLL commented Jul 19, 2023

I dumped 1MiB on my vanilla ESP32 running the mqtt/ssl example and aside from the server disconnecting it due to lack of PINGREQ, it works fine.

@mistoll
Copy link
Author

mistoll commented Jul 20, 2023

I can confirm, it works on an ESP32 (AI ESP32-S module). Same program on ESP32C3 failed.

@mistoll mistoll changed the title esp-mqtt: With TLS enabled large messages cause memory corruption (IDFGH-10684) esp-mqtt: With TLS enabled large messages cause memory corruption on ESP32-C3 (IDFGH-10684) Jul 20, 2023
@euripedesrocha
Copy link
Collaborator

@mistoll I wasn't able to reproduce the issue.
If you can share a project that presents it behavior, it would be really useful. If not, the sdkconfig, and git commit hashes for IDF and esp-mqtt.

@espressif-bot espressif-bot added the Awaiting Response awaiting a response from the author label Aug 9, 2023
@mistoll
Copy link
Author

mistoll commented Aug 11, 2023

Here's my project stripped down to the minimum: mqtttest.zip

All private data has been replaced with "AAAAAA".
I'm currently in vacation and cannot test the files. I alredy used that code when posting this issue. However before uploading I removed some components. That might have an influence. I'm back to work on 21.08.23 and will try it then.

IDF is on cbce221 (v5.1)
esp-mqtt is on dffabb067fb3c39f486033d2e47eb4b1416f0c82

@mistoll
Copy link
Author

mistoll commented Aug 23, 2023

When testing the code I noticed that the error only happens with a single mqtt broker implementation. So I tried a view:

  • hive.mqtt.org works
  • Private Mosquitto Broker works
  • Private MQTTNet has the described problem
    Both private brokers run on the same server using the same Let's Encrypt certificate.

Other clients work with all brokers.
I currently assume the problem is caused by the broker. I'm not yet able to decrypt the traffic from MQTTNet, so it's hard to say what's happening.

I won't close this issue myself, as a panic caused by a faulty server is still an issue. However from my personal point of view, this can be closed. If you with to investigate further I can provide a public MQTTNet broker instance.

@euripedesrocha
Copy link
Collaborator

Closing since I couldn't reproduce.

@espressif-bot espressif-bot added Status: Done Issue is done internally Resolution: Won't Do This will not be worked on and removed Status: Opened Issue is new labels May 6, 2024
@espressif-bot espressif-bot removed the Awaiting Response awaiting a response from the author label May 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Resolution: Won't Do This will not be worked on Status: Done Issue is done internally Type: Bug bugs in IDF
Projects
None yet
Development

No branches or pull requests

4 participants