Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

USB is randomly disconnected from ESP32-S3 #10000

Open
1 task done
sblantipodi opened this issue Jul 9, 2024 · 23 comments
Open
1 task done

USB is randomly disconnected from ESP32-S3 #10000

sblantipodi opened this issue Jul 9, 2024 · 23 comments
Assignees
Labels
Chip: ESP32-S3 Issue is related to support of ESP32-S3 Chip Resolution: Unable to reproduce With given information issue is unable to reproduce

Comments

@sblantipodi
Copy link
Contributor

Board

Lolin ESP32-S3

Device Description

Plain Lolin ESP32-S3 board

Hardware Configuration

No GPIO used

Version

v3.0.1

IDE Name

PlatformIO

Operating System

Windows 11

Flash frequency

240MHz

PSRAM enabled

no

Upload speed

115200

Description

the sketch below hangs from time to time.
the USB from S3 is disconnected and you need to power down/on the ESP device to get USB working again.

Sketch

byte pre[CONFIG_PREFIX_LENGTH];

Serial.begin(115200);
Serial.setRxBufferSize(1500);
size_t prefixLength = Serial.readBytes((byte *) pre, 1500);

Debug Message

-

Other Steps to Reproduce

No response

I have checked existing issues, online documentation and the Troubleshooting Guide

  • I confirm I have checked existing issues, online documentation and Troubleshooting guide.
@sblantipodi sblantipodi added the Status: Awaiting triage Issue is waiting for triage label Jul 9, 2024
@me-no-dev
Copy link
Member

requires minimal sketch that we can compile to reproduce.

@SuGlider
Copy link
Collaborator

SuGlider commented Jul 9, 2024

What is the USB CDC settings, Hardware Serial JTAG or OTG TinyUSB?

How is it detecting that USB has disconnected?

Could it be a USB Cable/plug problem instead?

When Log Output Level is debug, do you see any messages in the UART0?

@sblantipodi
Copy link
Contributor Author

requires minimal sketch that we can compile to reproduce.

it's difficult to give you a minimal sketch since there is a PC part also needed that sends the data to the ESP.

What is the USB CDC settings, Hardware Serial JTAG or OTG TinyUSB?

problem happen with both Hardware CDC and TinyUSB.

How is it detecting that USB has disconnected?

I can hear the Windows sound when a USB device is disconnected. And after the sound no device is present in the windows device manager.

Could it be a USB Cable/plug problem instead?

I tried a lot of cable and USB ports, I doubt. I have a firmware that is used by a lot of users, and all users are reporting the same problem with difference PCs and obviously cables.

When Log Output Level is debug, do you see any messages in the UART0?

no

@SuGlider
Copy link
Collaborator

SuGlider commented Aug 3, 2024

@sblantipodi -
I have tested it with Arduino 3.0.3, using an ESP32-S3 + HW Serial JTAG USB port. It worked fine for about 3 hours receiving data from a Python script.

Sketch:

// using S3 devKit - RGB LED will indicate that CDC has been
// open by Python Script.
// If S3 resets, Python script will hang/fail and LED will be kept RED

// Serial is the USB port - Enable CDC on Boot!
// Serial0 is the UART0 - Console -- Serial Monitor

void setup() {
  neopixelWrite(RGB_BUILTIN, RGB_BRIGHTNESS, 0, 0);  // Red
  Serial.begin();
  Serial.setRxBufferSize(1500);
  Serial.setTimeout(10); // reduces the time waiting for receiving bytes
  Serial0.begin(115200);
  Serial0.setDebugOutput(true);
  while (!Serial) delay(100);
  neopixelWrite(RGB_BUILTIN, 0, RGB_BRIGHTNESS, 0);  // Green
  Serial0.println("Starting... run the Python Script.");
  delay(2000);
}

#define CONFIG_PREFIX_LENGTH 1500
void loop() {
  byte pre[CONFIG_PREFIX_LENGTH];

  size_t prefixLength = Serial.readBytes((byte *) pre, 1500);
  if (prefixLength > 0) {
    Serial0.println(prefixLength);
  }
}

Python running on a Windows 11 computer:

import serial

print ("CDC App test for issue 10000")

try:
	# CDC same as SERIAL_8N1 - Arduino equivalent
	# Change 'com15' to what ever is your USB CDC serial device name (win/linux)
	# timeout=None means that it will work as a blocking read()
	# write() is blocking by default

	CDC = serial.Serial(port='com15', baudrate=115200, parity=serial.PARITY_NONE,
	stopbits=serial.STOPBITS_ONE, bytesize=serial.EIGHTBITS, timeout=None) 
except:
	print("COM15: port is busy or unavailable")
	exit()

# Configure Serial Out and In as necessary using UART and/or CDC with respective config in the sketch

count = 1
chars_100 = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUV"

while 1:
	# writes 100 bytes in blocking mode
	for x in range(count):
		CDC.write(chars_100.encode('utf_8'))
	print("#"+str(count * 100)+" bytes sent!")
	count = count + 1
	if (count > 50):
		count = 1

Findings:

No issue after running it for more than 3 hours, non stop. Throughput is about 45,000 bytes per second (or 360Kps).
Problems found when Power Plan from Windows Control Panel has timeout for Blanking the Screen or for Suspending Activities.
In that case I found out that S3 doesn't reset, but the PC stops transmission.

In order to make it work with no failure, I has to set both (blanking screen/ suspend activities0 to "Never".
Using a Linux computer that is not necessary as the Python script runs in background forever.

@SuGlider
Copy link
Collaborator

SuGlider commented Aug 3, 2024

@sblantipodi - The sketch and script from #10000 (comment)
has been running for almost 24 hours. No issue so far.

Both, PC and S3 are communicating using USB. No disconnection.
I think that the issue isn't in the Arduino/USB side.
I may be in the windows/linux application side.

@SuGlider SuGlider added Resolution: Unable to reproduce With given information issue is unable to reproduce Chip: ESP32-S3 Issue is related to support of ESP32-S3 Chip and removed Status: Awaiting triage Issue is waiting for triage labels Aug 3, 2024
@sblantipodi
Copy link
Contributor Author

sblantipodi commented Aug 4, 2024

@sblantipodi - The sketch and script from #10000 (comment)
has been running for almost 24 hours. No issue so far.

Both, PC and S3 are communicating using USB. No disconnection.
I think that the issue isn't in the Arduino/USB side.
I may be in the windows/linux application side.

@SuGlider that sketch lacks the WiFi connectivity part.
Please, just connect It to WiFi and test It again.
I'll do the same and report it.
I really appreciate what you are doing. Thanks.

@SuGlider
Copy link
Collaborator

SuGlider commented Aug 4, 2024

Please, just connect It to WiFi and test It again.

No issues again.

// using S3 devKit - RGB LED will indicate that CDC has been
// open by Python Script.
// If S3 resets, Python script will hang/fail and LED will be kept RED

// Serial is the USB port - Enable CDC on Boot!
// Serial0 is the UART0 - Console -- Serial Monitor

#include <WiFi.h>
#include <WiFiMulti.h>
#include <HTTPClient.h>

WiFiMulti wifiMulti;

void setup() {
  neopixelWrite(RGB_BUILTIN, RGB_BRIGHTNESS, 0, 0);  // Red

  Serial.begin();
  Serial.setRxBufferSize(1500);
  Serial.setTimeout(10); // reduces the time waiting for receiving bytes

  wifiMulti.addAP("SSID", "PWD");
  Serial0.begin(115200);
  Serial0.setDebugOutput(true);

  Serial0.println("Connecting Wifi...");
  Serial0.println();
  while (wifiMulti.run() != WL_CONNECTED) {
    Serial0.print(".");
    delay(100);
  }
  Serial0.println();

  if (wifiMulti.run() == WL_CONNECTED) {
    Serial0.println("");
    Serial0.println("WiFi connected");
    Serial0.println("IP address: ");
    Serial0.println(WiFi.localIP());
  }

  HTTPClient http;
  // testing connection...
  Serial0.print("[HTTP] begin...\n");
  http.begin("http://google.com/index.html");  //HTTP

  Serial0.print("[HTTP] GET...\n");
  // start connection and send HTTP header
  int httpCode = http.GET();

  // httpCode will be negative on error
  if (httpCode > 0) {
    // HTTP header has been send and Server response header has been handled
    Serial0.printf("[HTTP] GET... code: %d\n", httpCode);

    // file found at server
    if (httpCode == HTTP_CODE_OK) {
      String payload = http.getString();
      Serial0.println(payload);
    }
  } else {
    Serial0.printf("[HTTP] GET... failed, error: %s\n", http.errorToString(httpCode).c_str());
  }

  http.end();

  Serial0.println("Starting... run the Python Script.");
  while (!Serial) delay(100);
  neopixelWrite(RGB_BUILTIN, 0, RGB_BRIGHTNESS, 0);  // Green
}

#define CONFIG_PREFIX_LENGTH 1500
uint32_t count = 0;
void loop() {
  byte pre[CONFIG_PREFIX_LENGTH];
  if (wifiMulti.run() != WL_CONNECTED) {
    Serial0.println(".|");
    return;
  }
  size_t prefixLength = Serial.readBytes((byte *) pre, 1500);
  if (prefixLength > 0) {
    Serial0.println(prefixLength);
  }
}

@SuGlider
Copy link
Collaborator

SuGlider commented Aug 4, 2024

@sblantipodi - I think that the problem may be in the application... some problem with lack of RAM?
A failed malloc() or no space for a String to allocate memory....

@sblantipodi
Copy link
Contributor Author

@SuGlider can I ask you why you use Serial for read and Serial0 for write?
I can't read Serial0, how am I supposed to read Serial0?

The sketch that crashes on me, uses Serial for both read/write.

@SuGlider
Copy link
Collaborator

SuGlider commented Aug 4, 2024

USB Serial can read by the Python Script, is necessary. The Windows COM port is open by this script.
I use Serial0 as console and open that with the Arduino Serial Monitor.

With reagards to reading and writing from USB Serial, there is a caveat that has to do with TinyUSB / HW Serial JTAG driver and the related tasks used to populate/consume the RX/TX Buffers. The Arduino Sketch runs in a very low Task priority. The USB Tasks run on a very high priority. This may cause some problems.

@SuGlider
Copy link
Collaborator

SuGlider commented Aug 5, 2024

I can't read Serial0, how am I supposed to read Serial0?

I see that your board is a Lolin ESP32-S3. It has no USB-UART chip on it.

It is possible to read the UART by using an external UART-USB converter, based on chips ike CP2102, CH430, etc.
It may be possible to use another ESP32 board that has such converter and use its chip or by running a sketch that reads/forwards UART1 to UART0...

@sblantipodi
Copy link
Contributor Author

sblantipodi commented Aug 5, 2024

@SuGlider thanks for the answer, I really, really appreciate it.

I see that your board is a Lolin ESP32-S3. It has no USB-UART chip on it.

I have a lot of boards from various manufacturers and very very few of them has both USB and UART.

I know very little boards that has both USB and UART chip and that boards are more "development boards" than real ones...
I mean, what's the point of having both USB and UART on the same board ?

I think that my problem is caused by the fact that I write and read on Serial (USB).

Will espressif ever fix this problem? Having two different interfaces for read and write, is not really a solution to this :)
is there a workaround to read and write using USB serial without making the driver to crash?

@sblantipodi
Copy link
Contributor Author

I confirm that if I don't use the same Serial for read ad write it doesn't crash.
If I use the same Serial for both read and write, the USB driver crashes, the device is disconnected from the PC, the ESP hangs and there is no way to recover it if not by manually rebooting it.

@SuGlider
Copy link
Collaborator

SuGlider commented Aug 6, 2024

Thanks @sblantipodi for the confirmation.
If possible, just confirm that the crash happens in both USB Modes: TinyUSB and HWSerial JTAG.
It can be configured using the Arduino IDE menu, but it is necessary to build each mode and possibly upload it using BOOT+RESET buttons in order to put the S3 into download mode.

@sblantipodi
Copy link
Contributor Author

sblantipodi commented Aug 6, 2024

Thanks @sblantipodi for the confirmation.
If possible, just confirm that the crash happens in both USB Modes: TinyUSB and HWSerial JTAG.
It can be configured using the Arduino IDE menu, but it is necessary to build each mode and possibly upload it using BOOT+RESET buttons in order to put the S3 into download mode.

Thank you for your time @SuGlider, I appreciate it.
Yes, I have tested it in both TinyUSB and HWSerial JTAG mode.
Sometimes the crash happens in the first 5 minutes, sometimes it takes longer but there is no way to make it stable.

Same problem on different boards like UE TinyS3 and Adafruit ESP32-S3 Feather.
Problems does not happen on the standard ESP32 using CH340 chip.

I tried creating different tasks for read/write, I tried giving the tasks a different priority, I tried pinning them at different cores but nothing solved or improved the problem.

The only way to make it stable is to stop writing on serial, but this isn't a solution clearly :)
The less I write, the more time is needed to crash, but if I write something, a crash will happen, sooner or later.
In normal conditions a firmware can recover operations after a crash, but in this case is impossible because when the crash happens, only a manual reboot of the device make it working again.

@SuGlider
Copy link
Collaborator

SuGlider commented Aug 6, 2024

I see. I need a test case that I can use to investigate the issue. Let me know if you have a pair sketch/python that I could use to reproduce it.
In the meanwhile, I'll try to create this testing code and latter post it here.

@sblantipodi
Copy link
Contributor Author

sblantipodi commented Aug 6, 2024

@SuGlider here something that may help you reproduce the problem.

Sketch


#include "Arduino.h"


#include <WiFi.h>
#include <WiFiMulti.h>
#include <HTTPClient.h>

WiFiMulti wifiMulti;

void setup() {
  neopixelWrite(RGB_BUILTIN, RGB_BRIGHTNESS, 0, 0);  // Red

  Serial.begin();
  Serial.setRxBufferSize(1500);
  Serial.setTimeout(10); // reduces the time waiting for receiving bytes

  wifiMulti.addAP("SSID", "PWD");
  Serial.begin(115200);
  Serial.setDebugOutput(true);

  Serial.println("Connecting Wifi...");
  Serial.println();
  while (wifiMulti.run() != WL_CONNECTED) {
    Serial.print(".");
    delay(100);
  }
  Serial.println();

  if (wifiMulti.run() == WL_CONNECTED) {
    Serial.println("");
    Serial.println("WiFi connected");
    Serial.println("IP address: ");
    Serial.println(WiFi.localIP());
  }

  HTTPClient http;
  // testing connection...
  Serial.print("[HTTP] begin...\n");
  http.begin("http://google.com/index.html");  //HTTP

  Serial.print("[HTTP] GET...\n");
  // start connection and send HTTP header
  int httpCode = http.GET();

  // httpCode will be negative on error
  if (httpCode > 0) {
    // HTTP header has been send and Server response header has been handled
    Serial.printf("[HTTP] GET... code: %d\n", httpCode);

    // file found at server
    if (httpCode == HTTP_CODE_OK) {
      String payload = http.getString();
      Serial.println(payload);
    }
  } else {
    Serial.printf("[HTTP] GET... failed, error: %s\n", http.errorToString(httpCode).c_str());
  }

  http.end();

  Serial.println("Starting... run the Python Script.");
  while (!Serial) delay(100);
  neopixelWrite(RGB_BUILTIN, 0, RGB_BRIGHTNESS, 0);  // Green
}

#define CONFIG_PREFIX_LENGTH 1500
uint32_t count = 0;
unsigned long previousMillisA = 0;
const long intervalA = 1000;


void loop() {
  byte pre[CONFIG_PREFIX_LENGTH];
  if (wifiMulti.run() != WL_CONNECTED) {
    Serial.println(".|");
    return;
  }
  size_t prefixLength = Serial.readBytes((byte *) pre, 1500);
  if (prefixLength > 0) {
    Serial.println(prefixLength);
  }
  while(Serial.available() > 0) {
    char t = Serial.read();
  }

  unsigned long currentMillisA = millis();
  if (currentMillisA - previousMillisA >= intervalA) {
    previousMillisA = currentMillisA;
    Serial.println("MSG_sent_to_the_python_program_every_second msg can be pretty big sometimes. Lorem ipsum dolor");
    Serial.println("MSG_sent_to_the_python_program_every_second msg can be pretty big sometimes. Lorem ipsum dolor");
    Serial.println("MSG_sent_to_the_python_program_every_second msg can be pretty big sometimes. Lorem ipsum dolor");
    Serial.println("MSG_sent_to_the_python_program_every_second msg can be pretty big sometimes. Lorem ipsum dolor");
    Serial.println("MSG_sent_to_the_python_program_every_second msg can be pretty big sometimes. Lorem ipsum dolor");
    Serial.println("MSG_sent_to_the_python_program_every_second msg can be pretty big sometimes. Lorem ipsum dolor");
    Serial.println("MSG_sent_to_the_python_program_every_second msg can be pretty big sometimes. Lorem ipsum dolor");
    Serial.println("MSG_sent_to_the_python_program_every_second msg can be pretty big sometimes. Lorem ipsum dolor");
    Serial.println("MSG_sent_to_the_python_program_every_second msg can be pretty big sometimes. Lorem ipsum dolor");
    Serial.println("MSG_sent_to_the_python_program_every_second msg can be pretty big sometimes. Lorem ipsum dolor");
    Serial.println("MSG_sent_to_the_python_program_every_second msg can be pretty big sometimes. Lorem ipsum dolor");
  }
}

Python program:

import serial
import time

arduino = serial.Serial(port='COM5', baudrate=115200, timeout=5)

def send_message():
    while True:        
        arduino.write(b'Lorem ipsum dolor sit amet Lorem ipsum dolor sit amet Lorem ipsum dolor sit amet Lorem ipsum dolor sit amet Lorem ipsum dolor sit amet Lorem ipsum dolor sit amet Lorem ipsum dolor sit amet Lorem ipsum dolor sit amet Lorem ipsum dolor sit amet Lorem ipsum dolor sit amet Lorem ipsum dolor sit amet Lorem ipsum dolor sit amet Lorem ipsum dolor sit amet Lorem ipsum dolor sit amet Lorem ipsum dolor sit amet Lorem ipsum dolor sit amet Lorem ipsum dolor sit amet Lorem ipsum dolor sit amet Lorem ipsum dolor sit amet\n') 
        response = arduino.readline().decode('utf-8').strip() 
        if response:
            print(f"Received: {response}")
        time.sleep(0.008) 

if __name__ == "__main__":
    send_message()

If you reduce the "Lorem ipsum dolor" string length it requires more time to crash, the bigger the string is, the faster the crash is.

I exxagerated the values to make it crash faster but in a real world application, the program will crash too, sooner or later.

@sblantipodi
Copy link
Contributor Author

hi @SuGlider were you able to reproduce the problem with my snippets?
if yes, can you remove the Resolution: Unable to reproduce label? :)

@vickash
Copy link

vickash commented Aug 9, 2024

The only way to make it stable is to stop writing on serial, but this isn't a solution clearly :)
The less I write, the more time is needed to crash, but if I write something, a crash will happen, sooner or later.
In normal conditions a firmware can recover operations after a crash, but in this case is impossible because when the crash happens, only a manual reboot of the device make it working again.

I have a project using a call-and-response type protocol, implemented over Serial, and I'm having exactly the same issue when there's a lot of data being passed back and forth over USB CDC. Using a Lolin S3, just like @sblantipodi.

Problems does not happen on the standard ESP32 using CH340 chip.

My S3 board has a CH340 on board, so I switched to that interface and have a test running. No issues so far after about 10 minutes. Will update if that changes.

I'm using version 3.0.4. The same issue occurs with CDC on my S2, and C3, which both worked fine at some point last year, when I put this project down for a while. Same on the new (to me) H2 and C6.

I will try rolling everything back to see if I can find a working state, and let you know which version of the core that was, if it's any help.

@sblantipodi
Copy link
Contributor Author

sblantipodi commented Aug 9, 2024

@vickash I confirm that using CH340 is the way to go for stability currently but this is bad because most of the newer ESPs boards doesn't use that chip by default.
Regarding the USB implementation, I have the same problem on both Arduino core 3.x and 2.x.

@vickash
Copy link

vickash commented Aug 9, 2024

I let my S3 run for about an hour on the CH340, without issue, then stopped it. Now I'm trying CDC again on core version 2.0.14, which I think was the last version I used before 3. No issues so far, running about 10 minutes.

Are you using 2.0.17 @sblantipodi? Maybe try your example on 2.0.14?

@vickash
Copy link

vickash commented Aug 10, 2024

Now I'm trying CDC again on core version 2.0.14

This is still running after 17 hours. It's definitely something that changed after 2.0.14. Maybe something going from IDF 4 to 5, not necessarily the Arduino core itself?

@sblantipodi
Copy link
Contributor Author

I let my S3 run for about an hour on the CH340, without issue, then stopped it. Now I'm trying CDC again on core version 2.0.14, which I think was the last version I used before 3. No issues so far, running about 10 minutes.

Are you using 2.0.17 @sblantipodi? Maybe try your example on 2.0.14?

I'll try it next week and report back. Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Chip: ESP32-S3 Issue is related to support of ESP32-S3 Chip Resolution: Unable to reproduce With given information issue is unable to reproduce
Projects
None yet
Development

No branches or pull requests

4 participants