Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PX4 reports no logs even though there are logs on the SD card #14157

Closed
ndepal opened this issue Feb 13, 2020 · 7 comments · Fixed by #14188
Closed

PX4 reports no logs even though there are logs on the SD card #14157

ndepal opened this issue Feb 13, 2020 · 7 comments · Fixed by #14188

Comments

@ndepal
Copy link
Contributor

ndepal commented Feb 13, 2020

Describe the bug
We are seeing the issue that PX4 lists no logs when using QGC to request a log list or using some other mavlink application to send LOG_REQUEST_LIST (the response has num_logs = 0). This happens in situations where we are certain that there is at least one log file on the SD card. Soft-rebooting the flight controller fixes the issue.

The issue does not occur all the time and usually only after the vehicle has been on for an extended period of time (several hours). Potentially the issue also only occurs after having listed the logs repeatedly.

To Reproduce
Steps to reproduce the behavior:

  1. Drone switched on
  2. Create at least one log file on the sd card
  3. List the logs in QGC repeatedly
  4. Eventually, the list will be empty
  5. Reboot via QGC
  6. Log listing will be successful again

Additional context
This is happening on 1.10 but we also saw this in 1.7.4, so it's not a recent regression.

@julianoes
Copy link
Contributor

I looked for missing fclose or closedir in https://github.com/PX4/Firmware/blob/master/src/modules/mavlink/mavlink_log_handler.cpp but couldn't see anything wrong.

@ndepal
Copy link
Contributor Author

ndepal commented Feb 17, 2020

Thanks @julianoes for having a look.

I did some more investigating. Here is what I have learned so far:

  1. I cannot reproduce the issue simply by repeatedly requesting a log listing (LOG_REQUEST_LIST)
  2. I cannot reproduce the issue by repeatedly going through the sequence
    • arm (to create a new log file)
    • sleep
    • disarm
    • sleep
    • list logs
  3. I can reproduce the issue by repeatedly going though the sequence
    • arm
    • sleep
    • disarm
    • sleep
    • list logs
    • download all logs
    • delete all logs (optional, i.e. it happens with or without this step)
  4. The failure appears to always happen after successfully going through the above sequence 5 times. At the 6th iteration, no logs can be listed.

The failure happens at the line

https://github.com/PX4/Firmware/blob/07d656e971a72d1202651dfd3b4642736fb078d7/src/modules/mavlink/mavlink_log_handler.cpp#L464

Which fails with the errno 23: "File table overflow"

@ndepal
Copy link
Contributor Author

ndepal commented Feb 17, 2020

If I do the sequence without requesting all logs to be deleted at the end, the error does not occur during an attempt to list the log entries, but while trying to download a log at:

https://github.com/PX4/Firmware/blob/07d656e971a72d1202651dfd3b4642736fb078d7/src/modules/mavlink/mavlink_log_handler.cpp#L366

with the same errno 23.

@julianoes
Copy link
Contributor

It looks like somehow we run out of file descriptors on NuttX.

I've tried to debug this by wrapping fopen and fclose and it looks correct although sometimes 2 files are open at the same time. I've attached the patch to debug this. Maybe it would be worthwhile to try the same on NuttX.

And here is the output:

#List

fopen(./$log$.txt, 887475910) -> 1677757696 [../../src/modules/mavlink/mavlink_log_handler.cpp:481]]
fclose(1677757696) -> 0 [../../src/modules/mavlink/mavlink_log_handler.cpp:508]]
fopen(./logdata.txt, 887488848) -> 1677757696 [../../src/modules/mavlink/mavlink_log_handler.cpp:383]]
fclose(1677757696) -> 0 [../../src/modules/mavlink/mavlink_log_handler.cpp:407]]
fopen(./logdata.txt, 887488848) -> 1677757696 [../../src/modules/mavlink/mavlink_log_handler.cpp:383]]
fclose(1677757696) -> 0 [../../src/modules/mavlink/mavlink_log_handler.cpp:407]]
fopen(./logdata.txt, 887488848) -> 1677757696 [../../src/modules/mavlink/mavlink_log_handler.cpp:383]]
fclose(1677757696) -> 0 [../../src/modules/mavlink/mavlink_log_handler.cpp:407]]
fopen(./logdata.txt, 887488848) -> 1677757696 [../../src/modules/mavlink/mavlink_log_handler.cpp:383]]
fclose(1677757696) -> 0 [../../src/modules/mavlink/mavlink_log_handler.cpp:407]]
fopen(./logdata.txt, 887488848) -> 1677757696 [../../src/modules/mavlink/mavlink_log_handler.cpp:383]]
fclose(1677757696) -> 0 [../../src/modules/mavlink/mavlink_log_handler.cpp:407]]
fopen(./logdata.txt, 887488848) -> 1677757696 [../../src/modules/mavlink/mavlink_log_handler.cpp:383]]
fclose(1677757696) -> 0 [../../src/modules/mavlink/mavlink_log_handler.cpp:407]]
fopen(./logdata.txt, 887488848) -> 1677757696 [../../src/modules/mavlink/mavlink_log_handler.cpp:383]]
fclose(1677757696) -> 0 [../../src/modules/mavlink/mavlink_log_handler.cpp:407]]
fopen(./logdata.txt, 887488848) -> 1677757696 [../../src/modules/mavlink/mavlink_log_handler.cpp:383]]
fclose(1677757696) -> 0 [../../src/modules/mavlink/mavlink_log_handler.cpp:407]]
fopen(./logdata.txt, 887488848) -> 1677757696 [../../src/modules/mavlink/mavlink_log_handler.cpp:383]]
fclose(1677757696) -> 0 [../../src/modules/mavlink/mavlink_log_handler.cpp:407]]
fopen(./logdata.txt, 887488848) -> 1677757696 [../../src/modules/mavlink/mavlink_log_handler.cpp:383]]
fclose(1677757696) -> 0 [../../src/modules/mavlink/mavlink_log_handler.cpp:407]]

# Download 3
fopen(./logdata.txt, 887488848) -> 1677757696 [../../src/modules/mavlink/mavlink_log_handler.cpp:383]]
fclose(1677757696) -> 0 [../../src/modules/mavlink/mavlink_log_handler.cpp:407]]
fopen(./log/2020-02-19/17_46_04.ulg, 887492103) -> 1677757696 [../../src/modules/mavlink/mavlink_log_handler.cpp:422]]
fopen(./logdata.txt, 887488848) -> 1677729296 [../../src/modules/mavlink/mavlink_log_handler.cpp:383]]
fclose(1677729296) -> 0 [../../src/modules/mavlink/mavlink_log_handler.cpp:407]]
fclose(1677757696) -> 0 [../../src/modules/mavlink/mavlink_log_handler.cpp:418]]
fopen(./log/2020-02-19/17_46_29.ulg, 887492103) -> 1677757696 [../../src/modules/mavlink/mavlink_log_handler.cpp:422]]
fopen(./logdata.txt, 887488848) -> 1677729296 [../../src/modules/mavlink/mavlink_log_handler.cpp:383]]
fclose(1677729296) -> 0 [../../src/modules/mavlink/mavlink_log_handler.cpp:407]]
fclose(1677757696) -> 0 [../../src/modules/mavlink/mavlink_log_handler.cpp:418]]
fopen(./log/2020-02-19/17_46_38.ulg, 887492103) -> 1677757696 [../../src/modules/mavlink/mavlink_log_handler.cpp:422]]


# Download 1
fopen(./logdata.txt, 887488848) -> 1677729296 [../../src/modules/mavlink/mavlink_log_handler.cpp:383]]
fclose(1677729296) -> 0 [../../src/modules/mavlink/mavlink_log_handler.cpp:407]]
fclose(1677757696) -> 0 [../../src/modules/mavlink/mavlink_log_handler.cpp:418]]
fopen(./log/2020-02-19/17_46_04.ulg, 887492103) -> 1677757696 [../../src/modules/mavlink/mavlink_log_handler.cpp:422]]


# Download 1
fopen(./logdata.txt, 887488848) -> 1677729296 [../../src/modules/mavlink/mavlink_log_handler.cpp:383]]
fclose(1677729296) -> 0 [../../src/modules/mavlink/mavlink_log_handler.cpp:407]]
fclose(1677757696) -> 0 [../../src/modules/mavlink/mavlink_log_handler.cpp:418]]
fopen(./log/2020-02-19/17_46_29.ulg, 887492103) -> 1677757696 [../../src/modules/mavlink/mavlink_log_handler.cpp:422]]

patch.txt

@ndepal
Copy link
Contributor Author

ndepal commented Feb 19, 2020

Thanks for the patch. I'll try reproducing the issue with it applied.

@bkueng
Copy link
Member

bkueng commented Feb 19, 2020

It's an FD leak, and #14188 fixes it. Thanks for narrowing this down.

@julianoes
Copy link
Contributor

Thanks @bkueng, I did not see that!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants