Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pixhawk i2c deadlock in v1.6.5 #7951

Closed
bartslinger opened this issue Sep 11, 2017 · 8 comments
Closed

pixhawk i2c deadlock in v1.6.5 #7951

bartslinger opened this issue Sep 11, 2017 · 8 comments
Assignees

Comments

@bartslinger
Copy link
Contributor

Hi guys,

After lots and lots of debugging, I found a reproducible way to get the pixhawk into a deadlock.
Basically if there is at least one device on the i2c bus, and the signal is corrupted, the pixhawk can get into a deadlock. Tried it with a HMC5883 and with an airspeed sensor, both giving the same result.

I have the nsh terminal connected over a serial link, which stops responding. The mavlink usb connection gets lost. The LED on the pixhawk stops fading.

The following silly video below shows how I reproduced the deadlock. I'm shorting the i2c SCL line with ground. However, I've also been able to break it by shorting SDA to GND or SCL to SDA.

https://www.youtube.com/watch?v=7U37GW4Qmrc

I'm not 100% sure, but I've never seen this issue on a PX4 version which was built on top of v1.5.5.

@LorenzMeier
Copy link
Member

There were NuttX OS changes in between. @davids5 Could you audit the I2C changes done?

@davids5
Copy link
Member

davids5 commented Sep 12, 2017

@bartslinger - I am trying to reproduce this. I am testing on FMUv2 HW (with no silicon errata) with a the v3 build. I have a 3DR compass and an AUAV airspeed sensor connected to I2C connector. I am rapidly shorting the I2C SCL to ground on the cable. I have verified the signals is shorted with a scope.

I think I have reproduced the same configuration as you have it the video. The signal I am shorting is on connected to a I2C extender.

So far no luck. On current master - v3 build and on tag v1.6.5 with v2 build.

What is your value for SYS_AUTOSTART?
How long on average do you have to spend to shorting the signal out to get the failure?

Can you get current master to fail for you?

@bartslinger
Copy link
Contributor Author

bartslinger commented Sep 12, 2017

I'm sorry, I have the SDA and SCL lines crossed in my setup. I was actually shorting SDA to GND. I'm using 4001 SYS_AUTOSTART for this test, but I have another airframe on a private branch with the same results.

Found another interesting clue: It only happens after disabling the internal compass by setting CAL_MAG1_ROT to 0. If I use both compasses, I get the following message:

INFO  [ekf2] Mag sensor ID changed to 131594
INFO  [ekf2] Mag sensor ID changed to 73225

@bartslinger
Copy link
Contributor Author

bartslinger commented Sep 12, 2017

I got tired of jittering the cable so I setup an arduino to do it for me. It hangs as soon as a message gets interrupted in the middle. This is an example recorded with a logic analyzer:

logicanalyzer

Also tested with master, same behavior.

@bartslinger
Copy link
Contributor Author

This is my current test setup with the arduino (which breaks it really easily):

arduino

Arduino runs this code, which makes it occasionally collide with a packet

void setup() {
  // put your setup code here, to run once:
  pinMode(3, OUTPUT);
  digitalWrite(3, LOW);
}

void loop() {
  digitalWrite(3, LOW);
  delay(30);
  digitalWrite(3, HIGH);
  delay(1);
}

@davids5
Copy link
Member

davids5 commented Sep 12, 2017

@bartslinger - Thank you! I have got it now. I can see the error and will add a patch for it today.

@davids5
Copy link
Member

davids5 commented Sep 12, 2017

@bartslinger - please give #7957 a test spin.

@bartslinger
Copy link
Contributor Author

Works, thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants