Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jMAVSim SITL with Windows Cygwin gets stuck #10098

Closed
amitmc19 opened this issue Jul 31, 2018 · 16 comments · Fixed by #10371
Closed

jMAVSim SITL with Windows Cygwin gets stuck #10098

amitmc19 opened this issue Jul 31, 2018 · 16 comments · Fixed by #10371
Assignees

Comments

@amitmc19
Copy link

amitmc19 commented Jul 31, 2018

Installed the Windows Cygwin Toolchain, and attempting to run the JMAVSim. I followed the guide https://dev.px4.io/en/setup/dev_env_windows_cygwin.html.
After I start the build and run "make posix jmavsim", the SITL simulation gets stuck in accessing Sensors. Copy pasting the errors I get on the console
jmavsim

ERROR [sensors] Accel #0 fail:  TIMEOUT!
ERROR [sensors] Sensor Accel #0 failed. Reconfiguring sensor priorities.
WARN  [sensors] Remaining sensors after failover event 0: Accel #0 priority: 1
WARN  [sensors] Remaining sensors after failover event 0: Accel #1 priority: 1
ERROR [sensors] Gyro #0 fail:  TIMEOUT!
ERROR [sensors] Sensor Gyro #0 failed. Reconfiguring sensor priorities.
WARN  [sensors] Remaining sensors after failover event 0: Gyro #0 priority: 1
WARN  [ekf2] accel id changed, resetting IMU bias
ERROR [sensors] Accel #0 fail:  TIMEOUT!
ERROR [sensors] Sensor Accel #0 failed. Reconfiguring sensor priorities.
WARN  [sensors] Remaining sensors after failover event 0: Accel #0 priority: 1
WARN  [sensors] Remaining sensors after failover event 0: Accel #1 priority: 1
ERROR [sensors] Gyro #0 fail:  TIMEOUT!
ERROR [sensors] Sensor Gyro #0 failed. Reconfiguring sensor priorities.
WARN  [sensors] Remaining sensors after failover event 0: Gyro #0 priority: 1

Steps to reproduce the behavior:

  1. Install cygwin setup as documented here
  2. build and run the jmavsim as documented. Basically I ran "make posix jmavsim" in the cygwin console

Expected behavior
Expect the drone to take off in the SITL

Log Files and Screenshots
Always provide a link to the flight log file:

  • Download the flight log file from the vehicle (tutorial).
  • Share the link to a log showing the problem on PX4 Flight Review.

I have attached the screenshot of the cygwin console.

Laptop config: Intel Core i7-7700HQ CPU @ 2.8GHz, 4 cores, 16GB RAM

@MaEtUgR
Copy link
Member

MaEtUgR commented Aug 12, 2018

@amitmc19 I freshly downloaded the 0.3 toolchain and something seems broken with the simulation on master 😞 I didn't get the errors you have but the simulation binary px4.exe fails to run.

As a quick fix I tested to work on my machine:

  • open run-console.bat
  • execute
    cd Firmware
    git checkout stable
    git submodule update --init --recursive
    make posix jmavsim
    
    This switches to the last stable release version of PX4.

I need to look into what's broken on the most recent version.
Thanks for reporting!

@hamishwillee
Copy link
Contributor

@MaEtUgR I just ran this using the "clone the PX4 repository, build and run simulation with jMAVSim" option. SIM started fine, but (as usual) the sky is displaying as black in jMAVSim.

@MaEtUgR
Copy link
Member

MaEtUgR commented Aug 22, 2018

Update: @bkueng 's suspicion was true, the bash shell startup scripting introduced with #10173 broke cygwin simulation. I bisected to verify:
bisect2
I didn't find out why yet, hopefully it will not take me too long 🤔

@MaEtUgR
Copy link
Member

MaEtUgR commented Aug 23, 2018

0c5c741 likely introduces multiple problems...

  • Thanks to @julianoes's help we found out that the close() open() for every client here https://github.com/PX4/Firmware/blob/master/platforms/posix/src/px4_daemon/server.cpp#L156-L157 breaks with open returning "device or resource busy". Recreating the pipe each time with unlink() mkfifo() in between seems to fix it but the next client may already run into an abandonned pipe and therefore has to retry. Then the autostart script runs through and it starts up but:
  • Sensors always says all the sensor data is stale exactly like in the screenshot of OP. When I run listener (also works from an external shell) I see data but the timestamp of the data is always from the exact moment I last ran listener (e.g. if I wait 10 seconds and run listener again the last message on the uorb topic was 10 seconds ago).
  • Also Ctrl+C or the shutdown command write "Exiting..." and "Shuting down" and make the shell unusable but the px4 binary stays open and blocks until I kill it.

So not so many good news yet but I'll not give up and continue to try and find the causes.

@MaEtUgR
Copy link
Member

MaEtUgR commented Aug 23, 2018

Update: Ctrl+C and the shutdown command do not work because there's a thread of the px4 binary hanging in a system call in every server...

@LorenzMeier
Copy link
Member

That might need additional signal handling. @bkueng I recall we're doing that in the PX4 posix shell?

@MaEtUgR
Copy link
Member

MaEtUgR commented Aug 23, 2018

I found something that fixed all the problems for a few tests: replace all the posix shell pipe open(..., O_RDONLY) with O_RDWR. Any ideas why the open() system calls with pipe file descriptors created using mkfifo() mostly fail when read only and work with read write on cygwin are welcome. I try to find a way to fix it consistently now.

EDIT: I found a hint: "Opening a FIFO for reading normally blocks until some other process opens the same FIFO for writing, and vice versa."

@evgenee
Copy link

evgenee commented Sep 7, 2018

Hi, I have the same problem with the sensors as @amitmc19.
Is there any progress with this issue?
Thank you.

@julianoes
Copy link
Contributor

@evgenee It looks like there is a PR, you can try it out: #10371

@evgenee
Copy link

evgenee commented Sep 10, 2018

Thank you @julianoes. As I figure it out from #10371, the bug source is in the cygwin and it was not solved yet, but patched...
I've just started to work with px4 (worked before with other flight stacks). The question is whether previous versions of px4 sitl simulations on windows had similar problems with the cygwin?
Thanks

@MaEtUgR
Copy link
Member

MaEtUgR commented Sep 18, 2018

@evgenee

The question is whether previous versions of px4 sitl simulations on windows had similar problems with the cygwin?

No, like written just above: #10098 (comment)

the bug source is in the cygwin

Yes, it doesn't fully support all the named pipe calls introduced with the posix shell changes: #10173 . My pr #10371 aims to fix the startup script problem, please let me know if it works for you. You can read the detailed pr description, it states all my findings so far. I'll keep on debugging now that I'm back from holidays.

@evgenee
Copy link

evgenee commented Sep 18, 2018

Hi, @MaEtUgR,
Thanks for your contribution. I followed your PR, so it's working now.
May be you know whether there is native windows build of px4 for the sitl runs?
Thank you

@hamishwillee
Copy link
Contributor

@MaEtUgR evgenee

May be you know whether there is native windows build of px4 for the sitl runs?

I use tagged release 1.8.0

@evgenee
Copy link

evgenee commented Sep 19, 2018

Thank you @hamishwillee, @MaEtUgR

Can you be please more specific about native windows build in 1.8.0. I didn't find something related to this in github...
Thanks.

@hamishwillee
Copy link
Contributor

hamishwillee commented Sep 19, 2018

@evgenee So all our instructions assume you want to get the latest version of the master branch from either the PX4/Firmware repo or your clone of that repo. But if that latest version does not work, you might want to go back in time and get an older version that does work.

The way to do that is to get a specific tag release. You can see all the tags and releases here.

So after doing:

git clone https://github.com/PX4/Firmware.git

To get PX4 code at the time a tag was created you do:

git checkout tags/v1.8.0

If you have already built this you might need to run make distclean or similar before building.

There is a note on this at the end of this section: http://dev.px4.io/en/setup/building_px4.html#get_px4_code

Hope that helps. If you get stuck, search google for information on git and tags.

@MaEtUgR
Copy link
Member

MaEtUgR commented Oct 16, 2018

@amitmc19 I'm sorry I think I misinterpreted your problem. It seems the exact problem in your description is still not fixed. I made a workaround for the simulation not starting because of the bash shell startup script. But I still have performance issues see #10693

@evgenee PX4 1.8 release should work fine on Cygwin. If you mean Microsoft Visual C++ build by native windows build no that does not exist and would require a lot of interface reqrite effort so I don't see this happening anytime soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants