Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Boot gets stuck with eGPU connected #93

Open
harvzor opened this issue Nov 6, 2022 · 7 comments
Open

Boot gets stuck with eGPU connected #93

harvzor opened this issue Nov 6, 2022 · 7 comments
Assignees

Comments

@harvzor
Copy link

harvzor commented Nov 6, 2022

If I connect the eGPU and try to boot, the computer gets to a screen with the Kubuntu logo glowing, but then freezes forever. Boot is only possible with the eGPU disconnected.

Work around

However, I can get the eGPU running by doing this:

  1. log in
  2. run glxinfo|egrep "OpenGL vendor|OpenGL renderer*" confirms my internal GPU is being used
  3. run sudo egpu-switcher switch tells me the switch is successful
  4. run glxinfo|egrep "OpenGL vendor|OpenGL renderer*" confirms my internal GPU is still being used
  5. logout
  6. login
  7. run glxinfo|egrep "OpenGL vendor|OpenGL renderer*" prints: OpenGL vendor string: NVIDIA Corporation OpenGL renderer string: NVIDIA GeForce RTX 2060 SUPER/PCIe/SSE2

I'm pretty happy with the workaround method, it just took me a while to figure it out 😄

Other troubleshooting I've tried

Check whether the egpu.service is still enabled in systemd and check its output in journalctl.

When I check the journalctl after attempting to boot with the eGPU connected, there's no history of that boot.

In case there is a race-condition happening between bolt and egpu-switcher, try enabling Pre-Boot ACL in the BIOS and re-authorize your eGPU. With this setting enabled, your eGPU gets connected much faster. Be aware that this setting only makes sense with the Thunderbolt Security set to user (see #50).

My BIOS does not have a Pre-Boot ACL option.

I tried using ExecStartPre=/bin/sleep 1 as mentioned in #77 but it doesn't help. I still need to try the 5s pause solution in #50.

General setup

Using internal monitor.

ie.

  • Did you install egpu-switcher via ppa or via git + make - downloaded 0.18.1 release manually from GitHub and installed
  • What Linux distribution (+ version) are you using - Kubuntu 22.10
  • What brand / model is your laptop - lenovo yoga 720-13ikb
  • What brand / model is your GPU (+ enclosure) - r43sg-tb3 with RTX 2060 Super
  • What drivers (+ version) are you using - 515.65.01
  • What Desktop-Environment do you use (+ Display-Manager) - KDE on X11
  • If you are not using a Desktop-Environment, what Window-Manager do you use?
@hertg
Copy link
Owner

hertg commented Nov 6, 2022

Hmm, I don't have an immediate hunch what could be the problem here, but I can give you some tips to further troubleshoot.

If I connect the eGPU and try to boot, the computer gets to a screen with the Kubuntu logo glowing, but then freezes forever

Does the computer actually freeze or are you just stuck in the bootup screen? I think on most systems it is possible to show kernel messages on startup by hitting Home or Esc during bootup, maybe there is some useful information hidden behind the Splashscreen. More info: https://wiki.archlinux.org/title/Plymouth

When I check the journalctl after attempting to boot with the eGPU connected, there's no history of that boot.

That is strange 🤔 The following command does not give you any data about the attempted boot?

sudo journalctl -k -b -1

I find it strange that the Splashscreen shows up, indicating that it actually attempts to boot.

My BIOS does not have a Pre-Boot ACL option.

Sometimes it's not called Pre-Boot ACL (that's just what my Lenovo X1 Laptop calls it), it could be that such an option is available but named differently on your machine. But it is also possible that this option doesn't exist on your hardware, given that you are also using Lenovo, they would probably name it the same.

I still need to try the 5s pause solution in #50.

If you are on the latest version 0.18.0+ (which you seem to be on), the source code mentioned there no longer exists as it's been rewritten. But you can tweak the startup delay in the config /etc/egpu-switcher/config.yaml now by changing the detection retries and interval. More info: https://github.com/hertg/egpu-switcher#configuration

@harvzor
Copy link
Author

harvzor commented Nov 10, 2022

Thanks for the help!

Does the computer actually freeze or are you just stuck in the bootup screen? I think on most systems it is possible to show kernel messages on startup by hitting Home or Esc during bootup

Thanks I've found the messages now. The messages freeze and stop coming after a few seconds when booting with the eGPU attached. Hitting escape again (on a normal boot) should bring me back to the Kubuntu logo, but in this case, when it freezes, hitting escape doesn't do anything, so I think it's really freezing.

The messages displayed right before freezing change every time I try booting, but here's on of the boot attempts:

1668100964670

That boot seems to mention some issues with the Nvidia GPU I'm using.

sudo journalctl -k -b -1

Okay this does indeed give me the messages from the failed boot! I was looking in the wrong place before.

I can't find anything in the output mentioning egpu-switcher. Here's the output:

egpu-boot-journal.txt

The messages stop after Started Journal Service which I confirmed after another bad eGPU boot. On a good boot (with internal gpu), the messages continue after Started Journal Service. Here's a good boot example:

internal-gpu-journal.txt

My guess is that on a "bad boot", the system isn't actually freezing/crashing, but perhaps I'm just getting no visual feedback? I'm not sure.

But you can tweak the startup delay in the config /etc/egpu-switcher/config.yaml

I checked both /etc/egpu-switcher/ but there's no config.yaml inside, despite having run sudo egpu-switcher config and sudo egpu-switcher enable. Very strange. I also searched for any other folders named egpu-switcher and my system found /usr/share/egpu-switcher/ but this also has no config file. Maybe this is where the problem is coming from?

Steps I used to install

  1. Download the egpu-switcher-amd64 release from GitHub (0.18.1).
  2. Rename with sudo mv egpu-switcher-amd64 egpu-switcher (very important step as I believe in the egpu.service, it assumes the binary is called egpu-switcher)
  3. Copy to install directory with sudo cp egpu-switcher-amd64 /opt/
  4. Link with sudo ln -s /opt/egpu-switcher /usr/bin
  5. Now with a refreshed terminal I can use sudo egpu-switcher from anywhere

@hertg
Copy link
Owner

hertg commented Nov 11, 2022

Hmm, I don't know. It could very well be an issue unrelated to egpu-switcher.

It is strange that you don't find the config in /etc/egpu-switcher after running egpu-switcher config, does it show when you ls with root privileges? It is also strange that you don't have any files sitting in /usr/share/egpu-switcher, that is where the egpu.service is stored that gets symlinked. Did you run egpu-switcher enable? Because simply running config will not enable the service. Maybe also check with sudo ls there.

If the system doesn't boot even with egpu.service disabled, egpu-switcher itself is probably not at fault here.

Does your laptop have a discrete GPU or only integrated graphics? Do you have an option to change Hybrid Graphics to Discrete Graphics in BIOS (It may be called differently)? I remember that changing that has resolved a similar freeze issue on my old laptop.

btw. If you add the --verbose flag when running egpu-switcher commands, you might get some additional debug output.

@Man-with-Arrow
Copy link

It sounds like the same issue I ran into on Ubuntu 22.04, which was caused by upgrading to kernel 5.15.0-53. Booting 5.15.0-52 fixes it. @harvzor, which kernel version are you running?

@harvzor
Copy link
Author

harvzor commented Nov 21, 2022

 ~ uname -r                                                                                                                                                                ✔  15:40:18 
5.19.0-23-generic

@Man-with-Arrow Looks like I'm quite a few versions ahead 👀

@Man-with-Arrow
Copy link

 ~ uname -r                                                                                                                                                                ✔  15:40:18 
5.19.0-23-generic

@Man-with-Arrow Looks like I'm quite a few versions ahead 👀

It seems there’s a regression, either in Ubuntu’s kernel or upstream. I don’t have the time for a reinstall, but if you do - try testing it out with Fedora. I remember it working well for me in the past.

@hertg
Copy link
Owner

hertg commented Mar 2, 2023

@harvzor Are there any news on this by any chance? Do you still not have config files in /etc/egpu-switcher/and no systemd unit in /usr/share/egpu-switcher/egpu.service despite running the config and enable commands?

I saw in you screenshot that you have AppArmor installed, is it possible that egpu-switcher gets blocked by that?

@hertg hertg moved this to Feedback in egpu-switcher Mar 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Feedback
Development

No branches or pull requests

3 participants