Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Lenovo Ideapad 330] System unresponsive on wakeup from suspend on battery mode #593

Closed
royarg02 opened this issue Oct 20, 2021 · 25 comments

Comments

@royarg02
Copy link

royarg02 commented Oct 20, 2021

[x] I've read and accepted the Bug Reporting Howto
[x] I've attached all required tlp-stat outputs via Gist (see below)

Describe the bug

On battery mode, trying to wake up the laptop freezes(either showing a black screen with a stationery cursor on top-left, or a static lock screen), and becomes unresponsive to input.

Suspending manually through systemctl suspend exhibits the same issue; even from a tty(in case of tty, I can view the console displaying previous commands alongwith the output and the prompt, but am unable to type to the prompt, as if I'm just seeing a screenshot of the console).

This is not the case with previous tlp versions, for instance, 1.3.1, only appearing in 1.4.0.

Expected behavior

On wakeup from suspend, the system actually wakes up and accepts input.

To Reproduce

Steps to reproduce the unexpected behavior:

  1. Does the problem occur on battery or AC or both? On Battery mode only.
  2. Actions to reproduce the behaviour
  • Run tlp in battery mode.
  • Suspend the system, by either closing the lid or issuing systemctl suspend.
  • On wakeup, system freezes.
  1. Shell commands entered and their output Not Applicable here.
  2. Full output of tlp-stat via https://gist.github.com/ for all
    matching cases of 1.
    Github gist

Additional context

  • Interestingly, this issue occurs while plugged in and running on battery mode, and consequently, does not occur while running on battery and in ac mode.
  • I've followed the troubleshooting guide; disabling runtime PM and USB autosuspend didn't help.
  • Pretty sure this is a distribution agnostic issue; it is occurring on both Manjaro and Artix Linux.
  • Looks like a similar behavior is reported at [tlp 1.4.0-1] USB ports don't work on battery #587 (comment). Suffice to say the only way to recover from this issue is to hard reboot.
@royarg02
Copy link
Author

royarg02 commented Nov 5, 2021

This issue isn't reproducible after I switched to LTS kernel (5.10.76-1-lts).

@royarg02
Copy link
Author

royarg02 commented Nov 25, 2021

The issue had started to appear on the LTS kernel as well, even on version 1.3.1, it only resolved after I uninstalled the NVIDIA drivers.

I went back to the stable kernel 5.15.4-artix1-1 and uninstalled NVIDIA drivers there as well, now both 1.3.1 and 1.4.0 are working fine.

Following this I reinstalled the NVIDIA driver and blacklisted it (RUNTIME_PM_DRIVER_DENYLIST="nvidia') alongwith the PCIe device (RUNTIME_PM_DISABLE="xx:xx.x") but that didn't resolve the issue.

Edit: Nope. Still encountering the issue. The removal of the drivers did help though, as it is not freezing at every wakeup, rather once every 30 wakeup attempts, but not zero.

@linrunner
Copy link
Owner

I consider this a kernel bug - or worse: a bug in the proprietary nvidia driver - and I have no idea what I could do about it on the part of TLP. Therefore, I will close here.

Nevertheless, you are welcome to share further insights or the solution here.

@linrunner
Copy link
Owner

In the meantime I have gained new insights into this problem. Does the workaround added to the FAQ solve your issue?

AHCI_RUNTIME_PM_ON_BAT=on

@royarg02
Copy link
Author

Thanks for your suggestion. However, as it is difficult to recreate this issue without having the NVIDIA drivers, I had them installed for my first couple of tests. It didn't resolve.

Your suggestion also made me notice that my laptop occasionally appears to freeze on resume before I hear the HDD spinning up, and then responding to inputs.

I currently have runtime PM disabled with the NVIDIA drivers removed, and everything seems to be working fine. I plan to replace the HDD with a SSD sometime soon to see it that helps.

@timlag1305
Copy link

I ran into this same issue on my Lenovo T530. I do not have an Nvidia GPU. Setting AHCI_RUNTIME_PM_ON_BAT=on resolved it for me.

@jpcloureiro
Copy link

I ran into this same issue on my Lenovo T530. I do not have an Nvidia GPU. Setting AHCI_RUNTIME_PM_ON_BAT=on resolved it for me.

Can confirm the exact behavior on a T450s

Kernel: 5.16.8
TLP: 1.5.0

@timlag1305
Copy link

For me, I've tried both current kernel and LTS kernel with the same outcome:
Kernel: 5.16.8, 5.15.22
TLP: 1.5.0

@linrunner
Copy link
Owner

linrunner commented Feb 15, 2022

@timlag1305 , @jpcloureiro : too little information about your systems, please show:

tlp-stat -s -d --cdiff

Btw: my T450s with kernel 5.16.8 works fine ...

@timlag1305
Copy link

Sorry about that @linrunner. I missed that. Here is the gist. tlp_fixed.txt is when I have AHCI_RUNTIME_PM_ON_BAT="on" and tlp_broken.txt is when I have that commented out.

https://gist.github.com/timlag1305/01d5eacb5119b1e418a9a851aa522d14

@timlag1305
Copy link

Also unfortunately I couldn't find any useful logs pointing to any particular issue when I ran journalctl -b -1. The last entries were from the kernel relating to suspending, but there were no associated errors.

@bymoz089
Copy link

bymoz089 commented Oct 7, 2022

I encounter this same nasty freeze on my Thinkpad x230. Setting AHCI_RUNTIME_PM_ON_BAT=on prevents the freeze.

It took me weeks to locate the cause of this freeze, because:

  • the freeze happens only on the second wakeup from sleep, when on battery and booted with AC -- or on the first wakeup, when system booted with battery only.
  • there are absolutely no hints in the logs that this freeze occured, or why
  • the system freezes with following behavior: it is still possible to switch VTs (via CTRL+ALT+F[1..12] but the command promt (cursor) does not blink anymore and it is imposible to type anything. Switching to a VT, which runs X11 freezes the device entirely.
  • you can only get out of that freeze by a hard reset
  • this freeze happens i.e. with following combination of software:
    • TLP v 1.5.0 and Linux Kernels 5.18.x or Linux Kernel 5.10.x
  • it does not happen when using TLP 1.3.1 (and any Linux Kernel), or when using Linux Kernel v 4.19.x

@IOsetting
Copy link

I ran into this same issue on my Lenovo T530. I do not have an Nvidia GPU. Setting AHCI_RUNTIME_PM_ON_BAT=on resolved it for me.

I met the same issue on my T450s, system info

System         = LENOVO ThinkPad T450s 20BWS3V600
BIOS           = JBET73WW (1.37 )
OS Release     = Ubuntu 22.04.1 LTS
Kernel         = 5.15.0-56-generic #62-Ubuntu SMP Tue Nov 22 19:54:14 UTC 2022 x86_64
/proc/cmdline  = BOOT_IMAGE=/boot/vmlinuz-5.15.0-56-generic root=UUID=f987015c-7058-4890-a735-50582de22da3 ro quiet splash vt.handoff=7
Init system    = systemd v249 (249.11-0ubuntu3.6)
Boot mode      = UEFI

@linrunner
Copy link
Owner

linrunner commented Dec 11, 2022

@IOsetting : My T450s with same OS/kernel is still not affected.

@IOsetting @bymoz089 My suspicion is that it is due to the particular SSD. You have not posted complete outputs. I would need at least

tlp-stat --cdiff -s -d

Via https://gist.github.com/ if you don't mind.

@IOsetting
Copy link

Hi Linrunner, this is the command output https://gist.github.com/IOsetting/ab2be40564e7904fb697f87884667edf

@linrunner
Copy link
Owner

linrunner commented Dec 13, 2022

@IOsetting something comes to my mind. Please add

X_TLP_SUSPEND_ACMODE=1

to your configuration and also comment your change to AHCI_RUNTIME_PM_ON_BAT

#AHCI_RUNTIME_PM_ON_BAT=on

Then try if suspend/resume works. Show

tlp-stat --cdiff -s -d

afterwards.

ps. Gloway is unknown to me as an SSD brand.

@IOsetting
Copy link

Thank you, I have changed the configuration. Suspend and resume works with this configuration.
I appended the command output to the gist https://gist.github.com/IOsetting/ab2be40564e7904fb697f87884667edf

@linrunner
Copy link
Owner

linrunner commented Dec 15, 2022

@IOsetting Great. I'll have to think about bringing this deprecated feature back to life with the next release.

@vhuuyt
Copy link

vhuuyt commented Jan 26, 2023

The same issue on my laptop, here is the output. And I have noticed that, after my setting to AHCI_RUNTIME_PM_ON_BAT="on", my laptop could remember the screen brightness I had set the last time (just perfomed like it did when I didn't install the TLP).

Hope this could help you locate the problem.

TLP is very helpful 👍, but I didn't discover this software until yesterday :(
Thanks for all the things you have done.

@linrunner
Copy link
Owner

@IOsetting @vhuuyt @bymoz089 The workaround is now integrated directly into TLP. Could those affected please test with the main branch?

Before that, please neutralize the settings for the workaround by removing or commenting them:

#X_TLP_SUSPEND_ACMODE=1
#AHCI_RUNTIME_PM_ON_BAT=on

@noctuid
Copy link

noctuid commented Apr 1, 2023

I disabled tlp a while ago because AHCI_RUNTIME_PM_ON_BAT=on did not fix this issue for my p52 thinkpad (and seemed to cause flickering issues). I never came back and tried X_TLP_SUSPEND_ACMODE=1, but on master so far it looks like resume on battery is now working.

@linrunner
Copy link
Owner

linrunner commented Apr 1, 2023

@noctuid thanks, your feedback is very valuable. Could you please provide the output of

sudo tlp-stat -s -d --cdiff

@noctuid
Copy link

noctuid commented Apr 1, 2023

--- TLP 1.6.0-alpha.0 --------------------------------------------

+++ Configured Settings (only differences to defaults):
/etc/tlp.d/my.conf L0008: START_CHARGE_THRESH_BAT0="75"
/etc/tlp.d/my.conf L0009: STOP_CHARGE_THRESH_BAT0="80"

+++ System Info
System         = LENOVO ThinkPad P52 20M9CTO1WW
BIOS           = N2CET63W (1.46 )
EC Firmware    = 1.16
OS Release     = Arch Linux
Kernel         = 6.2.8-zen1-1-zen #1 ZEN SMP PREEMPT_DYNAMIC Wed, 22 Mar 2023 22:52:38 +0000 x86_64
/proc/cmdline  = cryptdevice=LABEL=cryptlinux:cryptlinux:allow-discards resume=/dev/cryptlinux_group/swap root=/dev/cryptlinux_group/root rw rootflags=noatime scsi_mod.use_blk_mq=y quiet vga=current nosplash initrd=/intel-ucode.img initrd=\initramfs-linux-zen.img
Init system    = systemd
Boot mode      = UEFI
Suspend mode   = s2idle [deep]

+++ TLP Status
State          = enabled
RDW state      = not installed
Last run       = 08:42:51 AM,   1894 sec(s) ago
Mode           = AC
Power source   = AC

+++ Disks
Devices = nvme0n1 sda

/dev/nvme0n1:
  Type       = NVMe
  Disk ID    = nvme-CT4000P3PSSD8_2240E671DBBF
  Model      = CT4000P3PSSD8
  Firmware   = P9CR40A
  Temp       = 41 °C
  Scheduler  = none mq-deadline [kyber] bfq (multi queue)

  Runtime PM:
    /sys/block/nvme0n1/device/power/control = on, autosuspend_delay_ms = (not available)

  SMART info:
    Critical Warning:                   0x00
    Temperature:                        41 Celsius
    Available Spare:                    100%
    Available Spare Threshold:          5%
    Percentage Used:                    0%
    Data Units Written:                 387,988 [198 GB]
    Power Cycles:                       121
    Power On Hours:                     2,464
    Unsafe Shutdowns:                   5
    Media and Data Integrity Errors:    0

/dev/sda:
  Type       = SATA
  Disk ID    = ata-Samsung_SSD_860_EVO_M.2_1TB_S415NB0M305103P
  Model      = Samsung SSD 860 EVO M.2 1TB
  Firmware   = RVT22B6Q
  APM Level  = none/disabled
  Status     = active/idle
  TRIM       = supported
  Host       = host1
  Scheduler  = none mq-deadline kyber [bfq] (multi queue)

  Runtime PM:
    /sys/block/sda/device/power/control = on, autosuspend_delay_ms = 15000

  SMART info:
      5 Reallocated_Sector_Ct     =        0
      9 Power_On_Hours            =    16860 [h]
     12 Power_Cycle_Count         =     2645
    177 Wear_Leveling_Count       =       99 [%]
    179 Used_Rsvd_Blk_Cnt_Tot     =        0
    190 Airflow_Temperature_Cel   =       38 [°C]
    241 Total_LBAs_Written        =    2.629 [TB]

@vhuuyt
Copy link

vhuuyt commented Apr 7, 2023

@IOsetting @vhuuyt @bymoz089 The workaround is now integrated directly into TLP. Could those affected please test with the main branch?

Before that, please neutralize the settings for the workaround by removing or commenting them:

#X_TLP_SUSPEND_ACMODE=1
#AHCI_RUNTIME_PM_ON_BAT=on

After my commenting #AHCI_RUNTIME_PM_ON_BAT=on, it works just like nothing happened. 🎉 congratulations!

@linrunner
Copy link
Owner

Hi @ALL : TLP 1.6 Beta 1 is out and contains a fix for this issue -> #700

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants