Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clock skew issues megathread #10006

Closed
craigloewen-msft opened this issue Apr 21, 2023 · 171 comments
Closed

Clock skew issues megathread #10006

craigloewen-msft opened this issue Apr 21, 2023 · 171 comments
Labels
wsl2 Issue/feature applies to WSL 2

Comments

@craigloewen-msft
Copy link
Member

craigloewen-msft commented Apr 21, 2023

Megathread

Current status: waiting on backport for kernel patch to mitigate issue.

We're creating this megathread to track the clock skew issues in WSL in one place, and will keep this parent comment current with any updates.

Background

Sometimes the WSL clock can become skewed after resume from sleep (specifically S0). See some example related issues for more info:
#8318
#8204
#7255

Potential work arounds

Use systemd to force clock sync

See this comment: #8204 (comment)

Set the hardware clock via a command

Run sudo hwclock -s. More info here.

Run ntpdate on distro start up

Edit /etc/wsl.conf to have this content:

[boot]
command="ntpdate ntp.ubuntu.com"

This will force a clock reset on start up of the distro.

Build a private kernel with this patch

@marceloid
Copy link

This workaround using systemd was the best and only solution in my case: #8204 (comment)

@duaneking
Copy link

The number of open issues that relate to this temporal distortion worry me, because it tells me there isn't a critical security focus on time by the teams involved in all these regressions and supposed fixes; so who is running the show for temporal security at MS? Anybody? Who is the Time Czar in the Security Org?

Or perhaps a better question to ask: Why aren't these consistent temporal anomalies being considered more of a security issue?

Everybody in security knows that time is a critical security component; so if time is not correct on the host system, then it is simply out of security compliance by default, right?

@haroldiedema
Copy link

haroldiedema commented Apr 26, 2023

Here's a simple fix that invokes hwclock -s everytime your machine wakes up from sleep. Note: This assumes you configured your WSL environment in such a way that it should always be running (for daemons/webservers/etc.).

Assuming you've changed the default user to something other than root, you'll first need to allow passwordless sudo when invoking sudo hwclock -s by updating the sudoers file:

$ sudo visudo

Add the following line:

%sudo   ALL=(ALL) NOPASSWD: /usr/sbin/hwclock

Next, create a batch file somewhere on your machine, e.g.: C:\sync-clock.bat with the following contents:

@echo off
ubuntu.exe run "sudo hwclock -s"
exit

(Change "ubuntu.exe" if you need to)

Lastly, create a Task in the Task Scheduler that runs every time your computer wakes up from sleep.

  1. Open the Task Scheduler and create a new "Basic Task".
  2. Set the trigger to "When a specific event is logged" and click "Next".
  3. Under "Log", select System.
  4. Under "Source", select Kernel-Power.
  5. Under "Event ID", type "507".
  6. Under "action", select "Start a program" and click "next"
  7. Specify the batch file we just created: C:\sync-clock.bat.

If the task is not executed on your machine, it may be because your version of Windows emits a different Event ID. Open up the Event Viewer and check under System for any log entries that have the source "Kernel-Power" that match a timestamp when your machine has woken up from sleep mode. Verify the correct event by reading its description. It should state something along the lines of "The system exited sleep mode". The correct "Event ID" should be listed in the same window.

Hope this helps.

@ManuInNZ
Copy link

ManuInNZ commented Apr 26, 2023 via email

@esumii
Copy link

esumii commented Apr 26, 2023

  1. Open the Task Scheduler and create a new "Basic Task".
  2. Set the trigger to "When a specific event is logged" and click "Next".
  3. Under "Log", select System.
  4. Under "Source", select Kernel-Power.
  5. Under "Event ID", type "507".

I'd add 107 as well.

  1. Under "action", select "Start a program" and click "next"
  2. Specify the batch file we just created: C:\sync-clock.bat.

If the task is not executed on your machine,

Another reason can be (lack of) power:

#5324 (comment)

to run without power, "Start the task only if the computer is on AC power" must be unchecked on the Task Scheduler of Windows)

Also see the above link to do without sudo.

@gaia
Copy link
Contributor

gaia commented Apr 26, 2023

Lastly, create a Task in the Task Scheduler that runs every time your computer wakes up from sleep.

Task Scheduler doesn't work to run tasks in WSL2, see #9231

@dboreham
Copy link

I don't think we need any more workarounds do we? Someone needs to go into the hypervisor code and fix the bug, no? Hypervisor's job is to present correct hardware clock functionality to its host kernels, presumably. Or have we moved on from that being the case?

@duaneking
Copy link

duaneking commented Apr 26, 2023

I don't think we need any more workarounds do we?

No. An actual fix would be best for security and compliance as systems having the wrong time can create GDPR violations in the worst case, and MSFT is a globally GDPR compliant company, right? So the team has a clear mandate as part of being One Microsoft that they need to fix this, right?

Someone needs to go into the hypervisor code and fix the bug, no?

Yes, if that is where it is.

Hypervisor's job is to present correct hardware clock functionality to its host kernels, presumably. Or have we moved on from that being the case?

I don't believe anybody's done that kind of due diligence. The correct people at that level don't seem to be aware of this issue, or I suspect it would have been resolved if they truly understood how big of an issue this was, so the fact nobody in technical leadership has freaked out and mandated a fix asap tells me that has not happened yet.

@Clockwork-Muse
Copy link

if they truly understood how big of an issue this was, so the fact nobody in technical leadership has freaked out and mandated a fix asap tells me that has not happened yet.

I don't think this is as large of an issue as you're trying to make it out to be.

Keep in mind that WSL is primarily intended to be a developer tool, and not something you'd run a production-level server on (also - if your server is allowed to sleep you probably have much larger problems). It would be simpler, easier, and cheaper to just run whatever distro "natively" (either in a dedicated hypervisor like Hyper-V, or on bare metal).

Yes, it's annoying. Yes, there are security issues (although exploiting them still requires stealing a private key, which shouldn't be trivial). If it's the end of the world for you, though, you likely have larger problems.

@duaneking
Copy link

I don't think this is as large of an issue as you're trying to make it out to be.

Then respectfully, you do not understand the issue or how this impacts the world at scale.

Keep in mind that WSL is primarily intended to be a developer tool, and not something you'd run a production-level server on

... and that's exactly why this is such a big issue. if you're making the bad assumption that because this is a developer's system that it's not going to be attacked, then sadly I have some bad news for you. We developers get attacked everyday, and even now people are trying to figure out ways to get on our machines. You ever hear of supply chain attacks? Developers are the supply chain.

(also - if your server is allowed to sleep you probably have much larger problems).

I agree but this is workstations, and that's even more heavily audited in some environments. If companies invested half as much in their production system security as they put into their corporate security, a lot of breaches would never happen.

It would be simpler, easier, and cheaper to just run whatever distro "natively" (either in a dedicated hypervisor like Hyper-V, or on bare metal).

Not for everybody. It would not be easier for me. I asked for many GB of ram for a reason: Docker/K8s.

Yes, it's annoying. Yes, there are security issues (although exploiting them still requires stealing a private key, which shouldn't be trivial). If it's the end of the world for you, though, you likely have larger problems.

Because looking in "%SystemDrive%\Documents and Settings\All Users\Application Data\Microsoft\Crypto\RSA\MachineKeys is hard, right? ;) I get what you're saying, but it's also clear to me that you don't have the same goals I do.

@gaia
Copy link
Contributor

gaia commented Apr 27, 2023

Even apt install/update fails when the clock is off. You can't develop if you can't install/update the tools you need.

This is an important issue.
This should be basic to fix.
The systemd workaround is enough for me.

@esumii
Copy link

esumii commented Apr 27, 2023

Task Scheduler doesn't work to run tasks in WSL2, see #9231

When the user is not logged on to the desktop, I suppose?

@asampal
Copy link

asampal commented Apr 27, 2023

I think another point to note is that WSL2 didn't always have this problem. So if it could do the right thing after waking up at one point, how come it's such a problem fixing the issue now?

@dboreham
Copy link

I don't think this is as large of an issue as you're trying to make it out to be.

Respectfully disagree. It's a very serious hypervisor bug. Presumably needs to be filed against the right product (not WSL, most likely) to get on the radar of someone who can fix it.

@duaneking
Copy link

I think another point to note is that WSL2 didn't always have this problem. So if it could do the right thing after waking up at one point, how come it's such a problem fixing the issue now?

Yes, exactly. This is a regression in a formally working product.

@haroldiedema
Copy link

Task Scheduler doesn't work to run tasks in WSL2, see #9231

When the user is not logged on to the desktop, I suppose?

No, he's right. That's why creating batch files is important. TaskScheduler can run a batch file, which then invokes the WSL commands. That works just fine.

@ghost
Copy link

ghost commented May 3, 2023

The number of open issues that relate to this temporal distortion worry me, because it tells me there isn't a critical security focus on time by the teams involved in all these regressions and supposed fixes; so who is running the show for temporal security at MS? Anybody? Who is the Time Czar in the Security Org?

Or perhaps a better question to ask: Why aren't these consistent temporal anomalies being considered more of a security issue?

Everybody in security knows that time is a critical security component; so if time is not correct on the host system, then it is simply out of security compliance by default, right?

@troyhunt can you get some traction on this?

@duaneking
Copy link

I guarantee you that audit logs that have the wrong timestamp due to the host system times being wrong can create severe problems; mostly because these audit records containing false data are considered immutable proof of legal compliance in these systems.

The end result is that data is being logged that is not correct, and then every single some of these systems that doesn't know the input is bad is then saying that data as presented is correct.

I would like to see this issue fixed.

@chrisclapham
Copy link

My team and I are also facing this issue. After a system restart or sleep the clock is usually behind by ~40mins. For now sudo hwclock -s seems to be working for some of us.

Eagerly looking forward to an official fix.

@0xabu
Copy link

0xabu commented May 10, 2023

Running hwclock -s gets me much closer to reality, but it's still off by 5 minutes:

$ sudo hwclock -s; date; cmd.exe /c "time /t"
Wed May 10 09:46:17 CEST 2023
09:51

Update: the VM's "hardware clock" appears to be reporting the time off by 5 minutes. This is not a drift calculation in the guest:

cmd.exe /c "echo %time%" ; sudo hwclock -r --verbose
 9:55:21.25
hwclock from util-linux 2.37.2
System Time: 1683705037.069536
Trying to open: /dev/rtc0
Using the rtc interface to the clock.
Assuming hardware clock is kept in UTC time.
Waiting for clock tick...
...got clock tick
Time read from Hardware Clock: 2023/05/10 07:50:18
Hw clock time : 2023/05/10 07:50:18 = 1683705018 seconds since 1969
Time since last adjustment is 1683705018 seconds
Calculated Hardware Clock drift is 0.000000 seconds
2023-05-10 09:50:16.900289+02:00

@lewissbaker
Copy link

The sudo hwclock -s command sometimes results in a clock that is still hours off of the realtime for me.

I've found the following snippet (heavily adapted/reduced from wslact utility from WSL utilities project works more reliably:

fix-time.sh

#!/bin/bash

set -e

function pwsh {
    local PowerShellExe="/mnt/c/Program Files/PowerShell/7/pwsh.exe"
    "$PowerShellExe" -NoProfile -NonInteractive -ExecutionPolicy Bypass -Command "[Console]::OutputEncoding = [System.Text.Encoding]::UTF8; [Console]::InputEncoding = [System.Text.Encoding]::UTF8; $*"
}

function full_date {
    date +"%F %T"
}

echo "Prev date: $(full_date)"

sudo date -u -s "$(pwsh Get-Date -AsUTC -UFormat \"%FT%TZ\")" > /dev/null

echo "New date : $(full_date)"

It just gets the current UTC time from Windows by running a PowerShell command, and then runs date to set the local WSL time to that time. It is still only to the nearest second, but that's good enough for my purposes.

The wslact time-sync command didn't work for me as it only outputs timezone information from the host to the nearest hour and so doesn't give the right result if you're on a timezone offset that isn't a whole number of hours.

@benc-uk
Copy link

benc-uk commented May 10, 2023

Likewise sudo hwclock -s still results in a clock hours out of sync, NTP is the only solution I've found to work, e.g. sudo ntpdate time.windows.com

@dboreham
Copy link

imho posts of the form "I found I could run hwclock|ntp and it made my clock kind of right" should be prohibited here. The bug is about the hypervisor screws up the guest OS's time. There is no workaround for that. The hypervisor just needs to be fixed such that it presents virtualized RTC to the guest that works.

(Sorry, no coffee yet this morning).

@ipalopezhentsev
Copy link

One more example of how it screws up work: suppose you've done some AWS S3 files downloading, then walked away, your computer went to sleep. Now you return and intend to download more files via the same console. AWS starts rejecting your attempts due to skewed time.

@ghost
Copy link

ghost commented Dec 4, 2023

A patch has gone into the maintainer branch. It will be back ported to our kernel.

@habaohaba I'm assuming that you're saying that the host clock doesn't agree with what you see from hwclock? That would be a different issue, please open a new bug for that.

@ghost ghost mentioned this issue Dec 12, 2023
2 tasks
@Tofandel

This comment was marked as off-topic.

@205g0

This comment was marked as off-topic.

@webstean

This comment was marked as off-topic.

@ghost

This comment was marked as off-topic.

@cmullendore

This comment was marked as off-topic.

@shigenobuokamoto

This comment was marked as off-topic.

@ghost

This comment was marked as off-topic.

@ghost ghost mentioned this issue Jan 16, 2024
2 tasks
@benhillis
Copy link
Member

Thanks for your patience. This issue should be resolved with https://github.com/microsoft/WSL/releases/tag/2.1.1.

@dboreham
Copy link

🕺

But...I know we should know this: what's the process for getting this code onto my laptop? Just wait for the next WSL update?

@davidfiala
Copy link

🕺

But...I know we should know this: what's the process for getting this code onto my laptop? Just wait for the next WSL update?

https://github.com/microsoft/WSL/releases

Find version >= 2.1.1 Pull the installer under 'Assets' for your platform. I believe there are CLI options as well for updating WSL to prerelease from the command line too with wsl --update --pre-release which can be seen from wsl --help

Before you do this though: Be sure you fully understand the consequences and risks of using the pre-release version and the update process as well as any new defaults or behaviors. It's probably out of scope to go any deeper on the topic of updates on this particular bug/thread. Open separate issues for that.


Big kudos to the team for getting this patched!

@dboreham
Copy link

Find version >= 2.1.1 Pull the installer under 'Assets' for your platform.

Ahhh, my mistake I assumed the assets from a release would be a kernel binary, not something usable by the end user. TIL!

@lrosenman
Copy link

I just refreshed the Windows store and received 2.1.1 as an update. No magic is required.

@mbomb007
Copy link

I just refreshed the Windows store and received 2.1.1 as an update. No magic is required.

You must already be on the pre-release. I installed updates with wsl --update, and it put me at 2.0.14.

@lrosenman
Copy link

I am on the Insider Dev build, but I installed it from the store.

@king-11
Copy link

king-11 commented Feb 11, 2024

I ran

wsl --update --pre-release

this broke my fedora WSL installation, after running

wsl --update

things started working again

@abtswath
Copy link

This change keeps the clock synced with Windows all the time, and I can't manually change the time anymore. Unless I set the kernel parameter hv_utils.timesync_implicit=0 in the C:\Users\<UserName>\.wslconfig.

@mbomb007
Copy link

If you consider it an issue, you should probably open a new issue, as this one is closed.

@asaf400
Copy link

asaf400 commented May 8, 2024

I ran

wsl --update --pre-release

this broke my fedora WSL installation, after running

wsl --update

things started working again

I guess it's either fixed, or I'm using a more recent fedora, but just wsl --update --pre-release still works for me:

[root@LAPTOP-ASAF-T14 ~]# cat /etc/os-release
NAME="Fedora Remix for WSL"
VERSION="39"
ID=fedoraremixforwsl
ID_LIKE=fedora
VERSION_ID=39
PLATFORM_ID="platform:f39"
PRETTY_NAME="Fedora Remix for WSL"
ANSI_COLOR="0;34"
CPE_NAME="cpe:/o:fedoraproject:fedora:39"
HOME_URL="https://github.com/WhitewaterFoundry/Fedora-Remix-for-WSL"
...
FEDORA_REMIX_VERSION=39.0.1


[root@LAPTOP-ASAF-T14 ~]# wsl.exe --version
WSL version: 2.2.4.0
Kernel version: 5.15.153.1-2
WSLg version: 1.0.61
MSRDC version: 1.2.5326
Direct3D version: 1.611.1-81528511
DXCore version: 10.0.26091.1-240325-1447.ge-release
Windows version: 10.0.22621.3447

Hopefully I won't need to manually run ntpdate time-a-g.nist.gov anymore 🤞

@rhoog-ine
Copy link

Works like a charm. No more clock updating after sleep/inactivity. No more git commits hours/days off.
When will this trickle out of pre-release?

@compuguy
Copy link

Works like a charm. No more clock updating after sleep/inactivity. No more git commits hours/days off.
When will this trickle out of pre-release?

If you have a WSL release that's 2.1.1 or newer, you should have the fix?

@lilltiger
Copy link

lilltiger commented Oct 30, 2024

My issue was that Windows VM synced it's time correctly with a ntp-server, while the underlaying host did not sync it's time and had the wrong time configured, the WSL synced with the host and not with the windows-system runing WSL.

After fixing the host time-sync it started to working as it should.


It seems like this issue should be re-opened as the issue is still there:

WSL version: 2.3.24.0
Kernel version: 5.15.153.1-2

PRETTY_NAME="Ubuntu 22.04.5 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.5 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy

WSL: Wed Oct 30 14:01:55 CET 2024
Power Shell: den 30 oktober 2024 13:15:34

When runing the commands with not even a second between em.

@fanyh
Copy link

fanyh commented Dec 26, 2024

In WSL, there is a scheduling deviation with CLOCK_MONOTONIC.

Environment

WSL Ubuntu 24.04

wsl.conf

[boot]
systemd=true
command="ntpdate ntp.ubuntu.com"

Issue Description

In WSL, there is a scheduling deviation with CLOCK_MONOTONIC.

After restarting WSL, the issue improves slightly, but after a longer duration (approximately 100 seconds), the deviation reappears. The only way to resolve this issue temporarily is to restart Windows.

Steps to Reproduce

  1. Set up the wsl.conf as shown above.
  2. Observe the behavior of CLOCK_MONOTONIC and CLOCK_REALTIME.
  3. Restart WSL to see a temporary improvement.
  4. Wait for about 100 seconds to notice the reappearance of the deviation.
  5. Restart Windows to temporarily resolve the issue.

code

local skynet = require "skynet"
local start = os.time()
while true do
    skynet.sleep(100)
    skynet.error(skynet.now .. "vs" .. os.time() - start)
end

log

[:01000009] 584vs5
[:01000009] 684vs6
[:01000009] 784vs7
[:01000009] 884vs8
[:01000009] 984vs9
[:01000009] 1084vs10
[:01000009] 1184vs14

c debug code

void skynet_timer_init(void) {
    ...
    struct timespec ti;
    clock_gettime(CLOCK_MONOTONIC, &ti);
    TI->t1 = (uint32_t)ti.tv_sec;
}

uint64_t skynet_now(void) {
    struct timespec ti;
    clock_gettime(CLOCK_MONOTONIC, &ti);

    struct timespec ti1;
    clock_gettime(CLOCK_REALTIME, &ti1);
    skynet_error(NULL, "pass:%d real:%d", ((uint32_t)ti.tv_sec - TI-> t1), ((uint32_t)ti1.tv_sec - TI->starttime));
    return TI->current;
}

c log

[:01000009] 1084vs10
[:00000000] pass:10 real:10
...
[:00000000] pass:10 real:11
[:00000000] pass:10 real:11
[:00000000] pass:10 real:11
[:00000000] pass:10 real:11
[:00000000] pass:10 real:11
[:00000000] pass:10 real:11
[:00000000] pass:10 real:11
[:00000000] pass:10 real:11
[:00000000] pass:10 real:11
[:00000000] pass:11 real:11
...
[:00000000] pass:11 real:13
...
[:00000000] pass:11 real:14
...
[:01000009] 1184vs14

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wsl2 Issue/feature applies to WSL 2
Projects
None yet
Development

No branches or pull requests