Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High CPU usage - futex_wait, gettimeofday syscalls #8735

Closed
megastef opened this issue Sep 23, 2016 · 5 comments
Closed

High CPU usage - futex_wait, gettimeofday syscalls #8735

megastef opened this issue Sep 23, 2016 · 5 comments
Labels
question Issues that look for answers.

Comments

@megastef
Copy link

megastef commented Sep 23, 2016

Got situation with very high CPU usage on node 4.5 and 6.x.
The actual node.js process get stuck (no logs etc ...).
running ctrace prints this:

sudo ctrace -p 15084 -v
[15217] Trace on:  attach to process 15084
[pid:15094] [08:52:00.317489] futex, 202 -- fast user-space locking linux (0x1df27c4, FUTEX_WAIT_PRIVATE, 660, NULL <unfinished ...>
[pid:15090] [08:52:00.317855] futex, 202 -- fast user-space locking linux (0x2ab7f68, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid:15095] [08:52:00.328036] futex, 202 -- fast user-space locking linux (0x1df27c4, FUTEX_WAIT_PRIVATE, 660, NULL <unfinished ...>
[pid:15093] [08:52:00.328051] futex, 202 -- fast user-space locking linux (0x1df27c4, FUTEX_WAIT_PRIVATE, 660, NULL <unfinished ...>
[pid:15092] [08:52:00.328058] futex, 202 -- fast user-space locking linux (0x1df27c4, FUTEX_WAIT_PRIVATE, 660, NULL <unfinished ...>
[pid:15089] [08:52:00.328065] futex, 202 -- fast user-space locking linux (0x2ab7f68, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid:15088] [08:52:00.328072] futex, 202 -- fast user-space locking linux (0x2ab7f68, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid:15087] [08:52:00.328078] futex, 202 -- fast user-space locking linux (0x2ab7f68, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>````

Executing interrupted by signal ...
-----------------------------------------------------------------------------------------------------------------------------------------------------
SYSCALL         TIME %     SECOND     CALLS      DESCRIPTION             ERRORNO (COUNT)
*               100        0.000000   8          -                       (0)            
futex           NaN        0.000000   8          fast user-space locking -              
-----------------------------------------------------------------------------------------------------------------------------------------------------

Also seen often before:

futex           0.0        0.000438   827        fast user-space locking                          EAGAIN (Resource temporarily unavailable) (3)

After a few minutes the process contiunes to run. Then I get this in ctrace:

Executing interrupted by signal ...
-----------------------------------------------------------------------------------------------------------------------------------------------------
SYSCALL         TIME %     SECOND     CALLS      DESCRIPTION                                    ERRORNO (COUNT)                              
*               100        0.688409   15652      -                                              (6)                                          
gettimeofday    95.4       0.656960   14499      get / set time                                 -                                            
clock_gettime   3.9        0.026934   598        clock and time functions                       -                                            
mmap            0.2        0.001334   47         map or unmap files or devices into memory      -                                            
write           0.1        0.000379   9          write to a file descriptor                     -                                            
socket          0.1        0.000574   8          create an endpoint for communication           -                                            
pread           0.1        0.000500   1          read or write data into multiple               -                                            
getsockname     0.0        0.000028   6          get socket name                                -                                            
getdents        0.0        0.000094   2          get directory entries                          -                                            
close           0.0        0.000147   12         close a file descriptor                        -                                            
open            0.0        0.000194   5          open and possibly create a file                -                                            
read            0.0        0.000083   10         read from a file descriptor                    -                                            
epoll_wait      0.0        0.000033   3          wait for an I/O event on an epoll file         -                                            
stat            0.0        0.000040   5          get file status                                -                                            
bind            0.0        0.000013   2          bind a name to a socket                        -                                            
futex           0.0        0.000252   361        fast user-space locking                        EAGAIN (Resource temporarily unavailable) (4)
sendto          0.0        0.000063   4          send a message on a socket                     -                                            
recvmsg         0.0        0.000028   4          receive a message from a socket                -                                            
fstat           0.0        0.000015   3          get file status                                -                                            
getcwd          0.0        0.000080   1          get current working directory                  -                                            
munmap          0.0        0.000076   8          map or unmap files or devices into memory      -                                            
connect         0.0        0.000099   10         initiate a connection on a socket              EINPROGRESS (Operation now in progress) (2)  
poll            0.0        0.000012   6          wait for some event on a file descriptor       -                                            
ioctl           0.0        0.000034   6          control device                                 -                                            
recvfrom        0.0        0.000012   2          receive a message from a socket                -                                            
openat          0.0        0.000012   1          open and possibly create a file                -                                            
epoll_ctl       0.0        0.000047   2          control interface for an epoll file descriptor -                                            
madvise         0.0        0.000303   31         give advice about use of memory                -                                            
mprotect        0.0        0.000018   4          set protection on a region of memory           -                                            
getsockopt      0.0        0.000045   2          get and set options on sockets                 -                                            
-----------------------------------------------------------------------------------------------------------------------------------------------------

Any ideas to how I could find the "gettime" calls, or debug this further?

@bnoordhuis
Copy link
Member

Start node with --perf_basic_prof and attach perf with perf top -g -p <pid> once node is running. The gettimeofday system calls are almost certainly new Date() or Date.now() calls in JS code.

@claudiorodriguez claudiorodriguez added the question Issues that look for answers. label Sep 23, 2016
@megastef
Copy link
Author

Thx! Cool. Did not know "perf top". Works, so i can watch until it happens again :)
bildschirmfoto 2016-09-23 um 14 23 25

@megastef
Copy link
Author

Just FYI. It turned out it was one a regular expressions consuming all CPU.
Thank you very much @bnoordhuis !

@Xgamefactory
Copy link

thanks @bnoordhuis for --perf_basic_prof flag.
just perf top is enough to see script js file that causes high cpu usage.

@rambo-panda
Copy link

rambo-panda commented Nov 26, 2024

I am currently encountering a similar issue. my program is started in fork mode using pm2, and after running for a period of time, there is a small chance that it will appear to hang, but the CPU usage is 0. When I checked with gdb and printed the backtrace, the information is only as follows.

image

[<ffffffff92b0cfa6>] futex_wait_queue_me+0xc6/0x130
[<ffffffff92b0dc8b>] futex_wait+0x17b/0x280
[<ffffffff92b0f9d6>] do_futex+0x106/0x5a0
[<ffffffff92b0fef0>] SyS_futex+0x80/0x190
[<ffffffff93175d9b>] system_call_fastpath+0x22/0x27
[<ffffffffffffffff>] 0xffffffffffffffff
  System:
    OS: Linux 3.10 Ubuntu 22.04.1 LTS 22.04.1 LTS (Jammy Jellyfish)
    CPU: (40) x64 Intel Xeon Processor (Cascadelake)
    Memory: 59.85 GB / 78.66 GB
    Container: Yes
    Shell: 5.1.16 - /bin/bash
  Binaries:
    Node: 20.16.0 - /usr/bin/node
    npm: 10.8.1 - /usr/bin/npm

Could you give me some ideas on how to continue tracking the root cause?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Issues that look for answers.
Projects
None yet
Development

No branches or pull requests

5 participants