Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Node down with a error - Insufficient resources: could not allocate code memory: Cannot allocate memory (os error 12) #61

Closed
SNSMLN opened this issue Mar 11, 2024 · 7 comments

Comments

@SNSMLN
Copy link

SNSMLN commented Mar 11, 2024

My node went down for no reason. In the neard log there is only this

Mar 11 16:57:50 node01new neard[882540]: 2024-03-11T15:57:50.306463Z  INFO stats: #033[1;49;33m#114493881 DnowrgBqgcK4Wo5CGufbhQbX4BpNeT4hGiw1q6a1FkbH#0
33[0m#033[1;49;37m Validator | 20 validators#033[0m#033[1;49;36m 43 peers ⬇ 19.3 MB/s ⬆ 48.3 MB/s#033[0m#033[1;49;32m 1.00 bps 36.9 Tgas/s#033[0m#033[1;
49;34m CPU: 300%, Mem: 3.79 GB#033[0m                                                                                                                   
Mar 11 16:57:54 node01new neard[882540]: 2024-03-11T15:57:54.167114Z  INFO near_network::peer_manager::connection: peer ed25519:2SVMYgTYcEgnxYGyr9XxHUZF
Hf7XfS43DaHNAHnFwgpw disconnected, while sending SyncAccountsData                                                                                       
Mar 11 16:57:54 node01new neard[882540]: 2024-03-11T15:57:54.167128Z  INFO near_network::peer_manager::connection: peer ed25519:2SVMYgTYcEgnxYGyr9XxHUZF
Hf7XfS43DaHNAHnFwgpw disconnected, while sending SyncAccountsData                                                                                       
Mar 11 16:57:55 node01new neard[882540]: thread '<unnamed>' panicked at runtime/runtime/src/actions.rs:172:13:                                          
Mar 11 16:57:55 node01new neard[882540]: Contract runtime failed to load a contrct: Insufficient resources: could not allocate code memory: Cannot alloc
ate memory (os error 12)                                                                                                                                
Mar 11 16:57:55 node01new neard[882540]: stack backtrace:                                                                                               
Mar 11 16:57:55 node01new neard[882540]: thread '<unnamed>' panicked at runtime/runtime/src/actions.rs:172:13:                                          
Mar 11 16:57:55 node01new neard[882540]: Contract runtime failed to load a contrct: Insufficient resources: could not allocate code memory: Cannot alloc
ate memory (os error 12)                                                                                                                                
Mar 11 16:57:55 node01new neard[882540]:    0: rust_begin_unwind                                                                                        
Mar 11 16:57:55 node01new neard[882540]:              at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:645:5             
Mar 11 16:57:55 node01new neard[882540]:    1: core::panicking::panic_fmt                                                                               
Mar 11 16:57:55 node01new neard[882540]:              at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/panicking.rs:72:14            
Mar 11 16:57:55 node01new neard[882540]:    2: node_runtime::actions::execute_function_call                                                             
Mar 11 16:57:55 node01new neard[882540]:    3: node_runtime::Runtime::apply_action                                                                      
Mar 11 16:57:55 node01new neard[882540]:    4: node_runtime::Runtime::apply_action_receipt                                                              
Mar 11 16:57:55 node01new neard[882540]:    5: node_runtime::Runtime::apply::{{closure}}                                                                
Mar 11 16:57:55 node01new neard[882540]:    6: node_runtime::Runtime::apply                                                                             
Mar 11 16:57:55 node01new neard[882540]:    7: <nearcore::runtime::NightshadeRuntime as near_chain::types::RuntimeAdapter>::apply_chunk                 
Mar 11 16:57:55 node01new neard[882540]:    8: near_chain::update_shard::apply_new_chunk                                                                
Mar 11 16:57:55 node01new neard[882540]:    9: <rayon_core::job::HeapJob<BODY> as rayon_core::job::Job>::execute                                        
Mar 11 16:57:55 node01new neard[882540]:   10: rayon_core::registry::WorkerThread::wait_until_cold                                                      
Mar 11 16:57:55 node01new neard[882540]: note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.                        
Mar 11 16:57:55 node01new neard[882540]: stack backtrace:                                                                                               
Mar 11 16:57:55 node01new neard[882540]:    0: rust_begin_unwind                                                                                        
Mar 11 16:57:56 node01new neard[882540]:              at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:645:5

In syslog :

Mar 11 16:57:56 node01new systemd[1]: neard.service: Main process exited, code=killed, status=6/ABRT                                                    
Mar 11 16:57:56 node01new systemd[1]: neard.service: Failed with result 'signal'.                                                                       
Mar 11 16:58:26 node01new systemd[1]: neard.service: Scheduled restart job, restart counter is at 1.                                                    
Mar 11 16:58:26 node01new systemd[1]: Stopped Near node.                                                                                                

Grafana is installed on the server. There's also nothing that attracts attention.

Screenshot_2024-03-11_19-36-08
Screenshot_2024-03-11_19-35-47
Screenshot_2024-03-11_19-37-04
Screenshot_2024-03-11_19-37-19
Screenshot_2024-03-11_19-37-35
Screenshot_2024-03-11_19-37-51
Screenshot_2024-03-11_19-38-08

The only thing. Increased number of memory page faults
Screenshot_2024-03-11_20-00-26

@SNSMLN SNSMLN changed the title Node down with a error - Insufficient resources: could not allocate code memory: Cannot alloc ate memory (os error 12) Node down with a error - Insufficient resources: could not allocate code memory: Cannot allocate memory (os error 12) Mar 11, 2024
@walnut-the-cat
Copy link
Collaborator

cc. @alexauroradev @marcelo-gonzalez

@pugachAG
Copy link

Looks like an issue due to insufficient memory, but "Memory Basic" dashboard seems to indicate that node still had enough memory. @nagisa this was originated from the runtime, maybe you know what could be the source of this?

@pugachAG
Copy link

pugachAG commented Mar 12, 2024

The issues should be fixed not with near/nearcore#10733 and near/nearcore#10736. So this should no longer happen after we release new neard version for statelessnet.

@DDeAlmeida
Copy link
Contributor

 Mem: 4.41 GB
2024-03-11T18:45:52.651473Z  INFO stats: #114503377 Bp2yKPEKsQREgTbijWLohffYip38gn2X412ipaBv7Ro3 Validator | 20 validators 32 peers ⬇ 14.3 MB/s ⬆ 1.49 MB/s 1.00 bps 0 gas/s CPU: 245%, Mem: 4.43 GB
2024-03-11T18:46:02.652847Z  INFO stats: #114503386 J52mEV3xWyeZtuj2KwS3stz1kFc8fTJa58Hg96935frv Validator | 20 validators 32 peers ⬇ 14.9 MB/s ⬆ 1.76 MB/s 0.90 bps 0 gas/s CPU: 249%, Mem: 4.42 GB
2024-03-11T18:46:12.654089Z  INFO stats: #114503396 C3uD39kbbvXhpFBYp5f85whmCpd32y5ExepeEzEVrzfj Validator | 20 validators 33 peers ⬇ 15.4 MB/s ⬆ 1.87 MB/s 1.00 bps 0 gas/s CPU: 239%, Mem: 4.39 GB
2024-03-11T18:46:20.461053Z  INFO near_network::peer_manager::connection: peer ed25519:AJJ7CC1GKyJAUKVd9xRt9YED99i176b9rm4NZRCYCM1s disconnected, while sending SyncAccountsData
2024-03-11T18:46:20.461069Z  INFO near_network::peer_manager::connection: peer ed25519:AJJ7CC1GKyJAUKVd9xRt9YED99i176b9rm4NZRCYCM1s disconnected, while sending SyncAccountsData
2024-03-11T18:46:20.461074Z  INFO near_network::peer_manager::connection: peer ed25519:AJJ7CC1GKyJAUKVd9xRt9YED99i176b9rm4NZRCYCM1s disconnected, while sending SyncAccountsData
2024-03-11T18:46:20.461078Z  INFO near_network::peer_manager::connection: peer ed25519:AJJ7CC1GKyJAUKVd9xRt9YED99i176b9rm4NZRCYCM1s disconnected, while sending SyncAccountsData
2024-03-11T18:46:22.655366Z  INFO stats: #114503406 4GZMm1cNMviUt1S5MymkJBUxmrEjjbH8E6tbaRX9MZ51 Validator | 20 validators 32 peers ⬇ 15.8 MB/s ⬆ 1.85 MB/s 1.00 bps 0 gas/s CPU: 259%, Mem: 4.38 GB
2024-03-11T18:46:32.657147Z  INFO stats: #114503416 F5vSuq2qWAWYMMbJSNDDAmmKfCResjMAyYgxPjaPdie8 Validator | 20 validators 32 peers ⬇ 15.7 MB/s ⬆ 1.86 MB/s 1.00 bps 0 gas/s CPU: 176%, Mem: 4.34 GB
2024-03-11T18:46:42.658467Z  INFO stats: #114503423 3zSq8mDZvSvSAHNHuyanFnMQBdQofaVdDJy4NyC6yudg Validator | 20 validators 32 peers ⬇ 14.3 MB/s ⬆ 1.75 MB/s 0.70 bps 0 gas/s CPU: 124%, Mem: 4.37 GB
2024-03-11T18:46:52.658933Z  INFO stats: #114503434 AjAA6m2Bow9jeHctRLwvcLDkdxDhv1wt82oEJ9ZoisvR Validator | 20 validators 33 peers ⬇ 13.9 MB/s ⬆ 1.82 MB/s 1.00 bps 0 gas/s CPU: 202%, Mem: 4.41 GB
thread '<unnamed>' panicked at runtime/runtime/src/actions.rs:172:13:
Contract runtime failed to load a contrct: Insufficient resources: could not allocate code memory: Cannot allocate memory (os error 12)
stack backtrace:
   0: rust_begin_unwind
             at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:645:5
   1: core::panicking::panic_fmt
             at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/panicking.rs:72:14
   2: node_runtime::actions::execute_function_call
   3: node_runtime::Runtime::apply_action
   4: node_runtime::Runtime::apply_action_receipt
   5: node_runtime::Runtime::apply::{{closure}}
   6: node_runtime::Runtime::apply
   7: <nearcore::runtime::NightshadeRuntime as near_chain::types::RuntimeAdapter>::apply_chunk
   8: near_chain::update_shard::apply_new_chunk
   9: <rayon_core::job::HeapJob<BODY> as rayon_core::job::Job>::execute
  10: rayon_core::registry::WorkerThread::wait_until_cold
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
Aborted

Same here

@walnut-the-cat
Copy link
Collaborator

walnut-the-cat commented Mar 21, 2024

@SNSMLN , @DDeAlmeida could you confirm if the issue is gone now?

@SNSMLN
Copy link
Author

SNSMLN commented Mar 22, 2024

@SNSMLN , @DDeAlmeida could you confirm if the issue is gone now?

After updating to build 1.36.1-298-g984f6ad71 the issue no longer appeared

@DDeAlmeida
Copy link
Contributor

Same here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants