Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Vllm0.6.2 UserWarning: resource_tracker: There appear to be 1 leaked shared_memory objects to clean up at shutdown #8933

Open
1 task done
Clint-chan opened this issue Sep 29, 2024 · 14 comments
Labels
bug Something isn't working

Comments

@Clint-chan
Copy link

Your current environment

bb9922e2c602e85bd72304acd3bfa9b3

Model Input Dumps

No response

🐛 Describe the bug

(demo_vllm) demo@dgx03:/raid/xinference/modelscope/hub/qwen/Qwen2-72B-Instruct/logs$ tail -f vllm_20240927.log
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x57 (0x7f5cf7ba9897 in /raid/demo/anaconda3/envs/vllm/lib/python3.10/site-packages/torch/lib/libc10.so)
frame #1: c10d::ProcessGroupNCCL::WorkNCCL::checkTimeout(std::optional<std::chrono::duration<long, std::ratio<1l, 1000l> > >) + 0x1d2 (0x7f5cf8e82c62 in /raid/demo/anaconda3/envs/vllm/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)
frame #2: c10d::ProcessGroupNCCL::watchdogHandler() + 0x1a0 (0x7f5cf8e87a80 in /raid/demo/anaconda3/envs/vllm/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)
frame #3: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x10c (0x7f5cf8e88dcc in /raid/demo/anaconda3/envs/vllm/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)
frame #4: + 0xdbbf4 (0x7f5d44931bf4 in /raid/demo/anaconda3/envs/vllm/bin/../lib/libstdc++.so.6)
frame #5: + 0x8609 (0x7f5d4618b609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #6: clone + 0x43 (0x7f5d45f56353 in /lib/x86_64-linux-gnu/libc.so.6)

/raid/demo/anaconda3/envs/vllm/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked shared_memory objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d '

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
@Clint-chan Clint-chan added the bug Something isn't working label Sep 29, 2024
@ghost

This comment was marked as spam.

@Clint-chan
Copy link
Author

LOG_DIR="/raid/xinference/modelscope/hub/qwen/Qwen2-72B-Instruct/logs";
mkdir -p "$LOG_DIR";
LOG_FILE="$LOG_DIR/vllm_$(date +%Y%m%d).log";
echo "Logging to $LOG_FILE" >&2;
conda run -n vllm_062 vllm serve
/raid/Qwen2-72B-Instruct
--port 20005
--dtype bfloat16
--tensor-parallel-size 4
--max-num-seqs 1024
--gpu-memory-utilization 0.85
--max-num-batched-tokens 8192
--max-model-len 8192
--block-size 32
--enforce-eager
--enable_chunked_prefill=True \

"$LOG_FILE" 2>&1' &

@Clint-chan
Copy link
Author

分析一下我的日志:INFO 09-29 08:48:42 logger.py:36] Received request chat-312348aecf9e42ec91de81da41e091db: prompt: '<|im_start|>system\n你是一位商品名词分析专家,请从描述内容中分析是否包含与输入的产品关键词含义相同。请使用Json 的格式返回如下结果,结果的值只允许有包含和不包含。格式 如下:{“result”:"结果"}<|im_end|>\n<|im_start|>user\nFIBER CABLE是否包含在哪下内容中,内容如下:None<|im_end|>\n<|im_start|>assistant\n', params: SamplingParams(n=1, best_of=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.0, temperature=0.7, top_p=1.0, top_k=-1, min_p=0.0, seed=None, use_beam_search=False, length_penalty=1.0, early_stopping=False, stop=[], stop_token_ids=[], include_stop_str_in_output=False, ignore_eos=False, max_tokens=None, min_tokens=0, logprobs=None, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True, truncate_prompt_tokens=None), prompt_token_ids: [151644, 8948, 198, 56568, 109182, 45943, 113046, 101042, 101057, 37945, 45181, 53481, 43815, 15946, 101042, 64471, 102298, 57218, 31196, 104310, 105291, 109091, 102486, 1773, 14880, 37029, 5014, 43589, 68805, 31526, 104506, 59151, 3837, 59151, 9370, 25511, 91680, 102496, 18830, 102298, 33108, 16530, 102298, 1773, 68805, 69372, 16872, 5122, 90, 2073, 1382, 854, 2974, 59151, 9207, 151645, 198, 151644, 872, 198, 37, 3256, 640, 356, 3494, 64471, 102298, 109333, 16872, 43815, 15946, 3837, 43815, 104506, 5122, 4064, 151645, 198, 151644, 77091, 198], lora_request: None, prompt_adapter_request: None.
INFO: 61.171.72.231:32674 - "POST /v1/chat/completions HTTP/1.1" 200 OK
ERROR: Exception in ASGI application
Traceback (most recent call last):
File "/raid/demo/anaconda3/envs/vllm/lib/python3.10/site-packages/starlette/responses.py", line 265, in call
await wrap(partial(self.listen_for_disconnect, receive))
File "/raid/demo/anaconda3/envs/vllm/lib/python3.10/site-packages/starlette/responses.py", line 261, in wrap
await func()
File "/raid/demo/anaconda3/envs/vllm/lib/python3.10/site-packages/starlette/responses.py", line 238, in listen_for_disconnect
message = await receive()
File "/raid/demo/anaconda3/envs/vllm/lib/python3.10/site-packages/uvicorn/protocols/http/httptools_impl.py", line 553, in receive
await self.message_event.wait()
File "/raid/demo/anaconda3/envs/vllm/lib/python3.10/asyncio/locks.py", line 214, in wait
await fut
asyncio.exceptions.CancelledError: Cancelled by cancel scope 7f5c49584a90

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/raid/demo/anaconda3/envs/vllm/lib/python3.10/site-packages/uvicorn/protocols/http/httptools_impl.py", line 399, in run_asgi
result = await app( # type: ignore[func-returns-value]
File "/raid/demo/anaconda3/envs/vllm/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 70, in call
return await self.app(scope, receive, send)
File "/raid/demo/anaconda3/envs/vllm/lib/python3.10/site-packages/fastapi/applications.py", line 1054, in call
await super().call(scope, receive, send)
File "/raid/demo/anaconda3/envs/vllm/lib/python3.10/site-packages/starlette/applications.py", line 123, in call
await self.middleware_stack(scope, receive, send)
File "/raid/demo/anaconda3/envs/vllm/lib/python3.10/site-packages/starlette/middleware/errors.py", line 186, in call
raise exc
File "/raid/demo/anaconda3/envs/vllm/lib/python3.10/site-packages/starlette/middleware/errors.py", line 164, in call
await self.app(scope, receive, _send)
File "/raid/demo/anaconda3/envs/vllm/lib/python3.10/site-packages/starlette/middleware/cors.py", line 85, in call
await self.app(scope, receive, send)
File "/raid/demo/anaconda3/envs/vllm/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 65, in call
await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
File "/raid/demo/anaconda3/envs/vllm/lib/python3.10/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
raise exc
File "/raid/demo/anaconda3/envs/vllm/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
await app(scope, receive, sender)
File "/raid/demo/anaconda3/envs/vllm/lib/python3.10/site-packages/starlette/routing.py", line 756, in call
await self.middleware_stack(scope, receive, send)
File "/raid/demo/anaconda3/envs/vllm/lib/python3.10/site-packages/starlette/routing.py", line 776, in app
await route.handle(scope, receive, send)
File "/raid/demo/anaconda3/envs/vllm/lib/python3.10/site-packages/starlette/routing.py", line 297, in handle
await self.app(scope, receive, send)
File "/raid/demo/anaconda3/envs/vllm/lib/python3.10/site-packages/starlette/routing.py", line 77, in app
await wrap_app_handling_exceptions(app, request)(scope, receive, send)
File "/raid/demo/anaconda3/envs/vllm/lib/python3.10/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
raise exc
File "/raid/demo/anaconda3/envs/vllm/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
await app(scope, receive, sender)
File "/raid/demo/anaconda3/envs/vllm/lib/python3.10/site-packages/starlette/routing.py", line 75, in app
await response(scope, receive, send)
File "/raid/demo/anaconda3/envs/vllm/lib/python3.10/site-packages/starlette/responses.py", line 258, in call
async with anyio.create_task_group() as task_group:
File "/raid/demo/anaconda3/envs/vllm/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 680, in aexit
raise BaseExceptionGroup(
exceptiongroup.ExceptionGroup: unhandled errors in a TaskGroup (1 sub-exception)
(VllmWorkerProcess pid=3223236) WARNING 09-29 08:48:42 shm_broadcast.py:404] No available block found in 60 second.
(VllmWorkerProcess pid=3223237) WARNING 09-29 08:48:42 shm_broadcast.py:404] No available block found in 60 second.
(VllmWorkerProcess pid=3223235) WARNING 09-29 08:48:42 shm_broadcast.py:404] No available block found in 60 second.
[rank2]:[E ProcessGroupNCCL.cpp:563] [Rank 2] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=29100340, OpType=GATHER, NumelIn=38016, NumelOut=0, Timeout(ms)=600000) ran for 600043 milliseconds before timing out.
[rank2]:[E ProcessGroupNCCL.cpp:1537] [PG 2 Rank 2] Timeout at NCCL work: 29100340, last enqueued NCCL work: 29100340, last completed NCCL work: 29100339.
[rank2]:[E ProcessGroupNCCL.cpp:577] [Rank 2] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data.
[rank2]:[E ProcessGroupNCCL.cpp:583] [Rank 2] To avoid data inconsistency, we are taking the entire process down.
[rank2]:[E ProcessGroupNCCL.cpp:1414] [PG 2 Rank 2] Process group watchdog thread terminated with exception: [Rank 2] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=29100340, OpType=GATHER, NumelIn=38016, NumelOut=0, Timeout(ms)=600000) ran for 600043 milliseconds before timing out.
Exception raised from checkTimeout at ../torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:565 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x57 (0x7f5cf7ba9897 in /raid/demo/anaconda3/envs/vllm/lib/python3.10/site-packages/torch/lib/libc10.so)
frame #1: c10d::ProcessGroupNCCL::WorkNCCL::checkTimeout(std::optional<std::chrono::duration<long, std::ratio<1l, 1000l> > >) + 0x1d2 (0x7f5cf8e82c62 in /raid/demo/anaconda3/envs/vllm/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)
frame #2: c10d::ProcessGroupNCCL::watchdogHandler() + 0x1a0 (0x7f5cf8e87a80 in /raid/demo/anaconda3/envs/vllm/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)
frame #3: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x10c (0x7f5cf8e88dcc in /raid/demo/anaconda3/envs/vllm/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)
frame #4: + 0xdbbf4 (0x7f5d44931bf4 in /raid/demo/anaconda3/envs/vllm/bin/../lib/libstdc++.so.6)
frame #5: + 0x8609 (0x7f5d4618b609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #6: clone + 0x43 (0x7f5d45f56353 in /lib/x86_64-linux-gnu/libc.so.6)

[rank3]:[E ProcessGroupNCCL.cpp:563] [Rank 3] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=29100340, OpType=GATHER, NumelIn=38016, NumelOut=0, Timeout(ms)=600000) ran for 600051 milliseconds before timing out.
[rank3]:[E ProcessGroupNCCL.cpp:1537] [PG 2 Rank 3] Timeout at NCCL work: 29100340, last enqueued NCCL work: 29100340, last completed NCCL work: 29100339.
[rank3]:[E ProcessGroupNCCL.cpp:577] [Rank 3] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data.
[rank3]:[E ProcessGroupNCCL.cpp:583] [Rank 3] To avoid data inconsistency, we are taking the entire process down.
[rank3]:[E ProcessGroupNCCL.cpp:1414] [PG 2 Rank 3] Process group watchdog thread terminated with exception: [Rank 3] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=29100340, OpType=GATHER, NumelIn=38016, NumelOut=0, Timeout(ms)=600000) ran for 600051 milliseconds before timing out.
Exception raised from checkTimeout at ../torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:565 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x57 (0x7f5cf7ba9897 in /raid/demo/anaconda3/envs/vllm/lib/python3.10/site-packages/torch/lib/libc10.so)
frame #1: c10d::ProcessGroupNCCL::WorkNCCL::checkTimeout(std::optional<std::chrono::duration<long, std::ratio<1l, 1000l> > >) + 0x1d2 (0x7f5cf8e82c62 in /raid/demo/anaconda3/envs/vllm/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)
frame #2: c10d::ProcessGroupNCCL::watchdogHandler() + 0x1a0 (0x7f5cf8e87a80 in /raid/demo/anaconda3/envs/vllm/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)
frame #3: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x10c (0x7f5cf8e88dcc in /raid/demo/anaconda3/envs/vllm/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)
frame #4: + 0xdbbf4 (0x7f5d44931bf4 in /raid/demo/anaconda3/envs/vllm/bin/../lib/libstdc++.so.6)
frame #5: + 0x8609 (0x7f5d4618b609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #6: clone + 0x43 (0x7f5d45f56353 in /lib/x86_64-linux-gnu/libc.so.6)

[rank1]:[E ProcessGroupNCCL.cpp:563] [Rank 1] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=29100340, OpType=GATHER, NumelIn=38016, NumelOut=0, Timeout(ms)=600000) ran for 600053 milliseconds before timing out.
[rank1]:[E ProcessGroupNCCL.cpp:1537] [PG 2 Rank 1] Timeout at NCCL work: 29100340, last enqueued NCCL work: 29100340, last completed NCCL work: 29100339.
[rank1]:[E ProcessGroupNCCL.cpp:577] [Rank 1] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data.
[rank1]:[E ProcessGroupNCCL.cpp:583] [Rank 1] To avoid data inconsistency, we are taking the entire process down.
[rank1]:[E ProcessGroupNCCL.cpp:1414] [PG 2 Rank 1] Process group watchdog thread terminated with exception: [Rank 1] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=29100340, OpType=GATHER, NumelIn=38016, NumelOut=0, Timeout(ms)=600000) ran for 600053 milliseconds before timing out.
Exception raised from checkTimeout at ../torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:565 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x57 (0x7f5cf7ba9897 in /raid/demo/anaconda3/envs/vllm/lib/python3.10/site-packages/torch/lib/libc10.so)
frame #1: c10d::ProcessGroupNCCL::WorkNCCL::checkTimeout(std::optional<std::chrono::duration<long, std::ratio<1l, 1000l> > >) + 0x1d2 (0x7f5cf8e82c62 in /raid/demo/anaconda3/envs/vllm/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)
frame #2: c10d::ProcessGroupNCCL::watchdogHandler() + 0x1a0 (0x7f5cf8e87a80 in /raid/demo/anaconda3/envs/vllm/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)
frame #3: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x10c (0x7f5cf8e88dcc in /raid/demo/anaconda3/envs/vllm/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)
frame #4: + 0xdbbf4 (0x7f5d44931bf4 in /raid/demo/anaconda3/envs/vllm/bin/../lib/libstdc++.so.6)
frame #5: + 0x8609 (0x7f5d4618b609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #6: clone + 0x43 (0x7f5d45f56353 in /lib/x86_64-linux-gnu/libc.so.6)

[rank0]:[E ProcessGroupNCCL.cpp:563] [Rank 0] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=29100340, OpType=GATHER, NumelIn=38016, NumelOut=38016, Timeout(ms)=600000) ran for 600079 milliseconds before timing out.
[rank0]:[E ProcessGroupNCCL.cpp:1537] [PG 2 Rank 0] Timeout at NCCL work: 29100340, last enqueued NCCL work: 29100340, last completed NCCL work: 29100339.
[rank0]:[E ProcessGroupNCCL.cpp:577] [Rank 0] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data.
[rank0]:[E ProcessGroupNCCL.cpp:583] [Rank 0] To avoid data inconsistency, we are taking the entire process down.
[rank0]:[E ProcessGroupNCCL.cpp:1414] [PG 2 Rank 0] Process group watchdog thread terminated with exception: [Rank 0] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=29100340, OpType=GATHER, NumelIn=38016, NumelOut=38016, Timeout(ms)=600000) ran for 600079 milliseconds before timing out.
Exception raised from checkTimeout at ../torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:565 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x57 (0x7f5cf7ba9897 in /raid/demo/anaconda3/envs/vllm/lib/python3.10/site-packages/torch/lib/libc10.so)
frame #1: c10d::ProcessGroupNCCL::WorkNCCL::checkTimeout(std::optional<std::chrono::duration<long, std::ratio<1l, 1000l> > >) + 0x1d2 (0x7f5cf8e82c62 in /raid/demo/anaconda3/envs/vllm/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)
frame #2: c10d::ProcessGroupNCCL::watchdogHandler() + 0x1a0 (0x7f5cf8e87a80 in /raid/demo/anaconda3/envs/vllm/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)
frame #3: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x10c (0x7f5cf8e88dcc in /raid/demo/anaconda3/envs/vllm/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)
frame #4: + 0xdbbf4 (0x7f5d44931bf4 in /raid/demo/anaconda3/envs/vllm/bin/../lib/libstdc++.so.6)
frame #5: + 0x8609 (0x7f5d4618b609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #6: clone + 0x43 (0x7f5d45f56353 in /lib/x86_64-linux-gnu/libc.so.6)

/raid/demo/anaconda3/envs/vllm/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked shared_memory objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d '
/raid/demo/anaconda3/envs/vllm/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked shared_memory objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d '
/raid/demo/anaconda3/envs/vllm/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked shared_memory objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d '

@ruleGreen
Copy link

same here

2 similar comments
@hxhcreate
Copy link

same here

@Guochry
Copy link

Guochry commented Oct 6, 2024

same here

@double-vin
Copy link

Once multiple GPU cards are used, this problem will occur.

@CREESTL
Copy link

CREESTL commented Oct 9, 2024

+1

@Clint-chan
Copy link
Author

i face a new bug now,
INFO: 61.171.72.231:10464 - "POST /v1/chat/completions HTTP/1.1" 200 OK
INFO 10-16 11:05:06 model_runner_base.py:120] Writing input of failed execution to /tmp/err_execute_model_input_20241016-110506.pkl...
[rank0]:[E1016 11:05:06.676414458 ProcessGroupNCCL.cpp:1515] [PG 3 Rank 0] Process group watchdog thread terminated with exception: CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

Exception raised from c10_cuda_check_implementation at ../c10/cuda/CUDAException.cpp:43 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7f87881a0f86 in /raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/torch/lib/libc10.so)
frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::string const&) + 0x64 (0x7f878814fd10 in /raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/torch/lib/libc10.so)
frame #2: c10::cuda::c10_cuda_check_implementation(int, char const*, char const*, int, bool) + 0x118 (0x7f878827bf08 in /raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/torch/lib/libc10_cuda.so)
frame #3: c10d::ProcessGroupNCCL::WorkNCCL::finishedGPUExecutionInternal() const + 0x56 (0x7f87894983e6 in /raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/torch/lib/libtorch_cuda.so)
frame #4: c10d::ProcessGroupNCCL::WorkNCCL::isCompleted() + 0xa0 (0x7f878949d600 in /raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/torch/lib/libtorch_cuda.so)
frame #5: c10d::ProcessGroupNCCL::watchdogHandler() + 0x1da (0x7f87894a42ba in /raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/torch/lib/libtorch_cuda.so)
frame #6: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x10c (0x7f87894a66fc in /raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/torch/lib/libtorch_cuda.so)
frame #7: + 0xdbbf4 (0x7f87d6c54bf4 in /raid/demo/anaconda3/envs/vllm_latest/bin/../lib/libstdc++.so.6)
frame #8: + 0x8609 (0x7f87d866d609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #9: clone + 0x43 (0x7f87d8438353 in /lib/x86_64-linux-gnu/libc.so.6)

INFO 10-16 11:05:09 logger.py:36] Received request chat-af9601cd9bc24371bb0b11c4a36bc55f: prompt: '<|im_start|>system\n<|im_end|>\n<|im_start|>user\n\n## Role: 专业AI翻译助手\n- description: 你是一个专业的AI翻译助手,精通中文。\n\n## Skills\n1. 精通多种语言的翻译技巧\n2. 准确理解和传达原文含义\n3. 严格遵循输出格式要求\n4. 专业术语和品牌名称的处理能力\n\n## Rules\n1. 仅输出原文和翻译,不添加任何解释或评论。\n2. 严格按照指定的JSON格式提供翻译结果。\n3. 保持专业术语和品牌名称不变,除非有官方的中文译名。\n4. 确保翻译准确传达原文含义,同时保持中文的自然流畅。\n5. 如遇无法翻译的内容,保持原文不变,不做额外说明。\n6. 禁止在输出中包含任何额外的解释、注释或元数据。\n\n## Workflow\n1. 接收翻译任务:目标语言为中文,待翻译文本为"BATERIA RECARGABLE PARA FUENTE DE VOLTAJE UNINTERRUPTIBLE"。\n2. 分析文本,识别专业术语和品牌名称。\n3. 进行翻译,遵循翻译规则。\n4. 按照指定的JSON格式输出结果。\n\n## OutputFormat\n{\n "翻译后": ""\n}\n\n## Init\n你的任务是将给定的文本"BATERIA RECARGABLE PARA FUENTE DE VOLTAJE UNINTERRUPTIBLE"准确翻译成中文。请直接开始翻译工作,只返回要求的JSON格式结果。<|im_end|>\n<|im_start|>assistant\n', params: SamplingParams(n=1, best_of=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.0, temperature=0.5, top_p=1.0, top_k=-1, min_p=0.0, seed=None, use_beam_search=False, length_penalty=1.0, early_stopping=False, stop=[], stop_token_ids=[], include_stop_str_in_output=False, ignore_eos=False, max_tokens=7868, min_tokens=0, logprobs=None, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True, truncate_prompt_tokens=None), prompt_token_ids: [151644, 8948, 198, 151645, 198, 151644, 872, 271, 565, 15404, 25, 220, 99878, 15469, 105395, 110498, 198, 12, 4008, 25, 220, 56568, 101909, 104715, 15469, 105395, 110498, 3837, 114806, 104811, 3407, 565, 30240, 198, 16, 13, 10236, 110, 122, 31935, 101312, 102064, 9370, 105395, 102118, 198, 17, 13, 65727, 228, 33956, 115167, 107707, 103283, 109091, 198, 18, 13, 220, 100470, 106466, 66017, 68805, 101882, 198, 19, 13, 220, 99878, 116925, 33108, 100135, 29991, 9370, 54542, 99788, 271, 565, 22847, 198, 16, 13, 220, 99373, 66017, 103283, 33108, 105395, 3837, 16530, 42855, 99885, 104136, 57191, 85641, 8997, 17, 13, 220, 110439, 105146, 9370, 5370, 68805, 99553, 105395, 59151, 8997, 18, 13, 220, 100662, 99878, 116925, 33108, 100135, 29991, 105928, 3837, 106781, 18830, 100777, 9370, 104811, 102610, 13072, 8997, 19, 13, 10236, 94, 106, 32463, 105395, 102188, 107707, 103283, 109091, 3837, 91572, 100662, 104811, 9370, 99795, 110205, 8997, 20, 13, 69372, 99688, 101068, 105395, 104597, 3837, 100662, 103283, 105928, 3837, 109513, 108593, 66394, 8997, 21, 13, 10236, 99, 223, 81433, 18493, 66017, 15946, 102298, 99885, 108593, 9370, 104136, 5373, 25074, 68862, 57191, 23305, 20074, 3407, 565, 60173, 198, 16, 13, 46602, 98, 50009, 105395, 88802, 5122, 100160, 102064, 17714, 104811, 3837, 74193, 105395, 108704, 17714, 63590, 19157, 5863, 74136, 7581, 3494, 50400, 95512, 93777, 3385, 68226, 15204, 40329, 6643, 3221, 48584, 13568, 1, 8997, 17, 13, 58657, 97771, 108704, 3837, 102450, 99878, 116925, 33108, 100135, 29991, 8997, 18, 13, 32181, 249, 22243, 105395, 3837, 106466, 105395, 104190, 8997, 19, 13, 6567, 234, 231, 99331, 105146, 9370, 5370, 68805, 66017, 59151, 3407, 565, 9258, 4061, 198, 515, 220, 330, 105395, 33447, 788, 8389, 630, 565, 15690, 198, 103929, 88802, 20412, 44063, 89012, 22382, 9370, 108704, 63590, 19157, 5863, 74136, 7581, 3494, 50400, 95512, 93777, 3385, 68226, 15204, 40329, 6643, 3221, 48584, 13568, 1, 102188, 105395, 12857, 104811, 1773, 14880, 101041, 55286, 105395, 99257, 3837, 91680, 31526, 101882, 9370, 5370, 68805, 59151, 1773, 151645, 198, 151644, 77091, 198], lora_request: None, prompt_adapter_request: None.
INFO 10-16 11:05:10 logger.py:36] Received request chat-8e11f18d93b74c4da90afc8c102132da: prompt: '<|im_start|>system\nYou are a helpful assistant.<|im_end|>\n<|im_start|>user\nYou need to decompose the user's input into "subject" and "intention" in order to accurately figure out what the user's input language actually is. \nNotice: the language type user use could be diverse, which can be English, Chinese, Español, Arabic, Japanese, French, and etc.\nMAKE SURE your output is the SAME language as the user's input!\nYour output is restricted only to: (Input language) Intention + Subject(short as possible)\nYour output MUST be a valid JSON.\n\nTip: When the user's question is directed at you (the language model), you can add an emoji to make it more fun.\n\n\nexample 1:\nUser Input: hi, yesterday i had some burgers.\n{\n "Language Type": "The user's input is pure English",\n "Your Reasoning": "The language of my output must be pure English.",\n "Your Output": "sharing yesterday's food"\n}\n\nexample 2:\nUser Input: hello\n{\n "Language Type": "The user's input is written in pure English",\n "Your Reasoning": "The language of my output must be pure English.",\n "Your Output": "Greeting myself☺️"\n}\n\n\nexample 3:\nUser Input: why mmap file: oom\n{\n "Language Type": "The user's input is written in pure English",\n "Your Reasoning": "The language of my output must be pure English.",\n "Your Output": "Asking about the reason for mmap file: oom"\n}\n\n\nexample 4:\nUser Input: www.convinceme.yesterday-you-ate-seafood.tv讲了什么?\n{\n "Language Type": "The user's input English-Chinese mixed",\n "Your Reasoning": "The English-part is an URL, the main intention is still written in Chinese, so the language of my output must be using Chinese.",\n "Your Output": "询问网站www.convinceme.yesterday-you-ate-seafood.tv"\n}\n\nexample 5:\nUser Input: why小红的年龄is老than小明?\n{\n "Language Type": "The user's input is English-Chinese mixed",\n "Your Reasoning": "The English parts are subjective particles, the main intention is written in Chinese, besides, Chinese occupies a greater "actual meaning" than English, so the language of my output must be using Chinese.",\n "Your Output": "询问小红和小明的年龄"\n}\n\nexample 6:\nUser Input: yo, 你今天咋样?\n{\n "Language Type": "The user's input is English-Chinese mixed",\n "Your Reasoning": "The English-part is a subjective particle, the main intention is written in Chinese, so the language of my output must be using Chinese.",\n "Your Output": "查询今日我的状态☺️"\n}\n\nUser Input: \n Fractal Gaming AB\n<|im_end|>\n<|im_start|>assistant\n', params: SamplingParams(n=1, best_of=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.0, temperature=1.0, top_p=1.0, top_k=-1, min_p=0.0, seed=None, use_beam_search=False, length_penalty=1.0, early_stopping=False, stop=[], stop_token_ids=[], include_stop_str_in_output=False, ignore_eos=False, max_tokens=100, min_tokens=0, logprobs=None, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True, truncate_prompt_tokens=None), prompt_token_ids: [151644, 8948, 198, 2610, 525, 264, 10950, 17847, 13, 151645, 198, 151644, 872, 198, 2610, 1184, 311, 28502, 2900, 279, 1196, 594, 1946, 1119, 330, 11501, 1, 323, 330, 396, 2939, 1, 304, 1973, 311, 29257, 7071, 700, 1128, 279, 1196, 594, 1946, 4128, 3520, 374, 13, 715, 34193, 25, 279, 4128, 943, 1196, 990, 1410, 387, 16807, 11, 892, 646, 387, 6364, 11, 8453, 11, 142658, 11, 34117, 11, 10769, 11, 8585, 11, 323, 4992, 624, 47082, 328, 4521, 697, 2550, 374, 279, 83490, 4128, 438, 279, 1196, 594, 1946, 4894, 7771, 2550, 374, 21739, 1172, 311, 25, 320, 2505, 4128, 8, 1333, 2939, 488, 17450, 37890, 438, 3204, 340, 7771, 2550, 27732, 387, 264, 2697, 4718, 382, 16011, 25, 3197, 279, 1196, 594, 3405, 374, 15540, 518, 498, 320, 1782, 4128, 1614, 701, 498, 646, 912, 458, 42365, 311, 1281, 432, 803, 2464, 4192, 8687, 220, 16, 510, 1474, 5571, 25, 15588, 11, 13671, 600, 1030, 1045, 62352, 624, 515, 220, 330, 13806, 3990, 788, 330, 785, 1196, 594, 1946, 374, 10526, 6364, 756, 220, 330, 7771, 26759, 287, 788, 330, 785, 4128, 315, 847, 2550, 1969, 387, 10526, 6364, 10346, 220, 330, 7771, 9258, 788, 330, 83646, 13671, 594, 3607, 698, 630, 8687, 220, 17, 510, 1474, 5571, 25, 23811, 198, 515, 220, 330, 13806, 3990, 788, 330, 785, 1196, 594, 1946, 374, 5326, 304, 10526, 6364, 756, 220, 330, 7771, 26759, 287, 788, 330, 785, 4128, 315, 847, 2550, 1969, 387, 10526, 6364, 10346, 220, 330, 7771, 9258, 788, 330, 38, 43632, 7037, 144346, 30543, 698, 3733, 8687, 220, 18, 510, 1474, 5571, 25, 3170, 70866, 1034, 25, 297, 316, 198, 515, 220, 330, 13806, 3990, 788, 330, 785, 1196, 594, 1946, 374, 5326, 304, 10526, 6364, 756, 220, 330, 7771, 26759, 287, 788, 330, 785, 4128, 315, 847, 2550, 1969, 387, 10526, 6364, 10346, 220, 330, 7771, 9258, 788, 330, 2121, 10566, 911, 279, 2874, 369, 70866, 1034, 25, 297, 316, 698, 3733, 8687, 220, 19, 510, 1474, 5571, 25, 8438, 25961, 2840, 3894, 2384, 11282, 45417, 12, 349, 7806, 64, 13915, 14485, 99526, 34187, 99245, 94432, 515, 220, 330, 13806, 3990, 788, 330, 785, 1196, 594, 1946, 6364, 29553, 7346, 9519, 756, 220, 330, 7771, 26759, 287, 788, 330, 785, 6364, 28037, 374, 458, 5548, 11, 279, 1887, 14602, 374, 2058, 5326, 304, 8453, 11, 773, 279, 4128, 315, 847, 2550, 1969, 387, 1667, 8453, 10346, 220, 330, 7771, 9258, 788, 330, 105396, 100010, 2136, 25961, 2840, 3894, 2384, 11282, 45417, 12, 349, 7806, 64, 13915, 14485, 698, 630, 8687, 220, 20, 510, 1474, 5571, 25, 3170, 30709, 99425, 9370, 102185, 285, 91777, 53795, 30709, 30858, 94432, 515, 220, 330, 13806, 3990, 788, 330, 785, 1196, 594, 1946, 374, 6364, 29553, 7346, 9519, 756, 220, 330, 7771, 26759, 287, 788, 330, 785, 6364, 5479, 525, 43022, 18730, 11, 279, 1887, 14602, 374, 5326, 304, 8453, 11, 27758, 11, 8453, 75754, 264, 7046, 330, 11944, 7290, 1, 1091, 6364, 11, 773, 279, 4128, 315, 847, 2550, 1969, 387, 1667, 8453, 10346, 220, 330, 7771, 9258, 788, 330, 105396, 30709, 99425, 33108, 30709, 30858, 9370, 102185, 698, 630, 8687, 220, 21, 510, 1474, 5571, 25, 29496, 11, 220, 56568, 100644, 108806, 90885, 94432, 515, 220, 330, 13806, 3990, 788, 330, 785, 1196, 594, 1946, 374, 6364, 29553, 7346, 9519, 756, 220, 330, 7771, 26759, 287, 788, 330, 785, 6364, 28037, 374, 264, 43022, 18790, 11, 279, 1887, 14602, 374, 5326, 304, 8453, 11, 773, 279, 4128, 315, 847, 2550, 1969, 387, 1667, 8453, 10346, 220, 330, 7771, 9258, 788, 330, 51154, 102242, 97611, 44091, 144346, 30543, 698, 630, 1474, 5571, 25, 715, 2869, 80148, 30462, 14137, 198, 151645, 198, 151644, 77091, 198], lora_request: None, prompt_adapter_request: None.
INFO 10-16 11:05:10 logger.py:36] Received request chat-7a3f5c6924f44630ac4b60b68ddc61d7: prompt: '<|im_start|>system\nYou are a helpful assistant.<|im_end|>\n<|im_start|>user\nYou need to decompose the user's input into "subject" and "intention" in order to accurately figure out what the user's input language actually is. \nNotice: the language type user use could be diverse, which can be English, Chinese, Español, Arabic, Japanese, French, and etc.\nMAKE SURE your output is the SAME language as the user's input!\nYour output is restricted only to: (Input language) Intention + Subject(short as possible)\nYour output MUST be a valid JSON.\n\nTip: When the user's question is directed at you (the language model), you can add an emoji to make it more fun.\n\n\nexample 1:\nUser Input: hi, yesterday i had some burgers.\n{\n "Language Type": "The user's input is pure English",\n "Your Reasoning": "The language of my output must be pure English.",\n "Your Output": "sharing yesterday's food"\n}\n\nexample 2:\nUser Input: hello\n{\n "Language Type": "The user's input is written in pure English",\n "Your Reasoning": "The language of my output must be pure English.",\n "Your Output": "Greeting myself☺️"\n}\n\n\nexample 3:\nUser Input: why mmap file: oom\n{\n "Language Type": "The user's input is written in pure English",\n "Your Reasoning": "The language of my output must be pure English.",\n "Your Output": "Asking about the reason for mmap file: oom"\n}\n\n\nexample 4:\nUser Input: www.convinceme.yesterday-you-ate-seafood.tv讲了什么?\n{\n "Language Type": "The user's input English-Chinese mixed",\n "Your Reasoning": "The English-part is an URL, the main intention is still written in Chinese, so the language of my output must be using Chinese.",\n "Your Output": "询问网站www.convinceme.yesterday-you-ate-seafood.tv"\n}\n\nexample 5:\nUser Input: why小红的年龄is老than小明?\n{\n "Language Type": "The user's input is English-Chinese mixed",\n "Your Reasoning": "The English parts are subjective particles, the main intention is written in Chinese, besides, Chinese occupies a greater "actual meaning" than English, so the language of my output must be using Chinese.",\n "Your Output": "询问小红和小明的年龄"\n}\n\nexample 6:\nUser Input: yo, 你今天咋样?\n{\n "Language Type": "The user's input is English-Chinese mixed",\n "Your Reasoning": "The English-part is a subjective particle, the main intention is written in Chinese, so the language of my output must be using Chinese.",\n "Your Output": "查询今日我的状态☺️"\n}\n\nUser Input: \n Fractal Gaming AB\n<|im_end|>\n<|im_start|>assistant\n', params: SamplingParams(n=1, best_of=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.0, temperature=1.0, top_p=1.0, top_k=-1, min_p=0.0, seed=None, use_beam_search=False, length_penalty=1.0, early_stopping=False, stop=[], stop_token_ids=[], include_stop_str_in_output=False, ignore_eos=False, max_tokens=100, min_tokens=0, logprobs=None, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True, truncate_prompt_tokens=None), prompt_token_ids: [151644, 8948, 198, 2610, 525, 264, 10950, 17847, 13, 151645, 198, 151644, 872, 198, 2610, 1184, 311, 28502, 2900, 279, 1196, 594, 1946, 1119, 330, 11501, 1, 323, 330, 396, 2939, 1, 304, 1973, 311, 29257, 7071, 700, 1128, 279, 1196, 594, 1946, 4128, 3520, 374, 13, 715, 34193, 25, 279, 4128, 943, 1196, 990, 1410, 387, 16807, 11, 892, 646, 387, 6364, 11, 8453, 11, 142658, 11, 34117, 11, 10769, 11, 8585, 11, 323, 4992, 624, 47082, 328, 4521, 697, 2550, 374, 279, 83490, 4128, 438, 279, 1196, 594, 1946, 4894, 7771, 2550, 374, 21739, 1172, 311, 25, 320, 2505, 4128, 8, 1333, 2939, 488, 17450, 37890, 438, 3204, 340, 7771, 2550, 27732, 387, 264, 2697, 4718, 382, 16011, 25, 3197, 279, 1196, 594, 3405, 374, 15540, 518, 498, 320, 1782, 4128, 1614, 701, 498, 646, 912, 458, 42365, 311, 1281, 432, 803, 2464, 4192, 8687, 220, 16, 510, 1474, 5571, 25, 15588, 11, 13671, 600, 1030, 1045, 62352, 624, 515, 220, 330, 13806, 3990, 788, 330, 785, 1196, 594, 1946, 374, 10526, 6364, 756, 220, 330, 7771, 26759, 287, 788, 330, 785, 4128, 315, 847, 2550, 1969, 387, 10526, 6364, 10346, 220, 330, 7771, 9258, 788, 330, 83646, 13671, 594, 3607, 698, 630, 8687, 220, 17, 510, 1474, 5571, 25, 23811, 198, 515, 220, 330, 13806, 3990, 788, 330, 785, 1196, 594, 1946, 374, 5326, 304, 10526, 6364, 756, 220, 330, 7771, 26759, 287, 788, 330, 785, 4128, 315, 847, 2550, 1969, 387, 10526, 6364, 10346, 220, 330, 7771, 9258, 788, 330, 38, 43632, 7037, 144346, 30543, 698, 3733, 8687, 220, 18, 510, 1474, 5571, 25, 3170, 70866, 1034, 25, 297, 316, 198, 515, 220, 330, 13806, 3990, 788, 330, 785, 1196, 594, 1946, 374, 5326, 304, 10526, 6364, 756, 220, 330, 7771, 26759, 287, 788, 330, 785, 4128, 315, 847, 2550, 1969, 387, 10526, 6364, 10346, 220, 330, 7771, 9258, 788, 330, 2121, 10566, 911, 279, 2874, 369, 70866, 1034, 25, 297, 316, 698, 3733, 8687, 220, 19, 510, 1474, 5571, 25, 8438, 25961, 2840, 3894, 2384, 11282, 45417, 12, 349, 7806, 64, 13915, 14485, 99526, 34187, 99245, 94432, 515, 220, 330, 13806, 3990, 788, 330, 785, 1196, 594, 1946, 6364, 29553, 7346, 9519, 756, 220, 330, 7771, 26759, 287, 788, 330, 785, 6364, 28037, 374, 458, 5548, 11, 279, 1887, 14602, 374, 2058, 5326, 304, 8453, 11, 773, 279, 4128, 315, 847, 2550, 1969, 387, 1667, 8453, 10346, 220, 330, 7771, 9258, 788, 330, 105396, 100010, 2136, 25961, 2840, 3894, 2384, 11282, 45417, 12, 349, 7806, 64, 13915, 14485, 698, 630, 8687, 220, 20, 510, 1474, 5571, 25, 3170, 30709, 99425, 9370, 102185, 285, 91777, 53795, 30709, 30858, 94432, 515, 220, 330, 13806, 3990, 788, 330, 785, 1196, 594, 1946, 374, 6364, 29553, 7346, 9519, 756, 220, 330, 7771, 26759, 287, 788, 330, 785, 6364, 5479, 525, 43022, 18730, 11, 279, 1887, 14602, 374, 5326, 304, 8453, 11, 27758, 11, 8453, 75754, 264, 7046, 330, 11944, 7290, 1, 1091, 6364, 11, 773, 279, 4128, 315, 847, 2550, 1969, 387, 1667, 8453, 10346, 220, 330, 7771, 9258, 788, 330, 105396, 30709, 99425, 33108, 30709, 30858, 9370, 102185, 698, 630, 8687, 220, 21, 510, 1474, 5571, 25, 29496, 11, 220, 56568, 100644, 108806, 90885, 94432, 515, 220, 330, 13806, 3990, 788, 330, 785, 1196, 594, 1946, 374, 6364, 29553, 7346, 9519, 756, 220, 330, 7771, 26759, 287, 788, 330, 785, 6364, 28037, 374, 264, 43022, 18790, 11, 279, 1887, 14602, 374, 5326, 304, 8453, 11, 773, 279, 4128, 315, 847, 2550, 1969, 387, 1667, 8453, 10346, 220, 330, 7771, 9258, 788, 330, 51154, 102242, 97611, 44091, 144346, 30543, 698, 630, 1474, 5571, 25, 715, 2869, 80148, 30462, 14137, 198, 151645, 198, 151644, 77091, 198], lora_request: None, prompt_adapter_request: None.
INFO 10-16 11:05:10 logger.py:36] Received request chat-33cd06cd1f474b76a0bde5ecc2beb72a: prompt: '<|im_start|>system\n假如你是一个贸易情报助手,根据给定的词语给出十个英文近义词\n示例:\n\nquestion:\nled\n\nanswer:\n["Light Emitting Diode", "Indicator Light", "Backlight", "Display Screen", "Lamp", "Bulb", "Illuminator", "Glow Stick", "Neon Sign", "Fluorescent Tube"]\n\nquestion:\napple\n\nanswer:\n["Apple", "Fruit", "Malus domestica", "Red Delicious Apple", "Granny Smith Apple", "Gala Apple", "Fuji Apple", "Honeycrisp Apple", "Cripps Pink Apple", "Pink Lady Apple"]\n\n请严格按照这样的数组格式输出,切勿输出其他内容\n请注意,将用户输入的词语首字母大写不能被认定为,不要输出这样的内容\nValve不是valve的近义词\nApple不是apple的近义词\nFruit不是fruit的近义词<|im_end|>\n<|im_start|>user\n Fractal Gaming AB\n\n请生成这样格式的输出:\n其中数组中每个元素为近义词\n["","","","","","","","","",""]\n类似于\n["Apple", "Fruit", "Malus domestica", "Red Delicious Apple", "Granny Smith Apple", "Gala Apple", "Fuji Apple", "Honeycrisp Apple", "Cripps Pink Apple", "Pink Lady Apple"]\n注意,这些近义词全是英语\n注意,将问题的首字母进行大写不能为认定为近义词\n注意,只要返回这个数组即可,其他内容不要返回<|im_end|>\n<|im_start|>assistant\n', params: SamplingParams(n=1, best_of=1, presence_penalty=0.5, frequency_penalty=0.5, repetition_penalty=1.0, temperature=0.9, top_p=1.0, top_k=-1, min_p=0.0, seed=None, use_beam_search=False, length_penalty=1.0, early_stopping=False, stop=[], stop_token_ids=[], include_stop_str_in_output=False, ignore_eos=False, max_tokens=7859, min_tokens=0, logprobs=None, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True, truncate_prompt_tokens=None), prompt_token_ids: [151644, 8948, 198, 105812, 56568, 101909, 100726, 108442, 110498, 3837, 100345, 89012, 22382, 9370, 113042, 107485, 110661, 105205, 59258, 64559, 99689, 198, 19793, 26355, 48443, 7841, 28311, 832, 271, 9217, 28311, 1183, 13911, 61537, 1280, 7767, 534, 497, 330, 19523, 8658, 497, 330, 3707, 4145, 497, 330, 7020, 13948, 497, 330, 43, 1121, 497, 330, 33, 360, 65, 497, 330, 40, 5448, 17272, 497, 330, 38, 10303, 46461, 497, 330, 8813, 263, 7075, 497, 330, 3882, 84, 4589, 1168, 29024, 25912, 7841, 510, 22377, 271, 9217, 510, 1183, 26567, 497, 330, 37, 21026, 497, 330, 29600, 355, 73422, 3001, 497, 330, 6033, 84688, 8162, 497, 330, 6464, 12888, 9082, 8162, 497, 330, 38, 6053, 8162, 497, 330, 76745, 7754, 8162, 497, 330, 39, 2534, 5082, 13090, 8162, 497, 330, 34, 54689, 82, 26119, 8162, 497, 330, 71352, 20621, 8162, 25912, 14880, 110439, 101893, 69824, 68805, 66017, 3837, 99322, 108710, 66017, 92894, 43815, 198, 118271, 3837, 44063, 20002, 31196, 9370, 113042, 59975, 110788, 26288, 61443, 53153, 99250, 104585, 17714, 3837, 100148, 66017, 101893, 43815, 198, 2208, 586, 99520, 831, 586, 9370, 59258, 64559, 99689, 198, 26567, 99520, 22377, 9370, 59258, 64559, 99689, 198, 37, 21026, 99520, 35598, 9370, 59258, 64559, 99689, 151645, 198, 151644, 872, 198, 2869, 80148, 30462, 14137, 271, 14880, 43959, 99654, 68805, 9370, 66017, 28311, 90919, 69824, 15946, 103991, 102268, 17714, 59258, 64559, 99689, 198, 1183, 59859, 59859, 59859, 59859, 2198, 7026, 113080, 198, 1183, 26567, 497, 330, 37, 21026, 497, 330, 29600, 355, 73422, 3001, 497, 330, 6033, 84688, 8162, 497, 330, 6464, 12888, 9082, 8162, 497, 330, 38, 6053, 8162, 497, 330, 76745, 7754, 8162, 497, 330, 39, 2534, 5082, 13090, 8162, 497, 330, 34, 54689, 82, 26119, 8162, 497, 330, 71352, 20621, 8162, 7026, 60533, 3837, 100001, 59258, 64559, 99689, 108743, 104105, 198, 60533, 3837, 44063, 86119, 9370, 59975, 110788, 71817, 26288, 61443, 53153, 17714, 104585, 17714, 59258, 64559, 99689, 198, 60533, 3837, 100671, 31526, 99487, 69824, 104180, 3837, 92894, 43815, 100148, 31526, 151645, 198, 151644, 77091, 198], lora_request: None, prompt_adapter_request: None.
INFO: 61.171.72.231:17908 - "POST /v1/chat/completions HTTP/1.1" 200 OK
INFO 10-16 11:05:10 logger.py:36] Received request chat-8b4bcd8b55c348ceac242190083e0c6c: prompt: '<|im_start|>system\n根据用户输入的商品关键字,生成一个包含JSON对象列表的字符串。每个对象的键是一个行业类别,值是一个五个商品类别的列表。确保有三个不同的行业类别,每个类别包含五个商品类别。每个行业关键字对应的产品不能出现重复,也不能和用户的输入重复,如果出现重复,可以再生成一个产品关键字替代一下。\n请注意key字段对应的值必须是中文,value字段对应的值必须是英文。\n请不要返回代码块\n请严格按照下面的格式输出,不要输出其余内容,谢谢你\n以下是JSON格式示例:\n[\n {\n "key": "手机行业",\n "value": ["Computer Hardware", "IT Equipment", "Electronic Components", "Software Solutions", "Networking Devices"]\n },\n {\n "key": "消费电子行业",\n "value": ["Computer Hardware", "IT Equipment", "Electronic Components", "Software Solutions", "Networking Devices"]\n },\n {\n "key": "科技设备行业",\n "value": ["Computer Hardware", "IT Equipment", "Electronic Components", "Software Solutions", "Networking Devices"]\n }\n]<|im_end|>\n<|im_start|>user\n Fractal Gaming AB<|im_end|>\n<|im_start|>assistant\n', params: SamplingParams(n=1, best_of=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.0, temperature=0.7, top_p=1.0, top_k=-1, min_p=0.0, seed=None, use_beam_search=False, length_penalty=1.0, early_stopping=False, stop=[], stop_token_ids=[], include_stop_str_in_output=False, ignore_eos=False, max_tokens=7947, min_tokens=0, logprobs=None, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True, truncate_prompt_tokens=None), prompt_token_ids: [151644, 8948, 198, 100345, 20002, 31196, 109348, 118294, 3837, 43959, 46944, 102298, 5370, 64429, 44177, 9370, 66558, 1773, 103991, 64429, 9370, 60949, 101909, 99717, 107975, 3837, 25511, 101909, 105220, 45943, 21515, 102657, 44177, 1773, 103944, 18830, 101124, 101970, 99717, 107975, 3837, 103991, 107975, 102298, 105220, 45943, 107975, 1773, 103991, 99717, 118294, 103124, 104310, 53153, 100347, 105444, 3837, 105827, 33108, 107494, 31196, 105444, 3837, 62244, 100347, 105444, 3837, 73670, 87256, 43959, 46944, 82700, 118294, 105598, 100158, 8997, 118271, 792, 44931, 110019, 25511, 100645, 20412, 104811, 3837, 957, 44931, 110019, 25511, 100645, 20412, 105205, 8997, 14880, 100148, 31526, 46100, 99922, 198, 14880, 110439, 100431, 9370, 68805, 66017, 3837, 100148, 66017, 106177, 43815, 3837, 116642, 198, 114566, 5370, 68805, 19793, 26355, 28311, 9640, 220, 341, 262, 330, 792, 788, 330, 58405, 99717, 756, 262, 330, 957, 788, 4383, 37332, 36765, 497, 330, 952, 20236, 497, 330, 89643, 34085, 497, 330, 19250, 22676, 497, 330, 78007, 40377, 7026, 220, 1153, 220, 341, 262, 330, 792, 788, 330, 100030, 100382, 99717, 756, 262, 330, 957, 788, 4383, 37332, 36765, 497, 330, 952, 20236, 497, 330, 89643, 34085, 497, 330, 19250, 22676, 497, 330, 78007, 40377, 7026, 220, 1153, 220, 341, 262, 330, 792, 788, 330, 99602, 101044, 99717, 756, 262, 330, 957, 788, 4383, 37332, 36765, 497, 330, 952, 20236, 497, 330, 89643, 34085, 497, 330, 19250, 22676, 497, 330, 78007, 40377, 7026, 220, 456, 60, 151645, 198, 151644, 872, 198, 2869, 80148, 30462, 14137, 151645, 198, 151644, 77091, 198], lora_request: None, prompt_adapter_request: None.
INFO: 61.171.72.231:28573 - "POST /v1/chat/completions HTTP/1.1" 200 OK
INFO 10-16 11:05:11 logger.py:36] Received request chat-122f7dc0a4a04df6bbdb7f08a718ffbd: prompt: '<|im_start|>system\n<|im_end|>\n<|im_start|>user\n\n## Role: 专业AI翻译助手\n- description: 你是一个专业的AI翻译助手,精通中文。\n\n## Skills\n1. 精通多种语言的翻译技巧\n2. 准确理解和传达原文含义\n3. 严格遵循输出格式要求\n4. 专业术语和品牌名称的处理能力\n\n## Rules\n1. 仅输出原文和翻译,不添加任何解释或评论。\n2. 严格按照指定的JSON格式提供翻译结果。\n3. 保持专业术语和品牌名称不变,除非有官方的中文译名。\n4. 确保翻译准确传达原文含义,同时保持中文的自然流畅。\n5. 如遇无法翻译的内容,保持原文不变,不做额外说明。\n6. 禁止在输出中包含任何额外的解释、注释或元数据。\n\n## Workflow\n1. 接收翻译任务:目标语言为中文,待翻译文本为"CNNS SAPROPEL ORGANIC FERTILIZER, IN POWDER AND GRANULAR FORM. RATIO ORGANIC MATTER 60%; TOTAL PROTEIN 2.5; HUMIC ACID 1.8 C N 12; PHH20 5;\xa0HUMIDITY 30%.\xa0PACKAGING 500 KG BAG (GOODS WITHOUT LABELS) 100% NEW"。\n2. 分析文本,识别专业术语和品牌名称。\n3. 进行翻译,遵循翻译规则。\n4. 按照指定的JSON格式输出结果。\n\n## OutputFormat\n{\n "翻译后": ""\n}\n\n## Init\n你的任务是将给定的文本"CNNS SAPROPEL ORGANIC FERTILIZER, IN POWDER AND GRANULAR FORM. RATIO ORGANIC MATTER 60%; TOTAL PROTEIN 2.5; HUMIC ACID 1.8 C N 12; PHH20 5;\xa0HUMIDITY 30%.\xa0PACKAGING 500 KG BAG (GOODS WITHOUT LABELS) 100% NEW"准确翻译成中文。请直接开始翻译工作,只返回要求的JSON格式结果。<|im_end|>\n<|im_start|>assistant\n', params: SamplingParams(n=1, best_of=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.0, temperature=0.5, top_p=1.0, top_k=-1, min_p=0.0, seed=None, use_beam_search=False, length_penalty=1.0, early_stopping=False, stop=[], stop_token_ids=[], include_stop_str_in_output=False, ignore_eos=False, max_tokens=7706, min_tokens=0, logprobs=None, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True, truncate_prompt_tokens=None), prompt_token_ids: [151644, 8948, 198, 151645, 198, 151644, 872, 271, 565, 15404, 25, 220, 99878, 15469, 105395, 110498, 198, 12, 4008, 25, 220, 56568, 101909, 104715, 15469, 105395, 110498, 3837, 114806, 104811, 3407, 565, 30240, 198, 16, 13, 10236, 110, 122, 31935, 101312, 102064, 9370, 105395, 102118, 198, 17, 13, 65727, 228, 33956, 115167, 107707, 103283, 109091, 198, 18, 13, 220, 100470, 106466, 66017, 68805, 101882, 198, 19, 13, 220, 99878, 116925, 33108, 100135, 29991, 9370, 54542, 99788, 271, 565, 22847, 198, 16, 13, 220, 99373, 66017, 103283, 33108, 105395, 3837, 16530, 42855, 99885, 104136, 57191, 85641, 8997, 17, 13, 220, 110439, 105146, 9370, 5370, 68805, 99553, 105395, 59151, 8997, 18, 13, 220, 100662, 99878, 116925, 33108, 100135, 29991, 105928, 3837, 106781, 18830, 100777, 9370, 104811, 102610, 13072, 8997, 19, 13, 10236, 94, 106, 32463, 105395, 102188, 107707, 103283, 109091, 3837, 91572, 100662, 104811, 9370, 99795, 110205, 8997, 20, 13, 69372, 99688, 101068, 105395, 104597, 3837, 100662, 103283, 105928, 3837, 109513, 108593, 66394, 8997, 21, 13, 10236, 99, 223, 81433, 18493, 66017, 15946, 102298, 99885, 108593, 9370, 104136, 5373, 25074, 68862, 57191, 23305, 20074, 3407, 565, 60173, 198, 16, 13, 46602, 98, 50009, 105395, 88802, 5122, 100160, 102064, 17714, 104811, 3837, 74193, 105395, 108704, 17714, 1, 28668, 2448, 36221, 1285, 1740, 43, 2726, 58487, 1317, 434, 3399, 1715, 78141, 11, 1964, 69793, 11391, 3567, 14773, 1093, 7081, 27824, 13, 431, 54838, 2726, 58487, 1317, 24795, 4198, 220, 21, 15, 16175, 46962, 5308, 2446, 687, 220, 17, 13, 20, 26, 472, 2794, 1317, 10584, 915, 220, 16, 13, 23, 356, 451, 220, 16, 17, 26, 14659, 39, 17, 15, 220, 20, 26, 4102, 39, 2794, 915, 3333, 220, 18, 15, 14360, 4102, 17279, 79606, 220, 20, 15, 15, 70087, 425, 1890, 320, 69733, 50, 6007, 56874, 50, 8, 220, 16, 15, 15, 4, 16165, 1, 8997, 17, 13, 58657, 97771, 108704, 3837, 102450, 99878, 116925, 33108, 100135, 29991, 8997, 18, 13, 32181, 249, 22243, 105395, 3837, 106466, 105395, 104190, 8997, 19, 13, 6567, 234, 231, 99331, 105146, 9370, 5370, 68805, 66017, 59151, 3407, 565, 9258, 4061, 198, 515, 220, 330, 105395, 33447, 788, 8389, 630, 565, 15690, 198, 103929, 88802, 20412, 44063, 89012, 22382, 9370, 108704, 1, 28668, 2448, 36221, 1285, 1740, 43, 2726, 58487, 1317, 434, 3399, 1715, 78141, 11, 1964, 69793, 11391, 3567, 14773, 1093, 7081, 27824, 13, 431, 54838, 2726, 58487, 1317, 24795, 4198, 220, 21, 15, 16175, 46962, 5308, 2446, 687, 220, 17, 13, 20, 26, 472, 2794, 1317, 10584, 915, 220, 16, 13, 23, 356, 451, 220, 16, 17, 26, 14659, 39, 17, 15, 220, 20, 26, 4102, 39, 2794, 915, 3333, 220, 18, 15, 14360, 4102, 17279, 79606, 220, 20, 15, 15, 70087, 425, 1890, 320, 69733, 50, 6007, 56874, 50, 8, 220, 16, 15, 15, 4, 16165, 1, 102188, 105395, 12857, 104811, 1773, 14880, 101041, 55286, 105395, 99257, 3837, 91680, 31526, 101882, 9370, 5370, 68805, 59151, 1773, 151645, 198, 151644, 77091, 198], lora_request: None, prompt_adapter_request: None.
INFO 10-16 11:05:11 logger.py:36] Received request chat-a0f6ea73195141e2a4e052f8fcad3161: prompt: '<|im_start|>system\n<|im_end|>\n<|im_start|>user\n\n## Role: 专业AI翻译助手\n- description: 你是一个专业的AI翻译助手,精通中文。\n\n## Skills\n1. 精通多种语言的翻译技巧\n2. 准确理解和传达原文含义\n3. 严格遵循输出格式要求\n4. 专业术语和品牌名称的处理能力\n\n## Rules\n1. 仅输出原文和翻译,不添加任何解释或评论。\n2. 严格按照指定的JSON格式提供翻译结果。\n3. 保持专业术语和品牌名称不变,除非有官方的中文译名。\n4. 确保翻译准确传达原文含义,同时保持中文的自然流畅。\n5. 如遇无法翻译的内容,保持原文不变,不做额外说明。\n6. 禁止在输出中包含任何额外的解释、注释或元数据。\n\n## Workflow\n1. 接收翻译任务:目标语言为中文,待翻译文本为"FROZEN CORN ON COB (1X12PKTS., 12.7KG EA"。\n2. 分析文本,识别专业术语和品牌名称。\n3. 进行翻译,遵循翻译规则。\n4. 按照指定的JSON格式输出结果。\n\n## OutputFormat\n{\n "翻译后": ""\n}\n\n## Init\n你的任务是将给定的文本"FROZEN CORN ON COB (1X12PKTS., 12.7KG EA"准确翻译成中文。请直接开始翻译工作,只返回要求的JSON格式结果。<|im_end|>\n<|im_start|>assistant\n', params: SamplingParams(n=1, best_of=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.0, temperature=0.5, top_p=1.0, top_k=-1, min_p=0.0, seed=None, use_beam_search=False, length_penalty=1.0, early_stopping=False, stop=[], stop_token_ids=[], include_stop_str_in_output=False, ignore_eos=False, max_tokens=7854, min_tokens=0, logprobs=None, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True, truncate_prompt_tokens=None), prompt_token_ids: [151644, 8948, 198, 151645, 198, 151644, 872, 271, 565, 15404, 25, 220, 99878, 15469, 105395, 110498, 198, 12, 4008, 25, 220, 56568, 101909, 104715, 15469, 105395, 110498, 3837, 114806, 104811, 3407, 565, 30240, 198, 16, 13, 10236, 110, 122, 31935, 101312, 102064, 9370, 105395, 102118, 198, 17, 13, 65727, 228, 33956, 115167, 107707, 103283, 109091, 198, 18, 13, 220, 100470, 106466, 66017, 68805, 101882, 198, 19, 13, 220, 99878, 116925, 33108, 100135, 29991, 9370, 54542, 99788, 271, 565, 22847, 198, 16, 13, 220, 99373, 66017, 103283, 33108, 105395, 3837, 16530, 42855, 99885, 104136, 57191, 85641, 8997, 17, 13, 220, 110439, 105146, 9370, 5370, 68805, 99553, 105395, 59151, 8997, 18, 13, 220, 100662, 99878, 116925, 33108, 100135, 29991, 105928, 3837, 106781, 18830, 100777, 9370, 104811, 102610, 13072, 8997, 19, 13, 10236, 94, 106, 32463, 105395, 102188, 107707, 103283, 109091, 3837, 91572, 100662, 104811, 9370, 99795, 110205, 8997, 20, 13, 69372, 99688, 101068, 105395, 104597, 3837, 100662, 103283, 105928, 3837, 109513, 108593, 66394, 8997, 21, 13, 10236, 99, 223, 81433, 18493, 66017, 15946, 102298, 99885, 108593, 9370, 104136, 5373, 25074, 68862, 57191, 23305, 20074, 3407, 565, 60173, 198, 16, 13, 46602, 98, 50009, 105395, 88802, 5122, 100160, 102064, 17714, 104811, 3837, 74193, 105395, 108704, 17714, 86555, 1285, 57, 953, 26465, 45, 6197, 7284, 33, 320, 16, 55, 16, 17, 22242, 9951, 2572, 220, 16, 17, 13, 22, 42916, 38362, 1, 8997, 17, 13, 58657, 97771, 108704, 3837, 102450, 99878, 116925, 33108, 100135, 29991, 8997, 18, 13, 32181, 249, 22243, 105395, 3837, 106466, 105395, 104190, 8997, 19, 13, 6567, 234, 231, 99331, 105146, 9370, 5370, 68805, 66017, 59151, 3407, 565, 9258, 4061, 198, 515, 220, 330, 105395, 33447, 788, 8389, 630, 565, 15690, 198, 103929, 88802, 20412, 44063, 89012, 22382, 9370, 108704, 86555, 1285, 57, 953, 26465, 45, 6197, 7284, 33, 320, 16, 55, 16, 17, 22242, 9951, 2572, 220, 16, 17, 13, 22, 42916, 38362, 1, 102188, 105395, 12857, 104811, 1773, 14880, 101041, 55286, 105395, 99257, 3837, 91680, 31526, 101882, 9370, 5370, 68805, 59151, 1773, 151645, 198, 151644, 77091, 198], lora_request: None, prompt_adapter_request: None.
INFO 10-16 11:05:14 logger.py:36] Received request chat-ce259a293f4947c690b5d0fa7a3746e8: prompt: '<|im_start|>system\n<|im_end|>\n<|im_start|>user\n\n## Role: 专业AI翻译助手\n- description: 你是一个专业的AI翻译助手,精通中文。\n\n## Skills\n1. 精通多种语言的翻译技巧\n2. 准确理解和传达原文含义\n3. 严格遵循输出格式要求\n4. 专业术语和品牌名称的处理能力\n\n## Rules\n1. 仅输出原文和翻译,不添加任何解释或评论。\n2. 严格按照指定的JSON格式提供翻译结果。\n3. 保持专业术语和品牌名称不变,除非有官方的中文译名。\n4. 确保翻译准确传达原文含义,同时保持中文的自然流畅。\n5. 如遇无法翻译的内容,保持原文不变,不做额外说明。\n6. 禁止在输出中包含任何额外的解释、注释或元数据。\n\n## Workflow\n1. 接收翻译任务:目标语言为中文,待翻译文本为"STAINLESS STEEL SCREEN 60 MESH *1.6(FOR INDONESIA)\n"。\n2. 分析文本,识别专业术语和品牌名称。\n3. 进行翻译,遵循翻译规则。\n4. 按照指定的JSON格式输出结果。\n\n## OutputFormat\n{\n "翻译后": ""\n}\n\n## Init\n你的任务是将给定的文本"STAINLESS STEEL SCREEN 60 MESH *1.6(FOR INDONESIA)\n"准确翻译成中文。请直接开始翻译工作,只返回要求的JSON格式结果。<|im_end|>\n<|im_start|>assistant\n', params: SamplingParams(n=1, best_of=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.0, temperature=0.5, top_p=1.0, top_k=-1, min_p=0.0, seed=None, use_beam_search=False, length_penalty=1.0, early_stopping=False, stop=[], stop_token_ids=[], include_stop_str_in_output=False, ignore_eos=False, max_tokens=7858, min_tokens=0, logprobs=None, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True, truncate_prompt_tokens=None), prompt_token_ids: [151644, 8948, 198, 151645, 198, 151644, 872, 271, 565, 15404, 25, 220, 99878, 15469, 105395, 110498, 198, 12, 4008, 25, 220, 56568, 101909, 104715, 15469, 105395, 110498, 3837, 114806, 104811, 3407, 565, 30240, 198, 16, 13, 10236, 110, 122, 31935, 101312, 102064, 9370, 105395, 102118, 198, 17, 13, 65727, 228, 33956, 115167, 107707, 103283, 109091, 198, 18, 13, 220, 100470, 106466, 66017, 68805, 101882, 198, 19, 13, 220, 99878, 116925, 33108, 100135, 29991, 9370, 54542, 99788, 271, 565, 22847, 198, 16, 13, 220, 99373, 66017, 103283, 33108, 105395, 3837, 16530, 42855, 99885, 104136, 57191, 85641, 8997, 17, 13, 220, 110439, 105146, 9370, 5370, 68805, 99553, 105395, 59151, 8997, 18, 13, 220, 100662, 99878, 116925, 33108, 100135, 29991, 105928, 3837, 106781, 18830, 100777, 9370, 104811, 102610, 13072, 8997, 19, 13, 10236, 94, 106, 32463, 105395, 102188, 107707, 103283, 109091, 3837, 91572, 100662, 104811, 9370, 99795, 110205, 8997, 20, 13, 69372, 99688, 101068, 105395, 104597, 3837, 100662, 103283, 105928, 3837, 109513, 108593, 66394, 8997, 21, 13, 10236, 99, 223, 81433, 18493, 66017, 15946, 102298, 99885, 108593, 9370, 104136, 5373, 25074, 68862, 57191, 23305, 20074, 3407, 565, 60173, 198, 16, 13, 46602, 98, 50009, 105395, 88802, 5122, 100160, 102064, 17714, 104811, 3837, 74193, 105395, 108704, 17714, 1, 784, 6836, 37773, 26499, 2749, 35073, 220, 21, 15, 386, 40004, 353, 16, 13, 21, 7832, 868, 19317, 60289, 5863, 340, 1, 8997, 17, 13, 58657, 97771, 108704, 3837, 102450, 99878, 116925, 33108, 100135, 29991, 8997, 18, 13, 32181, 249, 22243, 105395, 3837, 106466, 105395, 104190, 8997, 19, 13, 6567, 234, 231, 99331, 105146, 9370, 5370, 68805, 66017, 59151, 3407, 565, 9258, 4061, 198, 515, 220, 330, 105395, 33447, 788, 8389, 630, 565, 15690, 198, 103929, 88802, 20412, 44063, 89012, 22382, 9370, 108704, 1, 784, 6836, 37773, 26499, 2749, 35073, 220, 21, 15, 386, 40004, 353, 16, 13, 21, 7832, 868, 19317, 60289, 5863, 340, 1, 102188, 105395, 12857, 104811, 1773, 14880, 101041, 55286, 105395, 99257, 3837, 91680, 31526, 101882, 9370, 5370, 68805, 59151, 1773, 151645, 198, 151644, 77091, 198], lora_request: None, prompt_adapter_request: None.
INFO 10-16 11:05:16 logger.py:36] Received request chat-3a3770ba57c2481886726891062170d4: prompt: '<|im_start|>system\n<|im_end|>\n<|im_start|>user\n\n## Role: 专业AI翻译助手\n- description: 你是一个专业的AI翻译助手,精通中文。\n\n## Skills\n1. 精通多种语言的翻译技巧\n2. 准确理解和传达原文含义\n3. 严格遵循输出格式要求\n4. 专业术语和品牌名称的处理能力\n\n## Rules\n1. 仅输出原文和翻译,不添加任何解释或评论。\n2. 严格按照指定的JSON格式提供翻译结果。\n3. 保持专业术语和品牌名称不变,除非有官方的中文译名。\n4. 确保翻译准确传达原文含义,同时保持中文的自然流畅。\n5. 如遇无法翻译的内容,保持原文不变,不做额外说明。\n6. 禁止在输出中包含任何额外的解释、注释或元数据。\n\n## Workflow\n1. 接收翻译任务:目标语言为中文,待翻译文本为"1 X 40 CONTAINERS CONTAINING 163 CARTONS OF VCAPS PLUS SIZE 0 NONPRINTED HPMC "。\n2. 分析文本,识别专业术语和品牌名称。\n3. 进行翻译,遵循翻译规则。\n4. 按照指定的JSON格式输出结果。\n\n## OutputFormat\n{\n "翻译后": ""\n}\n\n## Init\n你的任务是将给定的文本"1 X 40 CONTAINERS CONTAINING 163 CARTONS OF VCAPS PLUS SIZE 0 NONPRINTED HPMC "准确翻译成中文。请直接开始翻译工作,只返回要求的JSON格式结果。<|im_end|>\n<|im_start|>assistant\n', params: SamplingParams(n=1, best_of=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.0, temperature=0.5, top_p=1.0, top_k=-1, min_p=0.0, seed=None, use_beam_search=False, length_penalty=1.0, early_stopping=False, stop=[], stop_token_ids=[], include_stop_str_in_output=False, ignore_eos=False, max_tokens=7840, min_tokens=0, logprobs=None, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True, truncate_prompt_tokens=None), prompt_token_ids: [151644, 8948, 198, 151645, 198, 151644, 872, 271, 565, 15404, 25, 220, 99878, 15469, 105395, 110498, 198, 12, 4008, 25, 220, 56568, 101909, 104715, 15469, 105395, 110498, 3837, 114806, 104811, 3407, 565, 30240, 198, 16, 13, 10236, 110, 122, 31935, 101312, 102064, 9370, 105395, 102118, 198, 17, 13, 65727, 228, 33956, 115167, 107707, 103283, 109091, 198, 18, 13, 220, 100470, 106466, 66017, 68805, 101882, 198, 19, 13, 220, 99878, 116925, 33108, 100135, 29991, 9370, 54542, 99788, 271, 565, 22847, 198, 16, 13, 220, 99373, 66017, 103283, 33108, 105395, 3837, 16530, 42855, 99885, 104136, 57191, 85641, 8997, 17, 13, 220, 110439, 105146, 9370, 5370, 68805, 99553, 105395, 59151, 8997, 18, 13, 220, 100662, 99878, 116925, 33108, 100135, 29991, 105928, 3837, 106781, 18830, 100777, 9370, 104811, 102610, 13072, 8997, 19, 13, 10236, 94, 106, 32463, 105395, 102188, 107707, 103283, 109091, 3837, 91572, 100662, 104811, 9370, 99795, 110205, 8997, 20, 13, 69372, 99688, 101068, 105395, 104597, 3837, 100662, 103283, 105928, 3837, 109513, 108593, 66394, 8997, 21, 13, 10236, 99, 223, 81433, 18493, 66017, 15946, 102298, 99885, 108593, 9370, 104136, 5373, 25074, 68862, 57191, 23305, 20074, 3407, 565, 60173, 198, 16, 13, 46602, 98, 50009, 105395, 88802, 5122, 100160, 102064, 17714, 104811, 3837, 74193, 105395, 108704, 17714, 1, 16, 1599, 220, 19, 15, 16120, 6836, 4321, 16120, 6836, 1718, 220, 16, 21, 18, 78032, 29526, 3008, 647, 31400, 50, 58453, 25341, 220, 15, 20575, 24372, 1479, 472, 92523, 330, 8997, 17, 13, 58657, 97771, 108704, 3837, 102450, 99878, 116925, 33108, 100135, 29991, 8997, 18, 13, 32181, 249, 22243, 105395, 3837, 106466, 105395, 104190, 8997, 19, 13, 6567, 234, 231, 99331, 105146, 9370, 5370, 68805, 66017, 59151, 3407, 565, 9258, 4061, 198, 515, 220, 330, 105395, 33447, 788, 8389, 630, 565, 15690, 198, 103929, 88802, 20412, 44063, 89012, 22382, 9370, 108704, 1, 16, 1599, 220, 19, 15, 16120, 6836, 4321, 16120, 6836, 1718, 220, 16, 21, 18, 78032, 29526, 3008, 647, 31400, 50, 58453, 25341, 220, 15, 20575, 24372, 1479, 472, 92523, 330, 102188, 105395, 12857, 104811, 1773, 14880, 101041, 55286, 105395, 99257, 3837, 91680, 31526, 101882, 9370, 5370, 68805, 59151, 1773, 151645, 198, 151644, 77091, 198], lora_request: None, prompt_adapter_request: None.
ERROR 10-16 11:05:16 client.py:244] TimeoutError('No heartbeat received from MQLLMEngine')
ERROR 10-16 11:05:16 client.py:244] NoneType: None
INFO: 61.171.72.231:58483 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
INFO: 61.171.72.231:32247 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
INFO: 61.171.72.231:25138 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
INFO: 61.171.72.231:10465 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
INFO: 61.171.72.231:10466 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
INFO: 61.171.72.231:32248 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
INFO: 61.171.72.231:28574 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
INFO: 61.171.72.231:21232 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
INFO: 61.171.72.231:17909 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
INFO: 61.171.72.231:40058 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
ERROR: Exception in ASGI application
Traceback (most recent call last):
File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/responses.py", line 257, in call
await wrap(partial(self.listen_for_disconnect, receive))
File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/responses.py", line 253, in wrap
await func()
File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/responses.py", line 230, in listen_for_disconnect
message = await receive()
^^^^^^^^^^^^^^^
File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/uvicorn/protocols/http/httptools_impl.py", line 555, in receive
await self.message_event.wait()
File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/asyncio/locks.py", line 212, in wait
await fut
asyncio.exceptions.CancelledError: Cancelled by cancel scope 7f288c184c80

During handling of the above exception, another exception occurred:

  • Exception Group Traceback (most recent call last):
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/uvicorn/protocols/http/httptools_impl.py", line 401, in run_asgi
    | result = await app( # type: ignore[func-returns-value]
    | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in call
    | return await self.app(scope, receive, send)
    | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/fastapi/applications.py", line 1054, in call
    | await super().call(scope, receive, send)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/applications.py", line 113, in call
    | await self.middleware_stack(scope, receive, send)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/middleware/errors.py", line 187, in call
    | raise exc
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/middleware/errors.py", line 165, in call
    | await self.app(scope, receive, _send)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/middleware/cors.py", line 85, in call
    | await self.app(scope, receive, send)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/middleware/exceptions.py", line 62, in call
    | await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/_exception_handler.py", line 62, in wrapped_app
    | raise exc
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/_exception_handler.py", line 51, in wrapped_app
    | await app(scope, receive, sender)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/routing.py", line 715, in call
    | await self.middleware_stack(scope, receive, send)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/routing.py", line 735, in app
    | await route.handle(scope, receive, send)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/routing.py", line 288, in handle
    | await self.app(scope, receive, send)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/routing.py", line 76, in app
    | await wrap_app_handling_exceptions(app, request)(scope, receive, send)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/_exception_handler.py", line 62, in wrapped_app
    | raise exc
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/_exception_handler.py", line 51, in wrapped_app
    | await app(scope, receive, sender)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/routing.py", line 74, in app
    | await response(scope, receive, send)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/responses.py", line 250, in call
    | async with anyio.create_task_group() as task_group:
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/anyio/_backends/_asyncio.py", line 736, in aexit
    | raise BaseExceptionGroup(
    | ExceptionGroup: unhandled errors in a TaskGroup (1 sub-exception)
    +-+---------------- 1 ----------------
    | Traceback (most recent call last):
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/responses.py", line 253, in wrap
    | await func()
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/responses.py", line 242, in stream_response
    | async for chunk in self.body_iterator:
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/vllm/entrypoints/openai/serving_chat.py", line 309, in chat_completion_stream_generator
    | async for res in result_generator:
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/vllm/utils.py", line 452, in iterate_with_cancellation
    | item = await awaits[0]
    | ^^^^^^^^^^^^^^^
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/vllm/engine/multiprocessing/client.py", line 486, in _process_request
    | raise request_output
    | vllm.engine.multiprocessing.MQEngineDeadError: Engine loop is not running. Inspect the stacktrace to find the original error: TimeoutError('No heartbeat received from MQLLMEngine').
    +------------------------------------
    ERROR: Exception in ASGI application
    Traceback (most recent call last):
    File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/responses.py", line 257, in call
    await wrap(partial(self.listen_for_disconnect, receive))
    File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/responses.py", line 253, in wrap
    await func()
    File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/responses.py", line 230, in listen_for_disconnect
    message = await receive()
    ^^^^^^^^^^^^^^^
    File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/uvicorn/protocols/http/httptools_impl.py", line 555, in receive
    await self.message_event.wait()
    File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/asyncio/locks.py", line 212, in wait
    await fut
    asyncio.exceptions.CancelledError: Cancelled by cancel scope 7f2794179eb0

During handling of the above exception, another exception occurred:

  • Exception Group Traceback (most recent call last):
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/uvicorn/protocols/http/httptools_impl.py", line 401, in run_asgi
    | result = await app( # type: ignore[func-returns-value]
    | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in call
    | return await self.app(scope, receive, send)
    | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/fastapi/applications.py", line 1054, in call
    | await super().call(scope, receive, send)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/applications.py", line 113, in call
    | await self.middleware_stack(scope, receive, send)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/middleware/errors.py", line 187, in call
    | raise exc
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/middleware/errors.py", line 165, in call
    | await self.app(scope, receive, _send)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/middleware/cors.py", line 85, in call
    | await self.app(scope, receive, send)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/middleware/exceptions.py", line 62, in call
    | await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/_exception_handler.py", line 62, in wrapped_app
    | raise exc
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/_exception_handler.py", line 51, in wrapped_app
    | await app(scope, receive, sender)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/routing.py", line 715, in call
    | await self.middleware_stack(scope, receive, send)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/routing.py", line 735, in app
    | await route.handle(scope, receive, send)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/routing.py", line 288, in handle
    | await self.app(scope, receive, send)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/routing.py", line 76, in app
    | await wrap_app_handling_exceptions(app, request)(scope, receive, send)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/_exception_handler.py", line 62, in wrapped_app
    | raise exc
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/_exception_handler.py", line 51, in wrapped_app
    | await app(scope, receive, sender)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/routing.py", line 74, in app
    | await response(scope, receive, send)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/responses.py", line 250, in call
    | async with anyio.create_task_group() as task_group:
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/anyio/_backends/_asyncio.py", line 736, in aexit
    | raise BaseExceptionGroup(
    | ExceptionGroup: unhandled errors in a TaskGroup (1 sub-exception)
    +-+---------------- 1 ----------------
    | Traceback (most recent call last):
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/responses.py", line 253, in wrap
    | await func()
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/responses.py", line 242, in stream_response
    | async for chunk in self.body_iterator:
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/vllm/entrypoints/openai/serving_chat.py", line 309, in chat_completion_stream_generator
    | async for res in result_generator:
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/vllm/utils.py", line 452, in iterate_with_cancellation
    | item = await awaits[0]
    | ^^^^^^^^^^^^^^^
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/vllm/engine/multiprocessing/client.py", line 486, in _process_request
    | raise request_output
    | vllm.engine.multiprocessing.MQEngineDeadError: Engine loop is not running. Inspect the stacktrace to find the original error: TimeoutError('No heartbeat received from MQLLMEngine').
    +------------------------------------
    ERROR: Exception in ASGI application
    Traceback (most recent call last):
    File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/responses.py", line 257, in call
    await wrap(partial(self.listen_for_disconnect, receive))
    File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/responses.py", line 253, in wrap
    await func()
    File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/responses.py", line 230, in listen_for_disconnect
    message = await receive()
    ^^^^^^^^^^^^^^^
    File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/uvicorn/protocols/http/httptools_impl.py", line 555, in receive
    await self.message_event.wait()
    File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/asyncio/locks.py", line 212, in wait
    await fut
    asyncio.exceptions.CancelledError: Cancelled by cancel scope 7f2a48015fd0

During handling of the above exception, another exception occurred:

  • Exception Group Traceback (most recent call last):
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/uvicorn/protocols/http/httptools_impl.py", line 401, in run_asgi
    | result = await app( # type: ignore[func-returns-value]
    | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in call
    | return await self.app(scope, receive, send)
    | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/fastapi/applications.py", line 1054, in call
    | await super().call(scope, receive, send)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/applications.py", line 113, in call
    | await self.middleware_stack(scope, receive, send)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/middleware/errors.py", line 187, in call
    | raise exc
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/middleware/errors.py", line 165, in call
    | await self.app(scope, receive, _send)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/middleware/cors.py", line 85, in call
    | await self.app(scope, receive, send)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/middleware/exceptions.py", line 62, in call
    | await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/_exception_handler.py", line 62, in wrapped_app
    | raise exc
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/_exception_handler.py", line 51, in wrapped_app
    | await app(scope, receive, sender)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/routing.py", line 715, in call
    | await self.middleware_stack(scope, receive, send)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/routing.py", line 735, in app
    | await route.handle(scope, receive, send)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/routing.py", line 288, in handle
    | await self.app(scope, receive, send)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/routing.py", line 76, in app
    | await wrap_app_handling_exceptions(app, request)(scope, receive, send)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/_exception_handler.py", line 62, in wrapped_app
    | raise exc
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/_exception_handler.py", line 51, in wrapped_app
    | await app(scope, receive, sender)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/routing.py", line 74, in app
    | await response(scope, receive, send)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/responses.py", line 250, in call
    | async with anyio.create_task_group() as task_group:
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/anyio/_backends/_asyncio.py", line 736, in aexit
    | raise BaseExceptionGroup(
    | ExceptionGroup: unhandled errors in a TaskGroup (1 sub-exception)
    +-+---------------- 1 ----------------
    | Traceback (most recent call last):
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/responses.py", line 253, in wrap
    | await func()
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/responses.py", line 242, in stream_response
    | async for chunk in self.body_iterator:
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/vllm/entrypoints/openai/serving_chat.py", line 309, in chat_completion_stream_generator
    | async for res in result_generator:
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/vllm/utils.py", line 452, in iterate_with_cancellation
    | item = await awaits[0]
    | ^^^^^^^^^^^^^^^
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/vllm/engine/multiprocessing/client.py", line 486, in _process_request
    | raise request_output
    | vllm.engine.multiprocessing.MQEngineDeadError: Engine loop is not running. Inspect the stacktrace to find the original error: TimeoutError('No heartbeat received from MQLLMEngine').
    +------------------------------------
    ERROR: Exception in ASGI application
    Traceback (most recent call last):
    File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/responses.py", line 257, in call
    await wrap(partial(self.listen_for_disconnect, receive))
    File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/responses.py", line 253, in wrap
    await func()
    File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/responses.py", line 230, in listen_for_disconnect
    message = await receive()
    ^^^^^^^^^^^^^^^
    File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/uvicorn/protocols/http/httptools_impl.py", line 555, in receive
    await self.message_event.wait()
    File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/asyncio/locks.py", line 212, in wait
    await fut
    asyncio.exceptions.CancelledError: Cancelled by cancel scope 7f265a456750

During handling of the above exception, another exception occurred:

  • Exception Group Traceback (most recent call last):
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/uvicorn/protocols/http/httptools_impl.py", line 401, in run_asgi
    | result = await app( # type: ignore[func-returns-value]
    | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in call
    | return await self.app(scope, receive, send)
    | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/fastapi/applications.py", line 1054, in call
    | await super().call(scope, receive, send)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/applications.py", line 113, in call
    | await self.middleware_stack(scope, receive, send)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/middleware/errors.py", line 187, in call
    | raise exc
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/middleware/errors.py", line 165, in call
    | await self.app(scope, receive, _send)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/middleware/cors.py", line 85, in call
    | await self.app(scope, receive, send)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/middleware/exceptions.py", line 62, in call
    | await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/_exception_handler.py", line 62, in wrapped_app
    | raise exc
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/_exception_handler.py", line 51, in wrapped_app
    | await app(scope, receive, sender)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/routing.py", line 715, in call
    | await self.middleware_stack(scope, receive, send)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/routing.py", line 735, in app
    | await route.handle(scope, receive, send)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/routing.py", line 288, in handle
    | await self.app(scope, receive, send)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/routing.py", line 76, in app
    | await wrap_app_handling_exceptions(app, request)(scope, receive, send)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/_exception_handler.py", line 62, in wrapped_app
    | raise exc
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/_exception_handler.py", line 51, in wrapped_app
    | await app(scope, receive, sender)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/routing.py", line 74, in app
    | await response(scope, receive, send)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/responses.py", line 250, in call
    | async with anyio.create_task_group() as task_group:
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/anyio/_backends/_asyncio.py", line 736, in aexit
    | raise BaseExceptionGroup(
    | ExceptionGroup: unhandled errors in a TaskGroup (1 sub-exception)
    +-+---------------- 1 ----------------
    | Traceback (most recent call last):
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/responses.py", line 253, in wrap
    | await func()
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/responses.py", line 242, in stream_response
    | async for chunk in self.body_iterator:
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/vllm/entrypoints/openai/serving_chat.py", line 309, in chat_completion_stream_generator
    | async for res in result_generator:
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/vllm/utils.py", line 452, in iterate_with_cancellation
    | item = await awaits[0]
    | ^^^^^^^^^^^^^^^
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/vllm/engine/multiprocessing/client.py", line 486, in _process_request
    | raise request_output
    | vllm.engine.multiprocessing.MQEngineDeadError: Engine loop is not running. Inspect the stacktrace to find the original error: TimeoutError('No heartbeat received from MQLLMEngine').
    +------------------------------------
    ERROR: Exception in ASGI application
    Traceback (most recent call last):
    File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/responses.py", line 257, in call
    await wrap(partial(self.listen_for_disconnect, receive))
    File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/responses.py", line 253, in wrap
    await func()
    File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/responses.py", line 230, in listen_for_disconnect
    message = await receive()
    ^^^^^^^^^^^^^^^
    File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/uvicorn/protocols/http/httptools_impl.py", line 555, in receive
    await self.message_event.wait()
    File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/asyncio/locks.py", line 212, in wait
    await fut
    asyncio.exceptions.CancelledError: Cancelled by cancel scope 7f265a4569f0

During handling of the above exception, another exception occurred:

  • Exception Group Traceback (most recent call last):
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/uvicorn/protocols/http/httptools_impl.py", line 401, in run_asgi
    | result = await app( # type: ignore[func-returns-value]
    | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in call
    | return await self.app(scope, receive, send)
    | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/fastapi/applications.py", line 1054, in call
    | await super().call(scope, receive, send)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/applications.py", line 113, in call
    | await self.middleware_stack(scope, receive, send)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/middleware/errors.py", line 187, in call
    | raise exc
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/middleware/errors.py", line 165, in call
    | await self.app(scope, receive, _send)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/middleware/cors.py", line 85, in call
    | await self.app(scope, receive, send)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/middleware/exceptions.py", line 62, in call
    | await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/_exception_handler.py", line 62, in wrapped_app
    | raise exc
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/_exception_handler.py", line 51, in wrapped_app
    | await app(scope, receive, sender)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/routing.py", line 715, in call
    | await self.middleware_stack(scope, receive, send)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/routing.py", line 735, in app
    | await route.handle(scope, receive, send)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/routing.py", line 288, in handle
    | await self.app(scope, receive, send)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/routing.py", line 76, in app
    | await wrap_app_handling_exceptions(app, request)(scope, receive, send)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/_exception_handler.py", line 62, in wrapped_app
    | raise exc
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/_exception_handler.py", line 51, in wrapped_app
    | await app(scope, receive, sender)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/routing.py", line 74, in app
    | await response(scope, receive, send)
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/responses.py", line 250, in call
    | async with anyio.create_task_group() as task_group:
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/anyio/_backends/_asyncio.py", line 736, in aexit
    | raise BaseExceptionGroup(
    | ExceptionGroup: unhandled errors in a TaskGroup (1 sub-exception)
    +-+---------------- 1 ----------------
    | Traceback (most recent call last):
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/responses.py", line 253, in wrap
    | await func()
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/starlette/responses.py", line 242, in stream_response
    | async for chunk in self.body_iterator:
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/vllm/entrypoints/openai/serving_chat.py", line 309, in chat_completion_stream_generator
    | async for res in result_generator:
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/vllm/utils.py", line 452, in iterate_with_cancellation
    | item = await awaits[0]
    | ^^^^^^^^^^^^^^^
    | File "/raid/demo/anaconda3/envs/vllm_latest/lib/python3.12/site-packages/vllm/engine/multiprocessing/client.py", line 486, in _process_request
    | raise request_output
    | vllm.engine.multiprocessing.MQEngineDeadError: Engine loop is not running. Inspect the stacktrace to find the original error: TimeoutError('No heartbeat received from MQLLMEngine').
    +------------------------------------

@hxhcreate
Copy link

upgrade to v0.6.3 seems to solve this error

@Clint-chan
Copy link
Author

image
still

@gkm0120
Copy link

gkm0120 commented Nov 19, 2024

vllm==0.6.0,torch==2.4.0,Qwen2.5-72B-Instruct,same here
image

@hiaoxui
Copy link

hiaoxui commented Dec 3, 2024

I have the same issue with vllm 0.6.4.post1. I worked with 4 A100 GPUs, and the generation slows down until completely hangs. Not sure if the hanging was related to the memory leak.

@qingzhi-sandm
Copy link

WARNING 01-12 08:36:16 config.py:1656] Casting torch.bfloat16 to torch.float16.
INFO 01-12 08:36:17 config.py:899] Defaulting to use mp for distributed inference
WARNING 01-12 08:36:17 arg_utils.py:940] The model has a long context length (131072). This may cause OOM errors during the initial memory profiling phase, or result in low performance due to small KV cache space. Consider setting --max-model-len to a smaller value.
INFO 01-12 08:36:17 llm_engine.py:226] Initializing an LLM engine (v0.6.1.dev238+ge2c6e0a82) with config: model='/home/qz/LLama3.1-8B-Instruct', speculative_config=None, tokenizer='/home/qz/LLama3.1-8B-Instruct', skip_tokenizer_init=False, tokenizer_mode=auto, revision=None, override_neuron_config=None, rope_scaling=None, rope_theta=None, tokenizer_revision=None, trust_remote_code=False, dtype=torch.float16, max_seq_len=131072, download_dir=None, load_format=LoadFormat.AUTO, tensor_parallel_size=4, pipeline_parallel_size=1, disable_custom_all_reduce=False, quantization=None, enforce_eager=False, kv_cache_dtype=auto, quantization_param_path=None, device_config=cuda, decoding_config=DecodingConfig(guided_decoding_backend='outlines'), observability_config=ObservabilityConfig(otlp_traces_endpoint=None, collect_model_forward_time=False, collect_model_execute_time=False), seed=0, served_model_name=/home/qz/LLama3.1-8B-Instruct, use_v2_block_manager=False, num_scheduler_steps=1, multi_step_stream_outputs=False, enable_prefix_caching=False, use_async_output_proc=True, use_cached_outputs=False, mm_processor_kwargs=None)
WARNING 01-12 08:36:17 multiproc_gpu_executor.py:53] Reducing Torch parallelism from 36 threads to 1 to avoid unnecessary CPU contention. Set OMP_NUM_THREADS in the external environment to tune this value as needed.
INFO 01-12 08:36:17 custom_cache_manager.py:17] Setting Triton cache manager to: vllm.triton_utils.custom_cache_manager:CustomCacheManager
INFO 01-12 08:36:17 selector.py:217] Cannot use FlashAttention-2 backend for Volta and Turing GPUs.
INFO 01-12 08:36:17 selector.py:116] Using XFormers backend.
(VllmWorkerProcess pid=2144909) INFO 01-12 08:36:17 selector.py:217] Cannot use FlashAttention-2 backend for Volta and Turing GPUs.
(VllmWorkerProcess pid=2144909) INFO 01-12 08:36:17 selector.py:116] Using XFormers backend.
(VllmWorkerProcess pid=2144910) INFO 01-12 08:36:17 selector.py:217] Cannot use FlashAttention-2 backend for Volta and Turing GPUs.
(VllmWorkerProcess pid=2144910) INFO 01-12 08:36:17 selector.py:116] Using XFormers backend.
(VllmWorkerProcess pid=2144911) INFO 01-12 08:36:17 selector.py:217] Cannot use FlashAttention-2 backend for Volta and Turing GPUs.
(VllmWorkerProcess pid=2144911) INFO 01-12 08:36:17 selector.py:116] Using XFormers backend.
/home/qz/.conda/envs/vllmtest/lib/python3.9/site-packages/xformers/ops/fmha/flash.py:211: FutureWarning: torch.library.impl_abstract was renamed to torch.library.register_fake. Please use that instead; we will remove torch.library.impl_abstract in a future version of PyTorch.
@torch.library.impl_abstract("xformers_flash::flash_fwd")
(VllmWorkerProcess pid=2144909) /home/qz/.conda/envs/vllmtest/lib/python3.9/site-packages/xformers/ops/fmha/flash.py:211: FutureWarning: torch.library.impl_abstract was renamed to torch.library.register_fake. Please use that instead; we will remove torch.library.impl_abstract in a future version of PyTorch.
(VllmWorkerProcess pid=2144909) @torch.library.impl_abstract("xformers_flash::flash_fwd")
(VllmWorkerProcess pid=2144910) /home/qz/.conda/envs/vllmtest/lib/python3.9/site-packages/xformers/ops/fmha/flash.py:211: FutureWarning: torch.library.impl_abstract was renamed to torch.library.register_fake. Please use that instead; we will remove torch.library.impl_abstract in a future version of PyTorch.
(VllmWorkerProcess pid=2144910) @torch.library.impl_abstract("xformers_flash::flash_fwd")
(VllmWorkerProcess pid=2144911) /home/qz/.conda/envs/vllmtest/lib/python3.9/site-packages/xformers/ops/fmha/flash.py:211: FutureWarning: torch.library.impl_abstract was renamed to torch.library.register_fake. Please use that instead; we will remove torch.library.impl_abstract in a future version of PyTorch.
(VllmWorkerProcess pid=2144911) @torch.library.impl_abstract("xformers_flash::flash_fwd")
/home/qz/.conda/envs/vllmtest/lib/python3.9/site-packages/xformers/ops/fmha/flash.py:344: FutureWarning: torch.library.impl_abstract was renamed to torch.library.register_fake. Please use that instead; we will remove torch.library.impl_abstract in a future version of PyTorch.
@torch.library.impl_abstract("xformers_flash::flash_bwd")
(VllmWorkerProcess pid=2144909) /home/qz/.conda/envs/vllmtest/lib/python3.9/site-packages/xformers/ops/fmha/flash.py:344: FutureWarning: torch.library.impl_abstract was renamed to torch.library.register_fake. Please use that instead; we will remove torch.library.impl_abstract in a future version of PyTorch.
(VllmWorkerProcess pid=2144909) @torch.library.impl_abstract("xformers_flash::flash_bwd")
(VllmWorkerProcess pid=2144910) /home/qz/.conda/envs/vllmtest/lib/python3.9/site-packages/xformers/ops/fmha/flash.py:344: FutureWarning: torch.library.impl_abstract was renamed to torch.library.register_fake. Please use that instead; we will remove torch.library.impl_abstract in a future version of PyTorch.
(VllmWorkerProcess pid=2144910) @torch.library.impl_abstract("xformers_flash::flash_bwd")
(VllmWorkerProcess pid=2144911) /home/qz/.conda/envs/vllmtest/lib/python3.9/site-packages/xformers/ops/fmha/flash.py:344: FutureWarning: torch.library.impl_abstract was renamed to torch.library.register_fake. Please use that instead; we will remove torch.library.impl_abstract in a future version of PyTorch.
(VllmWorkerProcess pid=2144911) @torch.library.impl_abstract("xformers_flash::flash_bwd")
(VllmWorkerProcess pid=2144909) INFO 01-12 08:36:18 multiproc_worker_utils.py:218] Worker ready; awaiting tasks
(VllmWorkerProcess pid=2144910) INFO 01-12 08:36:18 multiproc_worker_utils.py:218] Worker ready; awaiting tasks
(VllmWorkerProcess pid=2144911) INFO 01-12 08:36:18 multiproc_worker_utils.py:218] Worker ready; awaiting tasks
(VllmWorkerProcess pid=2144911) INFO 01-12 08:36:18 utils.py:992] Found nccl from library libnccl.so.2
INFO 01-12 08:36:18 utils.py:992] Found nccl from library libnccl.so.2
(VllmWorkerProcess pid=2144909) INFO 01-12 08:36:18 utils.py:992] Found nccl from library libnccl.so.2
INFO 01-12 08:36:18 pynccl.py:63] vLLM is using nccl==2.20.5
(VllmWorkerProcess pid=2144911) INFO 01-12 08:36:18 pynccl.py:63] vLLM is using nccl==2.20.5
(VllmWorkerProcess pid=2144910) INFO 01-12 08:36:18 utils.py:992] Found nccl from library libnccl.so.2
(VllmWorkerProcess pid=2144909) INFO 01-12 08:36:18 pynccl.py:63] vLLM is using nccl==2.20.5
(VllmWorkerProcess pid=2144910) INFO 01-12 08:36:18 pynccl.py:63] vLLM is using nccl==2.20.5
(VllmWorkerProcess pid=2144911) WARNING 01-12 08:36:19 custom_all_reduce.py:128] Custom allreduce is disabled because it's not supported on more than two PCIe-only GPUs. To silence this warning, specify disable_custom_all_reduce=True explicitly.
(VllmWorkerProcess pid=2144909) WARNING 01-12 08:36:19 custom_all_reduce.py:128] Custom allreduce is disabled because it's not supported on more than two PCIe-only GPUs. To silence this warning, specify disable_custom_all_reduce=True explicitly.
(VllmWorkerProcess pid=2144910) WARNING 01-12 08:36:19 custom_all_reduce.py:128] Custom allreduce is disabled because it's not supported on more than two PCIe-only GPUs. To silence this warning, specify disable_custom_all_reduce=True explicitly.
WARNING 01-12 08:36:19 custom_all_reduce.py:128] Custom allreduce is disabled because it's not supported on more than two PCIe-only GPUs. To silence this warning, specify disable_custom_all_reduce=True explicitly.
INFO 01-12 08:36:19 shm_broadcast.py:241] vLLM message queue communication handle: Handle(connect_ip='127.0.0.1', local_reader_ranks=[1, 2, 3], buffer=<vllm.distributed.device_communicators.shm_broadcast.ShmRingBuffer object at 0x7fe62a7b9e80>, local_subscribe_port=49407, remote_subscribe_port=None)
INFO 01-12 08:36:19 model_runner.py:1014] Starting to load model /home/qz/LLama3.1-8B-Instruct...
(VllmWorkerProcess pid=2144909) INFO 01-12 08:36:19 model_runner.py:1014] Starting to load model /home/qz/LLama3.1-8B-Instruct...
(VllmWorkerProcess pid=2144910) INFO 01-12 08:36:19 model_runner.py:1014] Starting to load model /home/qz/LLama3.1-8B-Instruct...
(VllmWorkerProcess pid=2144911) INFO 01-12 08:36:19 model_runner.py:1014] Starting to load model /home/qz/LLama3.1-8B-Instruct...
(VllmWorkerProcess pid=2144910) INFO 01-12 08:36:19 selector.py:217] Cannot use FlashAttention-2 backend for Volta and Turing GPUs.
(VllmWorkerProcess pid=2144910) INFO 01-12 08:36:19 selector.py:116] Using XFormers backend.
INFO 01-12 08:36:19 selector.py:217] Cannot use FlashAttention-2 backend for Volta and Turing GPUs.
INFO 01-12 08:36:19 selector.py:116] Using XFormers backend.
(VllmWorkerProcess pid=2144909) INFO 01-12 08:36:19 selector.py:217] Cannot use FlashAttention-2 backend for Volta and Turing GPUs.
(VllmWorkerProcess pid=2144909) INFO 01-12 08:36:19 selector.py:116] Using XFormers backend.
(VllmWorkerProcess pid=2144911) INFO 01-12 08:36:19 selector.py:217] Cannot use FlashAttention-2 backend for Volta and Turing GPUs.
(VllmWorkerProcess pid=2144911) INFO 01-12 08:36:19 selector.py:116] Using XFormers backend.
Loading safetensors checkpoint shards: 0% Completed | 0/4 [00:00<?, ?it/s]
Loading safetensors checkpoint shards: 25% Completed | 1/4 [00:01<00:03, 1.19s/it]
Loading safetensors checkpoint shards: 50% Completed | 2/4 [00:02<00:02, 1.15s/it]
Loading safetensors checkpoint shards: 75% Completed | 3/4 [00:03<00:01, 1.12s/it]
Loading safetensors checkpoint shards: 100% Completed | 4/4 [00:03<00:00, 1.15it/s]
Loading safetensors checkpoint shards: 100% Completed | 4/4 [00:03<00:00, 1.03it/s]

INFO 01-12 08:36:24 model_runner.py:1025] Loading model weights took 3.7710 GB
(VllmWorkerProcess pid=2144909) INFO 01-12 08:36:24 model_runner.py:1025] Loading model weights took 3.7710 GB
(VllmWorkerProcess pid=2144911) INFO 01-12 08:36:24 model_runner.py:1025] Loading model weights took 3.7710 GB
(VllmWorkerProcess pid=2144910) INFO 01-12 08:36:24 model_runner.py:1025] Loading model weights took 3.7710 GB
INFO 01-12 08:36:38 distributed_gpu_executor.py:57] # GPU blocks: 5402, # CPU blocks: 8192
(VllmWorkerProcess pid=2144909) ERROR 01-12 08:36:38 multiproc_worker_utils.py:233] Exception in worker VllmWorkerProcess while processing method initialize_cache: The model's max seq len (131072) is larger than the maximum number of tokens that can be stored in KV cache (86432). Try increasing gpu_memory_utilization or decreasing max_model_len when initializing the engine., Traceback (most recent call last):
(VllmWorkerProcess pid=2144909) ERROR 01-12 08:36:38 multiproc_worker_utils.py:233] File "/home/qz/.conda/envs/vllmtest/lib/python3.9/site-packages/vllm/executor/multiproc_worker_utils.py", line 226, in _run_worker_process
(VllmWorkerProcess pid=2144909) ERROR 01-12 08:36:38 multiproc_worker_utils.py:233] output = executor(*args, **kwargs)
(VllmWorkerProcess pid=2144909) ERROR 01-12 08:36:38 multiproc_worker_utils.py:233] File "/home/qz/.conda/envs/vllmtest/lib/python3.9/site-packages/vllm/worker/worker.py", line 258, in initialize_cache
[rank0]: Traceback (most recent call last):
[rank0]: File "/home/qz/LLM_attack/llm_test.py", line 17, in
[rank0]: llm = LLM(model=r"/home/qz/LLama3.1-8B-Instruct",dtype="float16",gpu_memory_utilization=1,tensor_parallel_size=4)
[rank0]: File "/home/qz/.conda/envs/vllmtest/lib/python3.9/site-packages/vllm/entrypoints/llm.py", line 216, in init
[rank0]: self.llm_engine = LLMEngine.from_engine_args(
[rank0]: File "/home/qz/.conda/envs/vllmtest/lib/python3.9/site-packages/vllm/engine/llm_engine.py", line 564, in from_engine_args
[rank0]: engine = cls(
[rank0]: File "/home/qz/.conda/envs/vllmtest/lib/python3.9/site-packages/vllm/engine/llm_engine.py", line 339, in init
[rank0]: self._initialize_kv_caches()
[rank0]: File "/home/qz/.conda/envs/vllmtest/lib/python3.9/site-packages/vllm/engine/llm_engine.py", line 487, in _initialize_kv_caches
[rank0]: self.model_executor.initialize_cache(num_gpu_blocks, num_cpu_blocks)
[rank0]: File "/home/qz/.conda/envs/vllmtest/lib/python3.9/site-packages/vllm/executor/distributed_gpu_executor.py", line 63, in initialize_cache
[rank0]: self._run_workers("initialize_cache",
[rank0]: File "/home/qz/.conda/envs/vllmtest/lib/python3.9/site-packages/vllm/executor/multiproc_gpu_executor.py", line 185, in _run_workers
[rank0]: driver_worker_output = driver_worker_method(*args, **kwargs)
[rank0]: File "/home/qz/.conda/envs/vllmtest/lib/python3.9/site-packages/vllm/worker/worker.py", line 258, in initialize_cache
[rank0]: raise_if_cache_size_invalid(num_gpu_blocks,
[rank0]: File "/home/qz/.conda/envs/vllmtest/lib/python3.9/site-packages/vllm/worker/worker.py", line 483, in raise_if_cache_size_invalid
[rank0]: raise ValueError(
[rank0]: ValueError: The model's max seq len (131072) is larger than the maximum number of tokens that can be stored in KV cache (86432). Try increasing gpu_memory_utilization or decreasing max_model_len when initializing the engine.
(VllmWorkerProcess pid=2144909) ERROR 01-12 08:36:38 multiproc_worker_utils.py:233] raise_if_cache_size_invalid(num_gpu_blocks,
(VllmWorkerProcess pid=2144909) ERROR 01-12 08:36:38 multiproc_worker_utils.py:233] File "/home/qz/.conda/envs/vllmtest/lib/python3.9/site-packages/vllm/worker/worker.py", line 483, in raise_if_cache_size_invalid
(VllmWorkerProcess pid=2144909) ERROR 01-12 08:36:38 multiproc_worker_utils.py:233] raise ValueError(
(VllmWorkerProcess pid=2144909) ERROR 01-12 08:36:38 multiproc_worker_utils.py:233] ValueError: The model's max seq len (131072) is larger than the maximum number of tokens that can be stored in KV cache (86432). Try increasing gpu_memory_utilization or decreasing max_model_len when initializing the engine.
(VllmWorkerProcess pid=2144909) ERROR 01-12 08:36:38 multiproc_worker_utils.py:233]
(VllmWorkerProcess pid=2144910) ERROR 01-12 08:36:38 multiproc_worker_utils.py:233] Exception in worker VllmWorkerProcess while processing method initialize_cache: The model's max seq len (131072) is larger than the maximum number of tokens that can be stored in KV cache (86432). Try increasing gpu_memory_utilization or decreasing max_model_len when initializing the engine., Traceback (most recent call last):
(VllmWorkerProcess pid=2144910) ERROR 01-12 08:36:38 multiproc_worker_utils.py:233] File "/home/qz/.conda/envs/vllmtest/lib/python3.9/site-packages/vllm/executor/multiproc_worker_utils.py", line 226, in _run_worker_process
(VllmWorkerProcess pid=2144910) ERROR 01-12 08:36:38 multiproc_worker_utils.py:233] output = executor(*args, **kwargs)
(VllmWorkerProcess pid=2144910) ERROR 01-12 08:36:38 multiproc_worker_utils.py:233] File "/home/qz/.conda/envs/vllmtest/lib/python3.9/site-packages/vllm/worker/worker.py", line 258, in initialize_cache
(VllmWorkerProcess pid=2144910) ERROR 01-12 08:36:38 multiproc_worker_utils.py:233] raise_if_cache_size_invalid(num_gpu_blocks,
(VllmWorkerProcess pid=2144910) ERROR 01-12 08:36:38 multiproc_worker_utils.py:233] File "/home/qz/.conda/envs/vllmtest/lib/python3.9/site-packages/vllm/worker/worker.py", line 483, in raise_if_cache_size_invalid
(VllmWorkerProcess pid=2144910) ERROR 01-12 08:36:38 multiproc_worker_utils.py:233] raise ValueError(
(VllmWorkerProcess pid=2144910) ERROR 01-12 08:36:38 multiproc_worker_utils.py:233] ValueError: The model's max seq len (131072) is larger than the maximum number of tokens that can be stored in KV cache (86432). Try increasing gpu_memory_utilization or decreasing max_model_len when initializing the engine.
(VllmWorkerProcess pid=2144910) ERROR 01-12 08:36:38 multiproc_worker_utils.py:233]
(VllmWorkerProcess pid=2144911) ERROR 01-12 08:36:38 multiproc_worker_utils.py:233] Exception in worker VllmWorkerProcess while processing method initialize_cache: The model's max seq len (131072) is larger than the maximum number of tokens that can be stored in KV cache (86432). Try increasing gpu_memory_utilization or decreasing max_model_len when initializing the engine., Traceback (most recent call last):
(VllmWorkerProcess pid=2144911) ERROR 01-12 08:36:38 multiproc_worker_utils.py:233] File "/home/qz/.conda/envs/vllmtest/lib/python3.9/site-packages/vllm/executor/multiproc_worker_utils.py", line 226, in _run_worker_process
(VllmWorkerProcess pid=2144911) ERROR 01-12 08:36:38 multiproc_worker_utils.py:233] output = executor(*args, **kwargs)
(VllmWorkerProcess pid=2144911) ERROR 01-12 08:36:38 multiproc_worker_utils.py:233] File "/home/qz/.conda/envs/vllmtest/lib/python3.9/site-packages/vllm/worker/worker.py", line 258, in initialize_cache
(VllmWorkerProcess pid=2144911) ERROR 01-12 08:36:38 multiproc_worker_utils.py:233] raise_if_cache_size_invalid(num_gpu_blocks,
(VllmWorkerProcess pid=2144911) ERROR 01-12 08:36:38 multiproc_worker_utils.py:233] File "/home/qz/.conda/envs/vllmtest/lib/python3.9/site-packages/vllm/worker/worker.py", line 483, in raise_if_cache_size_invalid
(VllmWorkerProcess pid=2144911) ERROR 01-12 08:36:38 multiproc_worker_utils.py:233] raise ValueError(
(VllmWorkerProcess pid=2144911) ERROR 01-12 08:36:38 multiproc_worker_utils.py:233] ValueError: The model's max seq len (131072) is larger than the maximum number of tokens that can be stored in KV cache (86432). Try increasing gpu_memory_utilization or decreasing max_model_len when initializing the engine.
(VllmWorkerProcess pid=2144911) ERROR 01-12 08:36:38 multiproc_worker_utils.py:233]
ERROR 01-12 08:36:38 multiproc_worker_utils.py:120] Worker VllmWorkerProcess pid 2144910 died, exit code: -15
Fatal Python error: _enter_buffered_busy: could not acquire lock for <_io.BufferedWriter name=''> at interpreter shutdown, possibly due to daemon threads
Python runtime state: finalizing (tstate=0x149b570)

Current thread 0x00007fe6efeb8280 (most recent call first):

/home/qz/.conda/envs/vllmtest/lib/python3.9/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 1 leaked shared_memory objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d '
Aborted (core dumped)
same question,and how to slove this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

9 participants