[Fix] Avoid pickling entire LLMEngine for Ray workers #3207

njhill · 2024-03-05T17:14:23Z

self references have been inadvertently introduced into the closure used to initialize the Ray workers, which is causing the whole LLMEngine object to be pickled with it.

This fixes it to reference the lora_config and kv_cache_dtype directly.

Yard1

Looks good, can you add a short description?

simon-mo · 2024-03-05T19:57:45Z

You can implement the __reduce__ method in LLMEngine and throw an error in there so this won't happen again in the future.

njhill · 2024-03-05T21:45:01Z

Good idea @simon-mo, now added.

vllm/engine/llm_engine.py

) Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>

[Fix] Avoid pickling entire LLMEngine for Ray workers

96a0707

njhill mentioned this pull request Mar 5, 2024

Async tokenization using thread pool #3206

Closed

Yard1 approved these changes Mar 5, 2024

View reviewed changes

Add __reduce__ method to LLMEngine per @simon-mo's suggestion

0893465

Yard1 reviewed Mar 5, 2024

View reviewed changes

vllm/engine/llm_engine.py Outdated Show resolved Hide resolved

Update vllm/engine/llm_engine.py

ee9644f

Yard1 enabled auto-merge (squash) March 5, 2024 22:47

Yard1 reviewed Mar 5, 2024

View reviewed changes

vllm/engine/llm_engine.py Outdated Show resolved Hide resolved

Update vllm/engine/llm_engine.py

3085f15

Yard1 merged commit 2efce05 into vllm-project:main Mar 6, 2024
22 checks passed

njhill deleted the dont-pickle-engine branch March 6, 2024 00:38

grandiose-pizza mentioned this pull request Mar 6, 2024

Added support for Jais models #3183

Merged

dtransposed pushed a commit to afeldman-nm/vllm that referenced this pull request Mar 26, 2024

[Fix] Avoid pickling entire LLMEngine for Ray workers (vllm-project#3207

4c5cc5f

) Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Fix] Avoid pickling entire LLMEngine for Ray workers #3207

[Fix] Avoid pickling entire LLMEngine for Ray workers #3207

njhill commented Mar 5, 2024 •

edited

Loading

Yard1 left a comment

simon-mo commented Mar 5, 2024

njhill commented Mar 5, 2024

[Fix] Avoid pickling entire LLMEngine for Ray workers #3207

[Fix] Avoid pickling entire LLMEngine for Ray workers #3207

Conversation

njhill commented Mar 5, 2024 • edited Loading

Yard1 left a comment

Choose a reason for hiding this comment

simon-mo commented Mar 5, 2024

njhill commented Mar 5, 2024

njhill commented Mar 5, 2024 •

edited

Loading