Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ Frontend ] Multiprocessing for OpenAI Server with zeromq #6883

Merged
merged 84 commits into from
Aug 3, 2024

Commits on Jul 25, 2024

  1. ⚗️ add backend proto file

    Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
    joerunde committed Jul 25, 2024
    Configuration menu
    Copy the full SHA
    bed649a View commit details
    Browse the repository at this point in the history
  2. ♻️ move proto to grpc/pb

    Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
    joerunde committed Jul 25, 2024
    Configuration menu
    Copy the full SHA
    7de9d49 View commit details
    Browse the repository at this point in the history
  3. ✨ add proto compilation

    Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
    joerunde committed Jul 25, 2024
    Configuration menu
    Copy the full SHA
    9394a62 View commit details
    Browse the repository at this point in the history
  4. updated

    robertgshaw2-neuralmagic committed Jul 25, 2024
    Configuration menu
    Copy the full SHA
    dd8bf96 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    5c7fbff View commit details
    Browse the repository at this point in the history
  6. 🚧 more wip

    Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
    joerunde committed Jul 25, 2024
    Configuration menu
    Copy the full SHA
    952e8ef View commit details
    Browse the repository at this point in the history
  7. fixed

    robertgshaw2-neuralmagic committed Jul 25, 2024
    Configuration menu
    Copy the full SHA
    e8eac95 View commit details
    Browse the repository at this point in the history
  8. 🐛 fixup race condition

    Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
    joerunde committed Jul 25, 2024
    Configuration menu
    Copy the full SHA
    938a843 View commit details
    Browse the repository at this point in the history
  9. 🐛 remove timeout

    Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
    joerunde committed Jul 25, 2024
    Configuration menu
    Copy the full SHA
    2b8d7cd View commit details
    Browse the repository at this point in the history

Commits on Jul 26, 2024

  1. format

    robertgshaw2-neuralmagic committed Jul 26, 2024
    Configuration menu
    Copy the full SHA
    ea02d39 View commit details
    Browse the repository at this point in the history
  2. streaming

    robertgshaw2-neuralmagic committed Jul 26, 2024
    Configuration menu
    Copy the full SHA
    4a2dc46 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    30f2bc9 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    c718b68 View commit details
    Browse the repository at this point in the history
  5. ⚗️ try unix sockets

    Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
    joerunde committed Jul 26, 2024
    Configuration menu
    Copy the full SHA
    b3d25c6 View commit details
    Browse the repository at this point in the history
  6. ⚡ no background loop

    Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
    joerunde committed Jul 26, 2024
    Configuration menu
    Copy the full SHA
    2765b17 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    b219778 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    932ea23 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    f029114 View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    6854758 View commit details
    Browse the repository at this point in the history
  11. 🐛 whoops

    Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
    joerunde committed Jul 26, 2024
    Configuration menu
    Copy the full SHA
    3b5ff66 View commit details
    Browse the repository at this point in the history
  12. 📝 log stuff

    Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
    joerunde committed Jul 26, 2024
    Configuration menu
    Copy the full SHA
    79247c3 View commit details
    Browse the repository at this point in the history
  13. stash

    robertgshaw2-neuralmagic committed Jul 26, 2024
    Configuration menu
    Copy the full SHA
    a39ebc0 View commit details
    Browse the repository at this point in the history
  14. pushing up

    robertgshaw2-neuralmagic committed Jul 26, 2024
    Configuration menu
    Copy the full SHA
    ef257f1 View commit details
    Browse the repository at this point in the history

Commits on Jul 28, 2024

  1. stash

    robertgshaw2-neuralmagic committed Jul 28, 2024
    Configuration menu
    Copy the full SHA
    a6c9bc5 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    d7490bc View commit details
    Browse the repository at this point in the history
  3. cleanup

    robertgshaw2-neuralmagic committed Jul 28, 2024
    Configuration menu
    Copy the full SHA
    f68fd60 View commit details
    Browse the repository at this point in the history
  4. more cleanup

    robertgshaw2-neuralmagic committed Jul 28, 2024
    Configuration menu
    Copy the full SHA
    38b5b9c View commit details
    Browse the repository at this point in the history
  5. cleanup

    robertgshaw2-neuralmagic committed Jul 28, 2024
    Configuration menu
    Copy the full SHA
    bc54311 View commit details
    Browse the repository at this point in the history
  6. stash

    robertgshaw2-neuralmagic committed Jul 28, 2024
    Configuration menu
    Copy the full SHA
    3cccebb View commit details
    Browse the repository at this point in the history
  7. more cleanup

    robertgshaw2-neuralmagic committed Jul 28, 2024
    Configuration menu
    Copy the full SHA
    4b78e29 View commit details
    Browse the repository at this point in the history
  8. setup

    robertgshaw2-neuralmagic committed Jul 28, 2024
    Configuration menu
    Copy the full SHA
    345bfdd View commit details
    Browse the repository at this point in the history
  9. cleanup

    robertgshaw2-neuralmagic committed Jul 28, 2024
    Configuration menu
    Copy the full SHA
    cfbb001 View commit details
    Browse the repository at this point in the history
  10. format

    robertgshaw2-neuralmagic committed Jul 28, 2024
    Configuration menu
    Copy the full SHA
    d811b42 View commit details
    Browse the repository at this point in the history
  11. cleaning up

    robertgshaw2-neuralmagic committed Jul 28, 2024
    Configuration menu
    Copy the full SHA
    852534e View commit details
    Browse the repository at this point in the history
  12. zlib

    robertgshaw2-neuralmagic committed Jul 28, 2024
    Configuration menu
    Copy the full SHA
    e42be96 View commit details
    Browse the repository at this point in the history
  13. Revert "zlib"

    This reverts commit e42be96.
    robertgshaw2-neuralmagic committed Jul 28, 2024
    Configuration menu
    Copy the full SHA
    5202a59 View commit details
    Browse the repository at this point in the history
  14. Configuration menu
    Copy the full SHA
    71b1bf9 View commit details
    Browse the repository at this point in the history

Commits on Jul 29, 2024

  1. Configuration menu
    Copy the full SHA
    a499079 View commit details
    Browse the repository at this point in the history
  2. format

    robertgshaw2-neuralmagic committed Jul 29, 2024
    Configuration menu
    Copy the full SHA
    88a1d08 View commit details
    Browse the repository at this point in the history
  3. format

    robertgshaw2-neuralmagic committed Jul 29, 2024
    Configuration menu
    Copy the full SHA
    13ce2f1 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    bb8ac06 View commit details
    Browse the repository at this point in the history
  5. cleaning

    robertgshaw2-neuralmagic committed Jul 29, 2024
    Configuration menu
    Copy the full SHA
    6ebdb3d View commit details
    Browse the repository at this point in the history
  6. cleaning

    robertgshaw2-neuralmagic committed Jul 29, 2024
    Configuration menu
    Copy the full SHA
    24c8100 View commit details
    Browse the repository at this point in the history
  7. cleaning

    robertgshaw2-neuralmagic committed Jul 29, 2024
    Configuration menu
    Copy the full SHA
    e707049 View commit details
    Browse the repository at this point in the history
  8. add stubs

    robertgshaw2-neuralmagic committed Jul 29, 2024
    Configuration menu
    Copy the full SHA
    baaf6bc View commit details
    Browse the repository at this point in the history
  9. format

    robertgshaw2-neuralmagic committed Jul 29, 2024
    Configuration menu
    Copy the full SHA
    9d19d92 View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    f1be4b8 View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    8e417ad View commit details
    Browse the repository at this point in the history
  12. 🥅 handle shutdown and request errors

    Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
    joerunde committed Jul 29, 2024
    Configuration menu
    Copy the full SHA
    4c16c5e View commit details
    Browse the repository at this point in the history
  13. 🎨 fmt and clean up shutdown handler

    Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
    joerunde committed Jul 29, 2024
    Configuration menu
    Copy the full SHA
    6ddd4a7 View commit details
    Browse the repository at this point in the history
  14. 🐛 fixup type hint for queue

    Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
    joerunde committed Jul 29, 2024
    Configuration menu
    Copy the full SHA
    6d7da74 View commit details
    Browse the repository at this point in the history
  15. ✨ update chat endpoint

    Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
    joerunde committed Jul 29, 2024
    Configuration menu
    Copy the full SHA
    97ea04d View commit details
    Browse the repository at this point in the history
  16. 🐛 fixup zmq constant types

    Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
    joerunde committed Jul 29, 2024
    Configuration menu
    Copy the full SHA
    6d753a4 View commit details
    Browse the repository at this point in the history
  17. ✨ hook up de/tokenize

    Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
    joerunde committed Jul 29, 2024
    Configuration menu
    Copy the full SHA
    38e308e View commit details
    Browse the repository at this point in the history
  18. ♻️ add VLLMBackend protocol

    Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
    joerunde committed Jul 29, 2024
    Configuration menu
    Copy the full SHA
    ec19a7b View commit details
    Browse the repository at this point in the history

Commits on Jul 30, 2024

  1. Frontend mp flag (#384)

    @robertgshaw2-neuralmagic 
    
    This adds the `--disable-frontend-multiprocessing` flag and should also
    correctly pick up embeddings models to disable the multiprocessing here.
    (Also some unrelated formatting changes)
    
    The backend stuff is wrapped up in a context manager that handles the
    process startup and shutdown at exit as well, so that we don't have to
    muck around much in the existing server lifecycle code
    
    ---------
    
    Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
    joerunde authored Jul 30, 2024
    Configuration menu
    Copy the full SHA
    453939b View commit details
    Browse the repository at this point in the history

Commits on Jul 31, 2024

  1. Features / Cleanup for MP Frontend (#387)

    SUMMARY:
    * refactor to use single socket
    * cleanup comments / logging
    * add `do_log_stats`
    * add `abort`
    robertgshaw2-neuralmagic authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    1f33286 View commit details
    Browse the repository at this point in the history
  2. Use random port for backend (#390)

    Picks an open port to use and boots both the client and server with it
    
    ---------
    
    Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
    joerunde authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    5362952 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    7214fb8 View commit details
    Browse the repository at this point in the history
  4. ✨ health check round 2 (#392)

    With all the extra fun refactors
    
    Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
    joerunde authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    98a7dab View commit details
    Browse the repository at this point in the history
  5. Add tokenizer (#394)

    SUMMARY:
    * add endpoints to request `ModelConfig`, `SchedulerConfig`,
    `LoRAConfig`, `ParallelConfig`
    * factor out tokenizer group creation function to be a utility function
    * create tokenizer_group on client side
    robertgshaw2-neuralmagic authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    f5f0b45 View commit details
    Browse the repository at this point in the history
  6. Socket context (#393)

    Ensures no sockets are leaked on the client-side
    
    Also postpones the server shutdown await so that the backend can
    shutdown concurrently, and all connections can be cleaned up at the same
    time. This prevents hangs where the frontend blocks on remaining
    connections but the backend has not yet initiated shutdown
    
    ---------
    
    Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
    joerunde authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    0b351c0 View commit details
    Browse the repository at this point in the history
  7. Logit bias (#395)

    SUMMARY:
    * fix issue with logit bias loading
    robertgshaw2-neuralmagic authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    79fcc44 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    9da8c4a View commit details
    Browse the repository at this point in the history
  9. 🐛 messed up the revert in the merge commit :(

    Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
    joerunde committed Jul 31, 2024
    Configuration menu
    Copy the full SHA
    4c65f74 View commit details
    Browse the repository at this point in the history
  10. fix (#396)

    SUMMARY:
    * passed clamped
    robertgshaw2-neuralmagic authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    9bc97f1 View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    68d8612 View commit details
    Browse the repository at this point in the history

Commits on Aug 1, 2024

  1. format

    robertgshaw2-neuralmagic committed Aug 1, 2024
    Configuration menu
    Copy the full SHA
    4337fe7 View commit details
    Browse the repository at this point in the history
  2. stash

    robertgshaw2-neuralmagic committed Aug 1, 2024
    Configuration menu
    Copy the full SHA
    779d9bd View commit details
    Browse the repository at this point in the history
  3. Fix failed tests (#398)

    SUMMARY:
    * hack
    robertgshaw2-neuralmagic authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    a6044a3 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    100189f View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    0fc8545 View commit details
    Browse the repository at this point in the history
  6. updated

    robertgshaw2-neuralmagic committed Aug 1, 2024
    Configuration menu
    Copy the full SHA
    6383091 View commit details
    Browse the repository at this point in the history
  7. cleaning

    robertgshaw2-neuralmagic committed Aug 1, 2024
    Configuration menu
    Copy the full SHA
    a09f57f View commit details
    Browse the repository at this point in the history
  8. ✅ add test for multiprocessing flag (#399)

    Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
    joerunde authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    1bdbfcb View commit details
    Browse the repository at this point in the history
  9. ✨ pipe tracing flag (#400)

    (plus rounding out the protocol with an error on `.encode`)
    
    ---------
    
    Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
    joerunde authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    f3c0f1c View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    9c415ad View commit details
    Browse the repository at this point in the history
  11. rename

    robertgshaw2-neuralmagic committed Aug 1, 2024
    Configuration menu
    Copy the full SHA
    62036ad View commit details
    Browse the repository at this point in the history
  12. cleaning

    robertgshaw2-neuralmagic committed Aug 1, 2024
    Configuration menu
    Copy the full SHA
    a177d87 View commit details
    Browse the repository at this point in the history
  13. ordering

    robertgshaw2-neuralmagic committed Aug 1, 2024
    Configuration menu
    Copy the full SHA
    9ca3b93 View commit details
    Browse the repository at this point in the history
  14. Configuration menu
    Copy the full SHA
    f8b5fb1 View commit details
    Browse the repository at this point in the history
  15. Update vllm/entrypoints/openai/rpc/server.py

    Co-authored-by: Simon Mo <simon.mo@hey.com>
    robertgshaw2-neuralmagic and simon-mo authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    fca5a71 View commit details
    Browse the repository at this point in the history
  16. format

    robertgshaw2-neuralmagic committed Aug 1, 2024
    Configuration menu
    Copy the full SHA
    5f07f86 View commit details
    Browse the repository at this point in the history

Commits on Aug 2, 2024

  1. Configuration menu
    Copy the full SHA
    bd0fd76 View commit details
    Browse the repository at this point in the history