Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Core][Frontend][Doc] Initial support for LLaVA-NeXT and GPT-4V Chat Completions API #3978

Closed

Commits on Apr 10, 2024

  1. Add basic support for OpenAI image input API

    - Refactor `OpenAIServingChat` and add function for loading image
    - Move `pillow` dev dependency to common
    - Add example chat template for LLaVA model
    DarkLight1337 committed Apr 10, 2024
    Configuration menu
    Copy the full SHA
    874a581 View commit details
    Browse the repository at this point in the history
  2. Update documentation

    - Add general guide for using VLMs
    - Add LLavA to list of supported models
    DarkLight1337 committed Apr 10, 2024
    Configuration menu
    Copy the full SHA
    607434e View commit details
    Browse the repository at this point in the history
  3. Add tests for OpenAI image input API and image loader

    - Move `ServerRunner` to common file
    DarkLight1337 committed Apr 10, 2024
    Configuration menu
    Copy the full SHA
    aaa6bfe View commit details
    Browse the repository at this point in the history

Commits on Apr 11, 2024

  1. Configuration menu
    Copy the full SHA
    26e7b2a View commit details
    Browse the repository at this point in the history
  2. Apply formatter

    DarkLight1337 committed Apr 11, 2024
    Configuration menu
    Copy the full SHA
    44829b5 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    bccb367 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    b9302e8 View commit details
    Browse the repository at this point in the history
  5. Fix errors in CI/CD

    - Incorrect loading of config (also rename `openai_api` to `image_openai`)
    - Incorrect await of stream generator
    DarkLight1337 committed Apr 11, 2024
    Configuration menu
    Copy the full SHA
    a44d7d1 View commit details
    Browse the repository at this point in the history

Commits on Apr 12, 2024

  1. Configuration menu
    Copy the full SHA
    561ad49 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    4479605 View commit details
    Browse the repository at this point in the history
  3. Improve async behaviour of loading images

    - Also, use the type definitions from `openai` directly
    DarkLight1337 committed Apr 12, 2024
    Configuration menu
    Copy the full SHA
    20852d9 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    ce770f4 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    6b016bc View commit details
    Browse the repository at this point in the history
  6. Some more fixes

    DarkLight1337 committed Apr 12, 2024
    Configuration menu
    Copy the full SHA
    7620354 View commit details
    Browse the repository at this point in the history
  7. Apply formatter

    DarkLight1337 committed Apr 12, 2024
    Configuration menu
    Copy the full SHA
    7c3e6d9 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    e74b0a7 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    9925dcb View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    ceb4e35 View commit details
    Browse the repository at this point in the history
  11. Refactor prompt parsing so that it can be shared between Chat Complet…

    …ions API and legacy Completions API
    DarkLight1337 committed Apr 12, 2024
    Configuration menu
    Copy the full SHA
    7bdc84e View commit details
    Browse the repository at this point in the history
  12. Make code more readable

    DarkLight1337 committed Apr 12, 2024
    Configuration menu
    Copy the full SHA
    a7d1098 View commit details
    Browse the repository at this point in the history
  13. Configuration menu
    Copy the full SHA
    8b9d636 View commit details
    Browse the repository at this point in the history
  14. Configuration menu
    Copy the full SHA
    9754142 View commit details
    Browse the repository at this point in the history
  15. Add code documentation

    DarkLight1337 committed Apr 12, 2024
    Configuration menu
    Copy the full SHA
    c48c13a View commit details
    Browse the repository at this point in the history
  16. Configuration menu
    Copy the full SHA
    3530362 View commit details
    Browse the repository at this point in the history
  17. Configuration menu
    Copy the full SHA
    b8feec9 View commit details
    Browse the repository at this point in the history
  18. Configuration menu
    Copy the full SHA
    9cae113 View commit details
    Browse the repository at this point in the history

Commits on Apr 13, 2024

  1. Configuration menu
    Copy the full SHA
    89d9086 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    cc1a5b3 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    f9c1135 View commit details
    Browse the repository at this point in the history

Commits on Apr 14, 2024

  1. Configuration menu
    Copy the full SHA
    ecc2d50 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    f2e8180 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    ce04842 View commit details
    Browse the repository at this point in the history
  4. Load image processor from HuggingFace

    - Note that multi modal processing logic has been moved from `LLM` to `LLMEngine`
    DarkLight1337 committed Apr 14, 2024
    Configuration menu
    Copy the full SHA
    cdbf08a View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    9a336ec View commit details
    Browse the repository at this point in the history
  6. Allow disabling image processor

    - Also fix missing arguments to config in `test_llava.py`
    DarkLight1337 committed Apr 14, 2024
    Configuration menu
    Copy the full SHA
    5722dd8 View commit details
    Browse the repository at this point in the history

Commits on Apr 15, 2024

  1. Configuration menu
    Copy the full SHA
    6e1fa67 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    7ce44da View commit details
    Browse the repository at this point in the history

Commits on Apr 16, 2024

  1. Configuration menu
    Copy the full SHA
    9804604 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    21434df View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    a5907b0 View commit details
    Browse the repository at this point in the history

Commits on Apr 17, 2024

  1. Configuration menu
    Copy the full SHA
    f08ff10 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    c126646 View commit details
    Browse the repository at this point in the history

Commits on Apr 18, 2024

  1. Configuration menu
    Copy the full SHA
    49ba216 View commit details
    Browse the repository at this point in the history
  2. Add TODO to test

    DarkLight1337 committed Apr 18, 2024
    Configuration menu
    Copy the full SHA
    11e9921 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    7ae80a2 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    2610bea View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    5ad2b67 View commit details
    Browse the repository at this point in the history
  6. Refactor image processing, MultiModalData and LLaVA model

    - Remove channel conversion and resizing from OpenAI server preprocessing since the image processor in HuggingFace should be able to handle that
    - `MultiModalData` is now an abstract class that outputs additional kwargs to be input into the model. This was intially done to support LLaVA-NeXT's `image_size` parameter but can be extended to other models as well.
    - The application of image processor is now defined inside `MultiModalData` so that there is no need to extensively edit the engine to support other types of data
    - New `MultiModalData` subclasses: `ImagePixelData` and `ImageFeatureData` to better differentiate the two cases of image input
    - Refactored LLaVA-1.5 model to make it easier to inherit for defining LLaVA-NeXT model
    DarkLight1337 committed Apr 18, 2024
    Configuration menu
    Copy the full SHA
    696357b View commit details
    Browse the repository at this point in the history
  7. Fix image processing not working directly, due to tensor being passed

    - Now, `ImagePixelData` only accepts `PIL.Image` input
    - Also move `torch` import out of `TYPE_CHECKING` as it is loaded anyways when importing `SamplingParams`
    DarkLight1337 committed Apr 18, 2024
    Configuration menu
    Copy the full SHA
    483b190 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    3e22017 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    0b6af35 View commit details
    Browse the repository at this point in the history
  10. Get LLaVA-Next to work with fixed-size images

    - Note the patch in `ImagePixelData`. To fully leverage the potential of LLaVA-Next, we should allow image of any size, but the feature size would then be variable.
    DarkLight1337 committed Apr 18, 2024
    Configuration menu
    Copy the full SHA
    e4c3502 View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    21aaf3d View commit details
    Browse the repository at this point in the history
  12. Configuration menu
    Copy the full SHA
    ac95b79 View commit details
    Browse the repository at this point in the history
  13. Configuration menu
    Copy the full SHA
    9a9a4e7 View commit details
    Browse the repository at this point in the history
  14. Configuration menu
    Copy the full SHA
    176ad2c View commit details
    Browse the repository at this point in the history

Commits on Apr 19, 2024

  1. Configuration menu
    Copy the full SHA
    91ea044 View commit details
    Browse the repository at this point in the history
  2. Fix LLaVA example and test w.r.t. image processing refactor

    - Note that we now load the images directly instead of from `.pt` files
    DarkLight1337 committed Apr 19, 2024
    Configuration menu
    Copy the full SHA
    cb19743 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    019f473 View commit details
    Browse the repository at this point in the history
  4. Fix circular import and set return type

    - These changes are propagated to the child PRs
    DarkLight1337 committed Apr 19, 2024
    Configuration menu
    Copy the full SHA
    f882d99 View commit details
    Browse the repository at this point in the history