Add support for OpenAI API : offline batch(file) processing #699

yichuan520030910320 · 2024-07-22T18:26:26Z

Thank you for your contribution, we really appreciate it. The following instructions will help improve your pull request and make it easier to receive feedback. If there are any items you don't understand, don't worry. Just submit the pull request and ask the maintainers for help.

Motivation

I want to support this OpenAI batch API functionally OpenAI Batch API Doc .

Running example

In this PR, we can support SGlang backend running behind the example code provided by OpenAI in the provided link seamlessly.

The frontend code is

from openai import OpenAI
import openai
import time
import json
import os


class OpenAIBatchProcessor:
    def __init__(self, api_key):
        # client = OpenAI(api_key=api_key)
        client = openai.Client(base_url="http://127.0.0.1:30000/v1", api_key="EMPTY")

        self.client = client

    def process_batch(self, input_file_path, endpoint, completion_window):

        # Upload the input file
        with open(input_file_path, "rb") as file:
            uploaded_file = self.client.files.create(file=file, purpose="batch")

        # Create the batch job
        batch_job = self.client.batches.create(
            input_file_id=uploaded_file.id,
            endpoint=endpoint,
            completion_window=completion_window,
        )

        # Monitor the batch job status
        while batch_job.status not in ["completed", "failed", "cancelled"]:
            time.sleep(3)  # Wait for 3 seconds before checking the status again
            print(
                f"Batch job status: {batch_job.status}...trying again in 3 seconds..."
            )
            batch_job = self.client.batches.retrieve(batch_job.id)

        # Check the batch job status and errors
        if batch_job.status == "failed":
            print(f"Batch job failed with status: {batch_job.status}")
            print(f"Batch job errors: {batch_job.errors}")
            return None

        # If the batch job is completed, process the results
        if batch_job.status == "completed":

            # print result of batch job
            print("batch", batch_job.request_counts)

            result_file_id = batch_job.output_file_id
            # Retrieve the file content from the server
            file_response = self.client.files.content(result_file_id)
            result_content = file_response.read()  # Read the content of the file

            # Save the content to a local file
            result_file_name = "batch_job_chat_results.jsonl"
            with open(result_file_name, "wb") as file:
                file.write(result_content)  # Write the binary content to the file
            # Load data from the saved JSONL file
            results = []
            with open(result_file_name, "r", encoding="utf-8") as file:
                for line in file:
                    json_object = json.loads(
                        line.strip()
                    )  # Parse each line as a JSON object
                    results.append(json_object)

            return results
        else:
            print(f"Batch job failed with status: {batch_job.status}")
            return None


# Initialize the OpenAIBatchProcessor
api_key = os.environ.get("OPENAI_API_KEY")
processor = OpenAIBatchProcessor(api_key)

# Process the batch job
input_file_path = "input.jsonl"
endpoint = "/v1/chat/completions"
completion_window = "24h"

# Process the batch job
results = processor.process_batch(input_file_path, endpoint, completion_window)

# Print the results
print(results)

and the return result is

Batch job status: validating...trying again in 3 seconds...
Batch job status: in_progress...trying again in 3 seconds...
Batch job status: in_progress...trying again in 3 seconds...
batch BatchRequestCounts(completed=2, failed=0, total=2)
[{'id': 'batch_req_bee263f2-0734-44cb-96c4-7fb8c4ad1661', 'custom_id': 'request-1', 'response': {'status_code': 200, 'request_id': '2a93b0ea158c42faa4efb9578ad75064', 'body': {'id': '2a93b0ea158c42faa4efb9578ad75064', 'object': 'chat.completion', 'created': 1721670579, 'model': 'gpt-3.5-turbo-0125', 'choices': [{'index': 0, 'message': {'role': 'assistant', 'content': "  Hello there! 👋 It's my pleasure to assist you. Here are three NBA players:\n\n1. LeBron James - He is a four-time NBA champion and four-time NBA Most Valuable Player (MVP) who has played for the Cleveland Cavaliers, Miami Heat, and Los Angeles Lakers.\n2. Stephen Curry - He is a three-time NBA champion and two-time NBA MVP who has played for the Golden State Warriors. He is known for his incredible shooting ability and has won multiple awards for his skills.\n3. Kevin Durant - He is a two-time NBA champion and two-time NBA MVP who has played for the Oklahoma City Thunder and Golden State Warriors. He is known for his scoring ability and is considered one of the best players in the NBA.\n\nI hope this helps! Let me know if you have any other questions. 😊"}, 'logprobs': None, 'finish_reason': 'FINISH_MATCHED_TOKEN: 2'}], 'usage': {'prompt_tokens': 37, 'completion_tokens': 203, 'total_tokens': 240}, 'system_fingerprint': None}}, 'error': None}, {'id': 'batch_req_db9eff3b-aead-4641-bb7b-8fba9fc2626c', 'custom_id': 'request-2', 'response': {'status_code': 200, 'request_id': '3c4bf3e052184f7a8f3e37dc71c79496', 'body': {'id': '3c4bf3e052184f7a8f3e37dc71c79496', 'object': 'chat.completion', 'created': 1721670581, 'model': 'gpt-3.5-turbo-0125', 'choices': [{'index': 0, 'message': {'role': 'assistant', 'content': "  Hello there! As an assistant, I'm happy to help. Here are three capital cities:\n\n1. Tokyo, Japan\n2. New York City, USA\n3. London, United Kingdom"}, 'logprobs': None, 'finish_reason': 'FINISH_MATCHED_TOKEN: 2'}], 'usage': {'prompt_tokens': 34, 'completion_tokens': 44, 'total_tokens': 78}, 'system_fingerprint': None}}, 'error': None}]

Notes

Here we can also support text completion; this feature(Batch API) can work together with parallel sampling.
Beyond adding Batch API, we can further reorder/reschedule the request in the file to do further

The basic design now is that the output line order may not match the input line order. we can use the custom_id field which will be present in each line of the output file, and allow map requests in input to results in output.

Support managing loaded files in the server (eg, the implementation of @app.post("/v1/files") and @app.get("/v1/files/{file_id}") for creating and retrieving files in the server)

Modification

Add file and batch OpenAI API in python/sglang/srt/server.py
Implement the logic of creating/retrieving/querying the file/batch in python/sglang/srt/openai_api/adapter.py and manage the relationship between files and batch requests (main modification)
Add some data structure in python/sglang/srt/openai_api/protocol.py for convenience (all data structure/format following Bacth API reference and File API Reference )
Refractor the code in python/sglang/srt/openai_api/adapter.py for function reuse
Add new parameters in python/sglang/srt/server.py to determine the file to store batch serving results in the server

cc @merrymercy @Ying1123 @hnyls2002 for CR

Checklist

Ensure pre-commit pre-commit run --all-files or other linting tools are used to fix potential lint issues.
Confirm that modifications are covered by complete unit tests. If not, please add more unit tests for correctness.
Modify documentation as needed, such as docstrings or example tutorials.

python/sglang/srt/openai_api/adapter.py

python/sglang/srt/managers/tokenizer_manager.py

python/sglang/srt/openai_api/adapter.py

yichuan520030910320 · 2024-07-29T14:43:06Z

Finish accuracy and throughput test, it is ready to merge.

yichuan520030910320 and others added 5 commits July 22, 2024 10:51

finish the functionable version of oai API batch and file

d5e549d

finish all of the component including files batch API

12e5492

finish all of the component including files batch API

d2270ca

finish all of the component including files batch API new

9510bc2

Merge branch 'main' into openai-batch

1993c8c

merrymercy requested changes Jul 27, 2024

View reviewed changes

merrymercy mentioned this pull request Jul 27, 2024

Development Roadmap (2024 Q3) #634

Closed

29 tasks

yichuan520030910320 added 3 commits July 28, 2024 17:35

finish alactuall batch

af67c10

solve conflict

4adcc23

solve conflict again

3db9d03

yichuan520030910320 requested a review from merrymercy July 28, 2024 17:49

yichuan520030910320 added 2 commits July 29, 2024 06:46

solve new conflict ready to merge?

156d378

store meta data only in mem

512aff9

Ying1123 approved these changes Jul 29, 2024

View reviewed changes

fix small bug about logprobs when solving conflict

5090891

merrymercy merged commit 084fa54 into sgl-project:main Jul 29, 2024
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for OpenAI API : offline batch(file) processing #699

Add support for OpenAI API : offline batch(file) processing #699

yichuan520030910320 commented Jul 22, 2024 •

edited

Loading

yichuan520030910320 commented Jul 29, 2024

Add support for OpenAI API : offline batch(file) processing #699

Add support for OpenAI API : offline batch(file) processing #699

Conversation

yichuan520030910320 commented Jul 22, 2024 • edited Loading

Motivation

Running example

Notes

Modification

Checklist

yichuan520030910320 commented Jul 29, 2024

yichuan520030910320 commented Jul 22, 2024 •

edited

Loading