Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Supporting Cline requests style. #2659

Open
gahoo opened this issue Dec 12, 2024 · 15 comments
Open

Supporting Cline requests style. #2659

gahoo opened this issue Dec 12, 2024 · 15 comments
Labels
feature help wanted Extra attention is needed pr welcome
Milestone

Comments

@gahoo
Copy link

gahoo commented Dec 12, 2024

Feature request / 功能建议

Cline requests are quite different in the content section which might be a list instead of str.

{
	"model": "qwen-2.5-coder-instruct",
	"messages": [{
		"role": "system",
		"content": "You are Cline, a highly skilled software engineer with extensive knowledge in many programming languages, frameworks, design patterns, and best practices......"
	}, {
		"role": "user",
		"content": [{
			"type": "text",
			"text": "<task>......</task>"
		}, {
			"type": "text",
			"text": "<environment_details>.....</environment_details>"
		}, {
			"type": "text",
			"text": "[TASK RESUMPTION] ....."
		}, {
			"type": "text",
			"text": "<environment_details>.......</environment_details>"
		}, {
			"type": "text",
			"text": "[TASK RESUMPTION] ......"
		}, {
			"type": "text",
			"text": "<environment_details>......</environment_details>"
		}]
	}],
	"temperature": 0,
	"stream": true,
	"stream_options": {
		"include_usage": true
	}
}

So the default chat_template might cause the following error:

TypeError: [address=0.0.0.0:33357, pid=544] can only concatenate str (not "list") to str

Motivation / 动机

It would be great that Cline or other similar tools can use Xinference's API.

Similar issues happened in DeepSeek API, and seems fixed. see cline/cline#230 .

Your contribution / 您的贡献

None

@gahoo gahoo added the feature label Dec 12, 2024
@XprobeBot XprobeBot added this to the v1.x milestone Dec 12, 2024
@qinxuye
Copy link
Contributor

qinxuye commented Dec 12, 2024

I have no clue about cline, to be honest, this is my first time hearing of it.

But welcome to contribute on this feature.

@qinxuye qinxuye added help wanted Extra attention is needed pr welcome labels Dec 12, 2024
@gahoo
Copy link
Author

gahoo commented Dec 13, 2024

Cline is a VS code extension, previous named Claude Dev.

I think it would be a lot of work since every LLM model has it's own chat_template. Do you think custom chat_template is a solution?

@qinxuye
Copy link
Contributor

qinxuye commented Dec 13, 2024

Does Cline has its own API router like openai ones /v1, if so, we can tell the message is from Cline, and we can write a function to convert the messages.

@gahoo
Copy link
Author

gahoo commented Dec 13, 2024

Cline supports API providers like OpenRouter, Anthropic, OpenAI, Google Gemini, AWS Bedrock, Azure, and GCP Vertex. You can also configure any OpenAI compatible API, or use a local model through LM Studio/Ollama. If you're using OpenRouter, the extension fetches their latest model list, allowing you to use the newest models as soon as they're available.

Cline is design to work with Anthropic Claude 3.5 Sonnet, it should be OpenAI API compatible. I think it just a client that construct requests and then send to API. It's hard to tell whether the request is from Cline. The request constructed does not contain any information says it's from Cline.

@qinxuye
Copy link
Contributor

qinxuye commented Dec 13, 2024

That would a bit more nasty, we have to analyze the content of messages, if it contains the structure like the example, we convert it to flattened normal openai messages.

@gahoo
Copy link
Author

gahoo commented Dec 13, 2024

Yes, and will bring extra burden to the server on each request. Do you think it's worthy? Are there any other clients that send similar request?

@qinxuye
Copy link
Contributor

qinxuye commented Dec 13, 2024

Yeah, it costs some, a simple way is just pass the messages to models, if TypeError raised, we then check and covert, IMO, that would be better.

@gahoo
Copy link
Author

gahoo commented Dec 16, 2024

That would a bit more nasty, we have to analyze the content of messages, if it contains the structure like the example, we convert it to flattened normal openai messages.

liteLLM is using this strategy.

https://github.com/BerriAI/litellm/blob/fd583e715ec1b183cade08ac29c1cbe73e5c94f6/litellm/llms/mistral/mistral_chat_transformation.py#L170

https://github.com/BerriAI/litellm/blob/fd583e715ec1b183cade08ac29c1cbe73e5c94f6/litellm/litellm_core_utils/prompt_templates/common_utils.py#L27

https://github.com/BerriAI/litellm/blob/fd583e715ec1b183cade08ac29c1cbe73e5c94f6/litellm/litellm_core_utils/prompt_templates/common_utils.py#L63

Yeah, it costs some, a simple way is just pass the messages to models, if TypeError raised, we then check and covert, IMO, that would be better.

If the requests content are all list, handling errors will also cost extra resources. Even worse than the first strategy.

@qinxuye
Copy link
Contributor

qinxuye commented Dec 16, 2024

OK, for messages we do the conversion I think it's OK.

Copy link

This issue is stale because it has been open for 7 days with no activity.

@github-actions github-actions bot added the stale label Dec 23, 2024
@qinxuye qinxuye removed the stale label Dec 24, 2024
@qinxuye
Copy link
Contributor

qinxuye commented Dec 24, 2024

@gahoo Do you have interest to contribute this feature?

@hwzhuhao
Copy link
Contributor

I have tried two methods, and the test results are both okay.
Method 1: Modify the chat-template to add the following content:
企业微信截图_17349479977060

Method 2: Add a static method _handle_messages_with_content_list_to_str_conversion in the ChatModelMixin class in llm/utils.py to implement message conversion, as detailed below:

    @staticmethod
    def _handle_messages_with_content_list_to_str_conversion( messages: List[Dict]) -> List[Dict]:
        """
        Handles messages with content list conversion
        """
        for message in messages:
            texts = ""
            msg_content = message.get("content")
            if msg_content:
                if isinstance(msg_content, str):
                    texts = msg_content
                elif isinstance(msg_content, list):
                    for c in msg_content:
                        text_content = c.get("text")
                        if text_content:
                            texts += text_content
            if texts:
                message["content"] = texts
        return messages

Additionally, modify the async_chat method in the VLLMChatModel class to call _handle_messages_with_content_list_to_str_conversion, as detailed below:

    @vllm_check
    async def async_chat(
        self,
        messages: List[Dict],
        generate_config: Optional[Dict] = None,
        request_id: Optional[str] = None,
    ) -> Union[ChatCompletion, AsyncGenerator[ChatCompletionChunk, None]]:
        messages = self._handle_messages_with_content_list_to_str_conversion(messages)
        tools = generate_config.pop("tools", []) if generate_config else None
        ... ...

@qinxuye
Copy link
Contributor

qinxuye commented Dec 24, 2024

Thanks @hwzhuhao , nice work, I prefer the method 2 which is more general. Willing to hear your opnions.

@hwzhuhao
Copy link
Contributor

I also agree to use method 2. The chat-template is usually provided by the model provider, so it is not suitable for modifications.

@ccly1996
Copy link

I also agree to use method 2. The chat-template is usually provided by the model provider, so it is not suitable for modifications.

I modified the code in the Docker image using method 2 above, but it didn't work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature help wanted Extra attention is needed pr welcome
Projects
None yet
Development

No branches or pull requests

5 participants