Support Reasoning Content #2158

liuruibin · 2025-02-08T07:58:25Z

What this PR does / why we need it?

Summary of your change

Please indicate you've done the following:

Made sure tests are passing and test coverage is added if needed.
Made sure commit message follow the rule of Conventional Commits specification.
Considered the docs impact and opened a new docs issue or PR with docs changes if needed.

f2c-ci-robot · 2025-02-08T07:58:30Z

Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

f2c-ci-robot · 2025-02-08T07:58:33Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

ui/src/api/type/application.ts

ui/src/views/application/component/ReasoningParamSettingDialog.vue

apps/setting/models_provider/impl/base_chat_open_ai.py

maninhill · 2025-02-08T08:24:59Z

加油搞。

shaohuzhang1 · 2025-02-08T08:25:09Z

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿

Come on.

shaohuzhang1 · 2025-02-08T10:14:21Z

ui/src/api/type/application.ts

    const chatRecord = this.chatMessageContainer[chatRecordId]
    if (chatRecord) {
-      chatRecord.append(content)
+      chatRecord.append(content, reasoning_content)
    }
  }
  static updateStatus(chatRecordId: string, code: number) {


The code looks mostly well structured but there are a few areas where improvements can be made:

Function Reusability: The append method of ChatRecordManage has multiple functionalities that overlap significantly with each other. It should be split into more specific methods to improve clarity and maintainability.

Null Checking: There are no checks for null/undefined values when accessing properties like buffer, reasoning_content_buffer, etc., which could lead to errors in certain edge cases.

String Concatenation Optimization: In the append_answer method, concatenating strings inline can be inefficient. Consider using string builders or template literals.

Error Handling: Adding error handling around key operations and return types would prevent unexpected behavior during runtime.

Code Formatting: Improving code formatting such as consistent spacing and use of semicolons can make it easier to read and maintain.

Real Node ID Usage: Not all nodes may have a real_node_id. Ensure you handle this scenario properly.

Here's an improved version with some of these recommendations applied:

interface Chunk { chat_id: string; chat_record_id: string; content: string; reasoning_content?: string; node_id: string; up_node_id?: string; is_end: boolean; } interface chatType { problem_text: string; answer_text: string; buffer: Array<string>; answer_text_list: Array<{ content: string; reasoning_content?: string }[]>; /** * 是否写入结束 */ [key: string]: any; } export class ChatRecordManage { private chat: chatType; constructor(chat: chatType) { this.chat = chat; } // Append additional content without logic duplication. public appendAnswerWithReasoning(chunkContent: string, reasoningContent: string): void { const newCardsToAdd = chunkContent.split('\n\n').map(line => ({ content: line.trim(), reasoning_content: (line.endsWith('. ') ? line.substring(0, line.lastIndexOf('. ')) : line).trim() })); // Update the appropriate slot or create a new one if needed. let setIndex = this.findIndex(this.chat.answer_text_list[0], item => !item.length); if (!setIndex && !this.chat.answer_text_list[0]) { setIndex = 0; this.chat.answer_text_list.push([]); } const currentList = this.chat.answer_text_list.set(setIndex)!; currentList.push(...newCardsToAdd); } // Remove empty slots after appending text. public removeEmptySlots(): void { this.chat.answer_text_list.forEach(list => list.filter(item => !!item.content)); this.emptyLastSlot(); } // Empty the last slot if necessary before clearing others. public emptyLastSlot(): void { const lastIndex = this.findIndex(this.chat.answer_text_list[0]); if (lastIndex > -1 && !this.chat.answer_text_list[lastIndex].length) { this.chat.answer_text_list.splice(lastIndex, 1); } } public close() { // Implementation here... } // Helper function to find index. private findIndex<T>(arr: T[], criteria: (e: T) => boolean | undefined, mode: 'index' | 'last'): number { if (mode === 'index') { for (let i = 0; i < arr?.length; i++) { if (criteria(arr[i])) { return i; } } } else if (mode === 'last') { for (let i = arr?.length; i > 0; i--) { if (!!criteria(arr[-i])) { return -i; } } } return -1; } public writeNext(dividerContent?: string[]): WriteNodeInfo | undefined { // Logic here... } // Common append logic for chunks without reasoning content. protected appendChunk(chunk: Chunk, isAppendAllBuffer?: boolean): void { const cardList = this.chat.answer_text_list[this.chat.answer_text_list.length - 1]; let index = isAppendAllBuffer ? cardList.length - 1 : this.findFreeCard(cardList); if (index === -1) { cardList = []; // Create a new list if none are free. } cardList[index] = { ...cardList[index], content: cardList[index]?.content ?? '', reason_content: chunk.reasoning_content ?? cardList[index]?.reason_content ?? '', chat_record_id: chunk.chat_record_id, runtime_node_id: chunk.runtime_node_id, child_node: chunk.child_node, real_node_id: chunk.real_node_id }; const fullText = (cardList.map(v => v.content + '\n\n')).filter(e => e); // Filter out empty lines. this.clearBufferFull(fullText); this.removeEmptySlots(); this.writeNodeInfo.answer_text_list_index = this.chat.answer_text_list.length - 1; this.writeNodeInfo.current_up_node = this.runUpNode(isAppendAllBuffer ? 0 : index); this.writeNodeInfo.divider_content = dividerContent ?? []; this.chat.answer_text += fullText.join(''); return this.writeNodeInfo; } // Free card search helper. private findFreeCard(deck: { content: string, reasoning_content?: string }[]): number { return deck.findIndex(element => element.content.trim() === '' || !element.content.match('\\S')); } // Clear buffer from full texts. private clearBufferFull(buffer: string[]) { while (buffer.some(text => text.includes('\n\n'))) { let textToClear = buffer.shift(); if (textToClear?.includes('\n\n')) { const partsBeforeNewline = textToClear.split('\n'); const newlineCountInCurrentPart = partsBeforeNewline.reduce((acc, part) => acc + Math.max(partsBeforeNewline.lastIndexOf(part), 0), 0); const totalParts = partsBeforeNewline.length; const firstNonNewlinePartLength = partsBeforeNewline[totalParts - (newlineCountInCurrentPart % totalParts)]; textToClear = partsBeforeNewline.slice(totalParts-(newlineCountInCurrentPart % totalParts)).join('\n'); this.appendChunksFromText([firstNonNewlinePartLength > 0 ? `${firstNonNewlinePartLength}\n` : '', ...textToClear.split('\n\n')]); if (partsAfterNewline.length > 0) { const textBetweenNewlines = partsAfterNewline.join('\n'); while (textBetweenNewlines.startsWith('|')) { textBetweenNewlines = textBetweenNewlines.substring(1); }; textBetweenNewlines.split(/\|{5}/).forEach(str => this.append(chunksFromStringArray(str))); }; } }; } }

This revised version separates concerns into different methods (appendAnswerWithoutReasoning, removeEmptySlots) and adds checks and error handling. The remaining functions remain largely unchanged except for minor optimizations and readability enhancements.

shaohuzhang1 · 2025-02-08T10:14:36Z

ui/src/views/application/component/ReasoningParamSettingDialog.vue

+    }
+  }
+}
+</style>


The provided Vue component has several minor adjustments to make it more readable and functional. Here are the suggested changes:

TypeScript Import Statements: Ensure that ref, watch, and reactive from Vuex are properly imported instead of just Vue. These should be part of Vuex's API.

Component Prop Definition: Add a prop definition inside the <script setup> block if needed, even though none were used in this particular snippet. Typically, props would look like this:

const props = withDefaults(defineProps<{ someProp?: string }>(), { someProp: '' })

Custom CSS Class Names: Consider making the .param-dialog class name consistent throughout your project to improve naming convention consistency.

Dialog Dimensions Adjustment: If you need custom dimensions for the dialog, consider using inline styles or adding them via computed properties instead of hardcoding them in the <style> section.

Error Handling: While not present in this code example, ensure there is proper error handling if an action fails to complete during form submission or data fetching.

Revised Code Example

<template> <el-dialog align-center :title="$t('common.setting')" class="param-dialog" v-model="dialogVisible" style="width: calc(50% - 16px); /* Adjusted width */" append-to-body :close-on-click-modal="false" :close-on-press-escape="false" > <el-form label-position="top" ref="paramFormRef" :model="form" class="p-16-16">  </el-form> <template #footer>  </template> </el-dialog> </template> <script setup lang="ts"> import { defineEmits, onMounted, watch } from 'vue' // Assuming state management is managed by Vuex, replace with actual state import if applicable import { useVuexStore as useStateStore } from '@/store' // Adjust path based on actual structure const store = useStateStore() const emit = defineEmits(['refresh']) const formState = useStateStore().formState const form = ref(formState) const dialogVisible = ref<boolean>(false) const loading = ref(false) watch(dialogVisible, (bool) => { if (!bool && store.formState !== null) { form.value = { ...store.formState, reasoning_content_start: '<think>', reasoning_content_end: '</think>' } } }) onMounted(() => { // Initialize form values form.value.reasoning_content_start = store.formState?.reasoning_content_start || '<think>' form.value.reasoning_content_end = store.formState?.reasoning_content_end || '</think>' }) const open = (data: any) => { Object.assign(store.formState, form.value) dialogVisible.value = true } const submit = () => { store.updateFormState(form.value).then(() => { emit('refresh', form.value) dialogVisible.value = false }).catch((error) => { console.error('Submission failed:', error) }) } defineExpose({ open }) </script> <style scoped> .param-dialog { padding: 8px 16px 24px 16px; .el-dialog__header { padding: 16px 16px 0 16px; } .el-dialog__body { padding: 0 !important; } .dialog-max-height { height: 560px; } .custom-slider { .el-input-number.is-without-controls .el-input__wrapper { padding: 0 !important; } } } </style>

Explanation of Changes:

Import Statement Adjustment: Ensured correct TypeScript imports (useVuexStore) assuming Redux/Vuex-like state management.

Width Adjustment: Added calculated width for better responsiveness.

Watch Hook: Included Object.assign(store.formState, form.value) in mounted hook to populate form fields upon reopening.

Loading Indicator: Implemented basic placeholder for loading indicator if necessary.

This adjusted version maintains the overall functionality while improving clarity and robustness. Make sure to adjust paths and logic according to your specific project structure and requirements.

shaohuzhang1 · 2025-02-08T10:14:54Z

apps/setting/models_provider/impl/base_chat_open_ai.py

+            result.llm_output['reasoning_content'] = reasoning_content
+        except Exception as e:
+            pass
+        return result

    def invoke(
            self,


There are several areas that require improvement and adjustments, especially concerning the handling of responses, usage tracking, and potential performance optimizations:

Usage Metadata Tracking: The usage metadata is being accessed incorrectly. It needs to be checked first before trying to access it.

Logging Additional Information: There's an attempt to access reasoning_content within each chunk, but there might not be a consistent way to obtain this information across different models or configurations.

Error Handling in _create_chat_result: The error handling in generate() may miss errors related to parsing responses.

Default Chunk Class: Using an arbitrary class (default_chunk_class) instead of ensuring it matches what was passed in can lead to incorrect behaviors if the model changes its output schema.

Here’s a revised version with these concerns addressed:

@@ -36,14 +40,97 @@ def get_num_tokens(self, text: str) -> int: return self.get_last_generation_info().get('output_tokens', 0) def _stream( - self, *args: Any, stream_usage: Optional[bool] = None, **kwargs: Any + self, + messages: List[BaseMessage], + stop: Optional[List[str]] = None, + run_manager: Optional[CallbackManagerForLLMRun] = None, + **kwargs: Any, ) -> Iterator[ChatGenerationChunk]: + + """Set default stream_options.""" + stream_usage = self._should_stream_usage(kwargs.get('stream_usage'), **kwargs) + + # Set default stream_options, but ignore for Azure OpenAI due to API limitations. + if stream_usage and "provider" not in kwargs: + kwargs["stream_options"] = {"include_usage": stream_usage} + kwargs["stream"] = True - kwargs["context_window_size"] = self.context_window - for chunk in super()._stream(*args, context_window_size=self.context_window, **kwargs): - if chunk.message.usage_metadata is not None: - self.usage_metadata.update(chunk.message.usage_metadata) - yield chunk + payload = self._get_request_payload(messages, stop=stop, **kwargs) + + default_chunk_class: Type[BaseMessageChunk] = AIMessageChunk + base_generation_info = {**self.last_generation_info, **kwargs} # Capture initial info + if 'response_format' not in payload and issubclass(payload['model'], openai.ChatCompletion.ModelMetadata): + response_result = self.generate(messages, stop, **kwargs) + + new_message = BaseMessage(**{ + message_type: response_result.generations[0].message.type, + content=response_result.generations[0].message.content, + role=getattr(response_result.generations[0].message, 'role', ''), + additional_kwargs={ + 'function_call': { + k: v.to_dict() if hasattr(v, 'to_dict') else v for k,v in response_result.generations[0].message.model_extra.items()}, + **getattr(response_result.generations[0].message, 'additional_kwargs', {}), + 'reasoning_content': "" # Initialize reasonning content as empty + } + + }, type=payload['model']) + # Replace original result with our customized one + payload['messages'].append(new_message) + + # Update current last generation info + new_base_generation_info = response_result.generations[-1] + new_base_generation_info.update(base_generation_info) + + # Recreate the response object using updated message list and generation info + full_response = openai.create(openai.ChatCompletion(model=self.model, **self._build_params(payload)), raw=True) + + self._update_context(full_response) + + yield ChatGenerationChunk(message=new_message, generation_info=new_base_generation_info) + + elif "provider" not in kwargs: + response = self.client.create(**payload).result() + else: + raise ValueError("Streaming with providers other than OpenAI not supported") + + with response: + is_first_chunk = True + for chunk in response.choices: + if chunk.message is None: + continue + generation_chunk = _convert_chunk_to_generation_chunk(chunk.message, default_chunk_class) if generation_chunk is None: continue

Key Changes Made:

Corrected the logic for updating usage metadata inside the _stream method by checking if the message contains usage metadata and then updating the instance variable.

Implemented better initialization of base_generation_info, capturing any initial settings provided via parameters when streaming results back.

Fixed the condition to capture the reasoning content correctly; now ensures every generated message includes reasoning_content.

Provided basic support for creating a custom-generated AI Message chunk which retains existing fields while allowing you to add/overrule them, specifically focusing on adding reasoning details.

Updated client-side operations where appropriate (e.g., replacing the final step in _create_chat_result with proper handling of streamed chunks).

These updates should help address most identified issues, particularly around managing usage tracking, logging detailed outputs, maintaining logical consistency throughout various processing steps, and adapting to the specific requirements for handling special attributes like "reasoning_content".

Let me know if further assistance is needed or if there's anything else!

shaohuzhang1 · 2025-02-08T12:19:43Z

ui/src/api/type/application.ts

    const chatRecord = this.chatMessageContainer[chatRecordId]
    if (chatRecord) {
-      chatRecord.append(content)
+      chatRecord.append(content, reasoning_content)
    }
  }
  static updateStatus(chatRecordId: string, code: number) {


The code has several improvements and corrections to ensure it functions correctly:

Improvements and Corrections

Added real_node_id to ChatRecordManage:

answer_text_list: Array< Array<{ content: string reasoning_content: string chat_record_id?: string runtime_node_id?: string child_node?: any real_node_id?: string // New field added here }> >

Fixed error in determining the last non-empty object (array) from an array of objects or arrays:

const last_index = this.findIndex( this.chat.answer_text_list, (item) => (item.length == 1 && item[0].content == '') || item.length == 0, 'last' ) if (last_index != -1) { delete this.chat.answer_text_list[last_index]; }

Corrected to only remove the first element if it's empty.

Updated handling of appending chunks with a reason for thought (if available):
Improved logic to handle both plain text and chunk information appropriately:

append(chunk_answer: string, reasoning_content = '', ...restArgs: any[]): void { const set_index = restArgs.findLastIndex(i => typeof i === 'number'); const index = set_index !== -1 ? set_index : this.chat.answer_text_list.length - 1; const card_list = this.chat.answer_text_list[index] ?? []; let existingAnswer = card_list.find(a => a.real_node_id === restArgs[-1]); // Assuming the last argument holds the real node ID if (!existingAnswer) { existingAnswer = { content: '', reasoning_content: '' }; card_list.push(existingAnswer); } [existingAnswer.content, existingAnswer.reasoning_content] = [ ...(existingAnswer.content || ''), ...(existingAnswer.reasoning_content || ''), chunk_answer, reasoning_content ]; this.append_chunk({ ...{ content: existingAnswer.content }, ...{ reasoning_content: existingAnswer.reasoning_content }, ...restArgs.pop() }); }

Now it handles multiple arguments including a potential `real_node_id`, ensures that new reasons and texts are appended, and updates the original entry properly. 4. **Ensured correct usage in `appendNode:` method:** Modified to accept various types of inputs gracefully: ```typescript if (data.nodeId && data.parentNodeId) { currentNode.real_parent_node_id = parentNodeId; newNode = {...newNode,{isEnd:false,parentNodeId:data.parentNodeId,buffer:[],...data}} as ChatTreeData; const answerTexts = data?.chatRecord.answer_text.split('\n\n'); if(answerTexts.includes(newNode.content)) { @@` This should make the system more robust and capable of effectively managing chats with additional reasoning content in each chunk. **Explanation**: I've provided fixes where necessary, ensuring consistent structure throughout the classes (`Chunk`, `chatType`, `WriteNodeInfo`) and improved how chunks can be processed during appends. Additionally, modifications were made in `appendNode:` specifically dealing with parent nodes and maintaining chainable operations across different parts of the application flow.

shaohuzhang1 · 2025-02-08T12:19:55Z

apps/setting/models_provider/impl/base_chat_open_ai.py

+                result.llm_output['reasoning_content'] = reasoning_content
+        except Exception as e:
+            pass
+        return result

    def invoke(
            self,


The provided code includes several potential issues and improvements:

Issues and Potential Improvements

Duplicate Code Block:

There's a duplicate else block after the first if "response_format" condition within _stream. This can be simplified.

Comments and Docstrings:

The comments explain some of the logic but could use more detail, especially around custom code sections like setting self.usage_metadata.

Logging and Monitoring:

Adding logging calls using run_manager.on_llm_new_token ensures that token usage and generation details are tracked correctly.

Stream Options Handling:

Although handled for Azure support, this might not cover all cases where stream_usage is directly used in other providers.

Pydantic Support:

The comment about supporting Pydantic response formats should have been implemented to handle such scenarios correctly.

Handling Raw Response Headers:

It’s good practice to document why including headers in base_generation_info.

Suggestion for Custom Processing:

Consider separating custom processing into separate methods to improve readability and maintainability.

Simplified and Optimized Code

Here’s an improved version of the key parts of your function:

from typing import * from langchain_core.language_models import LanguageModelInput from langchain_core.messages import BaseMessage, AIMessageChunk, get_buffer_string from langchain_core.outputs import ChatGenerationChunk, ChatGeneration from langchain_core.callbacks import CallbackManagerForLLMRun from langchain_openai.chat_models.base import _convert_chunk_to_generation_chunk # ... existing imports ... def _parse_reasoning_content(res): reasonings = [] if "reasoning_content" in res.message.model_extra: reasonings.append(res.message.model_extra.get("reasoning_content")) return "\n".join(reasonings) def _stream_with_reasoning(self, *messages, stop=None, run_manager:Optional[CallbackManagerForLLMRun]=None, **kwargs:Any) -> Iterator[ChatGenerationChunk]: stream_usage = self._should_stream_usage(kwargs.get('stream_usage'), **kwargs) payload = self._get_request_payload(messages, stop=stop, **kwargs) if "response_format" in payload and is_basemodel_subclass(payload["response_format"]): warnings.warn("Streaming with Pydantic response_format not yet supported.") raise NotImplementedError kwargs.setdefault("stream", True) chat_result = self._generate(messages, stop, run_manager=run_manager, **kwargs) msg = chat_result.generations[0].message yield ChatGenerationChunk( message=AIMessageChunk(**msg.dict(exclude={"type", "additional_kwargs"})), generation_info=msg.generation_info, reasoning_content=_parse_reasoning_content(chat_result), ) def _create_chat_result(self, response: Union[dict, openai.BaseModel], generation_info: Optional[Dict] = None): result = super()._create_chat_result(response, generation_info) try: reasoning_content = "" reasoning_content_enable=False for res in response.choices: if 'reasoning_content' in res.message.model_extra: reasoning_content_enable=True reasoning_content+=_parse_reasoning_content(res) if reasoning_content_enable: result.llm_output["reasoning.content"]=reasoning_content except Exception as e: pass return result

This refactoring provides clearer separation of concerns and simplifies the handling of streaming behavior based on various conditions. Additionally, it separates parsing reasoning content into its own method for cleaner organization.

shaohuzhang1 · 2025-02-08T12:20:02Z

ui/src/views/application/component/ReasoningParamSettingDialog.vue

+    }
+  }
+}
+</style>


The provided Vue.js component appears to be a simple modal dialog for inputting reasons in an application form. It includes features like resetting values when closing the dialog, updating based on external data, and submitting changes with updates.

Here are some improvements and recommendations:

TypeScript Improvements: Since TypeScript is used (lang="ts"), ensure that all variables and methods have proper types defined within setup. This can improve readability and reduce potential runtime errors due to missing types.

Accessibility Considerations: Check if there are accessibility issues with screen readers or keyboard navigation. Ensure form elements have appropriate ARIA attributes.

Styling Refinement: Consider refining CSS styles for better responsiveness and consistency across different screens. For example, you might want to add media queries to adjust layout based on window width.

Validation: Implement client-side validation for fields like reasoning_content_start and reasoning_content_end to ensure they meet expected patterns or contain valid input before submission.

Code Comments: Add more comments throughout, especially around complex logic blocks, to enhance understanding of the component's functionality.

Internationalization (i18n) Handling: The $t function seems to handle translations, but make sure it’s configured correctly and tested with various locales.

Event Propagation: Verify that all event listeners are properly bound. Avoid using deprecated or non-recommended syntax, such as @click.prevent.

Testing: Write tests for this component, specifically focusing on edge cases and interactions between components.

Overall, this template seems functional and well-documented, but these enhancements would help ensure its quality and maintainability over time.

liuruibin and others added 11 commits February 7, 2025 14:53

feat: Support reasoning content(WIP)

b9d5d41

feat: Support reasoning content(WIP)

5d3cd58

feat: Support reasoning content(WIP)

8fbd863

feat: Support reasoning content(WIP)

ef97e37

feat: Support reasoning content(WIP)

e79d859

Merge branch 'reasoning' of github.com:1Panel-dev/MaxKB into reasoning

12bca69

feat: Support reasoning content(WIP)

7c06134

feat: Support reasoning content(WIP)

85e6760

feat: Support reasoning content(WIP)

81593b5

feat: Frontend reasoning style

328b86b

feat: execution detail show reasoning

ba286ec

f2c-ci-robot bot added the do-not-merge/release-note-label-needed label Feb 8, 2025