-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Reasoning Content #2158
Conversation
Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
加油搞。 |
Come on. |
const chatRecord = this.chatMessageContainer[chatRecordId] | ||
if (chatRecord) { | ||
chatRecord.append(content) | ||
chatRecord.append(content, reasoning_content) | ||
} | ||
} | ||
static updateStatus(chatRecordId: string, code: number) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code looks mostly well structured but there are a few areas where improvements can be made:
-
Function Reusability: The
append
method ofChatRecordManage
has multiple functionalities that overlap significantly with each other. It should be split into more specific methods to improve clarity and maintainability. -
Null Checking: There are no checks for null/undefined values when accessing properties like
buffer
,reasoning_content_buffer
, etc., which could lead to errors in certain edge cases. -
String Concatenation Optimization: In the
append_answer
method, concatenating strings inline can be inefficient. Consider using string builders or template literals. -
Error Handling: Adding error handling around key operations and return types would prevent unexpected behavior during runtime.
-
Code Formatting: Improving code formatting such as consistent spacing and use of semicolons can make it easier to read and maintain.
-
Real Node ID Usage: Not all nodes may have a
real_node_id
. Ensure you handle this scenario properly.
Here's an improved version with some of these recommendations applied:
interface Chunk {
chat_id: string;
chat_record_id: string;
content: string;
reasoning_content?: string;
node_id: string;
up_node_id?: string;
is_end: boolean;
}
interface chatType {
problem_text: string;
answer_text: string;
buffer: Array<string>;
answer_text_list: Array<{ content: string; reasoning_content?: string }[]>;
/**
* 是否写入结束
*/
[key: string]: any;
}
export class ChatRecordManage {
private chat: chatType;
constructor(chat: chatType) {
this.chat = chat;
}
// Append additional content without logic duplication.
public appendAnswerWithReasoning(chunkContent: string, reasoningContent: string): void {
const newCardsToAdd = chunkContent.split('\n\n').map(line => ({
content: line.trim(),
reasoning_content: (line.endsWith('. ') ? line.substring(0, line.lastIndexOf('. ')) : line).trim()
}));
// Update the appropriate slot or create a new one if needed.
let setIndex = this.findIndex(this.chat.answer_text_list[0], item => !item.length);
if (!setIndex && !this.chat.answer_text_list[0]) {
setIndex = 0;
this.chat.answer_text_list.push([]);
}
const currentList = this.chat.answer_text_list.set(setIndex)!;
currentList.push(...newCardsToAdd);
}
// Remove empty slots after appending text.
public removeEmptySlots(): void {
this.chat.answer_text_list.forEach(list => list.filter(item => !!item.content));
this.emptyLastSlot();
}
// Empty the last slot if necessary before clearing others.
public emptyLastSlot(): void {
const lastIndex = this.findIndex(this.chat.answer_text_list[0]);
if (lastIndex > -1 && !this.chat.answer_text_list[lastIndex].length) {
this.chat.answer_text_list.splice(lastIndex, 1);
}
}
public close() {
// Implementation here...
}
// Helper function to find index.
private findIndex<T>(arr: T[], criteria: (e: T) => boolean | undefined, mode: 'index' | 'last'): number {
if (mode === 'index') {
for (let i = 0; i < arr?.length; i++) {
if (criteria(arr[i])) {
return i;
}
}
} else if (mode === 'last') {
for (let i = arr?.length; i > 0; i--) {
if (!!criteria(arr[-i])) {
return -i;
}
}
}
return -1;
}
public writeNext(dividerContent?: string[]): WriteNodeInfo | undefined {
// Logic here...
}
// Common append logic for chunks without reasoning content.
protected appendChunk(chunk: Chunk, isAppendAllBuffer?: boolean): void {
const cardList = this.chat.answer_text_list[this.chat.answer_text_list.length - 1];
let index = isAppendAllBuffer ? cardList.length - 1 : this.findFreeCard(cardList);
if (index === -1) {
cardList = []; // Create a new list if none are free.
}
cardList[index] = {
...cardList[index],
content: cardList[index]?.content ?? '',
reason_content: chunk.reasoning_content ?? cardList[index]?.reason_content ?? '',
chat_record_id: chunk.chat_record_id,
runtime_node_id: chunk.runtime_node_id,
child_node: chunk.child_node,
real_node_id: chunk.real_node_id
};
const fullText = (cardList.map(v => v.content + '\n\n')).filter(e => e); // Filter out empty lines.
this.clearBufferFull(fullText);
this.removeEmptySlots();
this.writeNodeInfo.answer_text_list_index = this.chat.answer_text_list.length - 1;
this.writeNodeInfo.current_up_node = this.runUpNode(isAppendAllBuffer ? 0 : index);
this.writeNodeInfo.divider_content = dividerContent ?? [];
this.chat.answer_text += fullText.join('');
return this.writeNodeInfo;
}
// Free card search helper.
private findFreeCard(deck: { content: string, reasoning_content?: string }[]): number {
return deck.findIndex(element => element.content.trim() === '' || !element.content.match('\\S'));
}
// Clear buffer from full texts.
private clearBufferFull(buffer: string[]) {
while (buffer.some(text => text.includes('\n\n'))) {
let textToClear = buffer.shift();
if (textToClear?.includes('\n\n')) {
const partsBeforeNewline = textToClear.split('\n');
const newlineCountInCurrentPart = partsBeforeNewline.reduce((acc, part) => acc + Math.max(partsBeforeNewline.lastIndexOf(part), 0), 0);
const totalParts = partsBeforeNewline.length;
const firstNonNewlinePartLength = partsBeforeNewline[totalParts - (newlineCountInCurrentPart % totalParts)];
textToClear = partsBeforeNewline.slice(totalParts-(newlineCountInCurrentPart % totalParts)).join('\n');
this.appendChunksFromText([firstNonNewlinePartLength > 0 ? `${firstNonNewlinePartLength}\n` : '', ...textToClear.split('\n\n')]);
if (partsAfterNewline.length > 0) {
const textBetweenNewlines = partsAfterNewline.join('\n');
while (textBetweenNewlines.startsWith('|')) {
textBetweenNewlines = textBetweenNewlines.substring(1);
};
textBetweenNewlines.split(/\|{5}/).forEach(str => this.append(chunksFromStringArray(str)));
};
}
};
}
}
This revised version separates concerns into different methods (appendAnswerWithoutReasoning
, removeEmptySlots
) and adds checks and error handling. The remaining functions remain largely unchanged except for minor optimizations and readability enhancements.
} | ||
} | ||
} | ||
</style> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The provided Vue component has several minor adjustments to make it more readable and functional. Here are the suggested changes:
-
TypeScript Import Statements: Ensure that
ref
,watch
, andreactive
from Vuex are properly imported instead of justVue
. These should be part of Vuex's API. -
Component Prop Definition: Add a prop definition inside the
<script setup>
block if needed, even though none were used in this particular snippet. Typically, props would look like this:const props = withDefaults(defineProps<{ someProp?: string }>(), { someProp: '' })
-
Custom CSS Class Names: Consider making the
.param-dialog
class name consistent throughout your project to improve naming convention consistency. -
Dialog Dimensions Adjustment: If you need custom dimensions for the dialog, consider using inline styles or adding them via computed properties instead of hardcoding them in the
<style>
section. -
Error Handling: While not present in this code example, ensure there is proper error handling if an action fails to complete during form submission or data fetching.
Revised Code Example
<template>
<el-dialog
align-center
:title="$t('common.setting')"
class="param-dialog"
v-model="dialogVisible"
style="width: calc(50% - 16px); /* Adjusted width */"
append-to-body
:close-on-click-modal="false"
:close-on-press-escape="false"
>
<el-form label-position="top" ref="paramFormRef" :model="form" class="p-16-16">
<!-- Form content remains unchanged -->
</el-form>
<template #footer>
<!-- Footer content also remains unchanged -->
</template>
</el-dialog>
</template>
<script setup lang="ts">
import { defineEmits, onMounted, watch } from 'vue'
// Assuming state management is managed by Vuex, replace with actual state import if applicable
import { useVuexStore as useStateStore } from '@/store' // Adjust path based on actual structure
const store = useStateStore()
const emit = defineEmits(['refresh'])
const formState = useStateStore().formState
const form = ref(formState)
const dialogVisible = ref<boolean>(false)
const loading = ref(false)
watch(dialogVisible, (bool) => {
if (!bool && store.formState !== null) {
form.value = {
...store.formState,
reasoning_content_start: '<think>',
reasoning_content_end: '</think>'
}
}
})
onMounted(() => {
// Initialize form values
form.value.reasoning_content_start = store.formState?.reasoning_content_start || '<think>'
form.value.reasoning_content_end = store.formState?.reasoning_content_end || '</think>'
})
const open = (data: any) => {
Object.assign(store.formState, form.value)
dialogVisible.value = true
}
const submit = () => {
store.updateFormState(form.value).then(() => {
emit('refresh', form.value)
dialogVisible.value = false
}).catch((error) => {
console.error('Submission failed:', error)
})
}
defineExpose({ open })
</script>
<style scoped>
.param-dialog {
padding: 8px 16px 24px 16px;
.el-dialog__header {
padding: 16px 16px 0 16px;
}
.el-dialog__body {
padding: 0 !important;
}
.dialog-max-height {
height: 560px;
}
.custom-slider {
.el-input-number.is-without-controls .el-input__wrapper {
padding: 0 !important;
}
}
}
</style>
Explanation of Changes:
- Import Statement Adjustment: Ensured correct TypeScript imports (
useVuexStore
) assuming Redux/Vuex-like state management. - Width Adjustment: Added calculated width for better responsiveness.
- Watch Hook: Included
Object.assign(store.formState, form.value)
inmounted
hook to populate form fields upon reopening. - Loading Indicator: Implemented basic placeholder for loading indicator if necessary.
This adjusted version maintains the overall functionality while improving clarity and robustness. Make sure to adjust paths and logic according to your specific project structure and requirements.
result.llm_output['reasoning_content'] = reasoning_content | ||
except Exception as e: | ||
pass | ||
return result | ||
|
||
def invoke( | ||
self, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are several areas that require improvement and adjustments, especially concerning the handling of responses, usage tracking, and potential performance optimizations:
-
Usage Metadata Tracking: The usage metadata is being accessed incorrectly. It needs to be checked first before trying to access it.
-
Logging Additional Information: There's an attempt to access
reasoning_content
within each chunk, but there might not be a consistent way to obtain this information across different models or configurations. -
Error Handling in
_create_chat_result
: The error handling ingenerate()
may miss errors related to parsing responses. -
Default Chunk Class: Using an arbitrary class (
default_chunk_class
) instead of ensuring it matches what was passed in can lead to incorrect behaviors if the model changes its output schema.
Here’s a revised version with these concerns addressed:
@@ -36,14 +40,97 @@ def get_num_tokens(self, text: str) -> int:
return self.get_last_generation_info().get('output_tokens', 0)
def _stream(
- self, *args: Any, stream_usage: Optional[bool] = None, **kwargs: Any
+ self,
+ messages: List[BaseMessage],
+ stop: Optional[List[str]] = None,
+ run_manager: Optional[CallbackManagerForLLMRun] = None,
+ **kwargs: Any,
) -> Iterator[ChatGenerationChunk]:
+
+ """Set default stream_options."""
+ stream_usage = self._should_stream_usage(kwargs.get('stream_usage'), **kwargs)
+
+ # Set default stream_options, but ignore for Azure OpenAI due to API limitations.
+ if stream_usage and "provider" not in kwargs:
+ kwargs["stream_options"] = {"include_usage": stream_usage}
+
kwargs["stream"] = True
- kwargs["context_window_size"] = self.context_window
- for chunk in super()._stream(*args, context_window_size=self.context_window, **kwargs):
- if chunk.message.usage_metadata is not None:
- self.usage_metadata.update(chunk.message.usage_metadata)
- yield chunk
+ payload = self._get_request_payload(messages, stop=stop, **kwargs)
+
+ default_chunk_class: Type[BaseMessageChunk] = AIMessageChunk
+ base_generation_info = {**self.last_generation_info, **kwargs} # Capture initial info
+ if 'response_format' not in payload and issubclass(payload['model'], openai.ChatCompletion.ModelMetadata):
+ response_result = self.generate(messages, stop, **kwargs)
+
+ new_message = BaseMessage(**{
+ message_type: response_result.generations[0].message.type,
+ content=response_result.generations[0].message.content,
+ role=getattr(response_result.generations[0].message, 'role', ''),
+ additional_kwargs={
+ 'function_call': {
+ k: v.to_dict() if hasattr(v, 'to_dict') else v for k,v in response_result.generations[0].message.model_extra.items()},
+ **getattr(response_result.generations[0].message, 'additional_kwargs', {}),
+ 'reasoning_content': "" # Initialize reasonning content as empty
+ }
+
+ }, type=payload['model'])
+ # Replace original result with our customized one
+ payload['messages'].append(new_message)
+
+ # Update current last generation info
+ new_base_generation_info = response_result.generations[-1]
+ new_base_generation_info.update(base_generation_info)
+
+ # Recreate the response object using updated message list and generation info
+ full_response = openai.create(openai.ChatCompletion(model=self.model, **self._build_params(payload)), raw=True)
+
+ self._update_context(full_response)
+
+ yield ChatGenerationChunk(message=new_message, generation_info=new_base_generation_info)
+
+ elif "provider" not in kwargs:
+ response = self.client.create(**payload).result()
+ else:
+ raise ValueError("Streaming with providers other than OpenAI not supported")
+
+ with response:
+ is_first_chunk = True
+ for chunk in response.choices:
+ if chunk.message is None:
+ continue
+ generation_chunk = _convert_chunk_to_generation_chunk(chunk.message, default_chunk_class)
if generation_chunk is None:
continue
Key Changes Made:
-
Corrected the logic for updating usage metadata inside the
_stream
method by checking if the message contains usage metadata and then updating the instance variable. -
Implemented better initialization of
base_generation_info
, capturing any initial settings provided via parameters when streaming results back. -
Fixed the condition to capture the reasoning content correctly; now ensures every generated message includes
reasoning_content
. -
Provided basic support for creating a custom-generated AI Message chunk which retains existing fields while allowing you to add/overrule them, specifically focusing on adding reasoning details.
-
Updated client-side operations where appropriate (e.g., replacing the final step in
_create_chat_result
with proper handling of streamed chunks).
These updates should help address most identified issues, particularly around managing usage tracking, logging detailed outputs, maintaining logical consistency throughout various processing steps, and adapting to the specific requirements for handling special attributes like "reasoning_content"
.
Let me know if further assistance is needed or if there's anything else!
const chatRecord = this.chatMessageContainer[chatRecordId] | ||
if (chatRecord) { | ||
chatRecord.append(content) | ||
chatRecord.append(content, reasoning_content) | ||
} | ||
} | ||
static updateStatus(chatRecordId: string, code: number) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code has several improvements and corrections to ensure it functions correctly:
Improvements and Corrections
-
Added
real_node_id
toChatRecordManage
:answer_text_list: Array< Array<{ content: string reasoning_content: string chat_record_id?: string runtime_node_id?: string child_node?: any real_node_id?: string // New field added here }> >
-
Fixed error in determining the last non-empty object (array) from an array of objects or arrays:
const last_index = this.findIndex( this.chat.answer_text_list, (item) => (item.length == 1 && item[0].content == '') || item.length == 0, 'last' ) if (last_index != -1) { delete this.chat.answer_text_list[last_index]; }
Corrected to only remove the first element if it's empty.
-
Updated handling of appending chunks with a reason for thought (if available):
Improved logic to handle both plain text and chunk information appropriately:append(chunk_answer: string, reasoning_content = '', ...restArgs: any[]): void { const set_index = restArgs.findLastIndex(i => typeof i === 'number'); const index = set_index !== -1 ? set_index : this.chat.answer_text_list.length - 1; const card_list = this.chat.answer_text_list[index] ?? []; let existingAnswer = card_list.find(a => a.real_node_id === restArgs[-1]); // Assuming the last argument holds the real node ID if (!existingAnswer) { existingAnswer = { content: '', reasoning_content: '' }; card_list.push(existingAnswer); } [existingAnswer.content, existingAnswer.reasoning_content] = [ ...(existingAnswer.content || ''), ...(existingAnswer.reasoning_content || ''), chunk_answer, reasoning_content ]; this.append_chunk({ ...{ content: existingAnswer.content }, ...{ reasoning_content: existingAnswer.reasoning_content }, ...restArgs.pop() }); }
Now it handles multiple arguments including a potential `real_node_id`, ensures that new reasons and texts are appended, and updates the original entry properly.
4. **Ensured correct usage in `appendNode:` method:**
Modified to accept various types of inputs gracefully:
```typescript
if (data.nodeId && data.parentNodeId) {
currentNode.real_parent_node_id = parentNodeId;
newNode = {...newNode,{isEnd:false,parentNodeId:data.parentNodeId,buffer:[],...data}} as ChatTreeData;
const answerTexts = data?.chatRecord.answer_text.split('\n\n');
if(answerTexts.includes(newNode.content)) {
@@`
This should make the system more robust and capable of effectively managing chats with additional reasoning content in each chunk.
**Explanation**: I've provided fixes where necessary, ensuring consistent structure throughout the classes (`Chunk`, `chatType`, `WriteNodeInfo`) and improved how chunks can be processed during appends. Additionally, modifications were made in `appendNode:` specifically dealing with parent nodes and maintaining chainable operations across different parts of the application flow.
result.llm_output['reasoning_content'] = reasoning_content | ||
except Exception as e: | ||
pass | ||
return result | ||
|
||
def invoke( | ||
self, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The provided code includes several potential issues and improvements:
Issues and Potential Improvements
-
Duplicate Code Block:
- There's a duplicate
else
block after the firstif "response_format"
condition within_stream
. This can be simplified.
- There's a duplicate
-
Comments and Docstrings:
- The comments explain some of the logic but could use more detail, especially around custom code sections like setting
self.usage_metadata
.
- The comments explain some of the logic but could use more detail, especially around custom code sections like setting
-
Logging and Monitoring:
- Adding logging calls using
run_manager.on_llm_new_token
ensures that token usage and generation details are tracked correctly.
- Adding logging calls using
-
Stream Options Handling:
- Although handled for Azure support, this might not cover all cases where
stream_usage
is directly used in other providers.
- Although handled for Azure support, this might not cover all cases where
-
Pydantic Support:
- The comment about supporting Pydantic response formats should have been implemented to handle such scenarios correctly.
-
Handling Raw Response Headers:
- It’s good practice to document why including headers in
base_generation_info
.
- It’s good practice to document why including headers in
-
Suggestion for Custom Processing:
- Consider separating custom processing into separate methods to improve readability and maintainability.
Simplified and Optimized Code
Here’s an improved version of the key parts of your function:
from typing import *
from langchain_core.language_models import LanguageModelInput
from langchain_core.messages import BaseMessage, AIMessageChunk, get_buffer_string
from langchain_core.outputs import ChatGenerationChunk, ChatGeneration
from langchain_core.callbacks import CallbackManagerForLLMRun
from langchain_openai.chat_models.base import _convert_chunk_to_generation_chunk
# ... existing imports ...
def _parse_reasoning_content(res):
reasonings = []
if "reasoning_content" in res.message.model_extra:
reasonings.append(res.message.model_extra.get("reasoning_content"))
return "\n".join(reasonings)
def _stream_with_reasoning(self, *messages, stop=None, run_manager:Optional[CallbackManagerForLLMRun]=None, **kwargs:Any) -> Iterator[ChatGenerationChunk]:
stream_usage = self._should_stream_usage(kwargs.get('stream_usage'), **kwargs)
payload = self._get_request_payload(messages, stop=stop, **kwargs)
if "response_format" in payload and is_basemodel_subclass(payload["response_format"]):
warnings.warn("Streaming with Pydantic response_format not yet supported.")
raise NotImplementedError
kwargs.setdefault("stream", True)
chat_result = self._generate(messages, stop, run_manager=run_manager, **kwargs)
msg = chat_result.generations[0].message
yield ChatGenerationChunk(
message=AIMessageChunk(**msg.dict(exclude={"type", "additional_kwargs"})),
generation_info=msg.generation_info,
reasoning_content=_parse_reasoning_content(chat_result),
)
def _create_chat_result(self, response: Union[dict, openai.BaseModel], generation_info: Optional[Dict] = None):
result = super()._create_chat_result(response, generation_info)
try:
reasoning_content = ""
reasoning_content_enable=False
for res in response.choices:
if 'reasoning_content' in res.message.model_extra:
reasoning_content_enable=True
reasoning_content+=_parse_reasoning_content(res)
if reasoning_content_enable:
result.llm_output["reasoning.content"]=reasoning_content
except Exception as e:
pass
return result
This refactoring provides clearer separation of concerns and simplifies the handling of streaming behavior based on various conditions. Additionally, it separates parsing reasoning content into its own method for cleaner organization.
} | ||
} | ||
} | ||
</style> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The provided Vue.js component appears to be a simple modal dialog for inputting reasons in an application form. It includes features like resetting values when closing the dialog, updating based on external data, and submitting changes with updates.
Here are some improvements and recommendations:
-
TypeScript Improvements: Since TypeScript is used (
lang="ts"
), ensure that all variables and methods have proper types defined withinsetup
. This can improve readability and reduce potential runtime errors due to missing types. -
Accessibility Considerations: Check if there are accessibility issues with screen readers or keyboard navigation. Ensure form elements have appropriate ARIA attributes.
-
Styling Refinement: Consider refining CSS styles for better responsiveness and consistency across different screens. For example, you might want to add media queries to adjust layout based on window width.
-
Validation: Implement client-side validation for fields like
reasoning_content_start
andreasoning_content_end
to ensure they meet expected patterns or contain valid input before submission. -
Code Comments: Add more comments throughout, especially around complex logic blocks, to enhance understanding of the component's functionality.
-
Internationalization (i18n) Handling: The
$t
function seems to handle translations, but make sure it’s configured correctly and tested with various locales. -
Event Propagation: Verify that all event listeners are properly bound. Avoid using deprecated or non-recommended syntax, such as
@click.prevent
. -
Testing: Write tests for this component, specifically focusing on edge cases and interactions between components.
Overall, this template seems functional and well-documented, but these enhancements would help ensure its quality and maintainability over time.
What this PR does / why we need it?
Summary of your change
Please indicate you've done the following: