-
Notifications
You must be signed in to change notification settings - Fork 15.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversational Feedback #12590
Conversational Feedback #12590
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎ 1 Ignored Deployment
|
081dfd2
to
43d36f7
Compare
43d36f7
to
a664da2
Compare
a664da2
to
7e918cb
Compare
@@ -0,0 +1,86 @@ | |||
# Chat Feedback Template | |||
|
|||
This template captures implicit feedback from human behavior in a simple chat bot. It instructs an LLM to reference a user's responses within a conversation to evaluate the chat bot's previous replies. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice.
1/ IIUC, this is performing chat evaluations w/o explicit user-feedback, which is very useful. We might create a top-level shortened summary that just states this clearly. It's the first eval template, so very cool to have.
2/ We might explicitly mention that your chat app should be implemented (or called) in chain.py
and call out specifically where as a placeholder. AFAICT, any chat runnable can simply append:
.with_config(
run_name="ChatBot",
callbacks=[
EvaluatorCallbackHandler(
evaluators=[
ResponseEffectivenessEvaluator(evaluate_response_effectiveness)
]
)
],
3/ Where to go fetch the evals in LangSmith? May be nice to show screenshot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool cool. For (1), are you saying on top of the README that's here? (is it too roundabout)?
For (2) - yes albeit it needs a 'last_run_id' to be passed around so that the feedback can be assigned to the previous response trace. If we didn't care about the exact credit assignment, or if we had a better way of tracking conversations, this would be easier/better
For (3) def. I'll do that.
6810b64
to
6b1d43d
Compare
6b1d43d
to
04f3134
Compare
Context in the README. Show how score chat responses based on a followup from the user and then log that as feedback in LangSmith
Context in the README. Show how score chat responses based on a followup from the user and then log that as feedback in LangSmith
Context in the README.
Show how score chat responses based on a followup from the user and then log that as feedback in LangSmith