Draft: Add token counting per message #460

lumpidu · 2024-07-14T19:19:16Z

Implement token counting per message by using the returned usage stats from the LLM API backend. Anthropic sends token stats already by default. As for OpenAI, the stream needs to be first configured to return those.

Add migration for new Message attributes input_token_count, output_token_count
Add concern Message::TokenCount for Message model
Implement capturing of returned token count from Anthropic & OpenAI and set Message attributes inside the async API-specific jobs. For OpenAI, configure the stream to return usage stats
Add basic usage dropdown-menu beneath last message that shows the number of input/output tokens when clicked on the "$" sign. This is just a quick demo for the functionality and will likely be changed to something else

This MR is related to #402

db/migrate/20240713130357_add_token_counts_to_messages.rb

app/services/ai_backend/anthropic.rb

app/services/ai_backend/open_ai.rb

app/services/ai_backend/anthropic.rb

app/services/ai_backend/open_ai.rb

Implement token counting per message by using the returned usage stats from the LLM API backend. Anthropic sends token stats already by default. As for OpenAI, the stream needs to be first configured to return those. - Add migration for new Message attributes input_token_count, output_token_count - Add concern Message::TokenCount for Message model - Implement capturing of returned token count from Anthropic & OpenAI and set Message attributes inside the async API-specific jobs. For OpenAI, configure the stream to return usage stats. Signed-off-by: Daniel Schnell <dschnell@grammatek.com> Co-authored-by: Keith Schacht <krschacht@gmail.com>

Add a drop-down showing the LLM API reported input/output tokens for the ongoing conversation. Signed-off-by: Daniel Schnell <dschnell@grammatek.com>

krschacht · 2024-07-15T17:03:11Z

app/views/messages/_message.html.erb

+                  <p class="py-1 px-2 text-sm text-gray-700 dark:text-gray-300">Output tokens: <%= output_tokens %></p>
+                </div>
+              </div>
+            <% end %>


@lumpidu I tried out this PR and it's looking good! But I just realized a pretty decent issue with this which I didn't consider before: the _message partial updates a lot because we stream continual fragments. Having this partial trigger some SQL queries doesn't seem like a good idea.

I spent a little while looking at the UI and I made room for this next to the conversation title, in the left column. Here is the screenshot:

What do you think? It's this PR #463 (I just hard coded those token numbers as placeholders which you can make dynamic)

One implication of putting it in the left column is that we need to make sure it updates itself. I already have the front-end wired up with turbo so that if a conversation model changes at all, it will broadcast this update to the left side and replace the full _conversation partial. Every time a message finishes streaming, it does message.conversation.touch to trigger an update to the timestamp of the conversation object, and this also causes the conversation title to update because it has an after_touch hook (all this is within conversation.rb) So as long as we continually update the cost on the conversation model then it will just automagically stay updated on the front-end.

If you like the direction I'm heading with this, then I can merge in that PR 463 and then you can pull main back into this PR. You can then update this PR to be:

Add 3 token columns to the conversation table: input_token_count, output_token_count, and token_cost

Add an after_save hook within message which updates these three columns on the conversation

Within get_next_ai_message_job find the conversation.touch and remove this, it's no longer necessary now that message.save always triggers the conversation to update

Within conversation.rb change the after_touch to an after_save

Update the _conversation partial to remove "hidden" on that and wire up those numbers

Yes, this looks good to me. I will merge your changes and continue as proposed.

Great! It's now in main so you can pull it into this branch.

krschacht · 2024-08-21T13:02:38Z

I can't push some fixes to this branch and I want to get it merged in, so I needed to create a new PR: #495

krschacht reviewed Jul 14, 2024

View reviewed changes

db/migrate/20240713130357_add_token_counts_to_messages.rb Outdated Show resolved Hide resolved

lumpidu force-pushed the feat/token_count branch from 562cb30 to 13dbd5f Compare July 14, 2024 22:15

krschacht reviewed Jul 14, 2024

View reviewed changes

app/services/ai_backend/anthropic.rb Outdated Show resolved Hide resolved

krschacht reviewed Jul 14, 2024

View reviewed changes

app/services/ai_backend/anthropic.rb Outdated Show resolved Hide resolved

krschacht reviewed Jul 14, 2024

View reviewed changes

app/services/ai_backend/open_ai.rb Show resolved Hide resolved

krschacht reviewed Jul 15, 2024

View reviewed changes

app/services/ai_backend/anthropic.rb Outdated Show resolved Hide resolved

krschacht reviewed Jul 15, 2024

View reviewed changes

app/services/ai_backend/open_ai.rb Outdated Show resolved Hide resolved

lumpidu and others added 2 commits July 15, 2024 12:15

Add "$" dropdown next to "Regenerate" in last message

7de810f

Add a drop-down showing the LLM API reported input/output tokens for the ongoing conversation. Signed-off-by: Daniel Schnell <dschnell@grammatek.com>

lumpidu force-pushed the feat/token_count branch from f270bdc to 7de810f Compare July 15, 2024 12:17

krschacht reviewed Jul 15, 2024

View reviewed changes

krschacht mentioned this pull request Aug 21, 2024

Add token count per message #495

Merged

krschacht closed this Aug 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Draft: Add token counting per message #460

Draft: Add token counting per message #460

lumpidu commented Jul 14, 2024 •

edited

Loading

krschacht Jul 15, 2024 •

edited

Loading

lumpidu Jul 15, 2024

krschacht Jul 15, 2024

krschacht commented Aug 21, 2024

Draft: Add token counting per message #460

Draft: Add token counting per message #460

Conversation

lumpidu commented Jul 14, 2024 • edited Loading

krschacht Jul 15, 2024 • edited Loading

Choose a reason for hiding this comment

lumpidu Jul 15, 2024

Choose a reason for hiding this comment

krschacht Jul 15, 2024

Choose a reason for hiding this comment

krschacht commented Aug 21, 2024

lumpidu commented Jul 14, 2024 •

edited

Loading

krschacht Jul 15, 2024 •

edited

Loading