Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(backend): Track LLM token usage + LLM blocks cleanup #8367

Merged
merged 10 commits into from
Oct 22, 2024

Conversation

majdyz
Copy link
Contributor

@majdyz majdyz commented Oct 17, 2024

Background

We are not tracking LLM's token usage.

Changes 🏗️

  • Added LLM token usage tracking on AIStructuredResponseGeneratorBlock
  • Make other LLM block utilizes AIStructuredResponseGeneratorBlock as the base LLM call.

Testing 🔍

Note

Only for the new autogpt platform, currently in autogpt_platform/

  • Create from scratch and execute an agent with at least 3 blocks
  • Import an agent from file upload, and confirm it executes correctly
  • Upload agent to marketplace
  • Import an agent from marketplace and confirm it executes correctly
  • Edit an agent from monitor, and confirm it executes correctly

@majdyz majdyz requested a review from a team as a code owner October 17, 2024 13:32
Copy link

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Recommended focus areas for review

Potential Performance Issue
The AIStructuredResponseGeneratorBlock's llm_call method now returns token usage information, but this information is not being used effectively in all cases. For example, in the Ollama provider case, it returns placeholder values (0, 0) for input and output tokens. This could lead to inaccurate token usage tracking.

Code Duplication
The AITextGeneratorBlock, AISummarizerBlock, and AIConversationBlock all implement similar llm_call methods that create an AIStructuredResponseGeneratorBlock and call its run_once method. This duplication could be refactored into a shared method or base class.

Potential Bug
The merge_stats method in the Block class doesn't handle nested dictionaries correctly. It only updates the top-level keys, which could lead to data loss or unexpected behavior if stats contain nested structures.

@majdyz majdyz requested review from ntindle and removed request for Torantulino October 18, 2024 03:25
@majdyz majdyz requested a review from kcze October 19, 2024 05:30
ntindle
ntindle previously approved these changes Oct 22, 2024
autogpt_platform/backend/backend/blocks/llm.py Outdated Show resolved Hide resolved
@majdyz majdyz force-pushed the zamilmajdy/add-token-count-as-stats branch from f845a7e to dcf1396 Compare October 22, 2024 23:05
@majdyz majdyz requested a review from ntindle October 22, 2024 23:12
@majdyz majdyz enabled auto-merge (squash) October 22, 2024 23:13
@majdyz majdyz merged commit f4dac22 into dev Oct 22, 2024
7 checks passed
@majdyz majdyz deleted the zamilmajdy/add-token-count-as-stats branch October 22, 2024 23:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants