-
-
Notifications
You must be signed in to change notification settings - Fork 319
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Abstract out token usage numbers #610
Comments
The most complex form of token accounting right now is OpenAI - their {
"completion_tokens": 9,
"prompt_tokens": 8,
"total_tokens": 17,
"prompt_tokens_details": {
"cached_tokens": 0,
"audio_tokens": 0
},
"completion_tokens_details": {
"reasoning_tokens": 0,
"audio_tokens": 0,
"accepted_prediction_tokens": 0,
"rejected_prediction_tokens": 0
}
} |
If my eventual goal is to support code that can calculate dollar spends I'll also need to consider how I store the fact that a prompt was executed in batch mode, which often provides a 50% discount. LLM doesn't have a mechanism for batch mode yet but it's definitely a potentially useful feature. |
Given the complexity of the OpenAI response I'm tempted to add a JSON column to store this. But the above example would actually be a waste of JSON, since the only things that actually matter in there are I could have columns for |
Gemini 1.5 Pro usage looks like this (according to "usageMetadata": {
"promptTokenCount": 14,
"candidatesTokenCount": 26,
"totalTokenCount": 40
} Anthropic is even simpler: "usage": {
"input_tokens": 8,
"output_tokens": 18
} |
"usage": {
"completion_tokens": 349,
"prompt_tokens": 12,
"total_tokens": 361
}
"usage": {
"prompt_tokens": 15,
"completion_tokens": 1,
"total_tokens": 16,
"prompt_tokens_details": null,
"completion_tokens_details": null
} |
Maybe start with something like this: @dataclass
class Usage:
model_id: str
input_tokens: int
output_tokens: int
details: Dict[str, int] (Update: I ditched this idea) |
I'm going to add a method to the def set_usage(input: int, output: int, details: dict = None) -> None: |
OpenAI detailed usage blocks are pretty lengthy: {
"completion_tokens": 462,
"prompt_tokens": 11,
"total_tokens": 473,
"prompt_tokens_details": {
"cached_tokens": 0,
"audio_tokens": 0
},
"completion_tokens_details": {
"reasoning_tokens": 0,
"audio_tokens": 0,
"accepted_prediction_tokens": 0,
"rejected_prediction_tokens": 0
}
} I'm going to trim out any keys that have a value of 0, and any nested blocks where everything is a zero. I'm also going to pull out |
{
"completion_tokens": 421,
"prompt_tokens": 30791,
"total_tokens": 31212,
"prompt_tokens_details": {
"cached_tokens": 30592,
"audio_tokens": 0
},
"completion_tokens_details": {
"reasoning_tokens": 0,
"audio_tokens": 0,
"accepted_prediction_tokens": 0,
"rejected_prediction_tokens": 0
}
} Becomes: {
"prompt_tokens_details": {"cached_tokens": 30592}
} Using code I got Code Interpreter to write and test for me: https://chatgpt.com/share/673d4727-a148-8006-a11a-485bbe2822d0 def simplify_usage_dict(d):
# Recursively remove keys with value 0 and empty dictionaries
def remove_empty_and_zero(obj):
if isinstance(obj, dict):
cleaned = {
k: remove_empty_and_zero(v)
for k, v in obj.items()
if v != 0 and v != {}
}
return {k: v for k, v in cleaned.items() if v is not None and v != {}}
return obj
return remove_empty_and_zero(d) or {} |
Idea: add |
Most APIs return the number of input and output tokens used (and sometimes a more detailed breakdown of those categories). I want to provide an abstraction over those to make it easier to implement tools on top of LLM that do token accounting.
The text was updated successfully, but these errors were encountered: