Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

llm schemas family of commands, similar to llm templates #781

Closed
simonw opened this issue Feb 27, 2025 · 9 comments
Closed

llm schemas family of commands, similar to llm templates #781

simonw opened this issue Feb 27, 2025 · 9 comments
Labels
enhancement New feature or request schemas

Comments

@simonw
Copy link
Owner

simonw commented Feb 27, 2025

For viewing and managing saved schemas.

@simonw simonw added enhancement New feature or request schemas labels Feb 27, 2025
@simonw simonw added this to the 0.23 (schemas) milestone Feb 27, 2025
@simonw
Copy link
Owner Author

simonw commented Feb 27, 2025

Originally I had planned to have these as schema files saved on dish, for consistency with how llm templates works.

On further thought though, the way the schemas database table works (with content hash IDs) makes this much more tempting as an in-database thing. Maybe templates should have been that way already.

llm/docs/logging.md

Lines 224 to 227 in f5c2cfb

CREATE TABLE [schemas] (
[id] TEXT PRIMARY KEY,
[content] TEXT
);

Maybe add a new unique but nullable alias column to that table? That way people could configure aliases for their stored schemas, much nicer than pasting around those hex IDs (4d1da05ad315ee72537b4f2c1f2361c4 etc).

@simonw
Copy link
Owner Author

simonw commented Feb 27, 2025

As a reminder, the template commands are:

llm templates --help

Usage: llm templates [OPTIONS] COMMAND [ARGS]...

  Manage stored prompt templates

Options:
  --help  Show this message and exit.

Commands:
  list*  List available prompt templates
  edit   Edit the specified prompt template using the default $EDITOR
  path   Output the path to the templates directory
  show   Show the specified prompt template

llm templates list --help

Usage: llm templates list [OPTIONS]

  List available prompt templates

Options:
  --help  Show this message and exit.

llm templates show --help

Usage: llm templates show [OPTIONS] NAME

  Show the specified prompt template

Options:
  --help  Show this message and exit.

llm templates edit --help

Usage: llm templates edit [OPTIONS] NAME

  Edit the specified prompt template using the default $EDITOR

Options:
  --help  Show this message and exit.

@simonw
Copy link
Owner Author

simonw commented Feb 27, 2025

  • llm schemas list - lists all schemas stored in the database. Looks like llm templates list. Supports multiple -q options.
  • llm schemas show ID - shows that schema

I'm not going to have edit because schemas are immutable once stored.

In the future this command will grow aliases support but I'm going to skip that for the initial release.

@simonw
Copy link
Owner Author

simonw commented Feb 27, 2025

Built a prototype:

llm schemas
[
  {
    "id": "b39546ea507992de3de97f7db00c5e16",
    "content": "{\"$schema\":\"http://json-schema.org/draft-07/schema#\",\"type\":\"object\",\"properties\":{\"dogs\":{\"type\":\"array\",\"items\":{\"type\":\"object\",\"properties\":{\"name\":{\"type\":\"string\",\"minLength\":1},\"bio\":{\"type\":\"string\",\"minLength\":1}},\"required\":[\"name\",\"bio\"],\"additionalProperties\":false}}},\"required\":[\"dogs\"],\"additionalProperties\":false}"
  },
  {
    "id": "4d1da05ad315ee72537b4f2c1f2361c4",
    "content": "{\"type\":\"object\",\"properties\":{\"dogs\":{\"type\":\"array\",\"items\":{\"type\":\"object\",\"properties\":{\"name\":{\"type\":\"string\"},\"bio\":{\"type\":\"string\"}}}}}}"
  },
  {
    "id": "1291b39a7c4c12ab8cbc22de576d6613",
    "content": "{\"type\":\"object\",\"properties\":{\"ui_elements\":{\"type\":\"array\",\"items\":{\"type\":\"object\",\"properties\":{\"name\":{\"type\":\"string\"},\"detailed_description\":{\"type\":\"string\"}}}}}}"
  },
  {
    "id": "dc0de7a644d7cf3b6a43fedc34b0354d",
    "content": "{\"type\":\"object\",\"properties\":{\"properties\":{\"name\":{\"type\":\"string\"},\"bio\":{\"type\":\"string\"}}}}"
  },
  {
    "id": "ac43af6c272634e387d2b8116dbc6ab9",
    "content": "{\"type\":\"object\",\"properties\":{\"name\":{\"type\":\"string\"},\"bio\":{\"type\":\"string\"}}}"
  },
  {
    "id": "5e49eb108d7f9951a350ea63ce87ec24",
    "content": "{\"type\":\"object\",\"properties\":{\"segments\":{\"type\":\"array\",\"items\":{\"type\":\"object\",\"properties\":{\"speaker_name\":{\"type\":\"string\"},\"spoken_text\":{\"type\":\"string\"},\"timestamp_hh_mm_ss\":{\"type\":\"string\"}}}}}}"
  },
  {
    "id": "b888a05042c403ae5d22e2f099f221b7",
    "content": "{\"type\":\"object\",\"properties\":{\"segments\":{\"type\":\"array\",\"items\":{\"type\":\"object\",\"properties\":{\"speaker_name\":{\"type\":\"string\"},\"spoken_text\":{\"type\":\"string\"},\"timestamp_mm_ss\":{\"type\":\"string\"}},\"required\":[\"speaker_name\",\"spoken_text\",\"timestamp_mm_ss\"]}}}}"
  },
  {
    "id": "9efb6fb9edc49fdd48bdbccdf34b1ecc",
    "content": "{\"type\":\"object\",\"properties\":{\"segments\":{\"type\":\"array\",\"items\":{\"type\":\"object\",\"properties\":{\"speaker_name\":{\"type\":\"string\"},\"spoken_text\":{\"type\":\"string\"},\"timestamp_mm_ss\":{\"type\":\"string\"}}}}}}"
  },
  {
    "id": "a75b7b3f00e065247e6e364304338aa5",
    "content": "{\"$schema\":\"http://json-schema.org/draft-07/schema#\",\"type\":\"object\",\"properties\":{\"dogs\":{\"type\":\"array\",\"items\":{\"type\":\"object\",\"properties\":{\"name\":{\"type\":\"string\",\"minLength\":1},\"ten_word_bio\":{\"type\":\"string\",\"minLength\":1}},\"required\":[\"name\",\"ten_word_bio\"],\"additionalProperties\":false}}},\"required\":[\"dogs\"],\"additionalProperties\":false}"
  },
  {
    "id": "1a89f66bee077d318fa38e99c6fb7abe",
    "content": "{\"properties\":{\"bio\":{\"type\":\"string\"},\"name\":{\"type\":\"string\"}},\"type\":\"object\"}"
  },
  {
    "id": "520f7aabb121afd14d0c6c237b39ba2d",
    "content": "{\"properties\":{\"dogs\":{\"items\":{\"properties\":{\"bio\":{\"type\":\"string\"},\"name\":{\"type\":\"string\"}},\"type\":\"object\"},\"type\":\"array\"}},\"type\":\"object\"}"
  }
]
Which shows me that even just truncating the schema content isn't going to be useful enough - truncate most of those and you get `{"$schema":"http://json-schema.org/draft-07/schema#","type":"object","properties":{"dogs":{"type":"array"` which is mostly visible boilerplate.

@simonw
Copy link
Owner Author

simonw commented Feb 27, 2025

I got Claude to build me a JSON schema summary function: https://claude.ai/share/f794bdc8-94ca-4d33-94f6-3d59cca58ebb

Looks like this:

b39546ea507992de3de97f7db00c5e16: {dogs: [{name, bio}]}
4d1da05ad315ee72537b4f2c1f2361c4: {dogs: [{name, bio}]}
1291b39a7c4c12ab8cbc22de576d6613: {ui_elements: [{name, detailed_description}]}
dc0de7a644d7cf3b6a43fedc34b0354d: {properties}
ac43af6c272634e387d2b8116dbc6ab9: {name, bio}
5e49eb108d7f9951a350ea63ce87ec24: {segments: [{speaker_name, spoken_text, timestamp_hh_mm_ss}]}
b888a05042c403ae5d22e2f099f221b7: {segments: [{speaker_name, spoken_text, timestamp_mm_ss}]}
9efb6fb9edc49fdd48bdbccdf34b1ecc: {segments: [{speaker_name, spoken_text, timestamp_mm_ss}]}
a75b7b3f00e065247e6e364304338aa5: {dogs: [{name, ten_word_bio}]}
1a89f66bee077d318fa38e99c6fb7abe: {bio, name}
520f7aabb121afd14d0c6c237b39ba2d: {dogs: [{bio, name}]}

@simonw
Copy link
Owner Author

simonw commented Feb 27, 2025

I'm going to add a count of responses for each schema, and a most recently used.

@simonw
Copy link
Owner Author

simonw commented Feb 27, 2025

Output currently looks like this, not beautiful but good enough for the moment:

- id: 4d1da05ad315ee72537b4f2c1f2361c4
  summary: |
    {dogs: [{name, bio}]}
  usage: |
    6 times, most recently 2025-02-27T01:37:40.117678+00:00
- id: 1291b39a7c4c12ab8cbc22de576d6613
  summary: |
    {ui_elements: [{name, detailed_description}]}
  usage: |
    2 times, most recently 2025-02-27T03:04:34.326711+00:00
- id: dc0de7a644d7cf3b6a43fedc34b0354d
  summary: |
    {properties}
  usage: |
    1 time, most recently 2025-02-27T04:09:59.834447+00:00

And you can search:

llm schemas list -q ten
- id: a75b7b3f00e065247e6e364304338aa5
  summary: |
    {dogs: [{name, ten_word_bio}]}
  usage: |
    3 times, most recently 2025-02-27T15:02:14.520661+00:00

simonw added a commit that referenced this issue Feb 27, 2025
@simonw
Copy link
Owner Author

simonw commented Feb 27, 2025

% llm schemas show a75b7b3f00e065247e6e364304338aa5       
{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "type": "object",
  "properties": {
    "dogs": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "name": {
            "type": "string",
            "minLength": 1
          },
          "ten_word_bio": {
            "type": "string",
            "minLength": 1
          }
        },
        "required": [
          "name",
          "ten_word_bio"
        ],
        "additionalProperties": false
      }
    }
  },
  "required": [
    "dogs"
  ],
  "additionalProperties": false
}

@simonw
Copy link
Owner Author

simonw commented Feb 27, 2025

Idea: add -q multi option to llm schemas show and show all schemas that match that search?

Better would be an option on llm schemas list which causes the full schemas to be output.

simonw added a commit that referenced this issue Feb 27, 2025
@simonw simonw closed this as completed in edc9e2d Feb 27, 2025
simonw added a commit that referenced this issue Feb 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request schemas
Projects
None yet
Development

No branches or pull requests

1 participant