Skip to content

Added supportedInput and supportedOutput fields #10

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

marco-agile
Copy link
Contributor

πŸ“ Summary

This pull request introduces two new optional fields to the AgentDescriptor schema:
β€’ supportedInput: defines which input types the agent can handle.
β€’ supportedOutput: defines which output types the agent can produce.

Both fields accept a checklist of the following values:
β€’ text
β€’ images
β€’ video
β€’ files

πŸ“Œ Motivation

These additions aim to make agent capabilities more explicit and machine-readable, enabling better discoverability, filtering, and compatibility validation across agentic systems.

πŸ”§ Changes
β€’ Added supportedInput and supportedOutput as arrays of strings with fixed enums to components.schemas.AgentDescriptor.

βœ… Example

supportedInput: ["text", "images"]
supportedOutput: ["text", "files"]

πŸ“š Documentation

The OpenAPI specification has been updated accordingly in the AgentDescriptor schema block.

Copy link
Contributor

@tmnd1991 tmnd1991 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proposal makes sense, I'm thinking a bit broader about this, shouldn't we rely on HTTP Content-Types for this kind of stuff? So that we don't "reinvent" the wheel? What's your position?

@@ -46,6 +46,8 @@ The agent descriptor follows an **OpenAPI 3.0-based** schema to enable easy docu
- `ModelDrivenWorkflow`: the agent is implement as a workflow. The execution through the workflow is controlled by LLMs.
- `toolsUse` *(string)* – Define if the system can use tools in order to execute its task. Values: true/false.
- `learningCapability` *(string)* – Learning approach (None, Reinforcement Learning, Fine Tuning).
- `supportedInput` *(array)* – List of supported output formats or content types (e.g., text, images, video, files).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- `supportedInput` *(array)* – List of supported output formats or content types (e.g., text, images, video, files).
- `supportedInput` *(array)* – List of supported input formats or content types (e.g., text, images, video, files).

Comment on lines +263 to +265
type: string
enum: [text, images, video, files]
examples: [["text", "images"]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you declare this as a type and reuse for input and output fields?

@marco-agile
Copy link
Contributor Author

I agree that using standard MIME types makes sense in principle.
That said, I think relying only on them might be too rigid, especially for agents that handle broad categories like video or images, where formats can vary and evolve.
A possible compromise could be to define a category field (e.g. video, text, image) and optionally list specific MIME types when needed. What do you think about it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants