Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

version 3.0: additional formats #607

Closed
ralfhandl opened this issue Mar 21, 2016 · 27 comments
Closed

version 3.0: additional formats #607

ralfhandl opened this issue Mar 21, 2016 · 27 comments
Labels

Comments

@ralfhandl
Copy link
Contributor

ralfhandl commented Mar 21, 2016

As there are several issues proposing new formats, here a list of the possible future full picture

Common Name Type Format Comments
octet/(unsigned) byte integer uint8 new: unsigned 8 bits
signed byte integer int8 new: signed 8 bits
short integer int16 new: signed 16 bits
integer integer int32 signed 32 bits
long integer int64 signed 64 bits
big integer integer
float/single number float
double number double
decimal number decimal new: decimal floating-point number, recipient-side internal representation as a binary floating-point number may lead to rounding errors
big decimal number
string string
byte string byte base64 encoded characters
url-safe binary string base64url new: base64url encoded characters - #606
binary string binary any sequence of octets
boolean boolean
date string date As defined by full-date - RFC3339
dateTime string date-time As defined by date-time - RFC3339
time (of day) string time new: As defined by partial-time - RFC3339 - #358
duration string duration new: As defined by xs:dayTimeDuration - XML Schema 1.1 - #359
uuid string uuid new: Universally Unique Identifier (UUID) RFC4122
password string password Used to hint UIs the input needs to be obscured.

2023-03-24: all of the above-mentioned formats are now registered in the OpenAPI Initiative Formats Registry.

@amarzavery
Copy link

👍

@DavidBiesack
Copy link

The format modifier is optional. Please add integer with no format -- not confined to 32/64 bit (a.k.a. BigInteger) and also number without format which is also not constrained to floating point 32/64 bit (a.k.a. BigDecimal).

@DavidBiesack
Copy link

Rather than tightly couple to uuid format, I suggest just a generic id format that means an opaque identifier string. Whether an ID is a UUID, a hash, a databse primary key, or something else seems more like an implementation detail that should be hidden from the API specification. Many API's make extensive use of id parameters/members but do not overly-specify them as UUID strings (often, because they are not UUIDs - look at bit.ly hashes for example.)

Slightly related, I would prefer not overloading format with what is really role or attribute. While Swagger 2.0 has password, it is not a format but a role that is orthogonal to format. Ditto for other PII like social security number, government id number, etc. (A more generic role for these might be masked which is a UI hint.)

Thus, while uuid is a format, id (if it were to replace uuid) a role, not a format.

@ralfhandl
Copy link
Contributor Author

@DavidBiesack I actually intended uuid as a format, i.e. a string that has the pattern

uuid = 8HEXDIG "-" 4HEXDIG "-" 4HEXDIG "-" 4HEXDIG "-" 12HEXDIG

No objections to adding the concept of a role and an id role. Still would require a format uuid.

@IvanGoncharov
Copy link
Contributor

@ralfhandl Such things should be added into JSON Schema standard or extracted into a separate spec.
I can imagine the situation when JSON Schema Draft 5 add format with the same name but different validation rules.
Another problem,for example, to validate your DB or user input on client-side. For that purpose you would use pure JSON Schema and it would be strange to use OpenAPI-specific types over there.
And last one, tooling support for such formats will be limited to only OpenAPI tools.

I'm not against extending formats I just say that spec should reference some external doc and not to define them internally.
IMHO, OpenAPI should describe API-specific stuff and reuse existing data validation specs.

@ralfhandl
Copy link
Contributor Author

@IvanGoncharov Looking at the list of formats supported by Swagger 2.0 we only find one format that is defined by JSON Schema: date-time. The proposed new formats are in line with the existing swagger-specific formats, so adding them would not enter new ground.

Formats are an explicit extension point of JSON Schema for semantic validation, and the OpenAPI Specification could be one of the "authoritative resources that accurately describes interoperable semantic validation".

I'm not aware of other external documents describing formats for semantic validation in JSON Schema.

@DavidBiesack
Copy link

I oppose requiring id strings to be UUID. (I'm not sure if that was what you meant by "Still would require a format uuid.")

As noted, I also think it is not a good idea to expose internal implementation details such as UUID format in an API definition. id strings (path parameters, query parameters, fields) should be no more than opaque string IDs. Over-specifying them as UUID is fragile and does not allow for non-breaking changes if the underlying implementation changes. Again, look at the bit.ly API which uses hash ID strings, not UUIDs.

@ralfhandl
Copy link
Contributor Author

I don't require id strings to be UUIDs, I only require uuid strings to be UUIDs. I see the string format uuid similar to the string format date-time - as a validation rule that restricts the allowed / possible values of a string parameter or property. It tells the client that some string values will be accepted, and others will be refused.

As you pointed out above the concept of an "id" is a role and not a format. So we should introduce this new concept in a new, specific way and not mix it up with format.

Could it be that your concept of an "id" is related to the concept of a "primary key"? See #587

@whitlockjc
Copy link
Member

What you expect to gain by formally supporting more formats? I realize the format could play a role in code generation, mock data generation, validation and potentially more so I figured I'd ask. I also ask because while writing Swagger tooling in the past, custom formats were easy to support without OpenAPI/Swagger being involved, especially since OpenAPI/Swagger does not dictate or limit which formats you can/cannot use.

Here are a few examples of Node.js code registering custom JSON Schema formats for various reasons:

@DavidBiesack
Copy link

Thanks, @ralfhandl for confirming -- makes sense for uuid to be an (optional) format, and id to be a role.

To answer your second question, an id may be a primary key, or there may be a mapping between the two. I want the resource and representation to remain decoupled from the implementation. I'll quote Mike Amundsen:

"Your storage model is not your object model is not your resource model is not your representation model."

@whitlockjc
Copy link
Member

"Your storage model is not your object model is not your resource model is not your representation model."

👍

@ralfhandl
Copy link
Contributor Author

@whitlockjc Code generation, mock data generation, validation, easier use of tools that know these formats out-of-the-box, better interoperability due to common agreement on what is e.g. a time or duration, ...

There seems to be demand for more pre-defined formats, see #358, #359, #606, and https://github.com/json-schema/json-schema/wiki/%22format%22-suggestions.

@ePaul
Copy link
Contributor

ePaul commented Mar 23, 2016

We are currently using type: number, format: decimal for money values (to make it explicit that these ought to not be mapped to some binary floating point number). Not sure if this needs standardizing.

@ralfhandl
Copy link
Contributor Author

@ePaul We came up with the same solution for numeric values with decimal mantissa when mapping primitive types to JSON Schema types and formats. If we can find a third person who did this, it's a pattern :-)

We also intended to add a precision extension keyword in our JSON Schema representation for conveying the length of the decimal mantissa, e.g. precision: 34 for a 128-bit decimal floating-point type.

Is that something that you'd also find useful?

@amarzavery
Copy link

Some of our customers needed decimal format for specifying monetary values. Hence we ended up supporting format decimal in our project AutoRest.

@whitlockjc
Copy link
Member

I think we're in agreement that there are many people using many formats outside of the documented ones. The question is whether this belongs in the OpenAPI specification as some sort of "formal support" or whether this is a tooling problem. It could very well be both.

I will tag this appropriately so we can discuss.

@amarzavery
Copy link

For code generation we need well defined formats. Since swagger spec defines the REST API, it becomes a contract that server and client need to abide by. It is always nice if your contract is explicit about everything.

Just making an analogy to make my point:
Imagine leasing a house where the contract has many loose ends left for the owner and tenant to interpret as per their choice. This wouldn't be a good scenario.

@fehguy
Copy link
Contributor

fehguy commented Mar 23, 2016

It seems that type must be a constrained type. format can be interpreted by the codegeneration. for example:

type: string
format: uuid

may fall back to String if UUID is not supported. But you cannot invent a type.

If that's not the mentality, then we must constrain all formats to a fixed set, which may be hard to support inside the OAI.

@whitlockjc
Copy link
Member

Code generation, as is validation, are tools to me and do not necessarily need OpenAPI changes for reasons I mentioned above. But one thing I just thought of that could make supporting this make sense would be where the OpenAPI wanted to dictate a minimum set of formats all tools must support. I could see that being useful.

@fehguy
Copy link
Contributor

fehguy commented Mar 23, 2016

@whitlockjc yes, and with a fallback to primitive types, if not supported. I don't think we should be inventing types.

In general, if we specify a format, we should dictate exactly what that is supposed to be. If a user expects a different behavior from a defined format, well, that violates the spec.

So... to make this concrete:

type: string
format: uuid

Should have a very specific format defined in the spec, specifically what @DavidBiesack mentioned:

uuid = 8HEXDIG "-" 4HEXDIG "-" 4HEXDIG "-" 4HEXDIG "-" 12HEXDIG
type: string
format: date-time

quite specifically says RFC3339 format (https://github.com/OAI/OpenAPI-Specification/blob/master/versions/2.0.md#data-types)

If I want to invent tonys-date-time then pretty much no tools will know what the heck to do with it, and would fall back to type: string.

@mspiegel
Copy link

mspiegel commented Jul 7, 2016

@ralfhandl what about instances where one wants to specify not the precision, which is equivalent to the number of significant digits (correct?), but one wants to specify the scale, or the number of digits following the decimal point. I believe the scale is more appropriate for fixed-point arithmetic and the precision is more appropriate for arbitrary-precision arithmetic. Sanity check, does any of this make sense?

@ralfhandl
Copy link
Contributor Author

ralfhandl commented Jul 12, 2016

@mspiegel This absolutely makes sense to me, a complete description of a decimal data type needs two facets:

  • precision - the maximum number of significant decimal digits in the mantissa
  • scale - the maximum number of decimal digits to the right of the decimal point - may be specified as variable

This covers the SQL data type DECIMAL(p,s) - precision: p, scale: s - as well as decimal floating-point types such as DECFLOAT34 - precision: 34, scale: variable.

I'd love to have both precision and scale as new keywords for specifying numeric types in addition to the existing minimum, maximum, and multipleOf, see #602.

@webron
Copy link
Member

webron commented Jul 21, 2016

Tackling PR: #741

@noirbizarre
Copy link
Contributor

I think null is missing from this list

@webron
Copy link
Member

webron commented Feb 2, 2017

Closing this in favor of #845.

@mma5997
Copy link

mma5997 commented Mar 10, 2021

is "format": "byte" any different from "format": "base64"

Link1 ==>where it tells to use byte
Link2 ==> has an example use of base64

exactly which to use where

@MikeRalphson
Copy link
Member

The spec text is normative, except for the examples.

byte is correct for OAS 3.0.x - though as formatis an open-ended field, base64is also an allowable value.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests