Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

W-16837997 - Avro union fixes #2048

Merged
merged 3 commits into from
Sep 25, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 8 additions & 8 deletions adrs/0014-avro-support.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ Accepted

## Context

Async 2.x supports AVRO Schemas, and we currently don't.
Async 2.x supports AVRO Schemas, and we currently don't.
We want to add support of AVRO Schemas:
- inside Async APIs
- as a standalone document
Expand Down Expand Up @@ -54,12 +54,12 @@ We've also added 3 AVRO-specific fields to the `AnyShape` Model via the `AvroFie

### Where we support and DON'T support AVRO Schemas
We Support AVRO Schemas (inline or inside a `$ref`):
- as a standalone document or file
- we encourage users to use the `.avsc` file type to indicate that's an avro file, for better suggestions and so on in the platform)
- as a standalone document or file
- we encourage users to use the `.avsc` file type to indicate that's an avro file
- must use the specific `AvroConfiguration`
- inside a message payload in an AsyncAPI
- the key `schemaFormat` MUST be declared and specify it's an AVRO payload
- we only support avro schema version 1.9.0
- **we only support avro schema version 1.9.0**
- the avro specific document `AvroSchemaDocument` can only be parsed with the specific `AvroConfiguration`

We don't support AVRO Schemas:
Expand All @@ -71,16 +71,16 @@ We'll use the Apache official libraries for JVM and JS, in the version 1.11.3, d

## Consequences / Constraints

The validation plugins differ in interfaces and implementations, and each has some constraints:
The validation libraries differ in interfaces and implementations, and each has some constraints:

### JVM avro validation plugin
### JVM avro validation library
- validation per se is not supported, we try to parse an avro schema and throw parsing results if there are any
- this means it's difficult to have location of where the error is thrown, we may give an approximate location from our end post-validation
- when a validation is thrown, the rest of the file is not being searched for more validations
- this is particularly important in large avro schemas, where many errors can be found but only one is shown

### Both JVM & JS validation plugins
### Both JVM & JS validation libraries
- `"default"` values are not being validated when the type is `bytes`, `map`, or `array`
- the validator treats as invalid an empty array as the default value for arrays (`"default": []`) even though the [Avro Schema Specification](https://avro.apache.org/docs/1.12.0/specification) has some examples with it
- if an avro record has a field that is a union that includes the root record itself (recursive reference) we fail to validate payloads (default value or payload validation) against that 'recursive union field' correctly. The root record works correctly. In the future we'll try to ignore the cases that we are now failing and/or show a warning instead.
- avro schemas of type union are not valid at the root level of the schema, but they are accepted as field types in a record or items in an array. For this reason, it is not possible to generate a `PayloadValidator` for an avro union shape.
- the avro libraries are very restrictives with the allowed characters in naming definition (names of records for example). They only allow letters, numbers and `_` chars. We are not modifying this behavior.
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,9 @@ case class AvroArrayShapeParser(map: YMap)(implicit ctx: AvroSchemaContext)
override def setMembers(anyShape: AnyShape): Unit =
shape.setWithoutId(ArrayShapeModel.Items, anyShape, Annotations.inferred())

override def parseMembers(e: YMapEntry): AnyShape = AvroTextParser(e.value).parse()
override def parseMembers(e: YMapEntry): AnyShape = {
AvroInlineTypeParser(e.value).parse()
}

override def parseSpecificFields(): Unit = {}
}
Original file line number Diff line number Diff line change
Expand Up @@ -2,17 +2,20 @@ package amf.apicontract.internal.spec.avro.parser.domain

import amf.apicontract.internal.spec.avro.parser.context.AvroSchemaContext
import amf.shapes.client.scala.model.domain.AnyShape
import org.yaml.model.{YMap, YNode, YScalar}
import org.yaml.model.{YMap, YNode, YScalar, YSequence}

case class AvroTextParser(node: YNode)(implicit ctx: AvroSchemaContext) extends AvroKeyExtractor {
case class AvroInlineTypeParser(node: YNode)(implicit ctx: AvroSchemaContext) extends AvroKeyExtractor {
def parse(): AnyShape = {
node.value match {
case scalar: YScalar =>
val avroType = scalar.text
val parsedShape = AvroTextTypeParser(scalar.text, None).parse()
annotatedAvroShape(parsedShape, avroType, node)
case map: YMap => // todo: putting an empty AnyShape when the union type is incorrect is kinda weird behavior
new AvroShapeParser(map).parse().getOrElse(AnyShape())
new AvroShapeParser(map).parse().getOrElse(AnyShape(map))
case seq: YSequence => // The only valid case here for a seq is a Union
AvroInlineUnionShapeParser(seq).parse()
case _ => AnyShape(node)
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -3,16 +3,34 @@ package amf.apicontract.internal.spec.avro.parser.domain
import amf.apicontract.internal.spec.avro.parser.context.AvroSchemaContext
import amf.core.client.scala.model.domain.AmfArray
import amf.core.internal.parser.domain.Annotations
import amf.shapes.client.scala.model.domain.UnionShape
import amf.shapes.client.scala.model.domain.{AnyShape, UnionShape}
import amf.shapes.internal.domain.metamodel.UnionShapeModel
import org.yaml.model.{YMap, YNode}
import org.yaml.model.{YMap, YNode, YSequence, YValue}

case class AvroUnionShapeParser(members: Seq[YNode], node: YNode)(implicit ctx: AvroSchemaContext)
extends AvroComplexShapeParser(node.as[YMap]) {
extends AvroComplexShapeParser(node.as[YMap])
with AvroUnionLikeShapeParser {
override val shape: UnionShape = UnionShape(node).withSynthesizeName("union")

override def parseSpecificFields(): Unit = {
val parsedMembers = members.map(node => AvroTextParser(node).parse())
val parsedMembers = parseMembers(members)
shape.setWithoutId(UnionShapeModel.AnyOf, AmfArray(parsedMembers, Annotations(node)), Annotations(node))
}
}

case class AvroInlineUnionShapeParser(seq: YSequence)(implicit ctx: AvroSchemaContext)
extends AvroUnionLikeShapeParser {

def parse(): AnyShape = {
val shape: UnionShape = UnionShape(seq).withSynthesizeName("union")
val parsedMembers = parseMembers(seq.nodes)
shape.setWithoutId(UnionShapeModel.AnyOf, AmfArray(parsedMembers, Annotations(seq)), Annotations(seq))
}

}

trait AvroUnionLikeShapeParser {
def parseMembers(members: Seq[YNode])(implicit ctx: AvroSchemaContext): Seq[AnyShape] = {
members.map(node => AvroInlineTypeParser(node).parse())
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
ModelId: file://amf-cli/shared/src/test/resources/avro/schemas/union-nullable-record-invalid.json
Profile:
Conforms: false
Number of results: 1

Level: Violation

- Constraint: http://a.ml/vocabularies/amf/parser#invalid-avro-schema
Message: Internal error during Avro validation: Error: invalid "null": 1234
Severity: Violation
Target: file://amf-cli/shared/src/test/resources/avro/schemas/union-nullable-record-invalid.json#/shape/unionNullableRecord
Property:
Range:
Location: file://amf-cli/shared/src/test/resources/avro/schemas/union-nullable-record-invalid.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
ModelId: file://amf-cli/shared/src/test/resources/avro/schemas/union-nullable-record-invalid.json
Profile:
Conforms: false
Number of results: 1

Level: Violation

- Constraint: http://a.ml/vocabularies/amf/parser#invalid-avro-schema
Message: invalid schema type: Invalid default for field nullableUnion: 1234 not a ["null","string"]
Severity: Violation
Target: file://amf-cli/shared/src/test/resources/avro/schemas/union-nullable-record-invalid.json#/shape/unionNullableRecord
Property:
Range:
Location: file://amf-cli/shared/src/test/resources/avro/schemas/union-nullable-record-invalid.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
ModelId: file://amf-cli/shared/src/test/resources/avro/schemas/union-root-invalid.json
Profile:
Conforms: false
Number of results: 1

Level: Violation

- Constraint: http://a.ml/vocabularies/amf/parser#invalid-avro-schema
Message: Internal error during Avro validation: Error: unknown type: ["string","int"]
Severity: Violation
Target: file://amf-cli/shared/src/test/resources/avro/schemas/union-root-invalid.json#/union/simpleUnion
Property:
Range:
Location: file://amf-cli/shared/src/test/resources/avro/schemas/union-root-invalid.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
ModelId: file://amf-cli/shared/src/test/resources/avro/schemas/union-root-invalid.json
Profile:
Conforms: false
Number of results: 1

Level: Violation

- Constraint: http://a.ml/vocabularies/amf/parser#invalid-avro-schema
Message: No type: {"name":"simpleUnion","type":["string","int"],"default":"something"}
Severity: Violation
Target: file://amf-cli/shared/src/test/resources/avro/schemas/union-root-invalid.json#/union/simpleUnion
Property:
Range:
Location: file://amf-cli/shared/src/test/resources/avro/schemas/union-root-invalid.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
ModelId: file://amf-cli/shared/src/test/resources/avro/schemas/union-simple-array-invalid.json
Profile: Avro
Conforms: false
Number of results: 1

Level: Violation

- Constraint: http://a.ml/vocabularies/amf/validation#example-validation-error
Message: '132' is not a valid value (of type '"string"') for '0.string'
Severity: Violation
Target: file://amf-cli/shared/src/test/resources/avro/schemas/union-simple-array-invalid.json#/array/default-array/array_1
Property: file://amf-cli/shared/src/test/resources/avro/schemas/union-simple-array-invalid.json#/array/default-array/array_1
Range: [(4,13)-(4,49)]
Location: file://amf-cli/shared/src/test/resources/avro/schemas/union-simple-array-invalid.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
ModelId: file://amf-cli/shared/src/test/resources/avro/schemas/union-simple-array-invalid.json
Profile: Avro
Conforms: false
Number of results: 1

Level: Violation

- Constraint: http://a.ml/vocabularies/amf/validation#schema-exception
Message: invalid schema type: Expected string. Got VALUE_NUMBER_INT
Severity: Violation
Target: null
Property: Expected string. Got VALUE_NUMBER_INT
Range: [(1,0)-(5,1)]
Location: file://amf-cli/shared/src/test/resources/avro/schemas/union-simple-array-invalid.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
ModelId: file://amf-cli/shared/src/test/resources/avro/schemas/union-simple-record-invalid.json
Profile:
Conforms: false
Number of results: 1

Level: Violation

- Constraint: http://a.ml/vocabularies/amf/parser#invalid-avro-schema
Message: Internal error during Avro validation: Error: invalid "string": true
Severity: Violation
Target: file://amf-cli/shared/src/test/resources/avro/schemas/union-simple-record-invalid.json#/shape/unionSimpleRecord
Property:
Range:
Location: file://amf-cli/shared/src/test/resources/avro/schemas/union-simple-record-invalid.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
ModelId: file://amf-cli/shared/src/test/resources/avro/schemas/union-simple-record-invalid.json
Profile:
Conforms: false
Number of results: 1

Level: Violation

- Constraint: http://a.ml/vocabularies/amf/parser#invalid-avro-schema
Message: invalid schema type: Invalid default for field simpleUnion: true not a ["string","int"]
Severity: Violation
Target: file://amf-cli/shared/src/test/resources/avro/schemas/union-simple-record-invalid.json#/shape/unionSimpleRecord
Property:
Range:
Location: file://amf-cli/shared/src/test/resources/avro/schemas/union-simple-record-invalid.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
{
"type": "record",
"name": "unionNullableRecord",
"namespace": "root",
"fields": [
{
"name": "nullableUnion",
"type": [ "null", "string"],
"default": 1234
}
]
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
{
"type": "record",
"name": "unionNullableRecord",
"namespace": "root",
"fields": [
{
"name": "nullableUnion",
"type": [ "null", "string"],
"default": null
}
]
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
{
"name": "simpleUnion",
"type": [ "string", "int"],
"default": "something"
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
{
"type": "array",
"items": [ "string", "int"],
"default": [{"string": 132}, {"string": "b"}]
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
{
"type": "array",
"items": [ "string", "int"],
"default": [{"string": "a"}, {"string": "b"}]
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
{
"type": "record",
"name": "unionSimpleRecord",
"namespace": "root",
"fields": [
{
"name": "simpleUnion",
"type": [ "string", "int"],
"default": true
}
]
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
{
"type": "record",
"name": "unionSimpleRecord",
"namespace": "root",
"fields": [
{
"name": "simpleUnion",
"type": [ "string", "int"],
"default": "something"
}
]
}
Original file line number Diff line number Diff line change
Expand Up @@ -165,6 +165,63 @@ class AvroSchemaValidationTest extends MultiPlatformReportGenTest {
)
}

test("validate simple union in record - valid") {
validate(
"union-simple-record-valid.json",
None,
configOverride = Some(config)
)
}

test("validate simple union in record - invalid") {
validate(
"union-simple-record-invalid.json",
Some("union-simple-record-invalid.report"),
configOverride = Some(config)
)
}

test("validate nullable union in record - valid") {
validate(
"union-nullable-record-valid.json",
None,
configOverride = Some(config)
)
}

test("validate nullable union in record - invalid") {
validate(
"union-nullable-record-invalid.json",
Some("union-nullable-record-invalid.report"),
configOverride = Some(config)
)
}

test("validate simple union in array - valid") {
validate(
"union-simple-array-valid.json",
None,
configOverride = Some(config)
)
}

test("validate simple union in array - invalid") {
validate(
"union-simple-array-invalid.json",
Some("union-simple-array-invalid.report"),
configOverride = Some(config)
)
}

test("validate union at root level is invalid") {
validate(
"union-root-invalid.json",
Some("union-root-invalid.report"),
configOverride = Some(config)
)
}


// TODO We need to see how implement this in with AVRO 1.11.3

// if (platform.name == "jvm") { // We were able only to change this behavior in JVM validator. JS one is still strict (only letter, numbers and '_')
Expand Down
Loading