Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[avro] Regression due to changed namespace of inner enum types #58

Closed
osi opened this issue Mar 9, 2017 · 9 comments
Closed

[avro] Regression due to changed namespace of inner enum types #58

osi opened this issue Mar 9, 2017 · 9 comments
Milestone

Comments

@osi
Copy link

osi commented Mar 9, 2017

This change:

7ed5465#diff-f1d5995cc2009c62f822a9e5da3ebd00R27

modified the namespace for inner enumerations. It used to just be the containing package, now it has the type. Since the AvroAlias annotation isn't yet supported, there's no way to make the schema generated from 2.8.7 compatible (in terms of avro's SchemaCompatibility.checkReaderWriterCompatibility) with 2.8.6

@cowtowncoder
Copy link
Member

Sigh. I should have known better than to merge such changes in a patch.

Do you have a suggestion as to what to do at this point? (both regarding 2.8.8, and 2.9)
Does the problem only affect case of reader/writer schema usage, which is new, or is it more general?

@osi
Copy link
Author

osi commented Mar 9, 2017

for 2.8.8, i'd say revert, as it was an unexpected breakage for a point release.

for 2.9, if the AvroAlias annotation was supported, they can be made backwards compatible by adding AvroAlias with the older name. then i'd just put the fact that inner class type names changed in the release notes.

it is this check in Avro's SchemaCompatibility that is failing, so aliases will fix.

  public static boolean schemaNameEquals(final Schema reader, final Schema writer) {
    final String writerFullName = writer.getFullName();
    if (objectsEqual(reader.getFullName(), writerFullName)) {
      return true;
    }
    // Apply reader aliases:
    if (reader.getAliases().contains(writerFullName)) {
      return true;
    }
    return false;
  }

I discovered this because I have two services that communicate with Avro (via Jackson), and do a schema exchange and compatibility test as part of their handshake.

@cowtowncoder
Copy link
Member

/cc @baharclerode Looks like we need to do bit of revert here.

@osi Agreed on need for partial (?) revert, was just hoping to figure out a minimal change for that.
I am bit worried about testing, just because (if I understand correctly) it'd be essentially comparing artifacts produced across versions, which is bit tricky to automate properly.

@osi
Copy link
Author

osi commented Mar 9, 2017

I added a test like the below to my codebase. The "expected" is just hardcoded. In my case, I care that the newer one can read what is produced by the older (so I can upgrade). One could just run the check with the parameters flipped to assert both ways.

I'm fine with the generated schemas changing as the code is developed, as long as the migration aids are in place.

AvroMapper mapper;
String expectedSchemaJson; 
AvroSchema expected = mapper.schemaFrom(expectedSchemaJson);
AvroSchema actual = mapper.schemaFor(TypeToTest.class);

SchemaCompatibility.SchemaPairCompatibility compatibility =
        SchemaCompatibility.checkReaderWriterCompatibility(
                actual.getAvroSchema(),
                expected.getAvroSchema());

assertEquals(compatibility.toString(),
             SchemaCompatibility.SchemaCompatibilityType.COMPATIBLE,
             compatibility.getType());

@baharclerode
Copy link
Contributor

baharclerode commented Mar 9, 2017

@cowtowncoder Minimal change would be to revert those 5 lines and disable the failing tests on 2.8, and then un-revert the changes in master along with making sure that aliasing is properly supported so that there's an upgrade path where devs can alias back to the old names when they upgrade to 2.9.

The vast majority of the failing tests are related to using the Apache implementation to deserialize with a Jackson schema, and can be ignored by adding the following:

@Before
public void setup() {
    // 2.8 doesn't generate schemas with compatible namespaces for Apache deserializer
    Assume.assumeTrue(deserializeFunctor != apacheDeserializer || schemaFunctor != getJacksonSchema);
}

to the following test files:

  • AvroNameTest
  • AvroIgnoreTest
  • ListWithComplexTest
  • MapWithComplexTest
  • RecordWithPrimitiveArrayTest
  • RecordWithPrimitiveTest
  • RecordWithPrimitiveWrapperArrayTest
  • RecordWithPrimitiveWrapperTest

The RecordWithComplexTest will need in addition to the above assumption the following assumption:

// 2.8 doesn't generate schemas with compatible namespaces for Apache serializer, so there are problems resolving unions
Assume.assumeTrue(serializeFunctor != apacheSerializer || schemaFunctor != getJacksonSchema);

Assuming the enums are not root objects in your schema, another work around would be to apply @AvroSchema to the enum fields in your records and set the schema that way, but this isn't a very good developer experience.

@baharclerode
Copy link
Contributor

Example revert: 2.8...baharclerode:bah.InnerNamespaceRevert

@cowtowncoder
Copy link
Member

@baharclerode Thanks! Ok, I'll work on that as base.

@cowtowncoder cowtowncoder added this to the 2.8.8 milestone Mar 10, 2017
@cowtowncoder cowtowncoder changed the title [avro] 2.8.7 changed namespace of inner enum types [avro] Regression due to changed namespace of inner enum types Mar 10, 2017
@cowtowncoder
Copy link
Member

Ok, revert for 2.8.8, but retained for 2.9.0.pr2.
Isolated assume calls in base class.

At this point I suspect some more work is needed to avoid skipping tests, since we get:

Tests run: 1054, Failures: 0, Errors: 0, Skipped: 139

but not quite sure what to remove from 2.9 (if anything).

@baharclerode
Copy link
Contributor

baharclerode commented Mar 10, 2017

There's 30 tests that are skipped because they cover bugs in the Apache implementation, and there were 86 failing tests that broke due to the revert (now skipped), and the remaining were the result of a few of the Assumes being a bit more broader than necessary in an effort to minimize changes (but are testing things that can't be reliably used right now anyways).

As to 2.9, I would just remove the assumes that were added for this revert and you should be down to 42 skipped tests (there were some more tests added covering features that the Apache Implementation didn't fully support)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants