Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define a Map Data Type for XDM #494

Closed
kstreeter opened this issue Aug 24, 2018 · 2 comments
Closed

Define a Map Data Type for XDM #494

kstreeter opened this issue Aug 24, 2018 · 2 comments

Comments

@kstreeter
Copy link
Collaborator

One consequence of our use of JSON Schema (and therefore JSON) to represent XDM is that the concept of an "object" (a structure with named fields that have a value) and a "map" (a container that maintains a relationship between keys and values) are not distinct.

While in JSON objects and maps can be treated (nearly) interchangeably, in XDM this is not true because we interpret XDM descriptions as describing a schema. In non-JSON environments (for example, a Parquet file), the properties of objects are mapped to columns, and the values of those properties are mapped to rows or entries.

While this is usually how we model data, there are many cases where we actually want the semantics of a map, where the keys are not columns but are themselves data elements that are stored in the rows/entries/etc. Environments such as a Parquet file support a logical map type, but we don't have any way to express in XDM that an object should be treated as a map.

Note this issue is dependent on #493. We should extend the current set of XDM types (which just need to be documented) with the map type.

What are the schemas that are affected by the issue

All

What are examples of products that are impacted by the issue

Adobe Audience Manager

@lrosenthol
Copy link
Collaborator

I have to disagree on this one. Object == Map. Having "sub-objects" is perfectly acceptable.

@kstreeter
Copy link
Collaborator Author

@lrosenthol unfortunately many of the data processing technologies we rely heavily on (such as Spark, and Parquet) simply don't work that way. Object properties get mapped to columns/schema, while maps are a distinct construct represented in the data. Maps are far less efficient and performant in these environments, so for most usages we want the column-based representation, but we do need maps for some use cases. We need something in XDM that allows us to signal when a map is desired rather than columns.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants