Support Struct data type #346

jtaylor-sfdc · 2013-07-27T20:57:00Z

Although supporting an ARRAY type helps, there are some use cases in which data is not homogenous. We should support a Struct data type for these cases.

anoopsjohn · 2013-07-28T18:05:11Z

Thanks James for filing this.

elilevine · 2013-08-02T17:42:36Z

+1
This would extend Phoenix's support for semi-structured data.

nmaillard · 2013-08-04T11:31:42Z

+1
structs would be great
how would this work with #19 and #239

apurtell · 2013-08-13T18:55:18Z

how would this work with #19 and #239

I have the same question.

To what extent could this be similar to (or borrow from?) kiji-schema?

jtaylor-sfdc · 2013-08-13T21:38:26Z

This one is a dup of #239. I think it'd be similar in concept kiji-schema, as it would define the structure of a single KeyValue column in your schema (i.e. an instantiation of the struct would be stored in a single KeyValue), but there's could be other sibling KeyValue columns that aren't structs.

I think the schema of the struct would be defined in the Phoenix metadata table (SYSTEM.TABLE) using a new struct type to differentiate it. We'd need to allow references in queries using a dotted notation. At upsert/insert time, you'd need to provide the struct in it's entirety.

Other than using less space, since you don't have the overhead of an entire KeyValue with each value. I'm not convinced this adds a whole lot of value. You can essentially model the same thing with multiple columns. I'd rather see HBase come up with better/more condensed block encodings and have a condensed memory model that can better leverage these encodings.

As far as #19, that one is different. It's for cases where you'd want to have very wide rows in which value information is encoded in the column qualifier. In this case, you'd define these set of columns as a "nested table" which you could join against the row that contains them. So a set of column qualifiers would look like another row to Phoenix.

jtaylor-sfdc · 2013-12-20T16:56:53Z

We should investigate using Parquet as our underlying storage format for these structs (and potentially for JSON as well, #497)

ghost assigned anoopsjohn Jul 30, 2013

jtaylor-sfdc mentioned this issue Aug 13, 2013

Support complex data types #239

Closed

ghost assigned ramkrish86 Dec 20, 2013

jtaylor-sfdc unassigned ramkrish86 Feb 9, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support Struct data type #346

Support Struct data type #346

jtaylor-sfdc commented Jul 27, 2013

anoopsjohn commented Jul 28, 2013

elilevine commented Aug 2, 2013

nmaillard commented Aug 4, 2013

apurtell commented Aug 13, 2013

jtaylor-sfdc commented Aug 13, 2013

jtaylor-sfdc commented Dec 20, 2013

Support Struct data type #346

Support Struct data type #346

Comments

jtaylor-sfdc commented Jul 27, 2013

anoopsjohn commented Jul 28, 2013

elilevine commented Aug 2, 2013

nmaillard commented Aug 4, 2013

apurtell commented Aug 13, 2013

jtaylor-sfdc commented Aug 13, 2013

jtaylor-sfdc commented Dec 20, 2013