You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hive tables by default don't use column information from metastore to get types; they use column information provided by the deserializer. Many popular and built-in SerDes are exceptions; they are enumerated explicitly, see HiveConf.java, SERDESUSINGMETASTOREFORSCHEMA. Hive used to replicate the types that SerDe provides into metastore, however there can be consistency problems with these; until recently, for large type strings (e.g. for Avro), the metastore columns could be truncated from the original schema; it's hypothetically possible for users to change the schema by changing the contents of the external schema file for some serdes; etc., so Hive was recently changed to not store the type for such SerDes that it is itself not using.
Only the SerDes that rely on metastore for schema store the schema in metastore.
We are seeing errors from Presto that seem to indicate Presto is using the column type strings from metastore directly, at least for describe table. Given that types may be inconsistent with SerDe-provided types, they should not be used in such manner.
The text was updated successfully, but these errors were encountered:
For reference, here's what Hive does (has.. method just checks the config setting)
if (hasMetastoreBasedSchema(...)) {
return tTable.getSd().getCols();
...
} else {
return MetaStoreUtils.getFieldsFromDeserializer(getTableName(), getDeserializer());
}
Hive tables by default don't use column information from metastore to get types; they use column information provided by the deserializer. Many popular and built-in SerDes are exceptions; they are enumerated explicitly, see HiveConf.java, SERDESUSINGMETASTOREFORSCHEMA. Hive used to replicate the types that SerDe provides into metastore, however there can be consistency problems with these; until recently, for large type strings (e.g. for Avro), the metastore columns could be truncated from the original schema; it's hypothetically possible for users to change the schema by changing the contents of the external schema file for some serdes; etc., so Hive was recently changed to not store the type for such SerDes that it is itself not using.
Only the SerDes that rely on metastore for schema store the schema in metastore.
We are seeing errors from Presto that seem to indicate Presto is using the column type strings from metastore directly, at least for describe table. Given that types may be inconsistent with SerDe-provided types, they should not be used in such manner.
The text was updated successfully, but these errors were encountered: