Skip to content
This repository has been archived by the owner on Feb 12, 2022. It is now read-only.

Support declaration of DEFAULT in CREATE statement #345

Open
jtaylor-sfdc opened this issue Jul 27, 2013 · 6 comments
Open

Support declaration of DEFAULT in CREATE statement #345

jtaylor-sfdc opened this issue Jul 27, 2013 · 6 comments

Comments

@jtaylor-sfdc
Copy link
Contributor

Support the declaration of a default value in the CREATE TABLE/VIEW statement like this:

CREATE TABLE Persons (
    Pid int NOT NULL PRIMARY KEY,
    LastName varchar(255) NOT NULL,
    FirstName varchar(255),
    Address varchar(255),
    City varchar(255) DEFAULT 'Sandnes'
)

To implement this, we'd need to:

  1. add a new DEFAULT_VALUE key value column in SYSTEM.TABLE and pass through the value when the table is created (in MetaDataClient).
  2. always set NULLABLE to ResultSetMetaData.columnNoNulls if a default value is present, since the column will never be null.
  3. add a getDefaultValue() accessor in PColumn
  4. for a row key column, during UPSERT use the default value if no value was specified for that column. This could be done in the PTableImpl.newKey method.
  5. for a key value column with a default value, we can get away without incurring any storage cost. Although a little bit of extra effort than if we persisted the default value on an UPSERT for key value columns, this approach has the benefit of not incurring any storage cost for a default value.
  • serialize any default value into KeyValueColumnExpression
  • in the evaluate method of KeyValueColumnExpression, conditionally use the default value if the column value is not present. If doing partial evaluation, you should not yet return the default value, as we may not have encountered the the KeyValue for the column yet (since a filter evaluates each time it sees each KeyValue, and there may be more than one KeyValue referenced in the expression). Partial evaluation is determined by calling Tuple.isImmutable(), where false means it is NOT doing partial evaluation, while true means it is.
  • modify EvaluateOnCompletionVisitor by adding a visitor method for RowKeyColumnExpression and KeyValueColumnExpression to set evaluateOnCompletion to true if they have a default value specified. This will cause filter evaluation to execute one final time after all KeyValues for a row have been seen, since it's at this time we know we should use the default value.
@jtaylor-sfdc
Copy link
Contributor Author

One complication is that a default value might be a constant expression, like CURRENT_DATE() which forces us to evaluate later. So we'll want to

  • Have a getDefaultExpression() in PColumn that returns an Expression. In most cases, this would be a LiteralExpression, but it might also be a function to get the current date, or the next sequence number.
  • Come up with a scheme to evaluate these (hopefully on the client-side only, otherwise we'd need to pass through the current date, etc in a scan) and pass them through. I think at compile time, we can wrap a ColumnExpression in a CoalesceFunction(ColumnExpression,DefaultExpression).

@ghost ghost assigned mravi Dec 4, 2013
@mravi
Copy link
Contributor

mravi commented Dec 14, 2013

I made little progress and would like few clarifications.
a) Modify the ColumDef syntax within PhoenixSQL.g to specify an expression for the DEFAULT literal or function. The new syntax would be something like this.

column_def returns [ColumnDef ret]
: c=column_name dt=identifier (LPAREN l=NUMBER (COMMA s=NUMBER)? RPAREN)? (n=NOT? NULL)? (pk=PRIMARY KEY (order=ASC|order=DESC)?)? (d=DEFAULT exp=expression)?
{ $ret = factory.columnDef(c, dt, n==null,
l == null ? null : Integer.parseInt( l.getText() ),
s == null ? null : Integer.parseInt( s.getText() ),
pk != null,
order == null ? null : ColumnModifier.fromDDLValue(order.getText()), exp ); }
;
b) Modify the ColumnDef constructor signature to accept ParseNode . Also, since we need to construct the expression from the ParseNode, would the following approach of transferring to a Expression be ok within the constructor code.

if(node instanceof LiteralParseNode){
LiteralParseNode literalNode = (LiteralParseNode)node;
defaultExpression = LiteralExpression.newConstant(literalNode.getValue());
} else if( node instanceof FunctionParseNode){
}

c) Since the default values for a column need to be stored in SYSTEM.TABLE, how do we plan to support function definitions like CURRENT_DATE() . Though this is easy for LiteralExpressions, FunctionExpressions need a special look.

d) Should we allow adding default values to dynamic columns also?

@jtaylor-sfdc
Copy link
Contributor Author

  • Store the default as a ParseNode on ColumnDef (no need to convert a LiteralParseNode)
    • Verify that defaultNode.isConstant() is true, otherwise throw a new SQLException as a default value must be a constant expression.
  • Serialize the defaultValue as a string by adding and setting parameter to MetaDataClient.INSERT_COLUMN based on ColumnDef.getDefaultValue().toString()
    • In MetaDataEndPointImpl, store the defaultValue string in the PhoenixDatabaseMetaData.COLUMN_DEF column on the SYSTEM.TABLE row representing the column
  • In MetaDataEndPointImpl, get the COLUMN_DEF value and pass it through when a PColumnImpl is created, storing it in a new defaultValue string member variable
    • Modify SQLParser by adding a parseExpression() method that takes a String and returns a ParseNode. See the parseQuery() method as an example - you'd call parser.expression() which will return a ParseNode.
    • Have an accessor in PColumn that calls new SQLParse(defaultValue).parseExpression() to return the ParseNode for the defaultValue.
  • In ExpressionCompiler, modify visit(ColumnParseNode node) by wrapping any column expression that has a default value with a CoalesceFunction(ColumnExpression,Expression defaultExpression).
    • Create the defaultExpression by calling defaultNode.visit(this)

This should cause the default value to be used when a column resolves to null. There are likely some edge cases for UPSERT SELECT and DELETE when they're executed on the server side that we'll have to work though, but hopefully this is enough to get you started.

@mravi
Copy link
Contributor

mravi commented Dec 29, 2013

Hi James,
For constant expressions like CURRENT_DATE() , shouldn't we not evaluate the value for the column during UPSERT to avoid displaying different values for each SELECT query for the column. In this case, we are better off storing the value.
Can you please share your opinion.

@jtaylor-sfdc
Copy link
Contributor Author

Good point, @mravi. We'll need to treat these differently, evaluating them at UPSERT time and storing them. This will include CURRENT_DATE() and NEXT VALUE FOR (i.e. CurrentDateParseNode and NextSequenceValueParseNode). Need to come up with a way of differentiating these from constants - I'll give this some thought.

@jtaylor-sfdc
Copy link
Contributor Author

@mravi - I've added an Expression.isDeterministic() for this case, and renamed isConstant() to be isStateless(). Note that ParseNode also has an isStateless() method, but doesn't need an isDeterministic method. An expression must be stateless in order to be a default expression. If a default expression for a KeyValue column is stateless and deterministic, then we can use the trick I mentioned about not storing the KeyValue with every row. If, however it is stateless and non deterministic, then we need to evaluate it upfront and store the value in the KeyValue at creation time.

Thanks again for recognizing that we need to differentiate these cases!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants