Skip to content

Commit

Permalink
feat(glue-alpha): add cfn-glue-table-tableinput-parameters to Glue …
Browse files Browse the repository at this point in the history
…table construct (#27643)

Add
[cfn-glue-table-tableinput-parameters](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-glue-table-tableinput.html#cfn-glue-table-tableinput-parameters)
to Glue Table construct as optional props

User can specify additional table parameter when creating Glue Table. 
Any key/value can be set depending on each user's requirement like
table's additional metadata or statistics. Some parameter can be used
when AWS services / 3rd party tools read table like
`skip.header.line.count`.

Closes #14159.

---
All Submissions:
- [x] Have you followed the guidelines in our [Contributing
guide?](https://github.com/aws/aws-cdk/blob/main/CONTRIBUTING.md)
Adding new Unconventional Dependencies:
- [ ] This PR adds new unconventional dependencies following the process
described
[here](https://github.com/aws/aws-cdk/blob/main/CONTRIBUTING.md/#adding-new-unconventional-dependencies)
New Features
- [x] Have you added the new feature to an [integration
test](https://github.com/aws/aws-cdk/blob/main/INTEGRATION_TESTS.md)?
- [x] Did you use yarn integ to deploy the infrastructure and generate
the snapshot (i.e. yarn integ without --dry-run)?
---
By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache-2.0 license

---------

Co-authored-by: Vinayak Kukreja <78971045+vinayak-kukreja@users.noreply.github.com>
Co-authored-by: Sumu Pitchayan <35242245+sumupitchayan@users.noreply.github.com>
  • Loading branch information
3 people authored Dec 27, 2023
1 parent 832e29a commit 8e15482
Show file tree
Hide file tree
Showing 12 changed files with 309 additions and 21 deletions.
18 changes: 18 additions & 0 deletions packages/@aws-cdk/aws-glue-alpha/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -263,6 +263,24 @@ new glue.S3Table(this, 'MyTable', {
});
```

Glue tables can also be configured to contain user-defined table properties through the [`parameters`](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-glue-table-tableinput.html#cfn-glue-table-tableinput-parameters) property:

```ts
declare const myDatabase: glue.Database;
new glue.S3Table(this, 'MyTable', {
parameters: {
key1: 'val1',
key2: 'val2',
},
database: myDatabase,
columns: [{
name: 'col1',
type: glue.Schema.STRING,
}],
dataFormat: glue.DataFormat.JSON,
});
```

### Partition Keys

To improve query performance, a table can specify `partitionKeys` on which data is stored and queried separately. For example, you might partition a table by `year` and `month` to optimize queries based on a time window:
Expand Down
1 change: 1 addition & 0 deletions packages/@aws-cdk/aws-glue-alpha/lib/external-table.ts
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,7 @@ export class ExternalTable extends TableBase {
'has_encrypted_data': true,
'partition_filtering.enabled': props.enablePartitionFiltering,
'connectionName': props.connection.connectionName,
...props.parameters,
},
storageDescriptor: {
location: props.externalDataLocation,
Expand Down
1 change: 1 addition & 0 deletions packages/@aws-cdk/aws-glue-alpha/lib/s3-table.ts
Original file line number Diff line number Diff line change
Expand Up @@ -141,6 +141,7 @@ export class S3Table extends TableBase {
'classification': props.dataFormat.classificationString?.value,
'has_encrypted_data': true,
'partition_filtering.enabled': props.enablePartitionFiltering,
...this.parameters,
},
storageDescriptor: {
location: `s3://${this.bucket.bucketName}/${this.s3Prefix}`,
Expand Down
17 changes: 17 additions & 0 deletions packages/@aws-cdk/aws-glue-alpha/lib/table-base.ts
Original file line number Diff line number Diff line change
Expand Up @@ -147,6 +147,16 @@ export interface TableBaseProps {
* @default - The parameter is not defined
*/
readonly storageParameters?: StorageParameter[];

/**
* The key/value pairs define properties associated with the table.
* The key/value pairs that are allowed to be submitted are not limited, however their functionality is not guaranteed.
*
* @see https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-glue-table-tableinput.html#cfn-glue-table-tableinput-parameters
*
* @default - The parameter is not defined
*/
readonly parameters?: { [key: string]: string }
}

/**
Expand Down Expand Up @@ -214,6 +224,12 @@ export abstract class TableBase extends Resource implements ITable {
*/
public readonly storageParameters?: StorageParameter[];

/**
* The tables' properties associated with the table.
* @see https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-glue-table-tableinput.html#cfn-glue-table-tableinput-parameters
*/
protected readonly parameters: { [key: string]: string }

/**
* Partition indexes must be created one at a time. To avoid
* race conditions, we store the resource and add dependencies
Expand All @@ -236,6 +252,7 @@ export abstract class TableBase extends Resource implements ITable {
this.columns = props.columns;
this.partitionKeys = props.partitionKeys;
this.storageParameters = props.storageParameters;
this.parameters = props.parameters ?? {};

this.compressed = props.compressed ?? false;
}
Expand Down
41 changes: 41 additions & 0 deletions packages/@aws-cdk/aws-glue-alpha/test/external-table.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -1066,6 +1066,47 @@ test('can associate an external location with the glue table', () => {
});
});

test('can specify table parameter', () => {
const app = new cdk.App();
const stack = new cdk.Stack(app, 'Stack');
const database = new glue.Database(stack, 'Database');
const connection = new glue.Connection(stack, 'Connection', {
connectionName: 'my_connection',
type: glue.ConnectionType.JDBC,
properties: {
JDBC_CONNECTION_URL: 'jdbc:server://server:443/connection',
USERNAME: 'username',
PASSWORD: 'password',
},
});
new glue.ExternalTable(stack, 'Table', {
database,
tableName: 'my_table',
connection,
columns: [{
name: 'col',
type: glue.Schema.STRING,
}],
dataFormat: glue.DataFormat.JSON,
externalDataLocation,
parameters: {
key1: 'val1',
key2: 'val2',
},
});

Template.fromStack(stack).hasResourceProperties('AWS::Glue::Table', {
TableInput: {
Parameters: {
key1: 'val1',
key2: 'val2',
classification: 'json',
has_encrypted_data: true,
},
},
});
});

function createTable(props: Pick<glue.S3TableProps, Exclude<keyof glue.S3TableProps, 'database' | 'dataFormat'>>): void {
const stack = new cdk.Stack();
const connection = new glue.Connection(stack, 'Connection', {
Expand Down

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Original file line number Diff line number Diff line change
Expand Up @@ -623,6 +623,72 @@
}
}
},
"MyTableWithParametersTable39568AB8": {
"Type": "AWS::Glue::Table",
"Properties": {
"CatalogId": {
"Ref": "AWS::AccountId"
},
"DatabaseName": {
"Ref": "MyDatabase1E2517DB"
},
"TableInput": {
"Description": "table_with_parameters generated by CDK",
"Name": "table_with_parameters",
"Parameters": {
"classification": "json",
"has_encrypted_data": true,
"key1": "val1",
"key2": "val2"
},
"StorageDescriptor": {
"Columns": [
{
"Name": "col1",
"Type": "string"
},
{
"Comment": "col2 comment",
"Name": "col2",
"Type": "string"
},
{
"Name": "col3",
"Type": "array<string>"
},
{
"Name": "col4",
"Type": "map<string,string>"
},
{
"Name": "col5",
"Type": "struct<col1:string>"
}
],
"Compressed": false,
"InputFormat": "org.apache.hadoop.mapred.TextInputFormat",
"Location": {
"Fn::Join": [
"",
[
"s3://",
{
"Ref": "DataBucketE3889A50"
},
"/"
]
]
},
"OutputFormat": "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat",
"SerdeInfo": {
"SerializationLibrary": "org.openx.data.jsonserde.JsonSerDe"
},
"StoredAsSubDirectories": false
},
"TableType": "EXTERNAL_TABLE"
}
}
},
"MyDeprecatedTableAA0364FD": {
"Type": "AWS::Glue::Table",
"Properties": {
Expand Down

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

12 changes: 12 additions & 0 deletions packages/@aws-cdk/aws-glue-alpha/test/integ.table.ts
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,18 @@ new glue.S3Table(stack, 'MyTableWithStorageDescriptorParameters', {
],
});

new glue.S3Table(stack, 'MyTableWithParameters', {
database,
bucket,
tableName: 'table_with_parameters',
columns,
dataFormat: glue.DataFormat.JSON,
parameters: {
key1: 'val1',
key2: 'val2',
},
});

new glue.Table(stack, 'MyDeprecatedTable', {
database,
bucket,
Expand Down
Loading

0 comments on commit 8e15482

Please sign in to comment.