Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(redshift): column compression encodings and comments can now be customised #23597

Merged
merged 26 commits into from
Feb 10, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
0f50d3f
addition: initial testing suite
Rizxcviii Jan 6, 2023
20fe7fa
addition: initial column encoding methods
Rizxcviii Jan 6, 2023
7517f8d
addition: docstring for ColumnEncoding
Rizxcviii Jan 6, 2023
8669223
addition: assigning enums to string variables
Rizxcviii Jan 6, 2023
38cfb4a
addition: adding encoding on creation of table
Rizxcviii Jan 6, 2023
656a307
addition: updates on column encoding
Rizxcviii Jan 6, 2023
7f2c6b9
addition: table comment and column comment
Rizxcviii Jan 6, 2023
5a72700
modification, addition
Rizxcviii Jan 6, 2023
77fe42c
Merge branch 'main' into feature/commentting-encoding
Rizxcviii Jan 6, 2023
68d8107
modification: integ test
Rizxcviii Jan 6, 2023
8689510
addition: docuementation for encoding and commentting
Rizxcviii Jan 6, 2023
73366fd
addition, modification:
Rizxcviii Jan 23, 2023
6f73d7a
modification: removing table comments code
Rizxcviii Jan 26, 2023
6ec1f26
modification: removing table comments code
Rizxcviii Jan 26, 2023
033bdc7
modification: integ test snapshot
Rizxcviii Jan 26, 2023
6a8ede8
modification: reverting import cleanup
Rizxcviii Jan 26, 2023
c8ac796
modification: removing table comments from README
Rizxcviii Jan 26, 2023
6ce5fa9
modification: bugfix, nested and incorrect test
Rizxcviii Jan 26, 2023
a96840c
modification: using private enum
Rizxcviii Jan 26, 2023
c4582e9
addition: line break on EOF
Rizxcviii Jan 26, 2023
2c13a23
Merge branch 'main' into feature/commentting-encoding
Rizxcviii Feb 6, 2023
07b6126
modification: using an actual compression encoding used by VARCHAR
Rizxcviii Feb 9, 2023
452d8ea
modification: rosetta fixing, was probably not run using the yarn com…
Rizxcviii Feb 9, 2023
461a93d
removal: lock file
Rizxcviii Feb 9, 2023
c78e737
modification: typo
Rizxcviii Feb 9, 2023
a626917
Merge branch 'main' into feature/commentting-encoding
mergify[bot] Feb 10, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 25 additions & 6 deletions packages/@aws-cdk/aws-redshift/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -200,17 +200,32 @@ new Table(this, 'Table', {
});
```

Tables can also be configured with a comment:
Tables and their respective columns can be configured to contain comments:

```ts fixture=cluster
new Table(this, 'Table', {
tableColumns: [
{ name: 'col1', dataType: 'varchar(4)' },
{ name: 'col2', dataType: 'float' }
{ name: 'col1', dataType: 'varchar(4)', comment: 'This is a column comment' },
{ name: 'col2', dataType: 'float', comment: 'This is a another column comment' }
],
cluster: cluster,
databaseName: 'databaseName',
tableComment: 'This is a table comment',
});
```

Table columns can be configured to use a specific compression encoding:

```ts fixture=cluster
import { ColumnEncoding } from '@aws-cdk/aws-redshift';

new Table(this, 'Table', {
tableColumns: [
{ name: 'col1', dataType: 'varchar(4)', encoding: ColumnEncoding.TEXT32K },
{ name: 'col2', dataType: 'float', encoding: ColumnEncoding.DELTA32K },
],
cluster: cluster,
databaseName: 'databaseName',
comment: 'This is a comment',
});
```

Expand Down Expand Up @@ -417,14 +432,16 @@ Some Amazon Redshift features require Amazon Redshift to access other AWS servic
When you create an IAM role and set it as the default for the cluster using console, you don't have to provide the IAM role's Amazon Resource Name (ARN) to perform authentication and authorization.

```ts
import * as ec2 from '@aws-cdk/aws-ec2';
import * as iam from '@aws-cdk/aws-iam';
declare const vpc: ec2.Vpc;

const defaultRole = new iam.Role(this, 'DefaultRole', {
assumedBy: new iam.ServicePrincipal('redshift.amazonaws.com'),
},
);

new Cluster(stack, 'Redshift', {
new Cluster(this, 'Redshift', {
masterUser: {
masterUsername: 'admin',
},
Expand All @@ -437,14 +454,16 @@ new Cluster(stack, 'Redshift', {
A default role can also be added to a cluster using the `addDefaultIamRole` method.

```ts
import * as ec2 from '@aws-cdk/aws-ec2';
import * as iam from '@aws-cdk/aws-iam';
declare const vpc: ec2.Vpc;

const defaultRole = new iam.Role(this, 'DefaultRole', {
assumedBy: new iam.ServicePrincipal('redshift.amazonaws.com'),
},
);

const redshiftCluster = new Cluster(stack, 'Redshift', {
const redshiftCluster = new Cluster(this, 'Redshift', {
masterUser: {
masterUsername: 'admin',
},
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,3 +18,5 @@ export async function handler(event: AWSLambda.CloudFormationCustomResourceEvent
}
return subHandler(event.ResourceProperties, event);
}

export { ColumnEncoding } from './types';
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
/* eslint-disable-next-line import/no-unresolved */
import * as AWSLambda from 'aws-lambda';
import { executeStatement } from './redshift-data';
import { ClusterProps, TableAndClusterProps, TableSortStyle } from './types';
import { ClusterProps, ColumnEncoding, TableAndClusterProps, TableSortStyle } from './types';
import { areColumnsEqual, getDistKeyColumn, getSortKeyColumns } from './util';
import { Column } from '../../table';

Expand Down Expand Up @@ -40,7 +40,7 @@ async function createTable(
tableAndClusterProps: TableAndClusterProps,
): Promise<string> {
const tableName = tableNamePrefix + tableNameSuffix;
const tableColumnsString = tableColumns.map(column => `${column.name} ${column.dataType}`).join();
const tableColumnsString = tableColumns.map(column => `${column.name} ${column.dataType}${getEncodingColumnString(column)}`).join();
Rizxcviii marked this conversation as resolved.
Show resolved Hide resolved

let statement = `CREATE TABLE ${tableName} (${tableColumnsString})`;

Expand All @@ -61,6 +61,11 @@ async function createTable(

await executeStatement(statement, tableAndClusterProps);

for (const column of tableColumns) {
if (column.comment) {
await executeStatement(`COMMENT ON COLUMN ${tableName}.${column.name} IS '${column.comment}'`, tableAndClusterProps);
}
}
if (tableAndClusterProps.tableComment) {
await executeStatement(`COMMENT ON TABLE ${tableName} IS '${tableAndClusterProps.tableComment}'`, tableAndClusterProps);
}
Expand Down Expand Up @@ -107,6 +112,20 @@ async function updateTable(
alterationStatements.push(...columnAdditions.map(addition => `ALTER TABLE ${tableName} ${addition}`));
}

const columnEncoding = tableColumns.filter(column => {
return oldTableColumns.some(oldColumn => column.name === oldColumn.name && column.encoding !== oldColumn.encoding);
}).map(column => `ALTER COLUMN ${column.name} ENCODE ${column.encoding || ColumnEncoding.AUTO}`);
if (columnEncoding.length > 0) {
alterationStatements.push(`ALTER TABLE ${tableName} ${columnEncoding.join(', ')}`);
}

const columnComments = tableColumns.filter(column => {
return oldTableColumns.some(oldColumn => column.name === oldColumn.name && column.comment !== oldColumn.comment);
}).map(column => `COMMENT ON COLUMN ${tableName}.${column.name} IS ${column.comment ? `'${column.comment}'` : 'NULL'}`);
if (columnComments.length > 0) {
alterationStatements.push(...columnComments);
}

const oldDistStyle = oldResourceProperties.distStyle;
if ((!oldDistStyle && tableAndClusterProps.distStyle) ||
(oldDistStyle && !tableAndClusterProps.distStyle)) {
Expand Down Expand Up @@ -162,3 +181,10 @@ async function updateTable(
function getSortKeyColumnsString(sortKeyColumns: Column[]) {
return sortKeyColumns.map(column => column.name).join();
}

function getEncodingColumnString(column: Column): string {
if (column.encoding) {
return ` ENCODE ${column.encoding}`;
Rizxcviii marked this conversation as resolved.
Show resolved Hide resolved
}
return '';
}
Original file line number Diff line number Diff line change
Expand Up @@ -24,3 +24,112 @@ export enum TableSortStyle {
*/
INTERLEAVED = 'INTERLEAVED',
}

/**
* The compression encoding of a column.
*
* @see https://docs.aws.amazon.com/redshift/latest/dg/c_Compression_encodings.html
*/
export enum ColumnEncoding {
/**
* Amazon Redshift assigns an optimal encoding based on the column data.
* This is the default.
*/
AUTO = 'AUTO',

/**
* The column is not compressed.
*
* @see https://docs.aws.amazon.com/redshift/latest/dg/c_Raw_encoding.html
*/
RAW = 'RAW',

/**
* The column is compressed using the AZ64 algorithm.
*
* @see https://docs.aws.amazon.com/redshift/latest/dg/az64-encoding.html
*/
AZ64 = 'AZ64',

/**
* The column is compressed using a separate dictionary for each block column value on disk.
*
* @see https://docs.aws.amazon.com/redshift/latest/dg/c_Byte_dictionary_encoding.html
*/
BYTEDICT = 'BYTEDICT',

/**
* The column is compressed based on the difference between values in the column.
* This records differences as 1-byte values.
*
* @see https://docs.aws.amazon.com/redshift/latest/dg/c_Delta_encoding.html
*/
DELTA = 'DELTA',

/**
* The column is compressed based on the difference between values in the column.
* This records differences as 2-byte values.
*
* @see https://docs.aws.amazon.com/redshift/latest/dg/c_Delta_encoding.html
*/
DELTA32K = 'DELTA32K',

/**
* The column is compressed using the LZO algorithm.
*
* @see https://docs.aws.amazon.com/redshift/latest/dg/lzo-encoding.html
*/
LZO = 'LZO',

/**
* The column is compressed to a smaller storage size than the original data type.
* The compressed storage size is 1 byte.
*
* @see https://docs.aws.amazon.com/redshift/latest/dg/c_MostlyN_encoding.html
*/
MOSTLY8 = 'MOSTLY8',

/**
* The column is compressed to a smaller storage size than the original data type.
* The compressed storage size is 2 bytes.
*
* @see https://docs.aws.amazon.com/redshift/latest/dg/c_MostlyN_encoding.html
*/
MOSTLY16 = 'MOSTLY16',

/**
* The column is compressed to a smaller storage size than the original data type.
* The compressed storage size is 4 bytes.
*
* @see https://docs.aws.amazon.com/redshift/latest/dg/c_MostlyN_encoding.html
*/
MOSTLY32 = 'MOSTLY32',

/**
* The column is compressed by recording the number of occurrences of each value in the column.
*
* @see https://docs.aws.amazon.com/redshift/latest/dg/c_Runlength_encoding.html
*/
RUNLENGTH = 'RUNLENGTH',

/**
* The column is compressed by recording the first 245 unique words and then using a 1-byte index to represent each word.
*
* @see https://docs.aws.amazon.com/redshift/latest/dg/c_Text255_encoding.html
*/
TEXT255 = 'TEXT255',

/**
* The column is compressed by recording the first 32K unique words and then using a 2-byte index to represent each word.
*
* @see https://docs.aws.amazon.com/redshift/latest/dg/c_Text255_encoding.html
*/
TEXT32K = 'TEXT32K',

/**
* The column is compressed using the ZSTD algorithm.
*
* @see https://docs.aws.amazon.com/redshift/latest/dg/zstd-encoding.html
*/
ZSTD = 'ZSTD',
}
17 changes: 17 additions & 0 deletions packages/@aws-cdk/aws-redshift/lib/table.ts
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ import { Construct, IConstruct } from 'constructs';
import { ICluster } from './cluster';
import { DatabaseOptions } from './database-options';
import { DatabaseQuery } from './private/database-query';
import { ColumnEncoding } from './private/database-query-provider';
import { HandlerName } from './private/database-query-provider/handler-name';
import { getDistKeyColumn, getSortKeyColumns } from './private/database-query-provider/util';
import { TableHandlerProps } from './private/handler-props';
Expand Down Expand Up @@ -79,6 +80,20 @@ export interface Column {
* @default - column is not a SORTKEY
*/
readonly sortKey?: boolean;

/**
* The encoding to use for the column.
*
* @default - Amazon Redshift determines the encoding based on the data type.
*/
readonly encoding?: ColumnEncoding;

/**
* A comment to attach to the column.
*
* @default - no comment
*/
readonly comment?: string;
}

/**
Expand Down Expand Up @@ -344,3 +359,5 @@ export enum TableSortStyle {
*/
INTERLEAVED = 'INTERLEAVED',
}

export { ColumnEncoding } from './private/database-query-provider';
Loading