Skip to content

Deploy DynamoDB Connector without Serverless Application Repository or CloudFormation Permissions

Michael Hackett edited this page May 2, 2023 · 4 revisions

This page is meant to be used by users of Athena Federation who want to use the DynamoDB connector but do not have the IAM permissions in their organization to use SAR or CFN. The following is a script you can use to manually perform the actions that using SAR and CFN would do for you.

Follow this guide to set up your development environment. You should clone the repository, but you do not need to build the modules. This page assumes you are using the same java and gh cli versions as the development environment in the guide.

Please fulfill the following prereqs first:

  • Export AWS credentials to terminal environment variables (or, just use your ~/.aws/credentials file) with permissions to create roles/policies in IAM, upload to S3, and create a lambda.
  • cd into your Athena Federation repository.

Paste the following into your terminal. Then use your text editor of choice to set the variables in env_vars.sh.

cat <<EOF > env_vars.sh 
export REGION=<customer region>
export ACCOUNT_ID=<customer account id>
export BUCKET_NAME=<s3 bucket to use for hosting jar and spilled data. do not include the s3:// prefix. you will need to make this if it does not exist!>
export BUCKET_KEY=<path to file in s3, for example jars/ddb>
export ROLE_NAME=<name of role in IAM for your connector to use>
export POLICY_NAME=<name of policy in IAM for your connector to use>
export LAMBDA_FUNCTION_NAME=<function name you want>
EOF

Now you can just paste the following. If you are curious about what each step is doing, there are comments in the script.

# source environment variables
source env_vars.sh

# make a directory for the resources we need
mkdir manual_upload
cd manual_upload

# write the JSON for the IAM policies the connector will use
cat <<EOF > iam_policy.json
{
    "Version": "2012-10-17",
            "Statement": [
                {
                    "Action": [
                        "logs:CreateLogGroup",
                        "logs:CreateLogStream",
                        "logs:PutLogEvents"
                    ],
                    "Effect": "Allow",
                    "Resource": [
                        "arn:aws:logs:$REGION:$ACCOUNT_ID:*"
                    ]
                },
                {
                    "Action": [
                        "dynamodb:DescribeTable",
                        "dynamodb:ListSchemas",
                        "dynamodb:ListTables",
                        "dynamodb:Query",
                        "dynamodb:Scan",
                        "glue:GetTableVersions",
                        "glue:GetPartitions",
                        "glue:GetTables",
                        "glue:GetTableVersion",
                        "glue:GetDatabases",
                        "glue:GetTable",
                        "glue:GetPartition",
                        "glue:GetDatabase",
                        "athena:GetQueryExecution",
                        "s3:ListAllMyBuckets"
                    ],
                    "Resource": "*",
                    "Effect": "Allow"
                },
                {
                    "Action": [
                        "s3:GetObject",
                        "s3:ListBucket",
                        "s3:GetBucketLocation",
                        "s3:GetObjectVersion",
                        "s3:PutObject",
                        "s3:PutObjectAcl",
                        "s3:GetLifecycleConfiguration",
                        "s3:PutLifecycleConfiguration",
                        "s3:DeleteObject"
                    ],
                    "Resource": [
                        "arn:aws:s3:::$BUCKET_NAME",
                        "arn:aws:s3:::$BUCKET_NAME/*"
                    ],
                    "Effect": "Allow"
                }
            ]
}
EOF

# Write the JSON for the role allowing Lambda use the above policy
cat <<EOF > assume_role.json
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal":
                {
                    "Service": [
                        "lambda.amazonaws.com"
                    ]
                },
            "Action": [
                "sts:AssumeRole"
            ]
        }
    ]
}
EOF

# Step 2 - create policies/roles in IAM and attach the role with the policy

aws iam create-role --role-name $ROLE_NAME --assume-role-policy-document file://assume_role.json --region $REGION
aws iam create-policy --policy-name $POLICY_NAME --policy-document file://iam_policy.json
aws iam attach-role-policy --role-name $ROLE_NAME --policy-arn arn:aws:iam::$ACCOUNT_ID:policy/$POLICY_NAME

# Step 3 - upload Athena DynamoDB Connector JAR to s3
cd ..
export LATEST_VERSION=$(gh release list --exclude-drafts -L 1 | sed 's/.*\s\+Latest\s\+v\(.*\)\s\+.*/\1/g')
export RELEASE_NAME=v$LATEST_VERSION
gh release download $RELEASE_NAME -p "athena-dynamodb-$LATEST_VERSION.jar" -D manual_upload
aws s3 cp manual_upload/athena-dynamodb-$LATEST_VERSION.jar s3://$BUCKET_NAME/$BUCKET_KEY

# Step 4 - create lambda
cd manual_upload
cat <<EOF > code.json
{
    "S3Bucket": "$BUCKET_NAME",
    "S3Key": "$BUCKET_KEY"
}
EOF

# wait a few seconds for IAM creation to settle, otherwise trust policy won't be ready sometimes.
sleep 5

aws lambda create-function --function-name $LAMBDA_FUNCTION_NAME \
--runtime java11 \
--code file://code.json \
--handler com.amazonaws.athena.connectors.dynamodb.DynamoDBCompositeHandler \
--role arn:aws:iam::$ACCOUNT_ID:role/$ROLE_NAME \
--memory-size 3008 \
--timeout 900 \
--environment "Variables={spill_bucket=$BUCKET_NAME,spill_prefix=athena-spill,disable_spill_encryption=false}"

# now clean up
cd ..
rm -rf manual_upload

Now you can start running queries in Athena by pointing a new data source to the created Lambda function.