Skip to content
This repository has been archived by the owner on Feb 19, 2022. It is now read-only.
/ s3-proxy Public archive

S3 nginx proxy where a provided link -> S3 bucket

License

Notifications You must be signed in to change notification settings

fartbagxp/s3-proxy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

License: MIT Build Status CircleCI

Overview

This is an nginx proxy that proxies large binary data (ex. PDFs) from an S3 bucket.

Provided a S3 bucket (private or public), the proxy is used to simply re-route the URL (via proxy_pass) to the S3 bucket resource.

The proxy route all requests to a specific path on a S3 bucket, defined by NGINX_S3_BUCKET (such as http://<xxx>.s3-website-us-east-1.amazonaws.com/).

For simplicity, this is one way it can look like:

nginx s3 proxy

AWS has a more extensive architectural diagram of what this can look like.

Setup

To try this out locally, you should have the following:

  • A presigned URL for a AWS S3 resource.
  • Certificates and Keys for SSL, and a Diffie–Hellman key for forward secrecy (use the config folder ones for local development, if you're too lazy to generate one).
  • Docker (optional)

Configuration

Configuration is handled by environment variables. The following variables are available:

  • NGINX_SERVER_NAME: The port in which this nginx proxy is listening to
  • NGINX_S3_BUCKET: The S3 bucket URL where the the resources are stored - (ex. http://<xxx>.s3-website-us-east-1.amazonaws.com/)
  • NGINX_SSL_CERT_PATH: The SSL certificate to identify the server
  • NGINX_SSL_KEY_PATH: The private key for the server for encrypting traffic between the server and client.
  • NGINX_SSL_DH_PATH: The Diffie–Hellman key for generating session keys for perfect forward secrecy.
  • NGINX_DNS_IP_1: The primary DNS resolver to use
  • NGINX_DNS_IP_2: The secondary DNS resolver to use (in case the first one fails)

Presign URL for AWS S3 resources

The normal use case is that a presigned URL will be generated so that the proxy does not need to know anything about authentication with AWS.

By default, the presign URL expires in an hour.

aws s3 presign s3://mybucket/myobject

This will generate a URL that will expire in 300 seconds.

aws s3 presign s3://mybucket/myobject --expires-in 300

Generated keys and certs

  • To generate a 2048-bit private key and a self-signed certificate, simply run
openssl req -newkey rsa:2048 -nodes -keyout config/domain.key -x509 -days 365 -out config/domain.crt

The domain.crt and domain.key files will appear in your directory.

  • To generate a Diffie-Hellman (DH) key for TLS, run this.
openssl dhparam -out config/dhparam.pem 4096

Building the configuration

There is a build.sample.sh script used for building the nginx.conf file.

Essentially, it replaces all the Nginx configuration.

cp build.sample.sh build.sh
  • Modify build.sh with your environment variables.
sh build.sh

The nginx.conf will appear in the local directory.

Deployment

Local

For local deployment, I simply use a slim Docker image for mostly testing purposes.

Here's a quick and dirty script for testing (using port in build.sample.sh).

sh build.sh
docker run --name nginx-s3-proxy \
    -p 3000:80 \
    -p 443:443 \
    -v ${PWD}/nginx.conf:/etc/nginx/nginx.conf \
    -v ${PWD}/config/domain.crt:/etc/nginx/domain.crt \
    -v ${PWD}/config/domain.key:/etc/nginx/domain.key \
    -v ${PWD}/config/dhparam.pem:/etc/nginx/dhparam.pem \
    -d nginx:1.13.12-alpine

At this point, you should be able to go to localhost:3000, it'll automatically redirect you to https://localhost, in which if you accept your own self-signed certificate, you will get an Access Denied message.

Append the AWS S3 resource URL from the presign URL localhost:3000/<resource> to get to your file.

Other Considerations

One consideration to fetching data from a private AWS S3 bucket is to use AWS API Gateway for proxying the HTTP request.

The API gateway can act as either a HTTP proxy or a AWS Lambda proxy.

It can provide a URL that can integrate a GET https://your-api-host/stage/ request with the backend GET https://your-s3-host/.

Managed service means scaling is done by AWS automatically.

The major drawback of this approach is the limitation of API Gateway Integration with HTTP proxy and AWS Lambda. The integration timeout is currently set to 30 seconds.

To download a 1GB binary file, one would need to hit slightly higher than 286.4Mbps to download the file before the timeout.

Troubleshooting

  • "SignatureDoesNotMatch" error appears as the response when you try to hit the server.

    The bucket name is currently set to an non-existent bucket, so please make sure to check that environment variable before proceeding.

About

S3 nginx proxy where a provided link -> S3 bucket

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages