S3PyPI is a CLI for creating a Python Package Repository in an S3 bucket.
The official Python Package Index (PyPI) is a public
repository of Python software. It's used by pip
to download packages.
If you work at a company, you may wish to publish your packages somewhere
private instead, and still have them be accessible via pip install
. This
requires hosting your own repository.
S3PyPI enables hosting a private repository at a low cost. It requires only an S3 bucket for storage, and some way to serve files over HTTPS (e.g. Amazon CloudFront).
Publishing packages and index pages to S3 is done using the s3pypi
CLI.
Creating the S3 bucket and CloudFront distribution is done using a provided
Terraform configuration, which you can tailor to your own needs.
- AWS CodeArtifact is a fully managed service that integrates with IAM.
Install s3pypi using pip:
$ pip install s3pypi
Before you can start using s3pypi
, you must set up an S3 bucket for storing
packages, and a CloudFront distribution for serving files over HTTPS. Both of
these can be created using the Terraform configuration provided in the
terraform/
directory:
$ git clone https://github.com/gorilla-co/s3pypi.git
$ cd s3pypi/terraform/
$ terraform init
$ terraform apply
You will be asked to enter your desired AWS region, S3 bucket name, and domain
name for CloudFront. You can also enter these in a file named
config.auto.tfvars
:
region = "eu-west-1"
bucket = "example-bucket"
domain = "pypi.example.com"
The Terraform configuration assumes that a Route 53 hosted zone exists for
your domain, with a matching (wildcard) certificate in AWS Certificate
Manager. If your certificate is a wildcard certificate, add
use_wildcard_certificate = true
to config.auto.tfvars
.
To ensure that concurrent invocations of s3pypi
do not overwrite each other's
changes, the objects in S3 can be locked via an optional DynamoDB table (using
the --lock-indexes
option). To create this table, add enable_dynamodb_locking = true
to config.auto.tfvars
.
To enable basic authentication, add enable_basic_auth = true
to
config.auto.tfvars
. This will attach a Lambda@Edge function to your
CloudFront distribution that reads user passwords from AWS Systems Manager
Parameter Store. Users and passwords can be configured using the put_user.py
script:
$ basic_auth/put_user.py pypi.example.com alice
Password:
This creates a parameter named /s3pypi/pypi.example.com/users/alice
. Passwords
are hashed with a random salt, and stored as JSON objects:
{
"password_hash": "7364151acc6229ec1468f54986a7614a8b215c26",
"password_salt": "RRoCSRzvYJ1xRra2TWzhqS70wn84Sb_ElKxpl49o3Y0"
}
The Terraform configuration can also be included in your own project as a module:
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 4.0"
}
}
}
provider "aws" {
region = "eu-west-1"
}
provider "aws" {
alias = "us_east_1"
region = "us-east-1"
}
module "s3pypi" {
source = "github.com/gorilla-co/s3pypi//terraform/modules/s3pypi"
bucket = "example-bucket"
domain = "pypi.example.com"
use_wildcard_certificate = true
enable_dynamodb_locking = true
enable_basic_auth = true
providers = {
aws.us_east_1 = aws.us_east_1
}
}
Existing resources created using the CloudFormation templates from s3pypi 0.x can be imported into Terraform and removed from CloudFormation. For example:
$ terraform init
$ terraform import module.s3pypi.aws_s3_bucket.pypi example-bucket
$ terraform import module.s3pypi.aws_cloudfront_distribution.cdn EDFDVBD6EXAMPLE
$ terraform apply
In this new configuration, CloudFront uses the S3 REST API endpoint as its
origin, not the S3 website endpoint. This allows the bucket to remain private,
with CloudFront accessing it through an Origin Access Identity (OAI). To make
this work with your existing S3 bucket, all <package>/index.html
objects must
be renamed to <package>/
. You can do so using the provided script:
$ scripts/migrate-s3-index.py example-bucket
To instead keep using the old configuration with a publicly accessible S3 website endpoint, pass the following options when uploading packages:
$ s3pypi upload ... --index.html --s3-put-args='ACL=public-read'
The s3pypi
CLI requires the following IAM permissions to access S3 and
(optionally) DynamoDB. Replace example-bucket
by your S3 bucket name.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:DeleteObject"
],
"Resource": "arn:aws:s3:::example-bucket/*"
},
{
"Effect": "Allow",
"Action": [
"s3:ListBucket"
],
"Resource": "arn:aws:s3:::example-bucket"
},
{
"Effect": "Allow",
"Action": [
"dynamodb:GetItem",
"dynamodb:PutItem",
"dynamodb:DeleteItem"
],
"Resource": "arn:aws:dynamodb:*:*:table/example-bucket-locks"
}
]
}
You can now use s3pypi
to upload packages to S3:
$ cd /path/to/your-project/
$ python setup.py sdist bdist_wheel
$ s3pypi upload dist/* --bucket example-bucket [--prefix PREFIX]
See s3pypi --help
for a description of all options.
Install your packages using pip
by pointing the --extra-index-url
to your
CloudFront domain. If you used --prefix
while uploading, then add the prefix
here as well:
$ pip install your-project --extra-index-url https://pypi.example.com/PREFIX/
Alternatively, you can configure the index URL in ~/.pip/pip.conf
:
[global]
extra-index-url = https://pypi.example.com/PREFIX/
Currently there are no plans to add new features to s3pypi. If you have any ideas for new features, check out our contributing guidelines on how to get these on our roadmap.
Got any questions or ideas? We'd love to hear from you. Check out our contributing guidelines for ways to offer feedback and contribute.
Copyright (c) Gorillini NV. All rights reserved.
Licensed under the MIT License.