-
Notifications
You must be signed in to change notification settings - Fork 4
StarCluster
You need to install Starcluster on your local machine. Please follow the directions from their installation guide to install StarCluster on your workstation.
Follow the StarCluster Quick Tutorial to start your first cluster. Please note that you will need to enable experimental features to use the get
and put
StarCluster commands.
# in the global options, set to experimental to "True"
ENABLE_EXPERIMENTAL=True
We provide an EBS snapshot (snap-1f73906b) of RUM version 2.0.2_07 along with the pre-compiled RUM genome indexes. You can create a volume from this snapshot in two ways:
If you have EC2 Tools installed, you can use that to careate the volume. Again, please pay attention to the region you will be requesting in the StarCluster config.
$ ec2-create-volume --private-key pk-XXXX.pem --cert cert-XXXX.pem --region us-east-1a --snapshot snap-1f73906b --size 100
Here we will use the Python boto
library to create a volume pre-populated with RUM v2.0.2_07 and all of the indexes.
First make sure that boto is installed.
# you may need to use sudo to do this
pip install boto
Then use the following script to create a new EBS volume from the public RUM snapshot. Pay particular attention to the availability zone to make sure it matches StarCluster's configured zone. The script also assumes that you have exported your AWS credentials to your shell:
export AWS_ACCESS_KEY_ID=XXXXXXXXXXXXX
export AWS_SECRET_ACCESS_KEY=XXXXXXXXXXXXXXXXXXXXXXXXXXXXX
The script is located as a gist on github, but contents are shown below:
#!/usr/bin/env python
import boto
from os import environ
conn = boto.connect_ec2(environ["AWS_ACCESS_KEY_ID"],environ["AWS_SECRET_ACCESS_KEY"])
size = 100
#adjust to the same zone as StarCluster config
zone = "us-east-1a"
# Description: RUMv2.0.2_7
snapshot = "snap-1f73906b"
v = conn.create_volume(size,zone,snapshot)
print v
Save this script to your filesystem and run it (in this case we named it create_rum_ebs.py
:
python create_rum_ebs.py
Now add the newly created volume to your StarCluster configuration file.
#############################
## Configuring EBS Volumes ##
#############################
[vol rum]
volume_id = vol-a1b2c3d4
mount_path = /rum
Next, we create and format the volumes for storing input reads and results. RUM needs lots of scratch space. It is recommended that you allow for 1TB of space per Illumina flowcell analysis. We will use StarCluster for this task:
starcluster createvolume --name=ngsdata -d -m "mkfs.ext4" 1024 us-east-1a
starcluster createvolume --name=rumresults 1024 us-east-1a
NOTE: the --name
option is for tagging the volume on AWS. If you are booting multiple clusters, it is good practice to prefix that name with the cluster's ID. E.g. if I have two clusters called "rum" and "physics", then the above would be changed to:
starcluster createvolume --name=rum-ngsdata 1024 us-east-1a
starcluster createvolume --name=rum-rumresults 1024 us-east-1a
Now add those volumes to your configuration:
[vol rumresults]
volume_id = vol-a123bcd5
mount_path = /rumresults
[vol ngsdata]
volume_id = vol-a123bcd6
mount_path = /ngsdata
Last but not least, add all of the volumes to your cluster definition. In the following, we created a cluster template named rum-small
:
[cluster rum-small]
# other cluster options ...
VOLUMES = ngstools, rumresults, ngsdata
Finally we are able to fire up starcluster. In this example, we have defined cluster profile called rum-small
with two execution nodes.
$ starcluster start rum-small
To use spot pricing, you can either set the SPOT_BID
parameter in the config, or use the --bid
option
# set a max spot bid price of $1.50
$ starcluster start --bid 1.50 rum-small
Once StarCluster reportst that the cluster is started,you should be able to ssh to the master:
$ starcluster sshmaster rum-small
If you mounted the ngstools
volume under a different mount point, you will need to edit the RUM configuration files as as appropriate to define absolute paths to indexes and executables.