-
-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Cloud Tier
Cloud storage is an ideal place to backup warm data. Its storage is scalable, and cost is usually low compared to on-premise storage servers. Uploading to the cloud is usually free. However, usually the cloud storage access is not free and slow.
SeaweedFS is fast. However, its capacity is limited by available number of volume servers.
One good way is to combine SeaweedFS' fast local access speed with the cloud storage's elastic capacity.
Assuming hot data is 20% and warm data is 80%. We can move the warm data to the cloud storage. The access for the warm data will be slower, but this can free up 80% servers, or repurpose them for faster local access, instead of just storing warm data with little access. This integration is all transparent to SeaweedFS users.
With fixed number of servers, this transparent cloud integration literally gives SeaweedFS unlimited capacity, in addition to its fast speed. Just add more local SeaweedFS volume servers to increase the throughput.
If one volume is tiered to the cloud,
- The volume is marked as readonly.
- The index file is still local
- The
.dat
file is moved to the cloud. - The same O(1) disk read is applied to the remote file. When requesting a file entry, a single range request retrieves the entry's content.
- Use
weed scaffold -conf=master
to generatemaster.toml
, tweak it, and start master server with themaster.toml
. - Use
volume.tier.upload
inweed shell
to move volumes to the cloud. - Use
volume.tier.download
inweed shell
to move volumes to the local cluster.
(Currently only s3 is developed. More is coming soon.)
Multiple s3 buckets are supported. Usually you just need to configure one backend.
[storage.backend]
[storage.backend.s3.default]
enabled = true
aws_access_key_id = "" # if empty, loads from the shared credentials file (~/.aws/credentials).
aws_secret_access_key = "" # if empty, loads from the shared credentials file (~/.aws/credentials).
region = "us-west-1"
bucket = "one_bucket" # an existing bucket
[storage.backend.s3.name2]
enabled = true
aws_access_key_id = "" # if empty, loads from the shared credentials file (~/.aws/credentials).
aws_secret_access_key = "" # if empty, loads from the shared credentials file (~/.aws/credentials).
region = "us-west-2"
bucket = "one_bucket_two" # an existing bucket
After this is configured, you can use this command to upload the .dat file content to the cloud.
// move the volume 37.dat to the s3 cloud
volume.tier.upload -dest=s3 -collection=benchmark -volumeId=37
// or
volume.tier.upload -dest=s3.default -collection=benchmark -volumeId=37
// if for any reason you want to move the volume to a different bucket
volume.tier.upload -dest=s3.name2 -collection=benchmark -volumeId=37
- Replication
- Store file with a Time To Live
- Failover Master Server
- Erasure coding for warm storage
- Server Startup Setup
- Environment Variables
- Filer Setup
- Directories and Files
- Data Structure for Large Files
- Filer Data Encryption
- Filer Commands and Operations
- Filer JWT Use
- Filer Cassandra Setup
- Filer Redis Setup
- Super Large Directories
- Path-Specific Filer Store
- Choosing a Filer Store
- Customize Filer Store
- Migrate to Filer Store
- Add New Filer Store
- Filer Store Replication
- Filer Active Active cross cluster continuous synchronization
- Filer as a Key-Large-Value Store
- Path Specific Configuration
- Filer Change Data Capture
- Cloud Drive Benefits
- Cloud Drive Architecture
- Configure Remote Storage
- Mount Remote Storage
- Cache Remote Storage
- Cloud Drive Quick Setup
- Gateway to Remote Object Storage
- Amazon S3 API
- AWS CLI with SeaweedFS
- s3cmd with SeaweedFS
- rclone with SeaweedFS
- restic with SeaweedFS
- nodejs with Seaweed S3
- S3 API Benchmark
- S3 API FAQ
- S3 Bucket Quota
- S3 API Audit log
- S3 Nginx Proxy
- Docker Compose for S3
- Hadoop Compatible File System
- run Spark on SeaweedFS
- run HBase on SeaweedFS
- run Presto on SeaweedFS
- Hadoop Benchmark
- HDFS via S3 connector
- Async Replication to another Filer [Deprecated]
- Async Backup
- Async Filer Metadata Backup
- Async Replication to Cloud [Deprecated]
- Kubernetes Backups and Recovery with K8up