Cirrus is a STAC-based geospatial processing pipeline platform, implemented using a scalable architecture deployed on AWS. Cirrus provides the generic infrastructure for processing, allowing a user to focus on implementing the specific processing logic for their data.
As input, Cirrus takes a STAC ItemCollection, with a process definition block.
That input is called a Cirrus ProcessPayload
(CPP).
An input is run through a workflow that generates one or more output STAC Items. These output Items are added to the Cirrus static STAC catalog in S3, and are also broadcast via an SNS topic. Subscriptions to that topic can triggering additional workflows or external processes, such as indexing into a STAC API catalog (e.g., stac-server).
Cirrus workflows range from the simple publishing of unmodified input items to the complex transformation of input Items and generation of wholly-new output Items. The current state of CPP processing is tracked in a state database to prevent duplicate processing and allow for a user to follow the state of any input through the pipeline.
As shown in this high-level overview of Cirrus, users input data to Cirrus through the use of feeders. Feeders are simply programs that get/generate some type of STAC metadata, combine it with processing parameters, and pass it into Cirrus as a CPP.
If developing new code for cirrus-geo, checkout the Contributing Guide.
Documentation for deploying, using, and customizing Cirrus is contained within the docs directory:
- Learn how to get started
- Understand the architecture of Cirrus and key concepts
- Use Cirrus to process input data and publish resulting STAC Items
- Cirrus features several component types that each represent a specific role within the Cirrus architecture
Cirrus is an Open-Source pipeline for processing geospatial data in AWS. Cirrus was developed by Element 84 originally under a NASA ACCESS project called Community Tools for Analysis of NASA Earth Observation System Data in the Cloud.