Skip to content

ACAPSproject/SOPHIA_v.0.1.1

Repository files navigation

The Solution Platform for Humanitarian Information Analysis (SOPHIA)

The Solutions Platform for Humanitarian Information Analysis, or SOPHIA, is a suite of services that aim to support data collection and humanitarian information analysis. The system is supported by ACAPS.

This repository hosts code and supplemental documentation for current version of SOPHIA, which allows users to interact with the UN OCHA ReliefWeb platform. This version is a first iteration in product development, supporting limited functionality.

While the SOPHIA platform is delivered through AirTable, the project can easily be modified to deliver its output as a JSON response to your preferred location.

Setting up environment variables and requirements

  • AWS: To develop using this repository, you must set up your environment variable according to the config.py file after installing the AWS CLI and your AWS credentials on your system (AWS CLI documentation)

  • Serverless: Sophia uses a serverless architecture and the Serverless Framework. To change Serverless configurations, change the settings in serverless.yml file.

  • AirTable API: SOPHIA delivers data to AirTable for ACAPS’ needs. To use the API and generate your API key, please follow the instructions in the AirTable documentation.

Serverless Framework Python Scheduled Cron on AWS

This template demonstrates how to develop and deploy a simple cron-like service running on AWS Lambda using the traditional Serverless Framework.

Schedule event type

This examples defines two functions, rateHandler and cronHandler, both of which are triggered by an event of schedule type, which is used for configuring functions to be executed at specific time or in specific intervals. For detailed information about schedule event, please refer to corresponding section of Serverless docs.

When defining schedule events, we need to use rate or cron expression syntax.

Rate expressions syntax

rate(value unit)

value - A positive number

unit - The unit of time. ( minute | minutes | hour | hours | day | days )

In below example, we use rate syntax to define schedule event that will trigger our rateHandler function every minute

functions:
  rateHandler:
    handler: handler.run
    events:
      - schedule: rate(1 minute)

Detailed information about rate expressions is available in official AWS docs.

Cron expressions syntax

cron(Minutes Hours Day-of-month Month Day-of-week Year)

All fields are required and time zone is UTC only.

Field Values Wildcards
Minutes 0-59 , - * /
Hours 0-23 , - * /
Day-of-month 1-31 , - * ? / L W
Month 1-12 or JAN-DEC , - * /
Day-of-week 1-7 or SUN-SAT , - * ? / L #
Year 192199 , - * /

In below example, we use cron syntax to define schedule event that will trigger our cronHandler function every second minute every Monday through Friday

functions:
  cronHandler:
    handler: handler.run
    events:
      - schedule: cron(0/2 * ? * MON-FRI *)

Detailed information about cron expressions in available in official AWS docs.

Usage

Deployment

This example is made to work with the Serverless Framework dashboard, which includes advanced features such as CI/CD, monitoring, metrics, etc.

In order to deploy with dashboard, you need to first login with:

serverless login

and then perform deployment with:

serverless deploy

After running deploy, you should see output similar to:

Deploying aws-python-scheduled-cron-project to stage dev (us-east-1)

✔ Service deployed to stack aws-python-scheduled-cron-project-dev (205s)

functions:
  rateHandler: aws-python-scheduled-cron-project-dev-rateHandler (2.9 kB)
  cronHandler: aws-python-scheduled-cron-project-dev-cronHandler (2.9 kB)

There is no additional step required. Your defined schedules becomes active right away after deployment.

Local invocation

In order to test out your functions locally, you can invoke them with the following command:

serverless invoke local --function rateHandler

After invocation, you should see output similar to:

INFO:handler:Your cron function aws-python-scheduled-cron-dev-rateHandler ran at 15:02:43.203145

Bundling dependencies

In case you would like to include 3rd party dependencies, you will need to use a plugin called serverless-python-requirements. You can set it up by running the following command:

serverless plugin install -n serverless-python-requirements

Running the above will automatically add serverless-python-requirements to plugins section in your serverless.yml file and add it as a devDependency to package.json file. The package.json file will be automatically created if it doesn't exist beforehand. Now you will be able to add your dependencies to requirements.txt file (Pipfile and pyproject.toml is also supported but requires additional configuration) and they will be automatically injected to Lambda package during build process. For more details about the plugin's configuration, please refer to official documentation.

Version

  • V 0.1.1 (release date): This version of SOPHIA collects daily reports published on Reliefweb, translates titles, summary and pdfs to English, splits the text into sentences and sends these sentences to a machine learning model for classification according to ACAPS’ definition of Humanitarian Access, Protection, Seasonal and Information Landscape humanitarian frameworks. The resulting classification is then sent to AirTable.

Contribution

If you wish to contribute to this project, please fork the repository and submit a pull request from your branch.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages