Update the calibration workflow, clean up readme

HERMES-SOC · Jan 8, 2024 · 7ea3b05 · 7ea3b05
1 parent b096611
commit 7ea3b05
Show file tree

Hide file tree

Showing 4 changed files with 32 additions and 222 deletions.
diff --git a/.github/workflows/calibration.yml b/.github/workflows/calibration.yml
@@ -28,8 +28,20 @@ jobs:
 
     - name: Test Lambda Function with curl
       run: |
-        curl -XPOST "http://localhost:9000/2015-03-31/functions/function/invocations" -d @lambda_function/tests/test_data/test_eea_event.json
-
+        # Run curl and write the HTTP status code to a variable
+        HTTP_STATUS=$(curl -X POST "http://localhost:9000/2015-03-31/functions/function/invocations" \
+                      -d @lambda_function/tests/test_data/test_eea_event.json \
+                      -o response.json -w '%{http_code}')
+    
+        # Check if the HTTP status is 200 (OK)
+        if [ "$HTTP_STATUS" -eq 200 ]; then
+          echo "Success: HTTP status is 200"
+          exit 0  # Exit with success
+        else
+          echo "Error or unexpected HTTP status: $HTTP_STATUS"
+          exit 1  # Exit with failure
+        fi
+
     - name: Copy Processed Files from Container
       run: |
         container_id=$(docker ps -qf "ancestor=processing_function:latest")

diff --git a/README.md b/README.md
@@ -6,8 +6,9 @@
 ### **Base Image Used For Container:** https://github.com/HERMES-SOC/docker-lambda-base 
 
 ### **Description**:
-This repository is to define the image to be used for the SWSOC file processing Lambda function container. This container will be built and and stored in an ECR Repo. 
-The container will contain the latest release code as the production environment and the latest code on master as the development. Files with the appropriate naming convention will be handled in production while files prefixed with `dev_` will be handled using the development environment.
+This repository is to define the image to be used for the SWSOC file processing Lambda function container. This container will be built and and stored in the appropriate development/production ECR Repo. 
+
+The container will contain the latest release code as the production environment and the latest code on master as the development. 
 
 ### **Testing Locally (Using own Test Data)**:
 1. Build the lambda container image (from within the lambda_function folder) you'd like to test: 
@@ -35,56 +36,5 @@ The container will contain the latest release code as the production environment
 
     `curl -XPOST "http://localhost:9000/2015-03-31/functions/function/invocations" -d @lambda_function/tests/test_data/test_eea_event.json`
 
-# Information on working with a CDK Project
-
-The `cdk.json` file tells the CDK Toolkit how to execute your app.
-
-This project is set up like a standard Python project.  The initialization
-process also creates a virtualenv within this project, stored under the `.venv`
-directory.  To create the virtualenv it assumes that there is a `python3`
-(or `python` for Windows) executable in your path with access to the `venv`
-package. If for any reason the automatic creation of the virtualenv fails,
-you can create the virtualenv manually.
-
-To manually create a virtualenv on MacOS and Linux:
-
-```
-$ python3 -m venv .venv
-```
-
-After the init process completes and the virtualenv is created, you can use the following
-step to activate your virtualenv.
-
-```
-$ source .venv/bin/activate
-```
-
-If you are a Windows platform, you would activate the virtualenv like this:
-
-```
-% .venv\Scripts\activate.bat
-```
-
-Once the virtualenv is activated, you can install the required dependencies.
-
-```
-$ pip install -r requirements.txt
-```
-
-At this point you can now synthesize the CloudFormation template for this code.
-
-```
-$ cdk synth
-```
-
-To add additional dependencies, for example other CDK libraries, just add
-them to your `setup.py` file and rerun the `pip install -r requirements.txt`
-command.
-
-## Useful commands for CDK
-
- * `cdk ls`          list all stacks in the app
- * `cdk synth`       emits the synthesized CloudFormation template
- * `cdk deploy`      deploy this stack to your default AWS account/region
- * `cdk diff`        compare deployed stack with current state
- * `cdk docs`        open CDK documentation
+### **How this Lambda Function is deployed**
+This lambda function is part of the main SWxSOC Pipeline ([Architecture Repo Link](https://github.com/HERMES-SOC/sdc_aws_pipeline_architecture)). It is deployed via AWS Codebuild within that repository. It is first built and tagged within the appropriate production or development repository (depending if it is a release or commit). View the Codebuild CI/CD file [here](buildspec.yml).
diff --git a/lambda_function/requirements.txt b/lambda_function/requirements.txt
@@ -3,5 +3,4 @@ hermes_spani @ git+https://github.com/HERMES-SOC/hermes_spani.git
 hermes_eea @ git+https://github.com/HERMES-SOC/hermes_eea.git
 hermes_nemisis @ git+https://github.com/HERMES-SOC/hermes_nemisis.git
 hermes_merit @ git+https://github.com/HERMES-SOC/hermes_merit.git
-cdftracker @ git+https://github.com/HERMES-SOC/CDFTracker.git
 psycopg2-binary==2.9.7
diff --git a/lambda_function/src/file_processor/file_processor.py b/lambda_function/src/file_processor/file_processor.py
@@ -17,11 +17,9 @@
     get_instrument_bucket,
 )
 from sdc_aws_utils.aws import (
-    create_s3_client_session,
-    object_exists,
-    download_file_from_s3,
-    upload_file_to_s3,
-    create_s3_file_key,
+    parse_file_key,
+    get_science_file,
+    push_science_file,
 )
 
 # Configure logger
@@ -118,55 +116,32 @@ def _process_file(self) -> None:
         )
 
         # Parse file key to needed information
-        (
-            parsed_file_key,
-            this_instr,
-            destination_bucket,
-        ) = self._parse_file(self.file_key, self.environment)
+        parsed_file_key = parse_file_key(self.file_key)
+
+        # Parse the science file name
+        science_file = science_filename_parser(parsed_file_key)
+        this_instr = science_file["instrument"]
+        destination_bucket = get_instrument_bucket(this_instr, self.environment)
 
         # Download file from S3 or get local file path
-        file_path = self._get_file(
+        file_path = get_science_file(
             self.instrument_bucket_name,
             self.file_key,
             parsed_file_key,
             self.dry_run,
         )
-
+        
         # Calibrate/Process file with Instrument Package
         calibrated_filename = self._calibrate_file(this_instr, file_path, self.dry_run)
 
         # Push file to S3 Bucket
-        self._put_file(
+        push_science_file(
             science_filename_parser,
             destination_bucket,
             calibrated_filename,
             self.dry_run,
         )
 
-    @staticmethod
-    def _parse_file(file_key, environment):
-        """
-        Parses the file key to extract the instrument name,
-        and determines the destination bucket based on the instrument and environment.
-
-        :param file_key: The key of the file in the S3 bucket.
-        :type file_key: str
-        :param environment: The current running environment (e.g., DEVELOPMENT).
-        :type environment: str
-        :return: A tuple containing key, instrument and bucket.
-        :rtype: tuple
-        """
-        # Parse file key to get instrument name
-        file_key_array = file_key.split("/")
-        parsed_file_key = file_key_array[-1]
-
-        # Parse the science file name
-        science_file = science_filename_parser(parsed_file_key)
-        this_instr = science_file["instrument"]
-        destination_bucket = get_instrument_bucket(this_instr, environment)
-
-        return parsed_file_key, this_instr, destination_bucket
-
     @staticmethod
     def _calibrate_file(instrument, file_path, dry_run=False):
         """
@@ -234,130 +209,4 @@ def _calibrate_file(instrument, file_path, dry_run=False):
             return calibrated_filename
 
         except ValueError as e:
-            log.error(e)
-
-    @staticmethod
-    def _get_file(instrument_bucket_name, file_key, parsed_file_key, dry_run=False):
-        """
-        Downloads the file from the specified S3 bucket, if not in a dry run.
-        If a file path is specified in the environment variables, it uses that instead.
-
-        :param instrument_bucket_name: The instrument bucket name.
-        :type instrument_bucket_name: str
-        :param file_key: The key of the file in the S3 bucket.
-        :type file_key: str
-        :param parsed_file_key: The parsed name of the file.
-        :type parsed_file_key: str
-        :param dry_run: Indicates whether the operation is a dry run.
-        :type dry_run: bool
-        :return: The path to the downloaded file or None if in a dry run.
-        :rtype: Path or None
-        """
-        # Download file from instrument bucket if not a dry run
-        # or use the specified file path
-        if not dry_run:
-            # Check if using test data in instrument package
-            if os.getenv("USE_INSTRUMENT_TEST_DATA") == "True":
-                log.info("Using test data from instrument package")
-                return None
-
-            # Check if file path is specified in environment variables
-            if os.getenv("SDC_AWS_FILE_PATH"):
-                log.info(
-                    "Using file path specified in environment variables"
-                    f"{os.getenv('SDC_AWS_FILE_PATH')}"
-                )
-                file_path = Path(os.getenv("SDC_AWS_FILE_PATH"))
-                return file_path
-
-            # Initialize S3 Client
-            s3_client = create_s3_client_session()
-
-            # Verify object exists in instrument bucket
-            if not (
-                object_exists(
-                    s3_client=s3_client,
-                    bucket=instrument_bucket_name,
-                    file_key=file_key,
-                )
-                or dry_run
-            ):
-                raise FileNotFoundError(
-                    f"File {file_key} does not exist in bucket {instrument_bucket_name}"
-                )
-
-            # Download file from S3 bucket if no file path is specified
-            file_path = download_file_from_s3(
-                s3_client,
-                instrument_bucket_name,
-                file_key,
-                parsed_file_key,
-            )
-
-            return file_path
-        else:
-            log.info("Dry Run - File will not be downloaded")
-            return None
-
-    @staticmethod
-    def _put_file(
-        science_filename_parser, destination_bucket, calibrated_filename, dry_run=False
-    ):
-        """
-        Uploads a file to the specified destination bucket in S3, if not in a dry run.
-        Generates the file key for the new file using the given parser.
-
-        :param science_filename_parser: The parser function to generate a file key.
-        :type science_filename_parser: function
-        :param destination_bucket: The name of the destination S3 bucket.
-        :type destination_bucket: str
-        :param calibrated_filename: The pathname of the new file to be uploaded.
-        :type calibrated_filename: str
-        :param dry_run: Indicates whether the operation is a dry run.
-        :type dry_run: bool
-        :return: The key of the newly uploaded file.
-        :rtype: str
-        """
-        # Generate file key for new file
-        new_file_key = create_s3_file_key(science_filename_parser, calibrated_filename)
-
-        # Upload file to destination bucket if not a dry run
-        if dry_run:
-            log.info("Dry Run - File will not be uploaded")
-            return new_file_key
-
-        if os.getenv("USE_INSTRUMENT_TEST_DATA") == "True":
-            log.info("Using test data from instrument package")
-            return new_file_key
-
-        if not os.getenv("SDC_AWS_FILE_PATH"):
-            # Initialize S3 Client
-            s3_client = create_s3_client_session()
-
-            # Verify object does not exist in instrument bucket
-            if object_exists(
-                s3_client=s3_client,
-                bucket=destination_bucket,
-                file_key=new_file_key,
-            ):
-                log.warning(
-                    f"File {new_file_key} already exists in bucket {destination_bucket}"
-                )
-                return new_file_key
-
-            # Upload file to destination bucket
-            upload_file_to_s3(
-                s3_client=s3_client,
-                destination_bucket=destination_bucket,
-                filename=calibrated_filename,
-                file_key=new_file_key,
-            )
-
-        else:
-            log.info(
-                "File Processed Locally - File will not be uploaded,"
-                "available in mounted volume as:"
-                f"{Path(calibrated_filename).as_posix()}"
-            )
-
-        return new_file_key
+            log.error(e)