Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🤝 Implement AWS Transfer Family Server #3501

Closed
6 tasks done
Gary-H9 opened this issue Feb 29, 2024 · 21 comments
Closed
6 tasks done

🤝 Implement AWS Transfer Family Server #3501

Gary-H9 opened this issue Feb 29, 2024 · 21 comments
Assignees
Labels
data-platform-apps-and-tools This issue is owned by Data Platform Apps and Tools enhancement enhancing an existing feature

Comments

@Gary-H9
Copy link
Contributor

Gary-H9 commented Feb 29, 2024

User Story

As a… User of the Analytical Platform
I want to be able to ingest data into the platform
So that… I can use all of the tooling etc within the platform

As a AP Product Engineer
I want to the platform to be able to ingest data in a controlled, precise and monitored fashion.
So that we can provide a better service for our users and build a foundational offering within the platform.

This ticket builds on this previously raised Feature Request.

Value / Purpose

Data ingestion will be a foundational part of the AP offering going forward. This piece of work will create the foundations which this offering will be built on.

Useful Contacts

Jacob W / Julia / Gary

Proposal

Create the ingestion route as outlined here.

Additional Information

image

For the AWS Transfer Family Server - Electronic Monitoring already use this functionality.

Definition of Done

  • 📝 Documentation has been written / updated
  • Modernisation Platform AWS Account created (WIP)
  • AWS Transfer Family Server Terraform'ed
  • IAM Roles defined + Terraform'ed
  • User testing arranged
  • Testing completed
@Gary-H9 Gary-H9 added enhancement enhancing an existing feature data-platform-apps-and-tools This issue is owned by Data Platform Apps and Tools labels Feb 29, 2024
@jacobwoffenden jacobwoffenden moved this from 👀 TODO to 🚀 In Progress in Analytical Platform Mar 4, 2024
@jacobwoffenden
Copy link
Member

29/02/24 summary:

@julialawrence julialawrence moved this from 🚀 In Progress to 🚫 Blocked in Analytical Platform Mar 4, 2024
@julialawrence julialawrence moved this from 🚫 Blocked to 🚀 In Progress in Analytical Platform Mar 4, 2024
@jacobwoffenden
Copy link
Member

jacobwoffenden commented Mar 5, 2024

05/03/24 summary:

@jacobwoffenden
Copy link
Member

jacobwoffenden commented Mar 6, 2024

06/03/24:

  • Successfully tested flow for updating and then consuming ClamAV definitions
  • Successfully tested scanning files using S3 bucket notifications
    • Clean files moved to a processed bucket, while infected are moved to quarantine
      • Files are deleted from landing once moved

TODO:

  • Add object policy to deny access to quarantined
  • Configure Transfer Family server
    • Home directory
    • Move into VPC and give an EIP
    • Custom hostname?
    • Restrict users to specific IP ranges
  • Notification for quarantined objects

Notes:

  • Electronic Monitoring create a server per provider, we don't want to do this, could be hard to scale and manager via AP dashboard

EDIT @jacobwoffenden:

I've managed to get this working by editing the IAM policy for the user to include more S3 permissions, and added KMS permissions

Screenshot 2024-03-06 at 18 37 42

EDIT 2 @jacobwoffenden:

  • figure out how/where to run aws transfer update-server to set DirectoryListingOptimization to ENABLED. this is not exposed in Terraform MP ClickOps'd it

@Gary-H9
Copy link
Contributor Author

Gary-H9 commented Mar 7, 2024

07/03/24:

  • Converted the lambda function to bash from python. Noted faster runs as a result. Tested both scenarios successfully.
  • Planned architecture relating to notifications in variety of scenarios.

@jacobwoffenden
Copy link
Member

jacobwoffenden commented Mar 11, 2024

11/04/24 summary:

  • More work on scan functionality
    • sends a message to SNS when quarantining a file
  • Started work on notify functionality
    • Triggers on SNS publish
    • Plumbed into Data Platform's GOV.UK Notify account, have tested basic functionality
  • Drafted architecture for "supplier data", or information about supplier we don't want to store in public Terraform, this will be superseded by AP dashboard eventually
  • Liaised with GDS about getting access to Analytical Platform's notify account

TODO:

  • SNS redrive logic
  • clarify naming/terminology (data contact, data owner, etc.)

@Gary-H9
Copy link
Contributor Author

Gary-H9 commented Mar 14, 2024

14th March summary:

  • Refactored existing code + added code for new lambda and required components
  • Removed calls to SNS for this iteration of the product - this can be looked at after MVP - ticket created for this
  • Created ingestion-transfer handler.py
  • Created a test bucket in analytical-platform-dev called dev-ingestion-testing to test cross account file move from the above lambda. Manually updated the policy on this but it needs further work.

@jacobwoffenden
Copy link
Member

jacobwoffenden commented Mar 18, 2024

@Gary-H9
Copy link
Contributor Author

Gary-H9 commented Mar 20, 2024

  • Closed development branch/PR in data-platform repository
  • Migrating code from the above branch into Modernisation Platform Environments repository. Merged - development and production built.
  • Updated DNS record for both environments in analytical-platform-production R53.

@Gary-H9
Copy link
Contributor Author

Gary-H9 commented Mar 26, 2024

Created documentation relating to the solution in user-guidance (🚧) and in our new runbooks documentation.

@jacobwoffenden
Copy link
Member

jacobwoffenden commented Mar 27, 2024

@ministryofjustice/modernisation-platform have enabled optimised directories manually in both transfer servers (dev and prod)

@jacobwoffenden
Copy link
Member

@Ed-Bajo Ed-Bajo closed this as completed Mar 28, 2024
@github-project-automation github-project-automation bot moved this from 🚀 In Progress to 🎉 Done in Analytical Platform Mar 28, 2024
@Gary-H9 Gary-H9 reopened this Mar 28, 2024
@Gary-H9
Copy link
Contributor Author

Gary-H9 commented Mar 28, 2024

Awaiting user information to allow testing. In the meantime egress has been completed.

@Gary-H9 Gary-H9 moved this from 🎉 Done to 🚀 In Progress in Analytical Platform Apr 2, 2024
@Gary-H9
Copy link
Contributor Author

Gary-H9 commented Apr 2, 2024

@jacobwoffenden jacobwoffenden moved this from 🚀 In Progress to 🚫 Blocked in Analytical Platform Apr 3, 2024
@jacobwoffenden
Copy link
Member

Pending details from BOLD to begin onboarding

@jacobwoffenden
Copy link
Member

Data Engineering's https://github.com/moj-analytical-services/iam_builder needs updating to add kms which is needed to add KMS permissions to their Airflow role

@jacobwoffenden
Copy link
Member

jacobwoffenden commented Apr 15, 2024

Analytical Platform team to update Airflow IAM role with permissions to access KMS key

  • arn:aws:iam::593291632749:role/airflow_dev_bold_rr_essex_police

@jacobwoffenden jacobwoffenden moved this from 🚫 Blocked to 🚀 In Progress in Analytical Platform Apr 17, 2024
@jacobwoffenden
Copy link
Member

@jacobwoffenden jacobwoffenden moved this from 🚀 In Progress to 🚫 Blocked in Analytical Platform Apr 18, 2024
@jacobwoffenden
Copy link
Member

jacobwoffenden commented Apr 18, 2024

Moving to blocked:

@michaeljcollinsuk michaeljcollinsuk moved this from 🚫 Blocked to 🚀 In Progress in Analytical Platform Apr 19, 2024
@michaeljcollinsuk
Copy link
Contributor

Moving back to in progress as @julialawrence is working on a new request from BOLD

@jacobwoffenden
Copy link
Member

Blocked by #3765

@jacobwoffenden jacobwoffenden moved this from 🚀 In Progress to 🚫 Blocked in Analytical Platform May 1, 2024
@jacobwoffenden
Copy link
Member

Closing as we've tested end-to-end, we're waiting on BOLD to perform their end-to-end testing which is out of scope for this issue.

@github-project-automation github-project-automation bot moved this from 🚫 Blocked to 🎉 Done in Analytical Platform May 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data-platform-apps-and-tools This issue is owned by Data Platform Apps and Tools enhancement enhancing an existing feature
Projects
Archived in project
Development

No branches or pull requests

4 participants