Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🚚 Spike: Investigate adding AWS DataSync capability to Platform #1309

Closed
3 tasks
Tracked by #1827
bagg3rs opened this issue Aug 29, 2023 · 10 comments
Closed
3 tasks
Tracked by #1827

🚚 Spike: Investigate adding AWS DataSync capability to Platform #1309

bagg3rs opened this issue Aug 29, 2023 · 10 comments
Assignees
Labels

Comments

@bagg3rs
Copy link
Contributor

bagg3rs commented Aug 29, 2023

User Story

As a Analytical Platform user
I want to sync unstructured data from a network share
So that that we can perform NLP (Natural Language Processing) on that data to gain insight.

Slack thread

image

Value

We have had a few requests to have data sync'd to S3 data warehouse for processing.
This is not a current capability of the Analytical Platform. Since this is connected to legacy smb file server (managed by a third party which cannot be changed) we would need a swing location for this service to be created to overcome network routing issues.

If we can provide a feature of our platform to enable other teams to setup and maintain their own Sync Transfer Tasks Teams can stop using sub optimal methods e.g. Remote Desktop or laptops which have issues around sleeping/terminating.

Questions / Assumptions / Hypothesis

Hypothesis

If we add the AWS Sync service
Then teams can use this to easily import data, rather than using laptops and virtual desktop sessions.

Proposal

Deploy AWS DataSync in Modernisation Platform
Allow teams to manage their sync requirements

Definition of done

  • AWS DataSync deployed and tested
  • Metadata applied to transferred data
  • Findings documented

Reference

How to write good user stories

@bagg3rs
Copy link
Contributor Author

bagg3rs commented Aug 30, 2023

Slack discussion

@jacobwoffenden jacobwoffenden added data-platform-apps-and-tools This issue is owned by Data Platform Apps and Tools and removed Data Platform Core Infrastructure labels Sep 22, 2023
@jhpyke
Copy link
Contributor

jhpyke commented Oct 24, 2023

Refinement (24/10/23) Look for steer from Project Management on Priority/Delivery

@YvanMOJdigital
Copy link

Iteration 4 objective: For this sprint we would like to know an estimated effort/time cost to deliver this functionality as well as the potential compute costs the users would incur to meet their needs. We want the us/users to be able to understand the cost/benefit of implementing the solution and doing the processing.

@jhpyke
Copy link
Contributor

jhpyke commented Oct 24, 2023

Refinment (24/10/23): Spike to be made on Implementation (/ what we'd need from ATOS to implement)

@julialawrence julialawrence changed the title 🚚 Investigate adding AWS DataSync capability to Platform 🚚 Spike: Investigate adding AWS DataSync capability to Platform Oct 26, 2023
@julialawrence julialawrence moved this to 🧐 To Do in Analytical Platform Oct 31, 2023
@jhpyke jhpyke moved this from 🧐 To Do to 💨 In Progress in Analytical Platform Nov 2, 2023
@jhpyke
Copy link
Contributor

jhpyke commented Nov 2, 2023

Looked at as of 02/11/23:

Identified home of target SMB server, a Non resolvable intranet address (http://dom1.infra.int/data/HQ/PGO/Shared/Group/Investigations/). Identified intended PoC as per following diagram:

Image

Final product may wish to target sync to bucket in Data Account directly, but getting the SMB connection/networking seems to be the primary challenge with this ticket.

@julialawrence julialawrence moved this from 💨 In Progress to ✋ Blocked in Analytical Platform Nov 9, 2023
@jhpyke jhpyke moved this from ✋ Blocked to 💨 In Progress in Analytical Platform Nov 13, 2023
@julialawrence julialawrence moved this from 💨 In Progress to ✋ Blocked in Analytical Platform Nov 21, 2023
@julialawrence
Copy link
Contributor

Have heard from Atos that creation of the service account is chargeable, so now it's back with the requestor to reassess.

@bagg3rs
Copy link
Contributor Author

bagg3rs commented Dec 7, 2023

Funding has been granted!

Copy link
Contributor

github-actions bot commented Feb 6, 2024

This issue is being marked as stale because it has been open for 60 days with no activity. Remove stale label or comment to keep the issue open.

@github-actions github-actions bot added the stale label Feb 6, 2024
Copy link
Contributor

This issue is being closed because it has been open for a further 7 days with no activity. If this is still a valid issue, please reopen it, Thank you!

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Feb 13, 2024
@jacobwoffenden jacobwoffenden moved this from 🚫 Blocked to 🎉 Done in Analytical Platform Feb 15, 2024
@bagg3rs
Copy link
Contributor Author

bagg3rs commented Aug 29, 2024

relates to #5175

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Archived in project
Development

No branches or pull requests

6 participants