Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

📐 Add ADR proposals #3107

Merged
merged 35 commits into from
Feb 15, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
6421207
:memo: Change 000 to use page page.title
bagg3rs Jan 17, 2024
0acbd58
Add ADR-007 and ADR-008 for consideration
bagg3rs Jan 17, 2024
ad1f39e
Update QuickSight documentation and add AWS Bedrock for language mode…
bagg3rs Jan 19, 2024
f6ea2c7
Update documentation and use separate AWS accounts for data storage
bagg3rs Jan 23, 2024
bbde7f5
Update formatting
bagg3rs Jan 23, 2024
cc31be8
Add SCP and OU info
bagg3rs Jan 27, 2024
0c7330d
Refactor AWS account structure for improved governance and security
bagg3rs Jan 27, 2024
78dc87d
Update vendor or partner access in ADR-011
bagg3rs Jan 27, 2024
8780929
Merge branch 'main' into add-adrs
bagg3rs Jan 27, 2024
62108e5
📝 ⬆️ ⬇️ move and 🔥 words
bagg3rs Jan 27, 2024
3d37b42
🔥 remove temp file
bagg3rs Jan 27, 2024
8f0027b
📝 clear up context
bagg3rs Jan 27, 2024
7b05da8
📝 tidy up
bagg3rs Jan 27, 2024
74e918d
🔄 accept bedrock 🤔 and formatting and clarity for quicksight
bagg3rs Jan 28, 2024
d03f740
🔄 clean up the mess a little more
bagg3rs Jan 28, 2024
eaffe9e
spelling
bagg3rs Jan 28, 2024
2516b30
Update ADR-008 AWS Bedrock documentation
bagg3rs Jan 28, 2024
f7c052b
fix linter issues
bagg3rs Jan 30, 2024
fcd9477
Bedrock status ✅ -> 🤔
bagg3rs Jan 31, 2024
9744ea6
Update ADR-009: Use AWS SageMaker for analytical tooling
bagg3rs Feb 2, 2024
218af22
Update ADR-009 to include benefits of using AWS SageMaker for analyti…
bagg3rs Feb 2, 2024
83edb30
Update tooling from EKS to AWS SageMaker for improved efficiency and …
bagg3rs Feb 2, 2024
7488003
Update ADR-009 remove duplicate consequences
bagg3rs Feb 3, 2024
06d5a5c
clarify
bagg3rs Feb 3, 2024
07ce3d8
⬆️ update review date
bagg3rs Feb 5, 2024
b5c5bc9
✅ accepted identity and updated consequences
bagg3rs Feb 5, 2024
6e024ee
spelling
bagg3rs Feb 5, 2024
07f9607
update review dates and rename Azure to EntraID
bagg3rs Feb 6, 2024
ba51c22
🚨 fix space
bagg3rs Feb 6, 2024
a922406
📝 Review and Update ADR-007 @julialawrence
bagg3rs Feb 12, 2024
ad67304
📝 Update ADR-009 to use AWS SageMaker for analytical tooling
bagg3rs Feb 13, 2024
09442f5
Update ADR-008: Add information about Amazon Bedrock
bagg3rs Feb 15, 2024
52f2b74
Add data sovereignty issue note to ADR-008 AWS Bedrock
bagg3rs Feb 15, 2024
68bdeef
Update ADR-009: Use AWS SageMaker for analytical tooling
bagg3rs Feb 15, 2024
51e6dd7
Merge branch 'main' into add-adrs
julialawrence Feb 15, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ last_reviewed_on: 2023-08-17
review_in: 6 months
---

# ADR-000 Record Architecture Decisions
# <%= current_page.data.title %>

## Status

Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
owner_slack: "#data-platform-notifications"
title: ADR-001 Use Cloud Platform for hosting infrastructure
last_reviewed_on: 2023-10-24
last_reviewed_on: 2023-02-05
review_in: 3 months
---

Expand All @@ -19,7 +19,7 @@ you must use [Cloud Platform](https://user-guide.cloud-platform.service.justice.

## Decision

We will use the Cloud Platform for hosting our digital services, which will include containerised code, apps and managed storage options (RDS, S3) where appropriate. It is a managed public cloud platform endorsed by
We will use the Cloud Platform for hosting our digital services, which will include containerised code, apps and managed storage options (RDS, S3) where appropriate. It is a managed public cloud platform endorsed by
Justice Digital.

## Consequences
Expand Down

This file was deleted.

Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
---
owner_slack: "#data-platform-notifications"
title: ADR-003 Use EntraID (formally AzureAD) for Identity and Access Management
last_reviewed_on: 2024-02-06
review_in: 3 months
---

# <%= current_page.data.title %>

## Status

✅ Accepted

## Context

The Data Platform will need a way to verify users and provide access to resources.
We want to simplify access for users by reducing the number of identities they need for services.

We do not want to run an identity service.

## Decision

We will use [EntraID](https://www.microsoft.com/en-us/security/business/identity-access/microsoft-entra-id) (formally AzureAD) for [Identity and Access Management](https://en.wikipedia.org/wiki/Identity_management) (IDAM).
Our users are already using a `@justice.gov.uk` account as their primary login.
Our users can take advantage of their existing identity to gain access to the Data Platform and access services.

## Consequences

- We will not have to run an identity service and managing logging and security of that system
- We won't be managing our identity service, we need to work with the end user compute team to improve identity operations (version and automate changes)
- Reduce our support requirements for joiners, movers and leavers(JML) e.g. issues with multi factor authentication and password resets
- Guest accounts are possible, but not managed which means we will need an alternative solution
- There is no systematic way to create and manage AzureAD groups to provide authN, we will need to work with end user compute team.
- Cross government AzureAD federation is not yet formalised, but in the future we could give other departments access to resources with their existing credentials
- We can look to unlock [SCIM](https://docs.github.com/en/enterprise-cloud@latest/admin/identity-and-access-management/provisioning-user-accounts-for-enterprise-managed-users/configuring-scim-provisioning-for-enterprise-managed-users) to create, manage, and deactivate GitHub accounts based on [EntraID](https://www.microsoft.com/en-us/security/business/identity-access/microsoft-entra-id) group membership
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
---
owner_slack: "#data-platform-notifications"
title: ADR-007 Use AWS QuickSight-for-data-visualisation
last_reviewed_on: 2024-01-17
review_in: 6 months
---

# <%= current_page.data.title %>

## Status

🤔 Proposed

## Context

We do not offer a managed data visualisation and reporting tool. Users need to build and run these applications themselves using [R](https://en.wikipedia.org/wiki/R_(programming_language)) or [Python](https://en.wikipedia.org/wiki/Python_(programming_language)).

[PowerBI](https://en.wikipedia.org/wiki/Microsoft_Power_BI) comes part of our Microsoft 365 subscription, but connecting to data on our platform requires additional [infrastructure](https://docs.aws.amazon.com/whitepapers/latest/using-power-bi-with-aws-cloud/connecting-the-microsoft-power-bi-service-to-aws-data-sources.html).

If we offered AWS QuickSight, we can reduce our current support burden from new RShiny deployments, and give new and existing users simpler visualisation and reporting capabilities.

## Decision

- _proposed - We will offer AWS QuickSight to our users. QuickSight is fully managed and can be integrated into our identity management system._

## Consequences

### General consequences

- Users can build and share dashboards from data stored on our platform
- Operates on a pay-as-you-go [pricing](https://aws.amazon.com/quicksight/pricing/) model, which means we are billed based on actual usage
- QuickSight is designed to be user-friendly (no coding required), but users might face issues when dealing with more advanced or complex use cases
- We will need to start a QuickSight community for users to help and share their experiences and knowledge
- There is already a public [QuickSight community](https://community.amazonquicksight.com/)

### Advantages

- Serverless BI service, meaning we do not need to patch or maintain and [security and compliance](https://docs.aws.amazon.com/quicksight/latest/user/QS-compliance.html) is maintained by AWS
- User friendly interface and extensive online training materials, we won't need to produce extensive documentation to support, AWS provides many resources for building and sharing dashboards
- Reduced operational cost and complexity for users to create reports and visualisations
- AWS provided immersion days and free training for our users
- Cost transparency, the total cost of ownership and management of RShiny and other hosted solutions is hard to calculate

### Disadvantages

- [Application configuration as code](https://github.com/aws-samples/amazon-quicksight-assets-as-code-sample) does not support all resources
- The internal permission management is complex and difficult to understand
42 changes: 42 additions & 0 deletions docs/source/documentation/adrs/adr-008-aws-bedrock.html.md.erb
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
---
owner_slack: "#data-platform-notifications"
title: ADR-008 AWS Bedrock
last_reviewed_on: 2024-01-17
review_in: 6 months
---

# <%= current_page.data.title %>

## Status

🤔 Proposed

## Context

Our users want to explore and leverage [large language models](https://en.wikipedia.org/wiki/Large_language_model) (LLM) to solve business problems.

Our platform lacks the resources required to run these models.

## Decision

- _proposed - We will offer [Amazon Bedrock](https://aws.amazon.com/bedrock/) to our users.
Amazon Bedrock is fully managed large language model platform, which offers many foundation models which be customised privately using techniques such as fine tuning and [retrieval-augmented generation (RAG)](https://docs.aws.amazon.com/sagemaker/latest/dg/jumpstart-foundation-models-customize-rag.html)._

## Consequences

### General consequences

- Bedrock provides pre-trained models for generations and embeddings
- Bedrock pricing is based on usage and can vary significantly month-to-month depending on your application's traffic and costs could spike unexpectedly. Usage is metered and billed per inference request and based on factors like model used, input length, and response length
- Bedrock models are accessed via an API using AWS permissions

### Advantages

- Serverless access to large language models meaning that our platform and users don't need to manage and maintain infrastructure
- Because this is a fully managed service, the compute is managed by AWS and overcomes resourcing limits that currently constrain the platform

### Disadvantages

- Limited model selection, Bedrock offers a few pre-trained models and new models can take time to reach all AWS regions
- Frankfurt region doesn't currently offer functionality such as fine-tuning and model training
- Service is currently only available in Frankfurt and Virginia which raises data sovereignty issues
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
---
owner_slack: "#data-platform-notifications"
title: ADR-009 Use AWS Sagemaker for analytical tooling
last_reviewed_on: 2024-01-17
review_in: 6 months
---

# <%= current_page.data.title %>

## Status

🤔 Proposed

## Context

Our user want analytical features not available on our existing platform. The types of tools and underlying compute is changing rapidly. SageMaker provides a managed service for these tools and provides instances with higher resources and GPU to speed and aid research.

There is also a lot of interest in using LLMs for e.g. semantic search of free text. SageMaker in VPC isolation mode makes sure sensitive workloads are secured and stay within the [instance](https://docs.aws.amazon.com/sagemaker/latest/dg/nbi.html), we can further secure data using [private VPC](https://aws.amazon.com/blogs/machine-learning/securing-amazon-sagemaker-studio-connectivity-using-a-private-vpc/) and [PrivateLink](https://aws.amazon.com/privatelink/)

## Decision

- _proposed - We will look to offer [Amazon SageMaker](https://aws.amazon.com/bedrock/) to our users_

## Proposal General Consequences

- [SageMaker costs](https://aws.amazon.com/sagemaker/pricing/) are based on usage and can vary significantly month-to-month depending on your application's usage, instance type. We can provide [proactive notifications](https://aws.amazon.com/blogs/mt/setting-up-an-amazon-cloudwatch-billing-alarm-to-proactively-monitor-estimated-charges/) that will for the first time allow our users to understand the cost of the work that they are doing
- Reduced [operational cost](https://aws.amazon.com/blogs/machine-learning/lowering-total-cost-of-ownership-for-machine-learning-and-increasing-productivity-with-amazon-sagemaker/) and complexity
- Agility and change readiness, additional analytical services can be offered when available without considerable effort that has lead to users having to work elsewhere (extensive development of front and backend services) e.g. [Control Panel](https://controlpanel.services.analytical-platform.service.justice.gov.uk/)
- Better cost transparency, we will understand our tooling compute costs which is currently very difficult to calculate
- For Foundation Models, SageMaker JumpStart does not download models from a public model zoo, it can be used in fully locked-down e.g. **no internet access**
- Network access can be limited and scoped down for SageMaker JumpStart models, this helps teams improve the security posture of the environment
- Due to the VPC boundaries, access to the endpoint can also be limited via subnets and security groups, which adds an extra layer of security
- Leverage managed services like [SageMaker Studio](https://aws.amazon.com/sagemaker/studio/)
- If successful we can close down our current tooling [EKS](https://aws.amazon.com/eks/). EKS although managed still requires a considerable amount of effort to run with the endless upgrades and get on with more useful tasks

### Disadvantages

- RStudio on Amazon SageMaker is a paid product and requires that each user is appropriately [licensed](https://docs.aws.amazon.com/sagemaker/latest/dg/rstudio-license.html). As part of the pilot we will need to understand our users need for RStudio
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
---
owner_slack: "#data-platform-notifications"
title: ADR-010 Documentation
last_reviewed_on: 2024-01-17
review_in: 6 months
---

# <%= current_page.data.title %>

## Status

🤔 Proposed

## Context

We need to document how our platform is built, and provide guidance where needed for our users.

We have many places to store documents and this creates a challenge for our existing and new members of our team.

Google Workspaces is being retired in favour or Office365 so we need an alternative for google docs.

## Decision

The following locations will be used for documenting all things related to the Data Platform.

### Team and Technical

>Team information, ways of working and ADRs should be stored in the open in our technical documentation [here](https://technical-documentation.data-platform.service.justice.gov.uk/)
>Documentation directly relating to code should be stored in a `README.MD` next to the code in its repository

### Sensitive information

>Sensitive information, or information on users of the platform which should be stored in our internal repository [here](https://github.com/ministryofjustice/data-platform-internal-documentation)

### Diagrams

>_diagrams_

## Consequences

>Since the majority of our documentation and guidance is published in the open, we need to ensure that we do not publish any sensitive details or user data in text or screenshots.
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
---
owner_slack: "#data-platform-notifications"
julialawrence marked this conversation as resolved.
Show resolved Hide resolved
title: ADR-011 Use separate AWS accounts for data domains and products
last_reviewed_on: 2024-01-17
review_in: 6 months
---

# <%= current_page.data.title %>

## Status

🤔 Proposed

## Context

The Data Platform will need to provide a secure location to store and share data to those who have been granted access. The use of a multi-account strategy will give the Data Platform a scalable storage architecture which adheres to the [AWS Well-Architected Framework](https://aws.amazon.com/architecture/well-architected/) pillars on operational excellence, security, reliability, and cost optimisation.

**/tldr**
Our current architecture is overly permissive in design and makes understanding responsibility and cost difficult.

Using separate AWS (Amazon Web Services) accounts for storing data will serve several purposes for MoJ, each contributing to improved governance, security and manageability.

## Decision

- _proposed_

## Proposal Consequences

### General consequences

- A shift in ownership and responsibility of cloud resources back to the teams that own the data
- We will need to understand what account owners need outside of single sign on, and account bootstrap
- Cost will be visible to owners and aligns with the Technology Code of Practice point 12, [make your service sustainable](https://www.gov.uk/guidance/the-technology-code-of-practice#make-your-technology-sustainable)
- Align with [NCSC cloud security guidance](https://www.ncsc.gov.uk/collection/cloud/the-cloud-security-principles/principle-3-separation-between-customers) on separation between customers (in our case domains) to defend against another customer having e.g. malicious code execution
- We will need to work with Modernisation Platform on improving our ability to dispense data accounts and ensure we do not impact their support
- We will define [Service Control Policies](https://docs.aws.amazon.com/organizations/latest/userguide/orgs_manage_policies_scps.html) against [AWS Organizations](https://docs.aws.amazon.com/organizations/latest/userguide/orgs_introduction.html)
- We will need functionality for users to request access to data and for [Data Owners](https://www.gov.uk/government/publications/essential-shared-data-assets-and-data-ownership-in-government/data-ownership-in-government-html#data-owner-2) to approve
- We will be able to give teams access to a project or temporary accounts for research (this could include other managed analytical tooling e.g. SageMaker) which then can be securely closed down with all associated resources removed
- Observability of accounts and data is simplified for account owners

Using separate AWS (Amazon Web Services) accounts for storing data will serve several purposes for MoJ, each contributing to improved governance, security, manageability, and efficiency.

Other reasons for using separate AWS accounts for data storage:

1. **Security Isolation:**
- **Data Segmentation:** Different types of data may have varying sensitivity levels. By using separate accounts, you can isolate highly sensitive data from less critical information, reducing the risk of unauthorised access or data breaches.
- **Access Control:** AWS Identity and Access Management (IAM) allows fine-grained control over who can access resources within an AWS account. Using separate accounts allows for better control and segregation of access permissions, limiting potential security vulnerabilities.
2. **Compliance Requirements:**
- **Regulatory Compliance:** Certain industries and regions have specific regulatory requirements regarding data storage and processing. Using separate AWS accounts can help you adhere to these compliance standards by providing clear boundaries and controls around data.
3. **Resource Management:**
- **Isolation of Resources:** Different business units or projects within an organisation may require their own set of AWS resources. Using separate accounts makes it easier to manage and isolate these resources, preventing interference or resource contention.
- **Resource Scaling:** Each AWS account has its own resource limits and can be independently scaled. This allows for better resource optimisation and avoids the risk of reaching account-wide limits.
4. **Cost Management:**
- **Billing and Budgeting:** AWS provides detailed billing reports for each account. By using separate accounts, you can better track and allocate costs to specific projects, teams, or departments. This facilitates more accurate budgeting and financial management. Tags provide some of these capabilities but are limited in their scope as they cannot be applied to all resources.
5. **Disaster Recovery:**
- **Isolation for Redundancy:** In the event of a disaster, having data stored in separate AWS accounts can act as a form of redundancy. If one account experiences issues, the others may remain unaffected, providing a level of data resilience.
6. **Third-Party Access:**
- **Vendor or Partner Access:** If external vendors or partners need access to specific data or services, setting up a separate account for them. can facilitate controlled and secure access without compromising other data in that account. if further restrictions on data access is required [AWS Clean Rooms](https://docs.aws.amazon.com/clean-rooms/latest/userguide/what-is.html) can be explored
7. **Ownership**
- **Responsibility:** We need for our users to take responsibility for storing data, and to meet point 12 of the The Technology Code of Practice of [Make your technology sustainable](https://www.gov.uk/guidance/the-technology-code-of-practice#make-your-technology-sustainable) and to inform our users of the cost associated with storing data, which in our current architecture is very difficult to deduct.
Loading
Loading