From 9bb8371c356d84dfb84d141f1f689cfbd1bb1d95 Mon Sep 17 00:00:00 2001 From: clarotech Date: Thu, 17 Nov 2022 14:12:35 +0000 Subject: [PATCH 1/2] Minor Typos in Documentation (#2881) --- .../importing-exporting-data-airlock.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/docs/using-tre/tre-for-research/importing-exporting-data-airlock.md b/docs/using-tre/tre-for-research/importing-exporting-data-airlock.md index ad33d6d001..8444db2679 100644 --- a/docs/using-tre/tre-for-research/importing-exporting-data-airlock.md +++ b/docs/using-tre/tre-for-research/importing-exporting-data-airlock.md @@ -3,9 +3,9 @@ This guide will take you through the process of importing data into a TRE workspace, and exporting data from a workspace to the outside world, using the Airlock feature. -The Airlock feature is intended for ad-hoc use when you need to bring in and export out files that you need for your research, and it ensures that when you import or export this data, all the appropriate approvals and procedures configured by your organisation take place. +The Airlock feature is intended for ad-hoc use when you need to bring in and export out files that you need for your research. It ensures that when you import or export this data all the appropriate approvals and procedures configured by your organisation take place. -You can read more about the Airock feature in the [Airlock documentation](../../azure-tre-overview/airlock.md). +You can read more about the Airlock feature in the [Airlock documentation](../../azure-tre-overview/airlock.md). ## Importing data to a workspace @@ -21,7 +21,7 @@ To bring in external data to a secure TRE workspace so you can use it for your r 1. Fill in a suitable **Title** for your request (make this short but descriptive to help you and others identify it in a list of many other requests) -1. Provide a **Business Jutification** for bringing the data into the workspace (this will be used to help your organisation's data stewards decide whether to approve or reject the request) +1. Provide a **Business Justification** for bringing the data into the workspace (this will be used to help your organisation's data stewards decide whether to approve or reject the request) 1. Click *Create* when ready. This will create your draft request and allow you to proceed with adding the data you'd like to import @@ -31,7 +31,7 @@ To bring in external data to a secure TRE workspace so you can use it for your r ### Step 2: Add data to your import request -1. The request you've just created should pop up automatically; however you can return to it at any time within the Airlock page by finding it in the list of requests. (Use the *My requests* quick filter to find it more easily) +1. The request you've just created should pop up automatically; however, you can return to it at any time within the Airlock page by finding it in the list of requests. (Use the *My requests* quick filter to find it more easily) 2. Click *Generate* in the **Files** section to generate a Storage SAS URL to use for uploading your data. @@ -50,7 +50,7 @@ To bring in external data to a secure TRE workspace so you can use it for your r ### Step 3: Get your approved data -The request will be in an *In Review* state until it's either approved or rejected by your Airlock Manager(s) manually or by an automated workflow, depending on your organisation's specific configuration. +The request will be in an *In Review* state until it is either approved or rejected by your Airlock Manager(s) manually or by an automated workflow (depending on your organisation's specific configuration). !!! note Your organisation may have the Airlock Notifier service configured which will send email notifications when your request has been approved/rejected, or you may have another mechanism in place. Please check with your TRE administrator. @@ -70,7 +70,7 @@ If your request is approved, you can follow the below steps to get your data fro - With the Azure CLI, you can use `az storage blob download --file /path/to/write/to --blob-url SAS_URL`. [More info](https://docs.microsoft.com/en-us/cli/azure/storage/blob?view=azure-cli-latest#az-storage-blob-download) !!! tip - If you're using a Workspace VM that uses one of the standard TRE Data Science VM images, you'll likely have both Storage Explorer and the Azure CLI pre-installed. + If you are using a Workspace VM that uses one of the standard TRE Data Science VM images, you will likely have both Storage Explorer and the Azure CLI pre-installed. ## Exporting data from a workspace From d61b3ec8025101005dc84c8620e5bba08b4af13b Mon Sep 17 00:00:00 2001 From: clarotech Date: Thu, 17 Nov 2022 14:45:45 +0000 Subject: [PATCH 2/2] Multiple edits to address Typos/Grammar in text (#2878) --- docs/azure-tre-overview/airlock.md | 76 +++++++++++++++--------------- 1 file changed, 38 insertions(+), 38 deletions(-) diff --git a/docs/azure-tre-overview/airlock.md b/docs/azure-tre-overview/airlock.md index 0a7f8c0e0d..a73b20d209 100644 --- a/docs/azure-tre-overview/airlock.md +++ b/docs/azure-tre-overview/airlock.md @@ -1,8 +1,8 @@ # Airlock -In a Trusted Research Environment (TRE) the workspaces represent a security boundary that enables researchers to access data, execute analysis, apply algorithms and collect reports. The airlock capability brings the only mechanism that allows users to `import` or `export` data, tools or other file based artefacts in a secure fashion with a human approval. -This constitutes the mechanism focused in preventing data exfiltration and securing TRE and its workspaces from inappropriate data, while allowing researchers to work on their projects and execute their tasks. -The airlock feature brings several actions: ingress/egress Mechanism; Data movement; Security gates; Approval mechanism and Notifications. As part of TRE's Safe settings all activity must be tracked for auditiing purpose. +In a Trusted Research Environment (TRE) the workspaces represent a security boundary that enables researchers to access data, execute analysis, apply algorithms and collect reports. The airlock capability is the only mechanism that allows users to `import` or `export` data, tools or other file based artefacts in a secure fashion with a human approval. +This constitutes the mechanism focused on preventing data exfiltration and securing TRE and its workspaces from inappropriate data, while allowing researchers to work on their projects and execute their tasks. +The airlock feature brings several actions: ingress/egress Mechanism; Data movement; Security gates; Approval mechanism and Notifications. As part of TRE's Safe settings all activity must be tracked for auditing purposes. The Airlock feature aims to address these goals: @@ -10,7 +10,7 @@ The Airlock feature aims to address these goals: * Provide a process to allow approved data to be imported through the security boundary of a TRE Workspace. -* TRE provides functionality to track requests and decisions, supporting cycle of revision, approval or rejection. +* TRE provides functionality to track requests and decisions, supporting cycles of revision, approval or rejection. * Data being imported with an airlock import process can be automatically scanned for security issues. @@ -20,24 +20,24 @@ The Airlock feature aims to address these goals: * All steps within the airlock process are audited. -Typically in a TRE, the Airlock feature would be used to allow a researcher to export outputs of a research project such as summary results. With the airlock, data to export must go though a human review, typically done by a data governance team. +Typically in a TRE, the Airlock feature would be used to allow a researcher to export the outputs of a research project such as summary results. With the airlock, data to be exported must go through a human review, typically undertaken by a data governance team. -The Airlock feature will create events on every meaningful step of the processes. This will enable increased flexibility by allowing an organization to extend the notification mechanism. +The Airlock feature will create events on every meaningful step of the process. This will enable increased flexibility by allowing an organization to extend the notification mechanism. ## Ingress/Egress Mechanism -The Airlock allows a TRE user to start `import` or `export` process to a given workspace. A number of milestones must be reached in order to complete a successful import or export. These milestones are defined using the following states: +The Airlock allows a TRE user to start the `import` or `export` process to a given workspace. A number of milestones must be reached in order to complete a successful import or export. These milestones are defined using the following states: -1. **Draft**: An Airlock request has been created but has not yet started. The TRE User/Researcher has now access to a storage location and he must identify the data to be processed. At this point the airlock import/export processes allow a single file to be processed. However a compressed file may be used (zip). -2. **Submitted**: The request was submitted by the ressearcher (not yet processed). -3. **In-Review**: The request is ready to be reviewed. This state can be reached directly from Submitted state or after going through a successful security scan (found clean). +1. **Draft**: An Airlock request has been created but has not yet started. The TRE User/Researcher now has access to a storage location and he must identify the data to be processed. At this point the airlock import/export processes allow a single file to be processed. However a compressed file may be used (zip). +2. **Submitted**: The request was submitted by the researcher (not yet processed). +3. **In-Review**: The request is ready to be reviewed. This state can be reached directly from the Submitted state or after going through a successful security scan (found clean). 4. **Approval In-progress**: The Airlock request has been approved, however data movement is still ongoing. -5. **Approved**: The Airlock request has been approved. At this state, data has been securely verified and manually reviewed. The data is now in its final location. For an import process the data is now available in the TRE workspace. It can be accessed by the requestor from within the workspace. +5. **Approved**: The Airlock request has been approved. At this state, data has been securely verified and manually reviewed. The data is now in its final location. For an import process the data is now available in the TRE workspace, it can be accessed by the requestor from within the workspace. 6. **Rejection In-progress**: The Airlock request has been rejected, however data movement is still ongoing. 7. **Rejected**: The Airlock request has been rejected. The data in the process was rejected manually by the Airlock Manager. 8. **Cancelled**: The Airlock request was manually cancelled by the requestor TRE user, a Workspace owner or a TRE administrator. The cancelation is only allowed when the request is not actively changing (i.e. **Draft** or **In-Review** state). 9. **Blocking In-progress**: The Airlock request has been blocked, however data movement is still ongoing. -10. **Blocked By Scan**: The Airlock request has been blocked. The security analysis found issues in the submitted data, and consequently quarantined the data. +10. **Blocked By Scan**: The Airlock request has been blocked. The security analysis found issues in the submitted data and consequently quarantined the data. ```mermaid graph TD @@ -60,28 +60,28 @@ graph TD ``` > Airlock state flow diagram for an Airlock export request -When an airlock process is created the initial state is **Draft** and the required infrastructure will get created providing a single container to isolate the data in the request. Once completed, the user user will be able to get a link for this container inside the storage account (URL + SAS token) that he can use to upload the desired data to be processed (import or export). +When an airlock process is created the initial state is **Draft** and the required infrastructure will get created providing a single container to isolate the data in the request. Once completed, the user will be able to get a link for this container inside the storage account (URL + SAS token) that he can use to upload the desired data to be processed (import or export). This storage location is external for import (`stalimex`) or internal for export (`stalexint`), however only accessible to the requestor (ex: a TRE user/researcher). -The user will be able to upload a file to the provided storage location, using any tool of their preference: [Azure Storage Explorer](https://azure.microsoft.com/en-us/features/storage-explorer/) or [AzCopy](https://docs.microsoft.com/en-us/azure/storage/common/storage-use-azcopy-v10) which is a command line. +The user will be able to upload a file to the provided storage location, using any tool of their preference: [Azure Storage Explorer](https://azure.microsoft.com/en-us/features/storage-explorer/) or [AzCopy](https://docs.microsoft.com/en-us/azure/storage/common/storage-use-azcopy-v10) which is a command line tool. The user Submits the request (TRE API call) starting the data movement (to the `stalimip` - import in-progress or `stalexip` - export in-progress). The airlock request is now in state **Submitted**. -If enabled, the Security Scanning is started. In case security flaws are found, the request state becomes **Blocking In-progress** while the data is moved to blocked storage (either import blocked `stalimblocked` or export blocked`stalexblocked`). In this case, the request is finalized with the state **Blocked By Scan**. -If the Security Scanning does not identify security flaws, the request state becomes **In-Review**. Simultaneously a notification is sent to the Airlock Manager user. The user needs to ask for the container url using the TRE API (SAS token + URL with READ permission). +If enabled, the Security Scanning is started. In the case that security flaws are found, the request state becomes **Blocking In-progress** while the data is moved to blocked storage (either import blocked `stalimblocked` or export blocked `stalexblocked`). In this case, the request is finalized with the state **Blocked By Scan**. +If the Security Scanning does not identify any security flaws, the request state becomes **In-Review**. Simultaneously, a notification is sent to the Airlock Manager user. The user needs to ask for the container URL using the TRE API (SAS token + URL with READ permission). -> The Security Scanning can be disabled, chaging the request state from **Submitted** straight to **In-Review**. +> The Security Scanning can be disabled, changing the request state from **Submitted** straight to **In-Review**. The Airlock Manager will manually review the data using the tools of their choice available in the TRE workspace. Once review is completed, the Airlock Manager will have to *Approve* or *Reject* the airlock proces, though a TRE API call. At this point, the request will change state to either **Approval In-progress** or **Rejection In-progress**, while the data movement occurs moving afterwards to **Approved** or **Rejected** accordingly. The data will now be in the final storage destination: `stalexapp` - export approved or `stalimapp` - import approved. -With this state change a notification will be triggered to the requestor including the location for the processed data in the form of an URL + SAS token. +With this state change, a notification will be triggered to the requestor including the location of the processed data in the form of an URL + SAS token. ## Data movement -For any airlock process there is data movement either **into** a TRE workspace (in import process) or **from** a TRE workspace (in export process). Being the TRE a Workspace boundary, there are networking configurations designed to achieve this goal. The data movement will guarantee that the data is automatically verified for securty flaws and manually reviewed, before placing data inside the TRE Workspace. -Also, the process guarantees that data is not tampered with through out the process. +For any airlock process, there is data movement either **into** a TRE workspace (in import process) or **from** a TRE workspace (in export process). Being a TRE Workspace boundary, there are networking configurations designed to achieve this goal. The data movement will guarantee that the data is automatically verified for security flaws and manually reviewed, before placing data inside the TRE Workspace. +Also, the process guarantees that data is not tampered with throughout the process. -In an import process, data will transition from more public locations (yet confined to the requestor) to TRE workspace storage, after guaranteeding security automatically and by manual review. +In an import process, data will transition from more public locations (yet confined to the requestor) to TRE workspace storage, after guaranteeing security automatically and by manual review. -In an export process data will transition from internal locations (available to the requestor) to public locations in the TRE, after going through a manual review. +In an export process, data will transition from internal locations (available to the requestor) to public locations in the TRE, after going through a manual review. Considering that the Airlock requests may require large data movements, the operations can have longer durations, hence becoming the operations asynchronous. This is why states like **Approval In-progress**, **Rejection In-progress** or **Blocking In-progress** will be set while there are data movement operations. @@ -89,37 +89,37 @@ Considering that the Airlock requests may require large data movements, the oper ## Security Scan -The identified data in a airlock proces, will be submited to a security scan. If the security scan identifies issues the data is quarantined, and a report is added to the process metadata. Both the requestor and Workspace Owner are notified. For successful security scan, the data will remain in state **In-progress**, and accessible to the Workspace Owner. +The identified data in a airlock proces, will be submited to a security scan. If the security scan identifies issues the data is quarantined and a report is added to the process metadata. Both the requestor and Workspace Owner are notified. For a successful security scan, the data will remain in state **In-progress**, and accessible to the Workspace Owner. -> * The Security scan will be optional, behind a feature flag, enabled by script -> * The outcome of security scan will be either the in-progress (`stalexip`) storage or blocked (`stalexblocked`) -> * An airlock process will guarantee that the content being imported/exported is secure. It is envisioned that a set of **security gates** are identified to be executed successfully for a process to be approve. +> * The Security scan will be optional, behind a feature flag enabled by a script +> * The outcome of the security scan will be either the in-progress (`stalexip`) storage or blocked (`stalexblocked`) +> * An airlock process will guarantee that the content being imported/exported is secure. It is envisioned that a set of **security gates** are identified to be executed successfully for a process to be approved. ## Approval mechanism -The approval mechanism, is bundled with any airlock process, providing a specific way to `approve` or `reject` the data. This mechanism will allow the Airlock Managers to explicitly approve/reject the process, after having acess to the data. The Airlock Manager users will be able to execute a manual review on the data using the tools available to him in a reviewal TRE Workspace. -Once this manual review is executed, Airlock Managers can proactivelly approve or reject the airlock request. +The approval mechanism, is bundled with any airlock process, providing a specific way to `approve` or `reject` the data. This mechanism will allow the Airlock Managers to explicitly approve/reject the process, after having access to the data. The Airlock Manager users will be able to execute a manual review of the data using the tools available to him in a reviewal TRE Workspace. +Once this manual review is executed, the Airlock Managers can proactively approve or reject the airlock request. -The only goal of the Approval mechanism is to provide a cycle of revision, approval or rejection, while tracking the decision. +The only goal of the Approval mechanism is to provide a cycle of revision, approval or rejection while tracking the decision. This mechanism will provide access to the data in the airlock process, and will be able to use a VM in TRE workspace. The data review will be the Airlock Manager responsibility > * It is envisioned that this mechanism to be more flexible and extensible. -> * The `Airlock Manager` is a role defined at workspace instance level and assigned to identities. Initially the `Owner` role will be used. +> * The `Airlock Manager` is a role defined at the workspace instance level and assigned to identities. Initially, the `Owner` role will be used. ## Notifications -Throughout the airlock process, the notification mechanism will notify the relevant people to the process. Both the requestor (TRE User/Researcher) and the Workspace Owner will be notified by email, of the relevant process events. +Throughout the airlock process, the notification mechanism will notify the relevant people of the process. Both the requestor (TRE User/Researcher) and the Workspace Owner will be notified by email of the relevant process events. -Whenever the airlock process changes to a state of **Draft**, **Submitted**, **Approved**, **Rejected**, **Approval In-progress**,**Rejection In-progress**, **Blocked By Scan** or **Cancelled**, the process requestor gets notified. -When the state changes to `In-progress` the Workspace Onwer (Airlock Manager) gets notified. +Whenever the airlock process changes to a state of **Draft**, **Submitted**, **Approved**, **Rejected**, **Approval In-progress**, **Rejection In-progress**, **Blocked By Scan** or **Cancelled**, the process requestor gets notified. +When the state changes to `In-progress` the Workspace Owner (Airlock Manager) gets notified. > * The Notification mechanism is also data-driven, allowing an organization to extend the notifications behavior. The mechanism is exemplified with a Logic App determining the notifications logic. > * Notifications will work with All TRE users being AAD users (guests or not), with email defined – if not, notifications will not be sent. ## Architecture -The Airlock feature is supported by infrastructure at the TRE and workspace level, containing a set of storage accounts. Each Airlock request, will provision and use unique storage containers with the request id in it's name. +The Airlock feature is supported by infrastructure at the TRE and workspace level, containing a set of storage accounts. Each Airlock request will provision and use unique storage containers with the request id in its name. ```mermaid graph LR @@ -156,7 +156,7 @@ Workspace: * `stalexrej` - workspace storage (st) airlock (al) export (ex) rejected (rej) * `stalexblocked` - workspace storage (st) airlock (al) export (ex) blocked -> * The external storage accounts (`stalimex`, `stalexapp`), are not bound to any vnet and accessible (with SAS token) via internet. +> * The external storage accounts (`stalimex`, `stalexapp`), are not bound to any vnet and are accessible (with SAS token) via the internet > * The internal storage account (`stalexint`) is bound to the workspace vnet, so ONLY TRE Users/Researchers on that workspace can access it > * The (export) in-progress storage account (`stalexip`) is bound to the workspace vnet > * The (export) blocked storage account (`stalexblocked`) is bound to the workspace vnet @@ -173,12 +173,12 @@ In the TRE Core, the TRE API will provide the airlock API endpoints allowing to | Method | Endpoint | Description | |---|---|---| | `POST` | `/api/workspaces/{workspace_id}/requests` | Create an Airlock request (in **Draft**) | -| `POST` | `/api/workspaces/{workspace_id}/requests/{airlock_request_id}/link` | Get the url and token to acccess Airlock Request | `POST` | `/api/workspaces/{workspace_id}/requests/{airlock_request_id}/submit` | Submits an Airlock request | +| `POST` | `/api/workspaces/{workspace_id}/requests/{airlock_request_id}/link` | Get the url and token to access an Airlock Request | `POST` | `/api/workspaces/{workspace_id}/requests/{airlock_request_id}/submit` | Submits an Airlock request | | `POST` | `/api/workspaces/{workspace_id}/requests/{airlock_request_id}/review` | Reviews an Airlock request | | `POST` | `/api/workspaces/{workspace_id}/requests/{airlock_request_id}/cancel` | Cancels an Airlock request | -container | -Also in the airlock feature we have the **Airlock Processor** which will handle the events that are created throughout the process, signalling state changes from blobs created, status changed or security scans finalized. + +Also in the airlock feature there is the **Airlock Processor** which handles the events that are created throughout the process, signalling state changes from blobs created, status changed or security scans finalized. ## Airlock flow