Skip to content

Commit

Permalink
feat: Updated GCP dbt (#154)
Browse files Browse the repository at this point in the history
* feat: Added dbt GCP Configuration
* Added GCP dbt config with IAM, KMS, BigQuery, and
 Compute sections for improved data modeling and analytics in GCP.

* feat: Added Compute, Dns sections for GCP

* feat: Added Logging section for GCP dbt

* feat: Added Storage section for GCP dbt

* feat: Added SQL section for GCP dbt
* Converted DNS files to text format (resource_id).
* Added the complete GCP Compliance CIS v1.2.0 view, including all sections.
*	Rearranged the order of queries.

* feat: Updated CTE queries within the sections in GCP dbt

* feat: Updated GCP dbt
  • Loading branch information
ronsh12 authored Oct 24, 2023
1 parent 72adb62 commit 600378c
Show file tree
Hide file tree
Showing 88 changed files with 1,716 additions and 0 deletions.
4 changes: 4 additions & 0 deletions gcp_compliance/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@

target/
dbt_packages/
logs/
78 changes: 78 additions & 0 deletions gcp_compliance/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
# CloudQuery × dbt: GCP Compliance Package

## Overview

### Requirements

- [dbt](https://docs.getdbt.com/docs/installation)
- [PostgreSQL](https://www.postgresql.org/download/) or any other mutually supported destination
- [CloudQuery](https://www.cloudquery.io/docs/quickstart) with [GCP](https://www.cloudquery.io/docs/plugins/sources/gcp/overview) and [PostgreSQL](https://www.cloudquery.io/docs/plugins/destinations/postgresql/overview)

[Quick guide](https://www.cloudquery.io/integrations/gcp/postgresql) for GCP-Postgres integration.

#### dbt Installation

An example of how to install dbt to work with Postgres.

First, install `dbt`:

```bash
pip install dbt-postgres
```

Create the profile directory:

```bash
mkdir -p ~/.dbt
```

Create a `profiles.yml` file in your profile directory (e.g. `~/.dbt/profiles.yml`):

```yaml
gcp_compliance: # This should match the name in your dbt_project.yml
target: dev
outputs:
dev:
type: postgres
host: 127.0.0.1
user: postgres
pass: pass
port: 5432
dbname: postgres
schema: public # default schema where dbt will build the models
threads: 1 # number of threads to use when running in parallel
```
Test the Connection:
After setting up your `profiles.yml`, you should test the connection to ensure everything is configured correctly:

```bash
dbt debug
```

This command will tell you if dbt can successfully connect to your PostgreSQL instance.

#### Running Your dbt Project

Navigate to your dbt project directory, where your `dbt_project.yml` resides.

Before executing the `dbt run` command, it might be useful to check for any potential issues:

```bash
dbt compile
```

If everything compiles without errors, you can then execute:

```bash
dbt run
```

This command will run your `dbt` models and create tables/views in your PostgreSQL database as defined in your models.

### Usage

- Sync your data from GCP: `cloudquery sync gcp.yml postgres.yml`

- Run dbt: `dbt run`
Empty file.
40 changes: 40 additions & 0 deletions gcp_compliance/dbt_project.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@

# Name your project! Project names should contain only lowercase characters
# and underscores. A good package name should reflect your organization's
# name or the intended use of these models
name: 'gcp_compliance'
version: '1.0.0'
config-version: 2

# This setting configures which "profile" dbt uses for this project.
profile: 'gcp_compliance'

# These configurations specify where dbt should look for different types of files.
# The `model-paths` config, for example, states that models in this project can be
# found in the "models/" directory. You probably won't need to change these!
model-paths: ["models"]
analysis-paths: ["analyses"]
test-paths: ["tests"]
seed-paths: ["seeds"]
macro-paths: ["macros"]
snapshot-paths: ["snapshots"]

clean-targets: # directories to be removed by `dbt clean`
- "target"
- "dbt_packages"

# Configuring models
# Full documentation: https://docs.getdbt.com/docs/configuring-models

# In this example config, we tell dbt to build all models in the example/
# directory as views. These settings can be overridden in the individual model
# files using the `{{ config(...) }}` macro.
models:
gcp_compliance:
# Config indicated by + and applies to all files under models/example/
# example:
# +materialized: view




17 changes: 17 additions & 0 deletions gcp_compliance/macros/bigquery/datasets_publicly_accessible.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
{% macro bigquery_datasets_publicly_accessible(framework, check_id) %}
select DISTINCT
d.id AS resource_id,
d._cq_sync_time As sync_time,
'{{framework}}' As framework,
'{{check_id}}' As check_id,
'Ensure that BigQuery datasets are not anonymously or publicly accessible (Automated)' AS title,
d.project_id AS project_id,
CASE
WHEN
a->>'role' = 'allUsers'
OR a->>'role' = 'allAuthenticatedUsers'
THEN 'fail'
ELSE 'pass'
END AS status
FROM gcp_bigquery_datasets d, JSONB_ARRAY_ELEMENTS(d.access) AS a
{% endmacro %}
18 changes: 18 additions & 0 deletions gcp_compliance/macros/bigquery/datasets_without_default_cmek.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
{% macro bigquery_datasets_without_default_cmek(framework, check_id) %}
select
DISTINCT
d.id AS resource_id,
d._cq_sync_time As sync_time,
'{{framework}}' As framework,
'{{check_id}}' As check_id,
'Ensure that all BigQuery Tables are encrypted with Customer-managed encryption key (CMEK) (Automated)' AS title,
d.project_id AS project_id,
CASE
WHEN
d.default_encryption_configuration->>'kmsKeyName' = ''
OR d.default_encryption_configuration->>'kmsKeyName' IS NULL -- TODO check if valid
THEN 'fail'
ELSE 'pass'
END AS status
FROM gcp_bigquery_datasets d
{% endmacro %}
20 changes: 20 additions & 0 deletions gcp_compliance/macros/bigquery/tables_not_encrypted_with_cmek.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
{% macro bigquery_tables_not_encrypted_with_cmek(framework, check_id) %}
select
DISTINCT
d.id AS resource_id,
d._cq_sync_time As sync_time,
'{{framework}}' As framework,
'{{check_id}}' As check_id,
'Ensure that a Default Customer-managed encryption key (CMEK) is specified for all BigQuery Data Sets (Automated)' AS title,
d.project_id AS project_id,
CASE
WHEN
t.encryption_configuration->>'kmsKeyName' = '' OR
d.default_encryption_configuration->>'kmsKeyName' IS NULL -- TODO check if valid
THEN 'fail'
ELSE 'pass'
END AS status
FROM gcp_bigquery_datasets d
JOIN gcp_bigquery_tables t ON
d.dataset_reference->>'datasetId' = t.table_reference->>'datasetId' AND d.dataset_reference->>'projectId' = t.table_reference->>'projectId'
{% endmacro %}
19 changes: 19 additions & 0 deletions gcp_compliance/macros/compute/allow_traffic_behind_iap.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
{% macro compute_allow_traffic_behind_iap(framework, check_id) %}
SELECT
DISTINCT
gcf.name AS resource_id,
gcf._cq_sync_time AS sync_time,
'{{framework}}' AS framework,
'{{check_id}}' AS check_id,
'GCP CIS3.10 Ensure Firewall Rules for instances behind Identity Aware Proxy (IAP) only allow the traffic from Google Cloud Loadbalancer (GCLB) Health Check and Proxy Addresses (Manual)' AS title,
gcf.project_id AS project_id,
CASE
WHEN
NOT ARRAY['35.191.0.0/16', '130.211.0.0/22'] <@ gcf.source_ranges
AND NOT (gcf.value->>'I_p_protocol' = 'tcp'
AND ARRAY(SELECT JSONB_ARRAY_ELEMENTS_TEXT(gcf.value->'ports')) @> ARRAY['80'])
THEN 'fail'
ELSE 'pass'
END AS status
FROM {{ ref('expanded_firewalls') }} AS gcf
{% endmacro %}
16 changes: 16 additions & 0 deletions gcp_compliance/macros/compute/default_network_exist.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
{% macro compute_default_network_exist(framework, check_id) %}
select
"name" AS resource_id,
_cq_sync_time As sync_time,
'{{framework}}' As framework,
'{{check_id}}' As check_id,
'Ensure that the default network does not exist in a project (Automated)' AS title,
project_id AS project_id,
CASE
WHEN
"name" = 'default'
THEN 'fail'
ELSE 'pass'
END AS status
FROM gcp_compute_networks
{% endmacro %}
19 changes: 19 additions & 0 deletions gcp_compliance/macros/compute/disks_encrypted_with_csek.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
{% macro compute_disks_encrypted_with_csek(framework, check_id) %}
select
"name" AS resource_id,
_cq_sync_time As sync_time,
'{{framework}}' As framework,
'{{check_id}}' As check_id,
'Ensure VM disks for critical VMs are encrypted with Customer-Supplied Encryption Keys (CSEK) (Automated)' AS title,
project_id AS project_id,
CASE
WHEN
disk_encryption_key->>'sha256' IS NULL
OR disk_encryption_key->>'sha256' = ''
OR source_image_encryption_key->>'kms_key_name' IS NULL
OR source_image_encryption_key->>'kms_key_name' = ''
THEN 'fail'
ELSE 'pass'
END AS status
FROM gcp_compute_disks
{% endmacro %}
19 changes: 19 additions & 0 deletions gcp_compliance/macros/compute/flow_logs_disabled_in_vpc.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
{% macro compute_flow_logs_disabled_in_vpc(framework, check_id) %}
select
DISTINCT
gcn.name AS resource_id,
gcn._cq_sync_time As sync_time,
'{{framework}}' As framework,
'{{check_id}}' As check_id,
'Ensure that VPC Flow Logs is enabled for every subnet in a VPC Network (Automated)' AS title,
gcn.project_id AS project_id,
CASE
WHEN
gcs.enable_flow_logs = FALSE
THEN 'fail'
ELSE 'pass'
END AS status
FROM gcp_compute_networks gcn
JOIN gcp_compute_subnetworks gcs ON
gcn.self_link = gcs.network
{% endmacro %}
16 changes: 16 additions & 0 deletions gcp_compliance/macros/compute/instance_ip_forwarding_enabled.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
{% macro compute_instance_ip_forwarding_enabled(framework, check_id) %}
select
"name" AS resource_id,
_cq_sync_time As sync_time,
'{{framework}}' As framework,
'{{check_id}}' As check_id,
'Ensure that IP forwarding is not enabled on Instances (Automated)' AS title,
project_id AS project_id,
CASE
WHEN
can_ip_forward = TRUE
THEN 'fail'
ELSE 'pass'
END AS status
FROM gcp_compute_instances
{% endmacro %}
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
{% macro compute_instances_with_default_service_account(framework, check_id) %}
select
DISTINCT
gci.name AS resource_id,
gci._cq_sync_time As sync_time,
'{{framework}}' As framework,
'{{check_id}}' As check_id,
'Ensure that instances are not configured to use the default service account (Automated)' AS title,
gci.project_id AS project_id,
CASE
WHEN
gci."name" NOT LIKE 'gke-'
AND gcisa->>'email' = (SELECT default_service_account
FROM gcp_compute_projects
WHERE project_id = gci.project_id)
THEN 'fail'
ELSE 'pass'
END AS status
FROM gcp_compute_instances gci, JSONB_ARRAY_ELEMENTS(gci.service_accounts) gcisa
{% endmacro %}
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
{% macro compute_instances_with_default_service_account_with_full_access(framework, check_id) %}
select
DISTINCT
gci.name AS resource_id,
gci._cq_sync_time As sync_time,
'{{framework}}' As framework,
'{{check_id}}' As check_id,
'Ensure that instances are not configured to use the default service account with full access to all Cloud APIs (Automated)' AS title,
gci.project_id AS project_id,
CASE
WHEN
gcisa->>'email' = (SELECT default_service_account
FROM gcp_compute_projects
WHERE project_id = gci.project_id)
AND ARRAY['https://www.googleapis.com/auth/cloud-platform'] <@ ARRAY(SELECT JSONB_ARRAY_ELEMENTS_TEXT(gcisa->'scopes'))
THEN 'fail'
ELSE 'pass'
END AS status
FROM gcp_compute_instances gci, JSONB_ARRAY_ELEMENTS(gci.service_accounts) gcisa
{% endmacro %}
20 changes: 20 additions & 0 deletions gcp_compliance/macros/compute/instances_with_public_ip.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
{% macro compute_instances_with_public_ip(framework, check_id) %}
select
DISTINCT
gci.name AS resource_id,
gci._cq_sync_time As sync_time,
'{{framework}}' As framework,
'{{check_id}}' As check_id,
'Ensure that Compute instances do not have public IP addresses (Automated' AS title,
gci.project_id AS project_id,
CASE
WHEN
gci."name" NOT LIKE 'gke-%'
AND (ac4->>'nat_i_p' IS NOT NULL OR ac4->>'nat_i_p' != '' OR ac6->>'nat_i_p' IS NOT NULL OR ac6->>'nat_i_p' != '')
THEN 'fail'
ELSE 'pass'
END AS status
FROM gcp_compute_instances gci, JSONB_ARRAY_ELEMENTS(gci.network_interfaces) AS ni
LEFT JOIN JSONB_ARRAY_ELEMENTS(ni->'access_configs') AS ac4 ON TRUE
LEFT JOIN JSONB_ARRAY_ELEMENTS(ni->'ipv6_access_configs') AS ac6 ON TRUE
{% endmacro %}
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
{% macro compute_instances_with_shielded_vm_disabled(framework, check_id) %}
select
"name" AS resource_id,
_cq_sync_time As sync_time,
'{{framework}}' As framework,
'{{check_id}}' As check_id,
'Ensure Compute instances are launched with Shielded VM enabled (Automated)' AS title,
project_id AS project_id,
CASE
WHEN
(shielded_instance_config->>'enable_integrity_monitoring')::boolean = FALSE
OR (shielded_instance_config->>'enable_vtpm')::boolean = FALSE
OR (shielded_instance_config->>'enable_secure_boot')::boolean = FALSE
THEN 'fail'
ELSE 'pass'
END AS status
FROM gcp_compute_instances
{% endmacro %}
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
{% macro compute_instances_without_block_project_wide_ssh_keys(framework, check_id) %}
select
gci.name AS resource_id,
gci._cq_sync_time As sync_time,
'{{framework}}' As framework,
'{{check_id}}' As check_id,
'Ensure "Block Project-wide SSH keys" is enabled for VM instances (Automated)' AS title,
gci.project_id AS project_id,
CASE
WHEN
gcmi->>'key' IS NULL OR
NOT gcmi->>'value' = ANY ('{1,true,True,TRUE,y,yes}')
THEN 'fail'
ELSE 'pass'
END AS status
FROM gcp_compute_instances gci
LEFT JOIN JSONB_ARRAY_ELEMENTS(gci.metadata->'items') gcmi ON gcmi->>'key' = 'block-project-ssh-keys'
{% endmacro %}
Loading

0 comments on commit 600378c

Please sign in to comment.