Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Adding support for Spanner with PG Dialect in Database Retriever Service #469

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 7 additions & 6 deletions docs/datastore/spanner.md → docs/datastore/spanner_gsql.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,9 +39,10 @@
1. Set environment variables.

```bash
export INSTANCE=my-spanner-instance
export DATABASE=my-spanner-database
export INSTANCE=my-spanner-gsql-instance
export DATABASE=assistantdemo
export REGION=regional-us-central1
export INSTANCE_DESCRIPTION="My Spanner GSQL Instance"
```

1. Create a Cloud Spanner instance:
Expand All @@ -50,7 +51,7 @@
gcloud spanner instances create $INSTANCE \
--config=$REGION \
--nodes=1 \
--description="My Spanner Instance"
--description=$INSTANCE_DESCRIPTION
```
1. Create a database within the Cloud Spanner instance:

Expand Down Expand Up @@ -120,8 +121,8 @@
# Example for Spanner
kind: "spanner-gsql"
project: <YOUR_PROJECT_ID>
instance: my-spanner-instance
database: my-spanner-database
instance: my-spanner-gsql-instance
database: assistantdemo
service_account_key_file: <PATH_TO_SERVICE_ACCOUNT_KEY_FILE>
```

Expand All @@ -139,4 +140,4 @@ Clean up after completing the demo.

```bash
gcloud spanner instances delete $INSTANCE
```
```
144 changes: 144 additions & 0 deletions docs/datastore/spanner_pg.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,144 @@
# Setup and configure Spanner

## Before you begin

1. Make sure you have a Google Cloud project and billing is enabled.

1. Set your `PROJECT_ID` environment variable:

```bash
export PROJECT_ID=<YOUR_PROJECT_ID>
```

1. [Install](https://cloud.google.com/sdk/docs/install) the gcloud CLI.

1. Set gcloud project:

```bash
gcloud config set project $PROJECT_ID
```

1. Enable APIs:

```bash
gcloud services enable spanner.googleapis.com
```

1. [Install python][install-python] and set up a python [virtual environment][venv].

1. Make sure you have python version 3.11+ installed.

```bash
python -V
```
[install-python]: https://cloud.google.com/python/docs/setup#installing_python
[venv]: https://cloud.google.com/python/docs/setup#installing_and_using_virtualenv

## Create a Cloud Spanner instance

1. Set environment variables.

```bash
export INSTANCE=my-spanner-pg-instance
export DATABASE=assistantdemo
export REGION=regional-us-central1
export DATABASE_DIALECT=POSTGRESQL
export INSTANCE_DESCRIPTION="My Spanner PG Instance"
```

1. Create a Cloud Spanner instance:

```bash
gcloud spanner instances create $INSTANCE \
--config=$REGION \
--nodes=1 \
--description=$INSTANCE_DESCRIPTION
```
1. Create a database within the Cloud Spanner instance:

```bash
gcloud spanner databases create $DATABASE --instance=$INSTANCE --database-dialect=$DATABASE_DIALECT
```
1. Verify the database created with the `gcloud` tool:

```bash
gcloud spanner databases execute-sql $DATABASE \
--instance=$INSTANCE \
--sql="SELECT 1"
```

## Create a Service Account

1. Set environment variables.

```bash
export SA_NAME=spanner-service
export SA_EMAIL=$SA_NAME@$PROJECT_ID.iam.gserviceaccount.com
```

1. Create a Service Account: Use the gcloud iam service-accounts create command to create a new service account.
```bash
gcloud iam service-accounts create $SA_NAME --description="Service account for Cloud Spanner" --display-name="Cloud Spanner Service Account"
```

1. Grant Required Permissions: Assign the necessary roles to the service account. For Cloud Spanner read and write access, you can grant the roles/spanner.databaseUser and roles/spanner.databaseAdmin roles. Use the gcloud projects add-iam-policy-binding command to grant these roles.

```bash
gcloud projects add-iam-policy-binding $PROJECT_ID --member="serviceAccount:$SA_EMAIL" --role="roles/spanner.databaseUser" --condition=None
gcloud projects add-iam-policy-binding $PROJECT_ID --member="serviceAccount:$SA_EMAIL" --role="roles/spanner.databaseAdmin" --condition=None
```

1. Generate a key file for the service account. This key file will be used for authentication when accessing GCP resources programmatically.
```bash
gcloud iam service-accounts keys create key.json --iam-account $SA_EMAIL
```
1. Use the generated key file (key.json) to authenticate your application when accessing Cloud Spanner.

## Initialize data

1. Change into the retrieval service directory:

```bash
cd genai-databases-retrieval-app/retrieval_service
```

1. Install requirements:

```bash
pip install -r requirements.txt
```

1. Make a copy of `example-config.yml` and name it `config.yml`.

```bash
cp example-config.yml config.yml
```

1. Update `config.yml` with your database information.

```bash
host: 0.0.0.0
datastore:
# Example for Spanner
kind: "spanner-postgres"
project: <YOUR_PROJECT_ID>
instance: my-spanner-pg-instance
database: assistantdemo
service_account_key_file: <PATH_TO_SERVICE_ACCOUNT_KEY_FILE>
```

1. Populate data into database:

```bash
python run_database_init.py
```

## Clean up resources

Clean up after completing the demo.

1. Delete the Cloud Spanner instance:

```bash
gcloud spanner instances delete $INSTANCE
```
1 change: 1 addition & 0 deletions retrieval_service/datastore/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
providers.postgres.Config,
providers.cloudsql_postgres.Config,
providers.spanner_gsql.Config,
providers.spanner_postgres.Config,
providers.alloydb.Config,
providers.cloudsql_mysql.Config,
providers.neo4j_graph.Config,
Expand Down
2 changes: 2 additions & 0 deletions retrieval_service/datastore/providers/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@
neo4j_graph,
postgres,
spanner_gsql,
spanner_postgres,
)

__ALL__ = [
Expand All @@ -29,5 +30,6 @@
cloudsql_postgres,
firestore,
spanner_gsql,
spanner_postgres,
neo4j_graph,
]
Loading