Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SNOW-103] Create a streamlit app template #68

Merged
merged 32 commits into from
Aug 13, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
4fc91ad
Initial commit for streamlit app template
jaymedina Jul 19, 2024
4bf59d4
example_secrets.toml
jaymedina Jul 19, 2024
df7a308
refactor main app script
jaymedina Jul 19, 2024
619d2c3
update example_secrets.toml
jaymedina Jul 19, 2024
efad37b
New queries.py. Working with snowflake data. Moved out sample data.
jaymedina Jul 19, 2024
40693fd
New widgets.py
jaymedina Jul 20, 2024
ad77817
New utils.py. Some reformatting of app.py
jaymedina Jul 22, 2024
d61dc38
Fixing imports
jaymedina Jul 22, 2024
387ffb6
Turn queries.py vars into global vars. Finish appy.py
jaymedina Jul 22, 2024
41ce43c
New tests/ and toolkit/ folder. Moving files
jaymedina Jul 22, 2024
dbd08c2
New requirements.txt. Small updates to app.py
jaymedina Jul 22, 2024
bb6e076
New Dockerfile
jaymedina Jul 22, 2024
76e86cb
Updated requirements.txt
jaymedina Jul 23, 2024
865eba4
Updated Dockerfile to use specific server address and port
jaymedina Jul 24, 2024
63818e5
Create README.md
jaymedina Jul 24, 2024
957d67e
Add examples
jaymedina Jul 24, 2024
5c262b9
New .dockerignore to prevent secrets.toml from being added to docker …
jaymedina Jul 24, 2024
5836a7e
Update README.md
jaymedina Jul 24, 2024
9c03639
Add docker-compose file
jaymedina Jul 24, 2024
b2017e3
Adding steps to "Launch your app"
jaymedina Jul 24, 2024
11e6044
Update EC2 instructions
jaymedina Jul 24, 2024
7207fde
Update to build your app section
jaymedina Jul 25, 2024
e53376e
.
jaymedina Jul 31, 2024
748a814
New test suite. Updated documentation
jaymedina Aug 3, 2024
02bf472
Final updates to README
jaymedina Aug 3, 2024
3ecf2c7
Ignore pycache
jaymedina Aug 3, 2024
68cd091
Adding pre-commit hook for black and isort
jaymedina Aug 13, 2024
c741ba1
Introduce SYNID global var for queries.py
jaymedina Aug 13, 2024
a72097b
Updated example in README.md
jaymedina Aug 13, 2024
903a3df
Separate .gitignore
jaymedina Aug 13, 2024
36aaa59
Untracked __pycache__ directories as specified in .gitignore
jaymedina Aug 13, 2024
d96317e
Updated syntax in .gitignore
jaymedina Aug 13, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,4 @@
.terraform*
terraform.tfstate*
*.csv
.DS_Store
.DS_Store
2 changes: 2 additions & 0 deletions streamlit_template/.dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# .dockerignore
.streamlit/secrets.toml
2 changes: 2 additions & 0 deletions streamlit_template/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
.streamlit/secrets.toml
__pycache__/
12 changes: 12 additions & 0 deletions streamlit_template/.pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
repos:
- repo: https://github.com/pycqa/isort
rev: 5.13.2
hooks:
- id: isort
name: isort (python)

- repo: https://github.com/psf/black
rev: 24.3.0
hooks:
- id: black
language_version: python3
7 changes: 7 additions & 0 deletions streamlit_template/.streamlit/config.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
[theme]
#primaryColor="#F63366"
primaryColor="#47C7DA"
backgroundColor="#FFFFFF"
secondaryBackgroundColor="#F0F2F6"
textColor="#262730"
font="sans serif"
4 changes: 4 additions & 0 deletions streamlit_template/.streamlit/example_secrets.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
[snowflake]
user = "EXAMPLE_USER"
password = "EXAMPLE_PASSWORD"
account = "example-0000000"
22 changes: 22 additions & 0 deletions streamlit_template/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Use the official Python base image
FROM python:3.11-slim

# Set environment variables
ENV PYTHONUNBUFFERED=1
ENV PYTHONDONTWRITEBYTECODE=1

# Copy requirements file
COPY requirements.txt .

# Install dependencies
RUN pip install --upgrade pip \
&& pip install -r requirements.txt

# Copy the rest of the application code
COPY . .

# Expose the port Streamlit runs on
EXPOSE 8501

# Command to run the app
CMD ["streamlit", "run", "app.py", "--server.port=8501", "--server.address=0.0.0.0"]
136 changes: 136 additions & 0 deletions streamlit_template/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
## Introduction
This area of the repository serves as a template for developing your own Streamlit application for internal use within Sage Bionetworks.
The template is designed to source data from the databases in Snowflake and compose a dashboard using the various tools provided by [Streamlit](https://docs.streamlit.io/)
and plotly.

Below is the directory structure for all the components within `streamlit_template`. In the following section we will break down the purpose for
each component within `streamlit_template`, and how to use these components to design your own application and deploy via an AWS EC2 instance.

```
streamlit_template/
├── .streamlit/
│ ├── config.toml
│ └── example_secrets.toml
├── tests/
│ ├── __init__.py
│ └── test_app.py
├── toolkit/
│ ├── __init__.py
│ ├── queries.py
│ ├── utils.py
| └── widgets.py
├── Dockerfile
├── app.py
├── requirements.txt
└── style.css
```

## Create your own Streamlit application

### 1. Setup and Enable Access to Snowflake

- Create a fork of this repository under your GitHub user account.
- Within the `.streamlit` folder, you will need a file called `secrets.toml` which will be read by Streamlit before making communications with Snowflake.
Use the contents in `example_secrets.toml` as a syntax guide for how `secrets.toml` should be set up. See the [Snowflake documentation](https://docs.snowflake.com/en/user-guide/admin-account-identifier#using-an-account-name-as-an-identifier) for how to find your
account name.
- Test your connection to Snowflake by running the example Streamlit app at the base of this directory. This will launch the application on port 8501, the default port for Streamlit applications.

```
streamlit run app.py
```

> [!CAUTION]
> Do not commit your `secrets.toml` file to your forked repository. Keep your credentials secure and do not expose them to the public.

### 2. Build your Queries

Once you've completed the setup above, you can begin working on your SQL queries.
- Navigate to `queries.py` under the `toolkit/` folder.
- Your queries will be string objects. Assign each of them an easy-to-remember variable name, as they will be imported into `app.py` later on.
- It is encouraged that you test these queries in a SQL Worksheet on Snowflake's Snowsight before running them on your application.

Example:
```
QUERY_NUMBER_OF_FILES = """

select
count(*) as number_of_files
from
synapse_data_warehouse.synapse.node_latest
where
project_id = '53214489'
and
node_type = 'file';
"""
```

### 3. Build your Widgets

Your widgets will be the main visual component of your Streamlit application.

- Navigate to `widgets.py` under the `toolkit/` folder.
- Modify the imports as necessary. By default we are using `plotly` to design our widgets.
- Create a function for each widget. For guidance, follow one of the examples in `widgets.py`.

### 4. Build your Application

Here is where all your work on `queries.py` and `widgets.py` come together.
- Navigate to `app.py` to begin developing.
- Import the queries you developed in Step 2.
- Import the widgets you developed in Step 3.
- Begin developing! Use the pre-existing `app.py` in the template as a guide for structuring your application.

> [!TIP]
> The `utils.py` houses the functions used to connect to Snowflake and run your SQL queries. Make sure to reserve an area
> in the script for using `get_data_from_snowflake` with your queries from Step 2.
>
> Example:
>
> ```
> from toolkit.queries import (QUERY_ENTITY_DISTRIBUTION, QUERY_PROJECT_SIZES,
> QUERY_PROJECT_DOWNLOADS, QUERY_UNIQUE_USERS)
>
> entity_distribution_df = get_data_from_snowflake(QUERY_ENTITY_DISTRIBUTION)
> project_sizes_df = get_data_from_snowflake(QUERY_PROJECT_SIZES)
> project_downloads_df = get_data_from_snowflake(QUERY_PROJECT_DOWNLOADS)
> unique_users_df = get_data_from_snowflake(QUERY_UNIQUE_USERS)
> ```

### 5. Test your Application

We encourage implementing unit and regression tests in your application, particularly if there are components that involve interacting with the application
to display and/or transform data (e.g. buttons, dropdown menus, sliders, so on).

- Navigate to `tests/test_app.py` to modify the existing script.
- The default tests use [Streamlit's AppTest tool](https://docs.streamlit.io/develop/api-reference/app-testing/st.testing.v1.apptest#run-an-apptest-script) to launch the application and retrieve its components. Please modify these existing tests or create brand new ones
as you see fit.

> [!TIP]
> Make sure to launch the test suite from the base directory of the `streamlit_app/` (i.e `pytest tests/test_app.py`)
> to avoid import issues.

### 6. Dockerize your Application

- Update the `requirements.txt` file with the packages used in any of the scripts above.
- Ensure you have pushed all your changes to your fork of the repository that you are working in (remember not to commit your `secrets.toml` file).
- **_(Optional)_** You can choose to push a Docker image to the GitHub Container Registry to pull it directly from the container registry when ready to deploy.
For instructions on how to deploy your Docker image to the GitHub Container Registry, [see here](https://docs.github.com/en/packages/working-with-a-github-packages-registry/working-with-the-container-registry).

### 7. Launch your Application on AWS EC2

- Create an EC2: Linux Docker product from the Sage Service Catalog.
- Go to _Provisioned Products_ in the menu on the left-hand-side.
- Once your EC2 product's `status` is set to `Available`, click it and navigate to the _Events_ tab.
- Click the URL next to `ConnectionURI` to launch a shell session in your instance.
- Navigate to your home directory (`cd ~`).
- Clone your repository in your desired working directory.
- Create your `secrets.toml` file again. The Docker image of your Streamlit application will not have the `secrets.toml` for security reasons.
- Build your Docker image from the Dockerfile in the repository
- Run your Docker container from the image, and make sure to have your `secrets.toml` mounted and the 8501 port specified, like so:
```
docker run -p 8501:8501 \
-v $PWD/secrets.toml:.streamlit/secrets.toml \
<image name>
```
> [!TIP]
> If you would like to leave the app running after you close your shell session, be sure to run with the container detached (i.e. Have `-d` somewhere in the `docker run` command)
Empty file.
60 changes: 60 additions & 0 deletions streamlit_template/app.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
import numpy as np
import streamlit as st
from toolkit.queries import (
QUERY_ENTITY_DISTRIBUTION,
QUERY_PROJECT_DOWNLOADS,
QUERY_PROJECT_SIZES,
QUERY_UNIQUE_USERS,
)
from toolkit.utils import get_data_from_snowflake
from toolkit.widgets import plot_download_sizes, plot_unique_users_trend

# Custom CSS for styling
with open("style.css") as f:
st.markdown(f"<style>{f.read()}</style>", unsafe_allow_html=True)


def main():

# 1. Retrieve the data using your queries in queries.py
entity_distribution_df = get_data_from_snowflake(QUERY_ENTITY_DISTRIBUTION)
project_sizes_df = get_data_from_snowflake(QUERY_PROJECT_SIZES)
project_downloads_df = get_data_from_snowflake(QUERY_PROJECT_DOWNLOADS)
unique_users_df = get_data_from_snowflake(QUERY_UNIQUE_USERS)

# 2. Transform the data as needed
convert_to_gib = 1024 * 1024 * 1024
project_sizes = dict(
PROJECT_ID=list(project_sizes_df["PROJECT_ID"]),
TOTAL_CONTENT_SIZE=list(project_sizes_df["TOTAL_CONTENT_SIZE"]),
)
total_data_size = sum(
project_sizes["TOTAL_CONTENT_SIZE"]
) # round(sum(project_sizes['TOTAL_CONTENT_SIZE']) / convert_to_gib, 2)
average_project_size = round(
np.mean(project_sizes["TOTAL_CONTENT_SIZE"]) / convert_to_gib, 2
)

# 3. Format the app, and visualize the data with your widgets in widgets.py
# -------------------------------------------------------------------------
# Row 1 -------------------------------------------------------------------
st.markdown("### Monthly Overview :calendar:")
col1, col2, col3 = st.columns([1, 1, 1])
col1.metric("Total Storage Occupied", f"{total_data_size} GB", "7.2 GB")
col2.metric("Avg. Project Size", f"{average_project_size} GB", "8.0 GB")
col3.metric("Annual Cost", "102,000 USD", "10,000 USD")

# Row 2 -----------------------------------------------------------------
st.markdown("### Unique Users Report :bar_chart:")
st.plotly_chart(plot_unique_users_trend(unique_users_df))

# Row 3 -------------------------------------------------------------------
st.plotly_chart(plot_download_sizes(project_downloads_df, project_sizes_df))

# Row 4 -------------------------------------------------------------------
st.markdown("### Entity Trends :pencil:")
st.dataframe(entity_distribution_df)


if __name__ == "__main__":
main()
10 changes: 10 additions & 0 deletions streamlit_template/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
black==24.3.0
isort==5.13.2
numpy==1.26.3
streamlit==1.36.0
pandas==2.2.2
plotly==5.22.0
pytest==8.3.2
pre-commit==3.6.0
snowflake-connector-python==3.9.1
snowflake-snowpark-python==1.15.0
52 changes: 52 additions & 0 deletions streamlit_template/style.css
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
/* Logo */
/* Adapted from Zachary Blackwood */
/* [data-testid="stSidebar"] {
background-image: url(https://streamlit.io/images/brand/streamlit-logo-secondary-colormark-darktext.png);
background-size: 200px;
background-repeat: no-repeat;
background-position: 4px 20px;
} */


/* Card */
/* Adapted from https://startbootstrap.com/theme/sb-admin-2 */
div.css-1r6slb0.e1tzin5v2 {
background-color: #FFFFFF;
border: 1px solid #CCCCCC;
padding: 5% 5% 5% 10%;
border-radius: 5px;

border-left: 0.5rem solid #9AD8E1 !important;
box-shadow: 0 0.15rem 1.75rem 0 rgba(58, 59, 69, 0.15) !important;

}

label.css-mkogse.e16fv1kl2 {
color: #36b9cc !important;
font-weight: 700 !important;
text-transform: uppercase !important;
}


/* Move block container higher */
div.block-container.css-18e3th9.egzxvld2 {
margin-top: -5em;
}


/* Hide hamburger menu and footer */
div.css-r698ls.e8zbici2 {
display: none;
}

footer.css-ipbk5a.egzxvld4 {
display: none;
}

footer.css-12gp8ed.eknhn3m4 {
display: none;
}

div.vg-tooltip-element {
display: none;
}
Empty file.
67 changes: 67 additions & 0 deletions streamlit_template/tests/test_app.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
"""A suite of unit tests for the streamlit app in the base directory. We encourage
adding tests into this suite to ensure functionality within your streamlit app, particularly
for the components that allow users to interact with the app (buttons, dropdown menus, etc).

These tests were all written using Streamlit's AppTest class. See here for more details:
https://docs.streamlit.io/develop/api-reference/app-testing/st.testing.v1.apptest#run-an-apptest-script

A few considerations:

1. This suite is meant to be run from the base directory, not from the tests directory.
2. The streamlit app is meant to be run from the base directory.
3. The streamlit app is assumed to be called ``app.py``.
"""

import os
import sys

import pytest
from streamlit.testing.v1 import AppTest

# Ensure that the base directory is in PYTHONPATH so ``toolkit`` and other tools can be found
sys.path.append(os.path.abspath(os.path.join(os.path.dirname(__file__), "..")))

# The timeout limit to wait for the app to load before shutdown ( in seconds )
DEFAULT_TIMEOUT = 30


@pytest.fixture(scope="module")
def app():
return AppTest.from_file(
"app.py", default_timeout=DEFAULT_TIMEOUT
).run() # Point to your main Streamlit app file


def test_monthly_overview(app):
"""
Ensure that the Monthly Overview section is being displayed
with the appropriate labels in the right order.
"""

# Access the Monthly Overview columns in Row 1
total_storage_occupied = app.columns[0].children[0]
avg_project_size = app.columns[1].children[0]
annual_cost = app.columns[2].children[0]

# Check that the labels are correct for each metric
assert total_storage_occupied.label == "Total Storage Occupied"
assert avg_project_size.label == "Avg. Project Size"
assert annual_cost.label == "Annual Cost"


def test_plotly_charts(app):
"""Ensure both plotly charts are being displayed."""

plotly_charts = app.get("plotly_chart")

assert plotly_charts is not None
assert len(plotly_charts) == 2


def test_dataframe(app):
"""Ensure that the dataframe is being displayed."""

dataframe = app.dataframe

assert dataframe is not None
assert len(dataframe) == 1
1 change: 1 addition & 0 deletions streamlit_template/toolkit/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
import toolkit
Loading