Skip to content

Commit

Permalink
Merge branch 'main' into feat/edit-pub-comment
Browse files Browse the repository at this point in the history
  • Loading branch information
Xavier Medrano authored and Xavier Medrano committed Dec 11, 2024
2 parents e9d542a + 29d8da2 commit d6dc68c
Show file tree
Hide file tree
Showing 19 changed files with 440 additions and 67 deletions.
35 changes: 35 additions & 0 deletions .github/workflows/publish_docs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
name: Publish documentation to GitHub Pages

on:
workflow_dispatch:
push:
branches:
- main

jobs:
build-and-deploy:
name: Build and publish documentation
runs-on: ubuntu-latest
permissions:
contents: write
steps:
- name: Check out repository
uses: actions/checkout@v4

- uses: tj-actions/changed-files@v45
id: docs-changed
with:
files: |
docs/*
- name: Set up Quarto
if: steps.docs-changed.outputs.any_changed == 'true'
uses: quarto-dev/quarto-actions/setup@v2

- name: Render and Publish
if: steps.docs-changed.outputs.any_changed == 'true'
uses: quarto-dev/quarto-actions/publish@v2
with:
path: docs
target: gh-pages
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ debug.log
lametro/secrets.py
.env
.env.local
.env

.venv
media/
5 changes: 3 additions & 2 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ LABEL maintainer "DataMade <info@datamade.us>"

RUN apt-get update && \
apt-get install -y libpq-dev gcc gdal-bin gnupg && \
apt-get install -y libxml2-dev libxslt1-dev antiword unrtf poppler-utils \
apt-get install -y libxml2-dev libxslt1-dev antiword unrtf poppler-utils postgresql-client \
tesseract-ocr flac ffmpeg lame libmad0 libsox-fmt-mp3 \
sox libjpeg-dev swig libpulse-dev curl git && \
apt-get clean && \
Expand All @@ -19,6 +19,7 @@ RUN pip install pip==24.0 && \

COPY . /app

RUN DJANGO_SETTINGS_MODULE=councilmatic.minimal_settings python manage.py collectstatic --no-input
ENV DJANGO_SECRET_KEY 'foobar'
RUN python manage.py collectstatic --no-input

ENTRYPOINT ["/app/docker-entrypoint.sh"]
22 changes: 22 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -216,6 +216,28 @@ docker-compose -f docker-compose.yml -f docker-compose.locust.yml run --service-
This will start the Locust web server on http://localhost:8089. For more details,
see the [Locust documentation](https://docs.locust.io/en/stable/).

## Review Apps

This repo is set up to deploy review apps on Heroku, and those pull from the staging database to match the experience of deploying as closely as possible! However, note that in order to prevent unapproved model changes from effecting the staging database, migrations are prevented from running on review apps. So those will still have to be reviewed locally.

## Updating the Documentation

To make changes to the documentation, [install Quarto](https://quarto.org/docs/get-started/).

Then, run the following in your terminal:

```bash
quarto preview docs
```

Make your changes to the `.qmd` files in the `docs/` directory. They will be automatically
reflected in your local version of the docs.

For more on authoring docs with Quarto, see [their Getting Started guide](https://quarto.org/docs/get-started/authoring/text-editor.html) and [documentation](https://quarto.org/docs/guide/).

The GitHub Pages site will rebuild automatically when your documentation changes are
merged into `main`.

## Errors / Bugs

If something is not behaving intuitively, it is a bug, and should be reported.
Expand Down
36 changes: 0 additions & 36 deletions councilmatic/minimal_settings.py

This file was deleted.

12 changes: 12 additions & 0 deletions councilmatic/settings.py
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,18 @@
DEBUG = env.bool("DJANGO_DEBUG")
SHOW_TEST_EVENTS = env.bool("SHOW_TEST_EVENTS")
ALLOWED_HOSTS = env.list("DJANGO_ALLOWED_HOSTS")

# Derive allowed origins from configured hosts
CSRF_TRUSTED_ORIGINS = []

for host in ALLOWED_HOSTS:
if host.startswith("."):
origin = f"https://*{host}"
else:
origin = f"https://{host}"

CSRF_TRUSTED_ORIGINS.append(origin)

COUNCILMATIC_SUPPRESS_LIVE_MEDIA = env.list("COUNCILMATIC_SUPPRESS_LIVE_MEDIA")

if env("LOCAL_DOCKER"):
Expand Down
2 changes: 2 additions & 0 deletions docs/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
/.quarto/
/_site/
22 changes: 22 additions & 0 deletions docs/_quarto.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
project:
type: website

website:
title: "LA Metro Councilmatic Documentation"
sidebar:
style: "docked"
search: true
contents: auto
tools:
- icon: github
menu:
- text: Source Code
href: https://github.com/Metro-Records/la-metro-councilmatic
- text: Issue Tracker
href: https://github.com/Metro-Records/la-metro-councilmatic/issues

format:
html:
theme: litera
css: styles.css
toc: true
58 changes: 58 additions & 0 deletions docs/commands.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
---
title: "Commands to Know"
order: 2
---

The LA Metro galaxy comes with several CLI commands and their various options.
This section identifies some of the most significant commands for councilmatic, how to use them and where to execute them.

Metro Councilmatic runs additional processes on the data, after it gets imported to the database.
If you do need to run a particular management command, read on for more information about the commands that comprise `hourly_processing` in the Metro dashboard.

### Refresh the Property Image Cache
Metro caches PDFs of board reports and event agendas. [This can raise issues.](https://github.com/datamade/la-metro-councilmatic/issues/347)
The [`refresh_pic` management command](https://github.com/datamade/django-councilmatic/blob/master/councilmatic_core/management/commands/refresh_pic.py) refreshes the document cache ([an S3 bucket connected to Metro Councilmatic via `property-image-cache`](https://github.com/datamade/property-image-cache)) by deleting potentially out-of-date versions of board reports and agendas.

```bash
# run the command and log the results (if on the server)
python manage.py refresh_pic >> /var/log/councilmatic/lametro-refreshpic.log 2>&1
```

### Create PDF packets
Metro Councilmatic has composite versions of the Event agendas (the event and all related board reports) and board reports (the report and its attachments). [A separate app assists in creating these PDF packets](https://github.com/datamade/metro-pdf-merger), and the [`compile_pdfs` command](https://github.com/datamade/la-metro-councilmatic/blob/master/lametro/management/commands/compile_pdfs.py) communicates with this app by telling it which packets to create.

```bash
# run the command and log the results (if on the server)
# documented in the `metro-pdf-merger` README: https://github.com/datamade/metro-pdf-merger#get-started
python manage.py compile_pdfs >> /var/log/councilmatic/lametro-compilepdfs.log 2>&1

python manage.py compile_pdfs --all_documents
```

### Convert report attachments into plain text
Metro Councilmatic allows users to query board reports via attachment text. The attachments must appear as plain text in the database: [`convert_attachment_text`](https://github.com/datamade/django-councilmatic/blob/master/councilmatic_core/management/commands/convert_attachment_text.py) helps accomplish this.

```bash
# run the command and log the results (if on the server)
python manage.py convert_attachment_text >> /var/log/councilmatic/lametro-convertattachments.log 2>&1

# update all documents
python manage.py convert_attachment_text --update_all
```

### Rebuild or update the search index
Haystack comes with a utility command for rebuilding and updating the search index. [Learn more in the Haystack docs.](https://django-haystack.readthedocs.io/en/master/management_commands.html)

```bash
# ideally, rebuild should be run with a small batch-size to avoid memory consumption issues
# https://github.com/datamade/devops/issues/42
# run the command and log the results (if on the server)
python manage.py rebuild_index --batch-size=200 >> /var/log/councilmatic/lametro-updateindex.log 2>&1

# update can be run with an age argument, which instructs SmartLogic to consider bills updated so many hours ago
python manage.py update_index --age=2

# update should be run in non-interactive mode, when logging the results
# `noinput` tells Haystack to skips the prompts
python manage.py update_index --noinput
```
62 changes: 62 additions & 0 deletions docs/debugging.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
---
title: "Debugging"
order: 1
---

## Don't Panic!

Many issues can arise in the Metro galaxy, from the shallowest part of the frontend to the deepest depths of the backend.
However, these issues are generally due to either a metadata error or a scraper error.

This documentation will focus on metadata errors. If you suspect you're experiencing a scraper issue, please refer to the
[debug documentation for the scrapers](https://metro-records.github.io/scrapers-lametro/debugging.html).

### Metadata Error
Metro performs a series of ETL tasks against its database. You can view the full pipeline
[here](https://github.com/datamade/la-metro-dashboard/blob/main/dags/hourly_processing.py).

Failures in the ETL pipeline might have a corresponding issue
[in the `la-metro-councilmatic` Sentry project](https://sentry.io/organizations/datamade/issues/?project=2131912),
however sometimes steps run without failing but don't generate the desired result.
Read on for more on each step of the pipeline, plus past failures and their resolutions.

#### `refresh_pic`

**Where it lives:** [Django Councilmatic](https://github.com/datamade/django-councilmatic/blob/2.5/councilmatic_core/management/commands/refresh_pic.py)<br />
**What is does:** Deletes [cached documents](https://github.com/datamade/property-image-cache) for recently updated bills and events

**Past issues:**

- [Cached event agenda was out of sync with Legistar](https://github.com/datamade/la-metro-councilmatic/issues/443).
We have since updated the logic for which documents to remove from the cache, so this error should be resolved,
but the linked issue contains instructions for resolving this error manually, in case we see a regression.

#### `compile_pdfs`

**Where it lives:** [LA Metro Councilmatic](https://github.com/datamade/la-metro-councilmatic/blob/main/lametro/management/commands/compile_pdfs.py)<br />
**What it does:** Notifies the [`metro-pdf-merger`](https://github.com/datamade/metro-pdf-merger) of new documents that need to be merged into a bill or event packet

**Past issues:**

- [Sometimes the worker fails to merge documents, resulting in missing packets](https://github.com/datamade/la-metro-councilmatic/issues/476).
There should be a corresponding error for this [in the `metro-pdf-merger` Sentry project](https://sentry.io/organizations/datamade/issues/?project=155211),
however the project is pretty noisy, so you can shell into the server (`metro-pdf-merger.datamade.us`) and tail or grep the worker logs to double check.
- [The worker PDF merger has died mysteriously](https://github.com/datamade/metro-pdf-merger/issues/19).

#### `convert_attachment_text`

**Where it lives:** [Django Councilmatic](https://github.com/datamade/django-councilmatic/blob/2.5/councilmatic_core/management/commands/convert_attachment_text.py)<br />
**What it does:** Extracts text from bill attachments for indexing

**Past issues:**

- N/A

#### `update_index`

**Where it lives:** [Haystack](https://django-haystack.readthedocs.io/en/master/management_commands.html#update-index)<br />
**What it does:** Updates the search index

**Past issues:**

- N/A
Loading

0 comments on commit d6dc68c

Please sign in to comment.