Skip to content

Scrape daily posts from subreddits #2499

Scrape daily posts from subreddits

Scrape daily posts from subreddits #2499

name: Scrape daily posts from subreddits
on:
schedule:
- cron: '0 9,15,21,3 * * *'
# on:
# push:
# branches:
# - main
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python 3.9
uses: actions/setup-python@v4
with:
python-version: 3.9
- name: Run image
uses: abatilo/actions-poetry@v2
with:
poetry-version: "1.4.2"
- name: Install poetry dependencies
run: poetry install
- name: Run black
run: |
poetry run black .
- name: Run ruff
run: |
poetry run ruff check . --fix
- name: Prefect Cloud login
run: |
poetry run prefect config set PREFECT_API_KEY=${{secrets.PREFECT_API_KEY}}
poetry run prefect cloud workspace set --workspace "${{secrets.PREFECT_WORKSPACE}}"
- name: Run posts scraper
run: |
poetry run python flows/reddit_posts_scraper.py
- name: Run update of posts' table in big query
run: |
poetry run python flows/update_posts_in_bq.py
- name: Add & Commit
if: always()
uses: EndBug/add-and-commit@v9.1.1
with:
# The arguments for the `git add` command (see the paragraph below for more info)
# Default: '.'
add: '.'
# The name of the user that will be displayed as the author of the commit.
# Default: depends on the default_author input
author_name: Arturo Martínez
# The email of the user that will be displayed as the author of the commit.
# Default: depends on the default_author input
author_email: ${{ secrets.EMAIL }}
# The message for the commit.
# Default: 'Commit from GitHub Actions (name of the workflow)'
message: 'scrape daily posts from subreddits'
# Whether to push the commit and, if any, its tags to the repo. It can also be used to set the git push arguments (see the paragraph below for more info)
# Default: true
push: true