Skip to content
This repository has been archived by the owner on Dec 20, 2024. It is now read-only.

Prefect #115

Merged
merged 24 commits into from
Oct 17, 2024
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Update prefect deployment
  • Loading branch information
bram-vdberg committed Oct 15, 2024
commit 5fd12666a4e20ecd5e8dd7dd29dc067f3dd70b7e
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -129,3 +129,6 @@ dmypy.json

# Pyre type checker
.pyre/

# Vim swap files
**/*.swp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine, I guess, but my personal preference would be to have a small .gitignore with project specific content. For local setup, I use .git/info/exclude.

Is this approach with a huge .gitignore better for, say, using dev containers?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or a global .gitignore

In ~/.gitconfig:

[core]
	excludesfile = /Users/bh2smith/.gitignore_global

which contains things like:

TODO.md
.venv/
.env.bkp
env/
venv/
.DS_Store
.idea/
.vscode/
out/

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this approach with a huge .gitignore better for, say, using dev containers?

Not that I know of, I didn't know you could add it to .git/info/exclude. Thanks!

35 changes: 34 additions & 1 deletion prefect/deployment.py
Original file line number Diff line number Diff line change
@@ -1,14 +1,18 @@
import os
import re
import sys
import logging
import pandas as pd
from io import StringIO
from dotenv import load_dotenv
from datetime import datetime, timedelta, timezone
from dune_client.client import DuneClient

"""
deployments_path = os.path.abspath("/deployments")
if deployments_path not in sys.path:
sys.path.insert(0, deployments_path)
"""

from typing import Any
from prefect import flow, task, get_run_logger
@@ -19,6 +23,8 @@
from src.fetch.orderbook import OrderbookFetcher
from src.models.order_rewards_schema import OrderRewards

load_dotenv()

def get_last_monday_midnight_utc():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have expected to see data uploaded aligned with accounting periods, so Tuesday 00:00:00.

In principle, that should not have an effect on the aggregated table. Uploading new values for an accounting week before payments should be easier with alignment.

now = datetime.now(timezone.utc)
current_weekday = now.weekday()
@@ -85,7 +91,34 @@ def upload_data_to_dune(data: str, block_start: int, block_end: int):

@task
def update_aggregate_query(table_name: str):
pass
"""
Query example:
WITH aggregate AS (
SELECT * FROM dune.cowswapbram.dataset_order_rewards_20921069_20921169
bram-vdberg marked this conversation as resolved.
Show resolved Hide resolved
UNION ALL
SELECT * FROM dune.cowswapbram.dataset_testtable
)

SELECT DISTINCT * FROM aggregate;
"""

logger = get_run_logger()
dune = DuneClient.from_env()
query_id = os.environ['AGGREGATE_QUERY_ID']
query = dune.get_query(query_id)
sql_query = query.sql

if table_name not in sql_query:
logger.info(f"Table name not found, updating table with {table_name}")
insertion_point = insertion_point = sql_query.rfind(")")
updated_sql_query = (
sql_query[:insertion_point].strip() +
f"\n UNION ALL\n SELECT * FROM {table_name}\n" +
sql_query[insertion_point:]
)
dune.update_query(query_sql=updated_sql_query)
else:
logger.info(f"Table already in query, not updating query")


@flow(retries=3, retry_delay_seconds=60, log_prints=True)