-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add time range parameters to sync script #483
Changes from all commits
8e9c478
786f8ed
ba8f745
a608e4b
0cd668c
11c9414
0f45be5
08615b9
14901f1
2b5fb06
c967b64
819f8cf
58722bc
1f01c20
dbf4301
94a5216
22dfa74
8e05eac
a4bbf28
0c5a2bc
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,13 +1,78 @@ | ||
"""Shared methods between both sync scripts.""" | ||
|
||
from datetime import datetime, timezone | ||
from typing import List, Tuple | ||
|
||
from dateutil.relativedelta import relativedelta | ||
from web3 import Web3 | ||
|
||
from src.logger import set_log | ||
from src.models.block_range import BlockRange | ||
|
||
log = set_log(__name__) | ||
|
||
|
||
def partition_time_range( | ||
start_time: datetime, end_time: datetime | ||
) -> list[tuple[datetime, datetime]]: | ||
"""Computes (list of) time ranges from input parameters. | ||
If both times are from the same month, only [(start_time, end_time)] is returned. | ||
Otherwise, the range is split into n pieces of the form [(start_time, start_of_month_2), | ||
(start_of_month_2, start_of_month_3),..., (start_of_month_n, end_time)]. | ||
""" | ||
assert start_time < end_time, "start_time must be strictly smaller than end_time" | ||
|
||
# if there is just one month to consider | ||
if end_time <= datetime(start_time.year, start_time.month, 1).replace( | ||
tzinfo=timezone.utc | ||
) + relativedelta(months=1): | ||
return [(start_time, end_time)] | ||
|
||
# if there are multiple months to consider | ||
next_month_start_time = datetime(start_time.year, start_time.month, 1).replace( | ||
tzinfo=timezone.utc | ||
) + relativedelta(months=1) | ||
time_range_list = [(start_time, next_month_start_time)] | ||
while end_time > next_month_start_time + relativedelta(months=1): | ||
time_range_list.append( | ||
(next_month_start_time, next_month_start_time + relativedelta(months=1)) | ||
) | ||
next_month_start_time = next_month_start_time + relativedelta(months=1) | ||
time_range_list.append((next_month_start_time, end_time)) | ||
|
||
return time_range_list | ||
|
||
|
||
def compute_block_range( | ||
start_time: datetime, end_time: datetime, node: Web3 | ||
) -> BlockRange: | ||
"""Computes a block range from start and end time. | ||
The convention for block ranges is to be inclusive, while the end time is exclusive. | ||
Only finalized blocks are considered. | ||
""" | ||
latest_block = node.eth.get_block("finalized") | ||
latest_block_time = datetime.fromtimestamp( | ||
latest_block["timestamp"], tz=timezone.utc | ||
) | ||
|
||
assert ( | ||
start_time < latest_block_time | ||
), "start time must be smaller than latest block time" | ||
|
||
start_block = find_block_with_timestamp(node, start_time.timestamp()) | ||
if latest_block_time < end_time: | ||
log.info( | ||
f"Latest finalized block time {latest_block_time} is smaller than {end_time}." | ||
"Using latest finalized block." | ||
) | ||
end_block = int(latest_block["number"]) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we add a log here to let us know if this happens that the script is not actually using the end time because it's still in the future? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do you have an opinion on where this processing for time ranges should take place? On the one hand, most of the logging and checking can also happen during initialization in On the other hand, a node would be required to do the checking for latest block. And it would be mixing abstractions. The accounting, on a high level, should be time based. That this is internally split into monthly ranges and blocks is a bit of an implementation detail. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure to be honest. Since this requires actual implementation details and not checking whether or not the args are valid it feels cleaner to leave this here. For the sake of serializing the args any end time > start time would be valid, so it seems to me like this is the right place. But if moving this to the initialization makes it more maintainable then that could be a good argument to do this there instead. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For now I just added a log message. |
||
else: | ||
end_block = find_block_with_timestamp(node, end_time.timestamp()) - 1 | ||
|
||
assert start_block < end_block, "start block must be smaller than end block" | ||
|
||
return BlockRange(block_from=start_block, block_to=end_block) | ||
harisang marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
|
||
def find_block_with_timestamp(node: Web3, time_stamp: float) -> int: | ||
""" | ||
This implements binary search and returns the smallest block number | ||
|
@@ -40,77 +105,3 @@ def find_block_with_timestamp(node: Web3, time_stamp: float) -> int: | |
# fallback in case correct block number hasn't been found | ||
# in that case, we will include some more blocks than necessary | ||
return mid_block_number + 200 | ||
|
||
|
||
def compute_block_and_month_range( # pylint: disable=too-many-locals | ||
node: Web3, recompute_previous_month: bool | ||
) -> Tuple[List[Tuple[int, int]], List[str]]: | ||
""" | ||
This determines the block range and the relevant months | ||
for which we will compute and upload data on Dune. | ||
""" | ||
# The function first a list of block ranges, followed by a list of | ||
# # months. Block ranges are stored as (start_block, end_block) pairs, | ||
# and are meant to be interpreted as closed intervals. | ||
# Moreover, we assume that the job runs at least once every 24h | ||
# Because of that, if it is the first day of month, we also | ||
# compute the previous month's table just to be on the safe side | ||
|
||
latest_finalized_block = node.eth.get_block("finalized") | ||
|
||
current_month_end_block = int(latest_finalized_block["number"]) | ||
current_month_end_timestamp = latest_finalized_block["timestamp"] | ||
|
||
current_month_end_datetime = datetime.fromtimestamp( | ||
current_month_end_timestamp, tz=timezone.utc | ||
) | ||
current_month_start_datetime = datetime( | ||
current_month_end_datetime.year, current_month_end_datetime.month, 1, 00, 00 | ||
) | ||
current_month_start_timestamp = current_month_start_datetime.replace( | ||
tzinfo=timezone.utc | ||
).timestamp() | ||
|
||
current_month_start_block = find_block_with_timestamp( | ||
node, current_month_start_timestamp | ||
) | ||
|
||
current_month = ( | ||
f"{current_month_end_datetime.year}_{current_month_end_datetime.month}" | ||
) | ||
## in case the month is 1-9, we add a "0" prefix, so that we have a fixed-length representation | ||
## e.g., 2024-12, 2024-01 | ||
if len(current_month) < 7: | ||
current_month = current_month[:5] + "0" + current_month[5] | ||
months_list = [current_month] | ||
block_range = [(current_month_start_block, current_month_end_block)] | ||
if current_month_end_datetime.day == 1 or recompute_previous_month: | ||
if current_month_end_datetime.month == 1: | ||
previous_month = f"{current_month_end_datetime.year - 1}_12" | ||
previous_month_start_datetime = datetime( | ||
current_month_end_datetime.year - 1, 12, 1, 00, 00 | ||
) | ||
else: | ||
previous_month = f"""{current_month_end_datetime.year}_ | ||
{current_month_end_datetime.month - 1} | ||
""" | ||
if len(previous_month) < 7: | ||
previous_month = previous_month[:5] + "0" + previous_month[5] | ||
previous_month_start_datetime = datetime( | ||
current_month_end_datetime.year, | ||
current_month_end_datetime.month - 1, | ||
1, | ||
00, | ||
00, | ||
) | ||
months_list.append(previous_month) | ||
previous_month_start_timestamp = previous_month_start_datetime.replace( | ||
tzinfo=timezone.utc | ||
).timestamp() | ||
previous_month_start_block = find_block_with_timestamp( | ||
node, previous_month_start_timestamp | ||
) | ||
previous_month_end_block = current_month_start_block - 1 | ||
block_range.append((previous_month_start_block, previous_month_end_block)) | ||
|
||
return block_range, months_list |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't there used to be a type
AccountingPeriod
? If that class has been abolished, it might not hurt to add a basic declaration:that can be passed around the project for better readability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I am still not sure how to reasonably merge the two code bases of solver-rewards and dune-sync (v1).
Accounting period seems to be tailored to rewards with defaulting to weeks and having pretty printing for dune tables. The accounting period also has some implicit constraints on having a time of 00:00:00 in some parts of the code.
Making the concept of accounting periods consistent throughout the code would definitely help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically, I block number ranges would be the most appropriate thing. This was the benefit of the old
AccountingPeriod
class; it provided a bijection between timestamps and block numbers so computations could be performed on the more natural type.Anyway, I suppose this is just a side note. Feel free to ignore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is an ongoing internal debate on what the most natural concept for accounting periods is. Block ranges sound nice when considering how backend data is structured. But they become a bit unnatural if multiple chains are considered. For simplicity, we want to use time based periods everywhere, across chains and projects.