-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
re:dash should support incremental scheduled jobs #35
Comments
in #28 Arik mentioned an upcoming feature will be making queries based on existing query results. Let's wait until that lands before digging into this. |
I misunderstood this - I thought outputs from previous queries were needed as inputs to new queries. Just appending new results to old ones should be simple -- the major question is if we need to handle removing results or if they should just accumulate indefinitely. |
@washort we should certainly remove old results. Here are two options:
|
I don't see how a |
@washort some other thoughts that came up in regards to this feature:
Not all of these need to be answered right now, we can improve them incrementally. |
A lot of our data in Presto is partitioned by submission_date. When we run queries over this data, often times the result is also partitioned by submission_date (for example, MAU/DAU/WAU). That means for historical data, the results don't change, but we re-compute anyways.
Incremental jobs would solve this. The results from yesterday would be joined with the results from today. The query would have to have $SUBMISSION_DATE in it somewhere, which re:dash would automatically fill in with $YESTERDAY. It would be up to the query writer to ensure correctness; i.e. that the query is idempotent.
The text was updated successfully, but these errors were encountered: