Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[batch] Show cost breakdown in the UI #13491

Merged
merged 10 commits into from
Sep 1, 2023
Merged

Conversation

jigold
Copy link
Contributor

@jigold jigold commented Aug 24, 2023

Fixes #13483

Screenshot 2023-08-24 at 2 46 17 PM Screenshot 2023-08-24 at 2 39 57 PM Screenshot 2023-08-24 at 2 39 48 PM

Copy link
Contributor

@iris-garden iris-garden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm! i don't have much context for the sql though, so as jackie and i discussed, passing just the sql part of the review along to dan :)

@iris-garden iris-garden assigned danking and unassigned iris-garden Aug 25, 2023
@danking
Copy link
Contributor

danking commented Aug 25, 2023

Comment on lines 76 to 83
if cost is None:
return None
if cost < 0.0001:
return '<$0.0001'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is actually zero, we should report that. Otherwise it looks like an empty Batch or a job which is Ready costs some small amount of money, which they don't.

Suggested change
if cost is None:
return None
if cost < 0.0001:
return '<$0.0001'
if cost is None:
return None
if cost == 0.0:
return 0.0
if cost < 0.0001:
return '<$0.0001'

record['cost_breakdown'] = json.loads(record['cost_breakdown'])
record['cost_breakdown'] = [
{'resource': resource, 'cost': cost} for resource, cost in record['cost_breakdown'].items()
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pull this into a cost_breakdown_to_dict function

) AS usage_t
LEFT JOIN resources ON usage_t.resource_id = resources.resource_id
GROUP BY batch_id
) AS cost_t ON TRUE;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like you're simplifying this a bit by removing the batch id and resource id. That's unrelated to the breakdown change, right? I just want to make sure I understand.

Now that you've dropped those fields, I think you can drop the batch_id from the groups. The plan looks slightly better. Same comment on the per-job breakdown except you can probably also drop job_id.

mysql> explain WITH base_t AS (
    -> SELECT batches.*,
    ->   batches_cancelled.id IS NOT NULL AS cancelled,
    ->   batches_n_jobs_in_complete_states.n_completed,
    ->   batches_n_jobs_in_complete_states.n_succeeded,
    ->   batches_n_jobs_in_complete_states.n_failed,
    ->   batches_n_jobs_in_complete_states.n_cancelled
    -> FROM batches
    -> LEFT JOIN batches_n_jobs_in_complete_states
    ->        ON batches.id = batches_n_jobs_in_complete_states.id
    -> LEFT JOIN batches_cancelled
    ->        ON batches.id = batches_cancelled.id
    -> WHERE batches.id = 10 AND NOT deleted
    -> )
    -> SELECT base_t.*, cost_t.cost, cost_t.cost_breakdown
    -> FROM base_t
    -> LEFT JOIN LATERAL (
    ->   SELECT COALESCE(SUM(`usage` * rate), 0) AS cost, JSON_OBJECTAGG(resources.resource, COALESCE(`usage` * rate, 0)) AS cost_breakdown
    ->   FROM (
    ->     SELECT batch_id, resource_id, CAST(COALESCE(SUM(`usage`), 0) AS SIGNED) AS `usage`
    ->     FROM aggregated_batch_resources_v2
    ->     WHERE base_t.id = aggregated_batch_resources_v2.batch_id
    ->     GROUP BY batch_id, resource_id
    ->   ) AS usage_t
    ->   LEFT JOIN resources ON usage_t.resource_id = resources.resource_id
    ->   GROUP BY batch_id
    -> ) AS cost_t ON TRUE;
+----+-------------------+-----------------------------------+------------+--------+-------------------------+-------------+---------+---------------------+------+----------+----------------------------+
| id | select_type       | table                             | partitions | type   | possible_keys           | key         | key_len | ref                 | rows | filtered | Extra                      |
+----+-------------------+-----------------------------------+------------+--------+-------------------------+-------------+---------+---------------------+------+----------+----------------------------+
|  1 | PRIMARY           | batches                           | NULL       | const  | PRIMARY,batches_deleted | PRIMARY     | 8       | const               |    1 |   100.00 | Rematerialize (<derived3>) |
|  1 | PRIMARY           | batches_n_jobs_in_complete_states | NULL       | const  | PRIMARY                 | PRIMARY     | 8       | const               |    1 |   100.00 | NULL                       |
|  1 | PRIMARY           | batches_cancelled                 | NULL       | const  | PRIMARY                 | PRIMARY     | 8       | const               |    0 |     0.00 | unique row not found       |
|  1 | PRIMARY           | <derived3>                        | NULL       | ALL    | NULL                    | NULL        | NULL    | NULL                |   36 |   100.00 | Using where                |
|  3 | DEPENDENT DERIVED | <derived4>                        | NULL       | ALL    | NULL                    | NULL        | NULL    | NULL                |   36 |   100.00 | Using filesort             |
|  3 | DEPENDENT DERIVED | resources                         | NULL       | eq_ref | resource_id             | resource_id | 4       | usage_t.resource_id |    1 |   100.00 | NULL                       |
|  4 | DEPENDENT DERIVED | aggregated_batch_resources_v2     | NULL       | ref    | PRIMARY,resource_id     | PRIMARY     | 8       | func                |   36 |   100.00 | Using where                |
+----+-------------------+-----------------------------------+------------+--------+-------------------------+-------------+---------+---------------------+------+----------+----------------------------+
7 rows in set, 2 warnings (0.00 sec)

mysql> 
mysql> explain WITH base_t AS (
    -> SELECT batches.*,
    ->   batches_cancelled.id IS NOT NULL AS cancelled,
    ->   batches_n_jobs_in_complete_states.n_completed,
    ->   batches_n_jobs_in_complete_states.n_succeeded,
    ->   batches_n_jobs_in_complete_states.n_failed,
    ->   batches_n_jobs_in_complete_states.n_cancelled
    -> FROM batches
    -> LEFT JOIN batches_n_jobs_in_complete_states
    ->        ON batches.id = batches_n_jobs_in_complete_states.id
    -> LEFT JOIN batches_cancelled
    ->        ON batches.id = batches_cancelled.id
    -> WHERE batches.id = 10 AND NOT deleted
    -> )
    -> SELECT base_t.*, cost_t.cost, cost_t.cost_breakdown
    -> FROM base_t
    -> LEFT JOIN LATERAL (
    ->   SELECT COALESCE(SUM(`usage` * rate), 0) AS cost, JSON_OBJECTAGG(resources.resource, COALESCE(`usage` * rate, 0)) AS cost_breakdown
    ->   FROM (
    ->     SELECT resource_id, CAST(COALESCE(SUM(`usage`), 0) AS SIGNED) AS `usage`
    ->     FROM aggregated_batch_resources_v2
    ->     WHERE base_t.id = aggregated_batch_resources_v2.batch_id
    ->     GROUP BY resource_id
    ->   ) AS usage_t
    ->   LEFT JOIN resources ON usage_t.resource_id = resources.resource_id
    -> ) AS cost_t ON TRUE;
+----+-------------------+-----------------------------------+------------+--------+-------------------------+-------------+---------+---------------------+------+----------+------------------------------+
| id | select_type       | table                             | partitions | type   | possible_keys           | key         | key_len | ref                 | rows | filtered | Extra                        |
+----+-------------------+-----------------------------------+------------+--------+-------------------------+-------------+---------+---------------------+------+----------+------------------------------+
|  1 | PRIMARY           | batches                           | NULL       | const  | PRIMARY,batches_deleted | PRIMARY     | 8       | const               |    1 |   100.00 | Rematerialize (<derived3>)   |
|  1 | PRIMARY           | batches_n_jobs_in_complete_states | NULL       | const  | PRIMARY                 | PRIMARY     | 8       | const               |    1 |   100.00 | NULL                         |
|  1 | PRIMARY           | batches_cancelled                 | NULL       | const  | PRIMARY                 | PRIMARY     | 8       | const               |    0 |     0.00 | unique row not found         |
|  1 | PRIMARY           | <derived3>                        | NULL       | ALL    | NULL                    | NULL        | NULL    | NULL                |    2 |   100.00 | Using where                  |
|  3 | DEPENDENT DERIVED | <derived4>                        | NULL       | ALL    | NULL                    | NULL        | NULL    | NULL                |   36 |   100.00 | NULL                         |
|  3 | DEPENDENT DERIVED | resources                         | NULL       | eq_ref | resource_id             | resource_id | 4       | usage_t.resource_id |    1 |   100.00 | NULL                         |
|  4 | DEPENDENT DERIVED | aggregated_batch_resources_v2     | NULL       | ref    | PRIMARY,resource_id     | PRIMARY     | 8       | func                |   36 |   100.00 | Using where; Using temporary |
+----+-------------------+-----------------------------------+------------+--------+-------------------------+-------------+---------+---------------------+------+----------+------------------------------+
7 rows in set, 2 warnings (0.01 sec)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment below about whether it's worth just cleaning up these queries all together rather than just making this feature of the cost breakdown possible.

) AS usage_t
LEFT JOIN resources ON usage_t.resource_id = resources.resource_id
GROUP BY batch_id
) AS cost_t ON TRUE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Food for thought for a possible clean up PR: The fact that we repeat this query in a few places makes me wonder if we're better off using two DB queries in a transaction. I suspect the DB round-trip-latency is far from our main bottleneck and it makes it easier to understand the code if we had something like:

def batch_cost_info(tx: Transaction, batch_id):
    return tx.select_and_fetchone(...)

That way we can abstract the cost calculation out from the query code or the UI code.

Copy link
Contributor Author

@jigold jigold Aug 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These queries are a mess anyways. They need to be rewritten to use lateral joins without the WITH base_t anyways. I was trying not to do that in this PR. The issue why the queries needed to be changed more drastically is because doing JSON_OBJECTAGG() the way we had it before was returning {null: ...} if there were no records which was causing lots of downstream problems.

Should I go ahead and simplify the queries to what is optimal here in this PR?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, I was just noodling on how we could make this more readable for the future. let's keep this PR simple and to the point. A separate PR to explore alternative organization of the SQL seems fine.

FROM batches
LEFT JOIN billing_projects ON batches.billing_project = billing_projects.name
LEFT JOIN batches_n_jobs_in_complete_states ON batches.id = batches_n_jobs_in_complete_states.id
LEFT JOIN batches_cancelled ON batches.id = batches_cancelled.id
STRAIGHT_JOIN billing_project_users ON batches.billing_project = billing_project_users.billing_project
LEFT JOIN LATERAL (
SELECT COALESCE(SUM(`usage` * rate), 0) AS cost
SELECT COALESCE(SUM(`usage` * rate), 0) AS cost, JSON_OBJECTAGG(resources.resource, COALESCE(`usage` * rate, 0)) AS cost_breakdown
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great.

@@ -15,6 +15,28 @@ <h2>Properties</h2>
<li>Cost: {% if 'cost' in job and job['cost'] is not none %}{{ job['cost'] }}{% endif %}</li>
</ul>

<h2>Cost Breakdown</h2>
{% if job['cost_breakdown'] %}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What causes this to be None? It seems like the DB queries always return it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's None if the job hasn't started running yet. Same with batches with no jobs running yet.

Copy link
Contributor

@danking danking left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@danking danking merged commit 07a483a into hail-is:main Sep 1, 2023
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[batch] Batch should display costs disaggregated by SKU
3 participants