Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: improvements drive stats #4258

Merged

Conversation

swiffer
Copy link
Contributor

@swiffer swiffer commented Oct 12, 2024

This brings several improvements to the Drive Stats Dashboard

  • make use of Drives instead of Positions where possible (250ms vs 1ms here)
  • reuse queries accross mutliple panels to reduce stress on database (3 queries no longer executed at all)

In addition some oddities have been fixed

  • While Panels mentioned "Average" within the queries a "Median" has been calculated, change Panel titles
  • Fix visualizations in first dashboard row by gapfilling data for days without drives
  • Fix "Exclude Locations" filter (when empty locations containing spaces have been filtered out)

fixes #4199
fixes #4193

Copy link

netlify bot commented Oct 12, 2024

Deploy Preview for teslamate ready!

Name Link
🔨 Latest commit 9dc0f98
🔍 Latest deploy log https://app.netlify.com/sites/teslamate/deploys/670d6900b4a16c0008fca54f
😎 Deploy Preview https://deploy-preview-4258--teslamate.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@swiffer swiffer force-pushed the perf-improvements-drive-stats branch 2 times, most recently from e71d913 to 976bda3 Compare October 13, 2024 05:15
@JakobLichterfeld JakobLichterfeld added enhancement New feature or request area:dashboard Related to a Grafana dashboard labels Oct 13, 2024
@JakobLichterfeld JakobLichterfeld changed the title perf-improvements-drive-stats perf: improvements drive stats Oct 13, 2024
@JakobLichterfeld
Copy link
Collaborator

JakobLichterfeld commented Oct 13, 2024

Thanks for your efforts!

On my Raspberry Pi 3B+ this dashboard is now unusable. Loading ~ 46 seconds. Median per day not showing up at all.

2024-10-13.10-12-11.mov

@JakobLichterfeld JakobLichterfeld added the WIP Work in progress label Oct 13, 2024
@swiffer
Copy link
Contributor Author

swiffer commented Oct 13, 2024

ok, that went totally in the wrong direction 🤔 will have a look...

@swiffer swiffer marked this pull request as draft October 13, 2024 08:19
@swiffer
Copy link
Contributor Author

swiffer commented Oct 13, 2024

@JakobLichterfeld - can you run an explain analyze on this query please

explain analyze WITH since as (
	SELECT date at time zone 'UTC' as date FROM positions
	WHERE car_id = 1
	ORDER BY date ASC
	LIMIT 1
),

actual AS (
	SELECT
		date_trunc('day', start_date at time zone 'UTC') AS date,
		count(*) AS distance
	FROM drives
	WHERE car_id = 1
	GROUP BY 1
),

base_line AS (
	SELECT date from generate_series(date_trunc('day', (select date from since)), NOW(), '1 day'::interval) date
)

SELECT
  base_line.date as time,
	convert_km(COALESCE(actual.distance, 0)::numeric, 'km') as "distance_km"
FROM base_line
LEFT JOIN actual ON actual.date = base_line.date
WHERE base_line.date BETWEEN '2023-10-13T08:20:26.94Z' AND '2024-10-13T08:20:26.94Z'
order by base_line.date;

@swiffer
Copy link
Contributor Author

swiffer commented Oct 13, 2024

never mind, found it!

@swiffer swiffer force-pushed the perf-improvements-drive-stats branch from 976bda3 to f2d13ea Compare October 13, 2024 08:34
@swiffer
Copy link
Contributor Author

swiffer commented Oct 13, 2024

ready for retest

@JakobLichterfeld
Copy link
Collaborator

@JakobLichterfeld - can you run an explain analyze on this query please

-------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=208989.57..208989.71 rows=56 width=40) (actual time=36763.214..36769.724 rows=366 loops=1)
   Sort Key: date.date
   Sort Method: quicksort  Memory: 31kB
   InitPlan 1 (returns $0)
     ->  Limit  (cost=208795.94..208795.94 rows=1 width=8) (actual time=35885.387..35885.393 rows=1 loops=1)
           ->  Sort  (cost=208795.94..224757.68 rows=6384697 width=8) (actual time=35874.792..35874.796 rows=1 loops=1)
                 Sort Key: ((positions.date AT TIME ZONE 'UTC'::text))
                 Sort Method: top-N heapsort  Memory: 17kB
                 ->  Seq Scan on positions  (cost=0.00..176872.46 rows=6384697 width=8) (actual time=0.284..31167.745 rows=6383679 loops=1)
                       Filter: (car_id = 1)
   ->  Hash Right Join  (cost=127.22..192.01 rows=56 width=40) (actual time=36760.720..36762.525 rows=366 loops=1)
         Hash Cond: ((date_trunc('day'::text, (drives.start_date AT TIME ZONE 'UTC'::text))) = date.date)
         ->  HashAggregate  (cost=112.15..145.59 rows=2229 width=16) (actual time=80.388..81.257 rows=700 loops=1)
               Group Key: date_trunc('day'::text, (drives.start_date AT TIME ZONE 'UTC'::text))
               Batches: 1  Memory Usage: 145kB
               ->  Seq Scan on drives  (cost=0.00..101.01 rows=2229 width=8) (actual time=28.246..61.703 rows=2107 loops=1)
                     Filter: (car_id = 1)
         ->  Hash  (cost=15.01..15.01 rows=5 width=8) (actual time=36680.199..36680.201 rows=366 loops=1)
               Buckets: 1024  Batches: 1  Memory Usage: 16kB
               ->  Function Scan on generate_series date  (cost=0.01..15.01 rows=5 width=8) (actual time=36674.824..36675.131 rows=366 loops=1)
                     Filter: ((date >= '2023-10-13 08:20:26.94+00'::timestamp with time zone) AND (date <= '2024-10-13 08:20:26.94+00'::timestamp with time zone))
                     Rows Removed by Filter: 1471
 Planning Time: 24.053 ms
 JIT:
   Functions: 26
   Options: Inlining false, Optimization false, Expressions true, Deforming true
   Timing: Generation 11.140 ms, Inlining 0.000 ms, Optimization 20.642 ms, Emission 553.109 ms, Total 584.891 ms
 Execution Time: 38848.518 ms
(28 rows)

@JakobLichterfeld
Copy link
Collaborator

ready for retest

you are too fast for my rpi to take up :-)

@swiffer swiffer marked this pull request as ready for review October 13, 2024 08:37
@JakobLichterfeld
Copy link
Collaborator

JakobLichterfeld commented Oct 13, 2024

never mind, found it!

I don't think so. It still takes ages. (45 seconds)

@swiffer
Copy link
Contributor Author

swiffer commented Oct 13, 2024

Aufzeichnung.2024-10-13.104702.mp4

@swiffer
Copy link
Contributor Author

swiffer commented Oct 13, 2024

you are using this one?

f2d13ea

@JakobLichterfeld
Copy link
Collaborator

JakobLichterfeld commented Oct 13, 2024

I don't think so. It still takes ages. (45 seconds)

you are using this one?

f2d13ea

Nope. I was faster than my Rpi was able to pull the new image, so I thought I would use the latest one.

Now it runs at instant speed (two refreshes in attached video). Even with last 10 years it is instant. But Median per day is still empty.

2024-10-13.11-15-01.mov

@JakobLichterfeld JakobLichterfeld removed the WIP Work in progress label Oct 13, 2024
@JakobLichterfeld
Copy link
Collaborator

JakobLichterfeld commented Oct 13, 2024

Do you have a preferred merger order for #4253?

@JakobLichterfeld
Copy link
Collaborator

But Median per day is still empty.

Just to confirm: In v1.30.1 The average fields are populated correctly on my setup.

@swiffer
Copy link
Contributor Author

swiffer commented Oct 13, 2024

But Median per day is still empty.

Just to confirm: In v1.30.1 The average fields are populated correctly on my setup.

using grafana 11.2.2 locally, will check in grafana 11.0.1

Do you have a preferred merger order for #4253?

i would prefer merging this first and add the additional panels afterwards.

@swiffer swiffer force-pushed the perf-improvements-drive-stats branch from f2d13ea to 0582bb6 Compare October 13, 2024 09:34
@swiffer
Copy link
Contributor Author

swiffer commented Oct 13, 2024

50% Percentile Calculations aren't available in Grafana 11.0.1 - reverted two panels to using it's own sql query instead of calculating with dataset from another panel. should work now.

@swiffer
Copy link
Contributor Author

swiffer commented Oct 13, 2024

Nope. I was faster than my Rpi was able to pull the new image, so I thought I would use the latest one.

Now it runs at instant speed (two refreshes in attached video). Even with last 10 years it is instant. But Median per day is still empty.

great to hear - the trick is to avoid positions whenever possible / ensure proper index usage / exclude streaming data as described here #4253 (comment) for now...

@JakobLichterfeld
Copy link
Collaborator

reverted two panels to using it's own sql query instead of calculating with dataset from another panel. should work now.

With your latest push it works, values seems reasonable for last 7 days but not for last 2.
2 Drives with 20.8 km is correct. The median is a bit off :-)
image

@JakobLichterfeld
Copy link
Collaborator

JakobLichterfeld commented Oct 13, 2024

For comparison with v1.30.1: It counts 4 drives which is in fact correct. There has been 2 ~800m drives as well.

image

image

22,3703 km driven in 4 drives means 5,592575 average innit? Ah it is average per day.
Happy to test more, but not today :-) Happy weekend.

@swiffer
Copy link
Contributor Author

swiffer commented Oct 13, 2024

22,3703 km driven in 4 drives means 5,592575 average innit? Ah it is average per day. Happy to test more, but not today :-) Happy weekend.

in v1.30.1 it currently is percentile_disc(0,5) per day.... i actually vote for percentile_cont(0,5) per day instead in case of 2 days timeframe...

@swiffer swiffer force-pushed the perf-improvements-drive-stats branch from 0582bb6 to 946c26c Compare October 13, 2024 11:05
@swiffer
Copy link
Contributor Author

swiffer commented Oct 13, 2024

Ok, ended up now basically beeing a rewrite of the dashboard... it could be that numbers are different from before but they should be better now, give it a try.

Some Notes:

  • when selecting "Last 2 days" this is a relative timeframe and basically means: last 48 hours. the dashboard now has been adapted to show data from last 48 hours only.
  • when selecting "Last 7 days" and you only had drives in day 1 and 2 of the week the estimated mileage most likely was way to high - this should have been corrected now.
  • switched to percentile_cont (this should be noticeable on short timeframes only).

would be interesting to see v1.30.1 vs this PR in your instance again

@swiffer swiffer force-pushed the perf-improvements-drive-stats branch from 946c26c to 4e93ded Compare October 14, 2024 07:40
@JakobLichterfeld
Copy link
Collaborator

JakobLichterfeld commented Oct 14, 2024

that numbers are different from before but they should be better now

Yes, looks better now.

A median of zero does not make sense imo.

11-15.10.

v1.30.1

image

pr-4258

image

11-12.10.

v1.30.1

image

pr-4258

image

Last 1 year

v1.30.1

image

pr-4258

image

@swiffer
Copy link
Contributor Author

swiffer commented Oct 14, 2024

depends - it looks like you are not driving enough 😆

median means 50 percentile - so if there are more than 50% of days where you are not driving at all then 0 km is correct.

should we switch to average calculations for the two panels ? (like the titles said before).

other approach would be still calculating a median but excluding all days without drives.

@JakobLichterfeld
Copy link
Collaborator

depends - it looks like you are not driving enough 😆

Yeah you are right, but don't miss the time on the road that much :-)

median means 50 percentile - so if there are more than 50% of days where you are not driving at all then 0 km is correct.

should we switch to average calculations for all 3 ? (like the titles said).

In this case I vote yes, as it is more intuitve, innit?

@swiffer swiffer force-pushed the perf-improvements-drive-stats branch from 4e93ded to 9dc0f98 Compare October 14, 2024 18:54
@swiffer
Copy link
Contributor Author

swiffer commented Oct 14, 2024

rebased and back to dashboard queries & average calculations. should be ready!

Copy link
Collaborator

@JakobLichterfeld JakobLichterfeld left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perfect, ty!

@JakobLichterfeld JakobLichterfeld merged commit 9adc7dd into teslamate-org:master Oct 19, 2024
18 of 22 checks passed
JakobLichterfeld pushed a commit that referenced this pull request Oct 19, 2024
@swiffer swiffer deleted the perf-improvements-drive-stats branch October 19, 2024 15:18
@sdwalker
Copy link
Contributor

Went from 1135 to 705 drives over the same 1 year timespan with average distance driven and average kWh used being wrong

Before:
image

After:
image

@swiffer
Copy link
Contributor Author

swiffer commented Oct 20, 2024

@sdwalker thanks for reporting - see #4285 for a fix of drives count.

the average values are different but should be "better" now as these are averages per day (incl. drives without days) instead of median (over days with drives).

so different output expected here.

@sdwalker
Copy link
Contributor

@sdwalker thanks for reporting - see #4285 for a fix of drives count.

Drive count is fixed

@sdwalker
Copy link
Contributor

the average values are different but should be "better" now as these are averages per day (incl. drives without days) instead of median (over days with drives).

so different output expected here.

Count me in the group that preferred the median values

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:dashboard Related to a Grafana dashboard enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

perf: drive stat - km driven slow km-driven in drive-stats only updates once a day
3 participants