Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework analytics handling to fix performance issues on large datasets #2080

Closed
jakkuh opened this issue Mar 29, 2024 · 4 comments
Closed

Rework analytics handling to fix performance issues on large datasets #2080

jakkuh opened this issue Mar 29, 2024 · 4 comments
Labels
Milestone

Comments

@jakkuh
Copy link
Contributor

jakkuh commented Mar 29, 2024

Summary

Long time shlink user here with 10s of millions of recorded visits. Due to the design of the app, the visits/orphan visits counts, along with the link visits counts either never load or take forever to load. https://imgur.com/94hIXTx

Upon doing some research it appears that every time the admin page is loaded a direct database query is run (regardless of if using roadrunner or a web server) to count these entries. On our large dataset with a reasonably cheap VPS (4 core, 8GB ram) this takes over 2 minutes to complete.
localhost | shlink | Query | 147 | executing | SELECT COUNT(v0_.id) AS sclr_0 FROM visits v0_ WHERE v0_.short_url_id IS NOT NULL AND v0_.potential_bot = 0

A far more optimal way to do this would be to store the counts in the database and then query those, for both the global count and individual shortlinks.

In a perfect world, those counts would be iterated on each visit, and then checked for accuracy by doing a full database count on a scheduled basis. It could also just be the latter.

Use case

Visit counts never loading, and sorting short links by visit count taking forever.

@acelaya
Copy link
Member

acelaya commented Mar 30, 2024

It's funny that you request this precisely now, because I happen to have just implemented it 😅

You can find more context on this other issue #2036

And these PRs:

The overview counts are the only ones missing, but it shouldn't be super hard to implement now.

I did my testing with ~1M visits, 10k short urls and 20k tags, and everything was around 20x faster in average.

@acelaya acelaya moved this to In Progress in Shlink Mar 30, 2024
@acelaya acelaya added this to the 4.1.0 milestone Mar 30, 2024
@acelaya acelaya moved this from In Progress to Todo in Shlink Mar 31, 2024
@acelaya acelaya moved this from Todo to In Progress in Shlink Mar 31, 2024
@acelaya
Copy link
Member

acelaya commented Apr 1, 2024

This is now fully implemented.

@acelaya acelaya closed this as completed Apr 1, 2024
@github-project-automation github-project-automation bot moved this from In Progress to Done in Shlink Apr 1, 2024
@jakkuh
Copy link
Contributor Author

jakkuh commented Apr 3, 2024

Sweet. Release soon? :D @acelaya

@acelaya
Copy link
Member

acelaya commented Apr 14, 2024

Shlink 4.1.0 has just been released.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Archived in project
Development

No branches or pull requests

2 participants