-
Notifications
You must be signed in to change notification settings - Fork 190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Database Indexes to speed up the requests #510
Comments
Mhm I don't get why but the queries were getting slower and slower with the indexes. So maybe those are not so good - I don't know the first results were really good though. Do I have to update the indexes "manually"? |
Hi, Is this still an issue ? As a full-time DBA, I could give a hand. |
Yea nothing has changed regarding this. |
@marco44 yes the issue is still there clicking on the news app takes about 20 seconds to load (no super computer ;) but I think this could be way faster. |
Ok Best thing may be to put long_query_time to 0, do a run, then turn this off (you don't want gigabytes of log), then find out what took time in the log. Then we'll know what we need to optimize (mysqldumpslow may help here). |
By the way, first query on my site is this:
That's a query that many SQL engines won't be able to rewrite smartly. This way of writing it is much more efficient:
This comes from the fact that "max" is an aggregate function like any other (count, avg, etc…), but there are some special cases in SQL optimizers for trivial cases (where there is no join for instance). With the order by + limit, we are explicit about what we want. I don't know if MySQL can benefit from it too, but in my case, it's a very big gain. |
Other query that could be optimized (the 2nd in my environment):
Two things can improve it without too much change:
Then the query can be a bit more optimized: If you look at the plan, you see that "Filter: ((feeds.folder_id = 0) OR (folders.deleted_at = 0))" is done as a last step
The filter has moved in the scan of the folders table. Which means checking on 6 records instead of 1000. Going faster with this would need to not use "technical" keys in tables anymore, but natural keys… (having the user in all 3 tables, etc…). That's a problem of normalizing the database, and would require much more work (removing the id columns and using natural keys). Or we could duplicate this data in all tables, which would make maintaining integrity harder. But having a way to tell which records belong to a user in oc_news_items would make finding them much faster (like 1 or 2 orders of magnitude). BTW is giving you query optimizations like this helping ? |
I'd much rather have the optimizations of pull requests. I doubt putting them in the issue like this is going to create any meaningful change. |
I presume you mean the optimizations IN pull requests ? |
Yeah, suggest the changes as a pull request instead of/in addition to the comments here. Otherwise we'd need to wait for someone else to copy these over to a PR and implement them. |
Ok. We'll see about implementation when we're sure the rewrites and indexes solve anything (I'm much more proficient with PostgreSQL, so some rewrites may not be efficient with MySQL). I'll need @enaut 's feedback (I don't have a nexcloud mysql database with lots of data in in to test). Or do we have test dumps with lots of data in them somewhere ? |
Hey thanks for helping! There is the slowquery (with time 0): |
Ok, so your queries are more or less all of the same family (all are similar to the second I posted). Can you post a plan of (what this query returns):
And how many records are there in oc_news_items ? ( |
I had to omit the analyze for this to work...
|
Yeah, analyze is with mysql. mariadb doesn't even have this. First run the query as is (without explain) It should take 7s. Then replace WHERE And then create the index (create index test_marc on oc_news_items (unread,id) or something like that), and try once more. Post the results. And if it's still slow, post an explain of the last try. |
There might be a difference in data since my news entries are all from one user? Ok now so that I do not mix things up...
So the original version is way faster with the index. Oh and the news app load is not really faster (even with index). |
I've built a dummy database of about the same size as yours, and I'm a bit lost, I don't really have an explanation for what you see. The rewrite doesn't do any difference here with mysql (but it does with postgreql as it can take advantage of it), but the index helps a lot for this query with and without the rewrite. Without the index, I'm at 0.5s (but on a very fast machine). What kind of machine are we dealing with so I have a rough idea of what is the normal ratio between your tests and mine ? Anyway if the index has sped up this query, great, let's move on to the next, please re-do a slow-query (erase the previous file, so we start from scratch) |
Looking at the queries, the starred items count will be slow too.
This should make finding unread and starred items much faster. If it's still not good enough, please do the slow-query routine again. |
This makes the overview on the left load as fast as expected.
|
Ok, I think that for this one, the first index I proposed is better suited. I tried to get rid of an index, but we may need it still. Please add:
and tell me... |
There is still one taking it's time...
more details:
The machine this runs on is a VirtualServer with 4 cores and 14 GB of Ram. Due to the virtualization some things not in cache take a while... but the database is usually fast enough. |
Wait... it is fast now... but when checking the "show read items" button it is slow again. |
I'm starting to wonder if you wouldn't have caching effects. Maybe you should reserve some memory for the DB cache, instead of relying on the operating system... I think you should raise your innodb_buffer_pool_size (but I have no real experience on this with MariaDB), to make sure your block cache don't get pushed out by IO on the machine. Maybe also double check you're not overcommitting memory, databases really hate when they think data is in memory, but it's been swapped by the hypervisor. |
I incresed the innodb_buffer_pool_size from 128MB to 1GB. It did not change much 1GB is the recommended size for my setup according to: https://www.taste-of-it.de/mysql-mariadb-feintuning-von-innodb_buffer_pool_size/ The following query still takes 7.4 seconds:
8GB of RAM are free - so plenty... |
But all in all with those indexes it is already much faster! Loading the news panel now only takes one second where it was ~20sec before. Only if activating the "show read articles" it takes a little longer. But I think the above query can become a subsecond one too! |
The upgrade of the |
I don't get why this last query takes so long. It's the most trivial of them all. There's no filtering at all, it's just "give me the 40 items with the largest id". The primary key from items should be enough (and it is on my mariadb test db). What's the query plan ? |
I don't know what a query plan is. when removing the |
Oh. Yes, query plans are what explain display, sorry :) If you remove the order by, the query only has to take whatever records it wants, so yes, it would be faster. But That's not what we want :) So, yes, please give me the plan |
I think its the "same ones" as above:
|
Ok, so that's absurd… The correct way of doing this is to use the items primary key backwards, get records and see if they match in feeds and folders (which most will). Here's the plan on my environment with something similar…
It answers in like 5ms. In your case, it does the whole join of all records in the 3 tables, then sorts the result by descending id and gives the 40 top. Which seems plain wrong. So maybe the problem is that it thinks the table is too fragmented or it has bad statistics… |
So I did the two commands (not sure about the second one because I expected something more verbose):
Afterwards the Query is still at 6 sec:
|
Well then I don't know what to say. MariaDB's optimizer is dumb in your case, indexes are correct. And I'm not good enough with it to know what to do if it takes the wrong index. But the cause here is the news_items_feed_id_index index. I didn't have it, though it's in the configuration file (maybe I did a mistake when copying from my postgresql database, as it's present there). Without the index, the query is fast, with it I get your plan... Maybe try to drop the index to see how it goes... |
So, now if I change that query:
Everything is good! - could you do a pullrequest for this? |
Can you try to use the app in detail to be sure that the suppression of this index has no bad consequence ? |
should I delete the index? Because else I would have to change php wouldn't I - I'm not really into php... |
What do you mean ? You can keep the index as deleted, it shouldn't be a problem (except for performance, which is what you should test) |
@Grotax : should I just edit the database.xml ? There are 3 indexes to add and 1 to remove. Will this be sufficient ? Is there a way I can test what a migration this would produce ? |
Just edit the database.xml. I don't think there's a way to check the migration though. |
I did drop the index now. Together with the new indexes the news app is now a lot faster. Now it has the speed I'd expect! Also the android app benefits from the speedup when synchronizing with the server. |
Close nextcloud#510 Hopefully this will solve performance with MySQL/MariaDB. There is no way to prove this though. See nextcloud#510
Close nextcloud#510 Hopefully this will solve performance with MySQL/MariaDB. There is no way to prove this though. See nextcloud#510 Signed-off-by: Marc Cousin <marc.cousin@people-doc.com>
Ok, made the PR. When you get the app uprgrade, @enaut , don't forget to drop the indexes. Or rename them to the ones in the PR |
Close nextcloud#510 Hopefully this will solve performance with MySQL/MariaDB. There is no way to prove this though. See nextcloud#510 Signed-off-by: Marc Cousin <marc.cousin@people-doc.com>
Read and tick the following checkbox after you have created the issue or place an x inside the brackets ;)
Is your feature request related to a problem? Please describe.
The Problem is: When loading the main Nextcloud news page it takes about 15+ sekonds to be displayed. Enabling long query log on the server I found that each pageload triggers several 5+ queries on the database. By adding some indexes I was able to cut those in halve or on some even cut them from 5sec to 0,002sec.
It might be slow for me because I like to archive and have
88720
news items stored.Describe the solution you'd like
It would help a lot if the indexes would be added when installing/updating the app.
I'm not good with
php
and neithermysql
which is why I let the possible indexes generate by some webpage.I found that:
makes the query for counting the starred items a 0,002sec instead of 5sec one.
All the indexes I added are:
As said I'm not good with any of it which is why it is probably good if someone else actually does the job of optimizing but I was able to cut the page load speeds to half in web nextcloud news. And even (a lot) less in the nextcloud news app (before 30 sec, now 1sec).
The text was updated successfully, but these errors were encountered: