-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Investigate higher disk i/o utilization on 1.4.2. #9201
Comments
I have this issue I have the same high number of reads and writes. Previously an only high number of writes. This is very bad and I get 30% of IO wait right now. I can add more IOPS but before 1.4.2, on 1.3.6 it was enough. |
@szibis How many CPU cores do you have available and also how many IOPS can your disks support? |
Using m4 instances on AWS - 16 cores. Additional GP2 EBS with 3TB size. Traffic about 60k+ req writes per second. Currently using twice much memory then on 1.3.6 - about 50-58GB. |
@szibis Since you have 16 cores or more, can you try setting:
You may need to increase |
@szibis I have merged a fix to master and 1.4 branch. If you are able to test it out, that would be great. |
@jwilder This is available as nightly build or I need to build the package on my own? |
@szibis I just realized that, if you set If you are running the change in 1.4, you shouldn't need the config changes though. I suggested those as a workaround for 1.4.2 in lieu of the fixes. They won't hurt if you want to keep them though. |
Yes, I use this bumped from the beginning.
I will keep them changed. Thx, for fix and help. |
@jwilder I also can see a big improvement over previous nightly builds. I had noticed this bump in disk i/o when going from version 1.3 to version 1.4+ |
@lpic10 Thanks for the update. What do you have set for |
I don't have it set, I suppose it is the default value |
@jwilder not sure related, but I've noticed a big increase in memory consumption since the update (blue mark) Let me know if you want me to run a profile (or if I should comment or open a different issue). |
@lpic10 Can you try setting |
@jwilder All return to bad state. I/Owaits are now high, number of reads are same as number ow writes with cache-snapshot-memory-size set to IO/wait IOPS |
@jwilder I just changed here, tomorrow I give an update |
@szibis Can you grab some profiles via the |
We have some bigger amount of measurements right now (1100+) because of some bad reporting issue, but it was on 1.3.x and it was working much better than now on 1.4.x. Removing them is now impossible, even one, by one. Standard number of measurements is under 200. Every config change and restart takes with such big DB even 30+ minutes. pprof.contentions.delay.001.pb.gz |
@szibis I've attached a build of the current 1.4 branch plus some changes I'm testing to resolve these issues. Would you be able to try this out and see if it improves your situation? I can push up a branch if you prefer to build the binary yourself. |
Testing on one node. |
Trying latest nightly with
Unsurprisingly, changes that has reduced disk IO has increased need for memory, and vice-versa. |
@jwilder should this be closed? |
The text was updated successfully, but these errors were encountered: