Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make some modifications to how the benchmarking tool works in local m… #312

Closed

Conversation

jwlent55
Copy link

…ode as part of

an evaluation of the RocksDB strorage engine:

  1. Ensure that all the warm up records requested are created to eliminate some
    random errors.
  2. Modify the mixed operation in local mode to increment the VectorClock to prevent
    all iterations from failing due to ObsoleteVersion exceptions.
  3. Modify the mixed operation to perform the write even if the read returns nothing
    rather than silently passing without performing a write (I considered making this
    an error, but, preferred this approach).
  4. Add a new Warning count similar to the ReturnCode count to report how often a mixed
    operation reads nothing (could be used for other situations).
  5. Move the RecordCode report outside the summaryOnly check so that the error counts
    are always reported.

@jwlent55 jwlent55 force-pushed the improve-local-mode-benchmark-tool branch 2 times, most recently from ab6af8c to bd4cdc9 Compare September 28, 2015 14:52
@@ -92,6 +94,16 @@ public synchronized void recordReturnCode(int code) {
returnCodes.get(Icode)[0]++;
}

public synchronized void recordWarningCode(int code) {
Integer Icode = code;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nipick: can we follow Java conventions of beginning variable names with a lower case? We also don't typically prefix variables with type information, though I guess that's not a big deal if you want to leave it as is.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with you formatting suggestion and will update the code as you requested. Originally I had just cloned the recordReturnCode method above and not thought to much about the local variable names.

@FelixGV
Copy link
Collaborator

FelixGV commented Oct 1, 2015

Awesome! Thanks for digging into this @jwlent55 (:

I left a minor nitpick... I think the PR would be good to merge in after fixing the case style.

About the questions you brought up:

  1. Regarding RocksDB options, I'm not familiar enough with them to have a strong opinion on this, but I heard RocksDB had a LOT of knobs, a lot of which are useful in one scenario or another. So... I guess the end goal we might want to aim for would be to expose all or most config options? But of course it's also okay to integrate them organically as we require them.
  2. I don't think anyone tried varying the hard-coded options yet. This originated as a hackday project which @bhasudha pushed to completion on her own time, but we haven't had a need to run it in production yet so there's probably a lot of details missing before it can be considered rock-solid (pun intended).
  3. Regarding the benchmark tool, we don't really use this tool anymore. We have a more complex closed source tool which we use to deploy clusters, test workloads in fully distributed mode and publish comparative reports to some other internal tool. Nagini was an attempt to open-source that system, but it only contains the deployment functionality, it does not contain the performance testing part.

Thanks again for this change (:

-F

@jwlent55 jwlent55 force-pushed the improve-local-mode-benchmark-tool branch from bd4cdc9 to 534881b Compare October 1, 2015 19:17
@jwlent55
Copy link
Author

jwlent55 commented Oct 1, 2015

I have amended the code one more time:

  1. Address your earlier formatting comment.
  2. Make several more of the RocksDB options configurable.

I tried to leave the RocksDB options defaults unchanged (except in cases where they are already overridden by the current code) so as not to change current behavior.

Still testing, but, we are getting much better performance using:

  • "compactionStyle" of UNIVERSAL (much lower write magnification)
  • Much larger "writeBufferSize" (using the RocksDb default of 4M or greater rather than the current 8Kb override)

Note that Samza currently configures UNIVERSAL compaction style by default:
https://samza.apache.org/learn/documentation/0.9/jobs/configuration-table.html

@FelixGV
Copy link
Collaborator

FelixGV commented Oct 1, 2015

Oh okay the unconventional case style was a copy paste from elsewhere, I see. Anyway, thanks for fixing both (:

I think you should feel free to change the RocksDB default configs we've got in there if your testing shows that you get better performance otherwise. I doubt anyone is using any of that stuff in production yet since it's still pretty experimental. Once we greenlight RocksDB as being stable, then we may want to freeze the default configs provided...

-F

@jwlent55
Copy link
Author

jwlent55 commented Oct 1, 2015

Should I amend this commit with the changes I have so far?

@FelixGV
Copy link
Collaborator

FelixGV commented Oct 1, 2015

If it's not too much trouble, sure, go for it (: !

@jwlent55 jwlent55 force-pushed the improve-local-mode-benchmark-tool branch 2 times, most recently from b43db74 to 6513d86 Compare October 1, 2015 21:33
@jwlent55
Copy link
Author

jwlent55 commented Oct 1, 2015

No trouble at all. Had it all loaded up. Just did not want to mess you up by changing the commit as you were merging it. Thanks for the quick replies.

@jwlent55 jwlent55 force-pushed the improve-local-mode-benchmark-tool branch from 6513d86 to 1137bfe Compare October 1, 2015 21:45
this.lockStripes = props.getInt("rocksdb.lock.stripes", 50);

// TODO: Validate the default mandatory options
compactionStyle = CompactionStyle.valueOf(props.getString("rocksdb.options.compactionStyle", CompactionStyle.UNIVERSAL.toString()));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm...

I thought I had given a comment here earlier, but apparently it got lost in the mists of the interwebs. Anyway, it went something along the lines of:

Sorry for another last minute nitpick, but can we make this config extraction part of VoldemortConfig? It would be ideal if all configs, and their default values, were defined in there only.

Thanks for your work! It's very nice to see someone pushing on the RocksDB front (:

-F

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw the email notification for your earlier comment, but, also did not see it here. I did find it associated with an obsolete commit:

jwlent55@6513d86

I think I may have caused the problem by amending my update to fix a spelling mistake just as you were commenting on the code.

In any event, happy to make the change. I debated (with myself) where to place the option definitions: VoldemortConfig or RocksDbStorageConfiguration. I ended up following the Krati pattern. Figured I would guess wrong.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... it always brings a tear to my eyes to see inconsistencies in the code, but I guess that's a fight we'll take on another day (: ...

Going forward though, it'd be good to try to consolidate configs in VoldemortConfig as much as possible.

Thanks again for your work!

-F

@jwlent55 jwlent55 force-pushed the improve-local-mode-benchmark-tool branch from 1137bfe to 82cad02 Compare October 2, 2015 22:13
@jwlent55
Copy link
Author

jwlent55 commented Oct 2, 2015

One more version. The Javadoc comments are a bit brief. I did not want to replicate all the information that is documented elsewhere. I did provide a reference the related RocksDB JNI javadocs that will should work inside an IDE. Let me know if that is not acceptable.

Since I am now messing with a high churn file there are conflicts. They should be easy to resolve. I just added 3 chunks of code to VoldemortConfig.java. No deletes or changed lines. I need to read up this weekend on how to properly handle them on my end.

@jwlent55 jwlent55 force-pushed the improve-local-mode-benchmark-tool branch 3 times, most recently from 60c9ca3 to fba157f Compare October 3, 2015 06:18
@jwlent55
Copy link
Author

jwlent55 commented Oct 3, 2015

Commit is now rebased off master.

@FelixGV
Copy link
Collaborator

FelixGV commented Oct 6, 2015

Hi @jwlent55,

I'm sorry but I have some more comments xD ...

First, and most importantly, it would be preferable if there was no use of RocksDB-related classes in the VoldemortConfig's constructor, because that will cause a ClassNotFoundException if running Voldemort without RocksDB on the classpath. Since Voldemort is pluggable with many different storage engines, we try as much as possible to allow it to run on lean dependencies, when possible. So... there are a few ways to achieve this, for example: making sure the internal state of the class is limited to just first-class Java types, like String, and then doing the enum conversion as part of the getter/setter. This would still allow you to fail fast at start up time (since you would be calling the getter during start up), which is desirable, but you wouldn't touch RocksDB classes if the RocksDBStorageEngine isn't enabled.

Secondly, can you please follow the style/convention of the rest of the class. Use this.property = instead of the setter. I'm not sure why we do that, and there is a good argument to be made for doing it your way, but let's just keep it consistent for now.

Third, we try to keep the config's default as part of the Javadoc comment of each setter. Again, there is an argument to be made against this way of doing things. I don't like it entirely myself, but I prefer to keep it consistent. So until we change it everywhere, let's keep it the same way...

I feel bad for asking you to change the PR a few times in a row :/ . Hopefully you agree with my requested changes... Feel free to say if you don't have time, and I can clean up the changes for you if you prefer, but I would prefer giving you the opportunity to contribute code that won't get changed on the next commit!

Thanks!

-F

…ode as part of

an evaluation of the RocksDB strorage engine:

1) Ensure that all the warm up records requested are created to eliminate some
random errors.
2) Modify the mixed operation in local mode to increment the VectorClock to prevent
all iterations from failing due to ObsoleteVersion exceptions.
3) Modify the mixed operation to perform the write even if the read returns nothing
rather than silently passing without performing a write (I considered making this
an error, but, preferred this approach).
4) Add a new Warning count similar to the ReturnCode count to report how often a mixed
operation reads nothing (could be used for other situations).
5) Move the RecordCode report outside the summaryOnly check so that the error counts
are always reported.
6) Make it possible to configure the storage engine in local mode.
7) Make several of the RocksDB parameters configurable.
@jwlent55 jwlent55 force-pushed the improve-local-mode-benchmark-tool branch from fba157f to e1db72a Compare October 9, 2015 21:18
@jwlent55
Copy link
Author

jwlent55 commented Oct 9, 2015

Sorry for the delay responding to your comments. I agree that referencing RockDB types from VoldemortConfig is far from ideal. I only used the setter because I thought I might want to add some validation code there instead of in the validate method, but, there is a lot to be said for consistency.

The reason for the delay was that I was investigating a whole new approach to configuring RocksDB using an API not available in 3.6.2, Upgrading to RocksDB 3.13.1 turned out to take a lot more work than I expected (3.13.1 has some issues: facebook/rocksdb#606). I think those issues are now under control (perhaps a 3.13.2).

The new approach to configuring RocksDB is based on the realization that RocksDB has many, many parameters and they are are well documented in the RocksDB code. Cloning all those parameters and documentation over to Voldemort seemed to be tedious and hard to maintain. I am not yet sure which ones are the most important The new approach allows one to configure RockDB using the native parameter names with very little code in Voldemort and gives one access to almost all of them (many more than are available with the JNI API).

Let me know what you think.

@FelixGV
Copy link
Collaborator

FelixGV commented Oct 9, 2015

I agree.

I think having a pass-through mechanism where rocksdb.some.config.name passes some.config.name to the RocksDB engine makes a lot of sense, and would not require exposing every single one through Voldemort.

So are you thinking of waiting until the next stable release of RocksDB, upgrading the Voldemort dependency, and using this new configuration API?

No worries on the delay (:

-F

@jwlent55
Copy link
Author

That is plan and it looks like the latest 3.13.1 build has what we need. Still need to test it a bit more. Especially some OS X testing.

A private RocksDB build was our fallback position. We already have one, but, that approach would block a Voldemort pull request.

Sent from my iPhone

On Oct 9, 2015, at 5:41 PM, Felix GV notifications@github.com wrote:

I agree.

I think having a pass-through mechanism where rocksdb.some.config.name passes some.config.name to the RocksDB engine makes a lot of sense, and would not require exposing every single one through Voldemort.

So are you thinking of waiting until the next stable release of RocksDB, upgrading the Voldemort dependency, and using this new configuration API?

No worries on the delay (:

-F


Reply to this email directly or view it on GitHub.

@jwlent55
Copy link
Author

Quick "15 minute test" of OS X yesterday turned into an all day event - RocksDB failed the OS X test. We have a private build of RocksDB 3.13.1 that works for us. Built on one of our Macs after fixing a directory issue related to how the parts are pulled together to build the final jar. Hopefully a real build will be available on Maven Central soon.

@FelixGV
Copy link
Collaborator

FelixGV commented Oct 13, 2015

Ok, thanks for the update.

It would be ideal if we can keep Voldemort's RocksDB dependency limited to what's available in Maven Central.

If a stable RocksDB version is released and you're blocked on the Java artifact publication, just let me know and I might be able to put some pressure on the Samza guys sitting on the other side of the room so they publish the artifact (:

-F

@jwlent55
Copy link
Author

100% agree. The private build was just so we could continue testing internally with what we hoped the official build would look like. I assumed my Voldemort pull requests was blocked on that.

In any event a new version of 3.13.1 was just released that looks good. OS X worked for us as did all 4 compression algorithms on Linux-64.

Sent from my iPhone

On Oct 13, 2015, at 6:05 PM, Felix GV notifications@github.com wrote:

Ok, thanks for the update.

It would be ideal if we can keep Voldemort's RocksDB dependency limited to what's available in Maven Central.

If a stable RocksDB version is released and you're blocked on the Java artifact publication, just let me know and I might be able to put some pressure on the Samza guys sitting on the other side of the room so they publish the artifact (:

-F


Reply to this email directly or view it on GitHub.

@jwlent55
Copy link
Author

I have been building on this commit to add RocksDB support for the truncate operation. That allows the drop store admin function to work. I have something working that passes the unit tests (which I enhanced a bit to verify you can add an item to the store after the truncate) and works in my full deployment environment.

This commit already contains two different changes:

  1. Several small Benchmark tool changes.
  2. Ability to configure RocksDB options.
    I don't think I should add the truncate change to this one. If anything this one should be broken into two commits.

Any thoughts if you want to accept some or all of the changes in this commit? I suppose I could separate the truncate changes from the configuration changes, but, if you are close to accepting the configuration changes I would prefer to just build off of them.

@FelixGV
Copy link
Collaborator

FelixGV commented Oct 16, 2015

Sorry about that @jwlent55, I hadn't seen that you had did the refactoring to pass through any rocksdb config.

The change LGTM. I ran the unit tests locally and then merged it in.

-F

@FelixGV
Copy link
Collaborator

FelixGV commented Oct 16, 2015

Feel free to send a PR for the other change you're working on as well (: !

@FelixGV FelixGV closed this Oct 16, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants