Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[hbase] update HBase bindings for eom #1396

Merged
merged 4 commits into from
Feb 5, 2020

Conversation

busbey
Copy link
Collaborator

@busbey busbey commented Feb 3, 2020

* remove 0.98, 1.0, 1.2, and 2.0 bindings
* add 2.2 binding
* incorporate README from 0.98 binding into current bindings
* incorporate README on bigtable testing from 1.0 binding into 1.4 binding
* incorporate implementation from 1.0 client into current bindings
* updated asynchbase binding to include parts of removed bindings it referenced
* update 1.4 and 2.2 to current releases
* use shaded client test for all hbase bindings.
* make hbase bindings consistently use log4j
* fixes brianfrankcooper#1173
* fixes brianfrankcooper#1172
@busbey
Copy link
Collaborator Author

busbey commented Feb 3, 2020

ugh. the maprdb stuff also had a reference to the hbase 1.0 binding implementation? I'll chase that down.

Copy link
Contributor

@joshelser joshelser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one question about hbase22 vs hbase21 as the docs say "HBase 2.1+"

hbase22/README.md Outdated Show resolved Hide resolved
@joshelser
Copy link
Contributor

Gotcha. I think your phrasing of "really 1.x and 2.x artifacts" is the most meaningful for users. The numbers in the path can just be things we use :)

@busbey
Copy link
Collaborator Author

busbey commented Feb 5, 2020

Okay I switched both remaining HBase bindings to refer to "hbase 1.y" and "hbase 2.y" instead of specific minor releases. how's that?

@joshelser
Copy link
Contributor

👍 sounds great!
Ship it.

For best results, use the pre-splitting strategy recommended in [HBASE-4163](https://issues.apache.org/jira/browse/HBASE-4163):

```
hbase(main):001:0> n_splits = 200 # HBase recommends (10 * number of regionservers)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is gross. we have the split algorithm classes... assuming they're now available on all release branches, we should promote those instead.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would love to have a simpler way to presplit. Can you make a PR with specifics?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The class I was thinking of is o.a.h.h.util.RegionSplitter$SplitAlgorithm. RegionSplitter is marked as IA.Private, but SplitAlgorithm is consumed from public interfaces of the likes of TableSnapshotInputFormat and others. It does not seem to have an existing algorithm that is sympathetic to YCSB, but I suppose one could drop such an implementation into the class path of the target cluster...

Please see the general instructions in the `doc` folder if you are not sure how it all works. You can apply additional properties (as seen in the next section) like this:

```
bin/ycsb run hbase1 -P workloads/workloada -cp /HBASE-HOME-DIR/conf -p table=usertable -p columnfamily=family -p clientbuffering=true
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a way for the user to specify a specific version of the hbase-1.y client? From the poms, seems it can be specified at build time.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can build against a specific different version if you like yes. after this PR it's by specifying hbase1.version or hbase2.version respectively

# HBase (0.98.x) Driver for YCSB
This driver is a binding for the YCSB facilities to operate against a HBase 0.98.x Server cluster.
To run against an HBase >= 1.0 cluster, use the `hbase10` binding.
# HBase (2.y) Driver for YCSB
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason not to consolidate the 3 different readmes into a single file?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the way packaging works, we create a tarball of binaries for each binding and that packaging gets the README associated with that binding. We could restructure the hbase1, hbase2, and asynchbase modules to try to change this, but that's going to be a bunch of work.

@busbey busbey merged commit 780aec9 into brianfrankcooper:master Feb 5, 2020
@busbey busbey deleted the cleanup-hbase branch February 5, 2020 19:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants