-
Notifications
You must be signed in to change notification settings - Fork 9.1k
HADOOP-17464. Create hadoop-compression module #2611
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: trunk
Are you sure you want to change the base?
Conversation
💔 -1 overall
This message was automatically generated. |
a lot of those tests are failing because code which expects the codecs to be found aren't finding them...you'll need to modify the MR and hadoop-hdfs dependencies to suit. Some of the tests look unrelated. Can you rebase to trunk to make sure you are up to date with all changes, especially HttpServer2? I'm going to raise this on hadoop-common to discuss how best to move to this. We may end up having to leave some codecs everyone expects to be in hadoop-common to stay there, while putting all new codecs into hadoop-compression |
f258421
to
aed3837
Compare
Hmm, I cannot reproduce the failures including |
Currently I only move lz4 and snappy codes to hadoop-compression which are two codecs we have added the dependencies. |
that's great -that means all should be good. We just now need to get that build working |
About the latest test failure, I cannot reproduce them locally:
Latest commit should fix these tests:
I can reproduce this one, but looks unrelated to the change:
|
Okay, I think I fixed all test failures except this one:
It looks not related. I also ran the test against current trunk locally. The test is also failed. I'm going to fix it by https://issues.apache.org/jira/browse/HADOOP-17472. |
Hmm, I guess this is ready for review as |
OK, so where are we with this patch?
|
I think so. Previously people may include hadoop-common and expect snappy/lz4 codec there. Where should we notify the people? release-note?
You mean to let codec loader to report what codecs there are? |
Perhaps we should create a page for compression codecs, under "Tools" category of https://hadoop.apache.org/docs/current/index.html. This can be done separately though after this is merged. We also need release notes I assume.
I think maybe some informative error message that users should have @steveloughran do you still think this is a worthwhile thing to do? and I assume this targets Hadoop 3.4.x, not 3.3.x, is that correct? |
Another thing is as @iwasakims mentioned in the |
Hmm, seems we cannot do that. hadoop-compression depends on hadoop-common, so we cannot make hadoop-common depends on hadoop-compression. |
@viirya yeah you're right. Personally I feel it should be fine as long as we have good documentation and error messages like mentioned above, but would like to hear inputs from @steveloughran and @iwasakims . Just to clarify: in the long term we do still plan to keep the main classes such as |
Except that we plan to move all compression stuffs to Different than the extra codecs, for these abstractions, I think the ideal approach is all or not when considering moving into |
As this is staying open for a while, and no more new comments. @steveloughran could we have some guidance about this module? Thank you. |
💔 -1 overall
This message was automatically generated. |
See https://issues.apache.org/jira/browse/HADOOP-17464 for details.
We added lz4-java, snappy-java dependencies to replace native libs (HADOOP-17125, HADOOP-17292). As per the suggestion from the review comments, we better add a hadoop module to include these extra dependencies, to avoid messing up the dependencies of user application.
Moved Lz4, Snappy Codec to hadoop-compression module. Codec test code are moved to hadoop-compression module too.