Skip to content

Commit

Permalink
PARQUET-1529: Shade fastutil in all modules where used (#617)
Browse files Browse the repository at this point in the history
  • Loading branch information
gszadovszky authored Feb 13, 2019
1 parent 9461845 commit dcfd53a
Show file tree
Hide file tree
Showing 3 changed files with 8 additions and 22 deletions.
4 changes: 4 additions & 0 deletions parquet-avro/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -161,6 +161,10 @@
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
</plugin>
</plugins>
</build>

Expand Down
22 changes: 0 additions & 22 deletions parquet-column/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -109,28 +109,6 @@
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<minimizeJar>true</minimizeJar>
<artifactSet>
<includes>
<include>it.unimi.dsi:fastutil</include>
</includes>
</artifactSet>
<relocations>
<relocation>
<pattern>it.unimi.dsi</pattern>
<shadedPattern>org.apache.parquet.it.unimi.dsi</shadedPattern>
</relocation>
</relocations>
</configuration>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.codehaus.mojo</groupId>
Expand Down
4 changes: 4 additions & 0 deletions parquet-hadoop/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -121,6 +121,10 @@
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-jar-plugin</artifactId>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
</plugin>
</plugins>
</build>

Expand Down

4 comments on commit dcfd53a

@czulehner
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you now have activated the shade plugin here, does this now also shade the jackson classes, right?
Was the previous non-shading of jackson on parquet.hadoop intentionally? (You already had the parquet-jackson dependency)

@gszadovszky
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The jackson classes are shaded by the separate module parquet-jackson. The adding of the shade plugin here did not change the original idea of handling the jackson dependency so, jackson was not shaded by parquet-hadoop directly and is still not shaded.
See https://github.com/apache/parquet-mr/blob/master/parquet-jackson/README.md for more details.

@czulehner
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I already read the parquet-jackson docu.
Just my two cents:

  • explicitly define the (compile time) dependency in parquet-hadoop to fastutil (yet transient dep via parquet-column) to make it more clear, where shading would be needed.
  • where is now the shading of fastutil defined? AFAIK the default configuration for the shade plugin (in root pom) will be used when the plugin gets activated. So now, the shade plugin gets activated in parquet-hadoop with config from root pom, which only shades jackson - no config for fastutil shading any more?

@gszadovszky
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Due to the new feature column-indexes (to be introduced in 1.11.0) parquet-hadoop has started using using some fastutil classes. It was working just fine in the unit test level and in some runtime environments but in one it was failed (it was spark as far as I remember). The problem was that fastutil supposed to be shaded by parquet, so no one should ever need this lib as a runtime dependency. I've fixed this problem by adding the default shade plugin (with the configuration in the root which includes both jackson and fastutil) to the modules where fastutil or jackson is used.
So, after this change all the modules which depend on fastutil is also shading it (not in the way we shade jackson but the common way by shading the required classes jar-by-jar).

Please sign in to comment.