Skip to content

Discussions on internet

Animesh Trivedi edited this page Jul 12, 2017 · 2 revisions

The Spark Endianness bug

http://apache-spark-developers-list.1001551.n3.nabble.com/Tungsten-in-a-mixed-endian-environment-td15975.html https://issues.apache.org/jira/browse/SPARK-12778

And the comment: How big of a deal this use case is in a heterogeneous endianness environment? If we do want to fix it, we should do it when right before Spark shuffles data to minimize performance penalty, i.e. turn big-endian encoded data into little-indian encoded data before it goes on the wire. This is a pretty involved change and given other things that might break across heterogeneous endianness environments, I am not sure if it is high priority enough to even warrant review bandwidth right now.

Clone this wiki locally