-
Notifications
You must be signed in to change notification settings - Fork 77
Conversation
Thanks for opening a pull request! Could you open an issue for this pull request on Github Issues? https://github.com/oap-project/native-sql-engine/issues Then could you also rename commit message and pull request title in the following format?
See also: |
4304a6a
to
054de03
Compare
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
@rui-mo the scala unit tests are based on spark300, will disable these tests first and fixed them in following patches |
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
<groupId>com.intel.oap</groupId> | ||
<version>${project.version}</version> | ||
<scope>test</scope> | ||
</dependency> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we only need to package on arrow-data-source-common now
<groupId>javax.servlet</groupId> | ||
<artifactId>javax.servlet-api</artifactId> | ||
<version>3.1.0</version> | ||
</dependency> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: this is a requirement for thrift-server
arrow-data-source/pom.xml
Outdated
<dependency> | ||
<groupId>org.apache.hadoop</groupId> | ||
<artifactId>hadoop-common</artifactId> | ||
<version>2.7.3</version> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this and use hadoop.version
@@ -168,12 +175,22 @@ case class ColumnarShuffledHashJoinExec( | |||
s"ColumnarShuffledHashJoinExec doesn't support doExecute") | |||
} | |||
override def supportsColumnar = true | |||
|
|||
// override def inputRDDs(): Seq[RDD[InternalRow]] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove dead code
// protected override def prepareRelation(ctx: CodegenContext): HashedRelationInfo = { | ||
// throw new UnsupportedOperationException( | ||
// "prepareRelation is used by codegen which we don't support") | ||
// } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dito
@@ -52,8 +52,7 @@ import scala.util.Random | |||
case class ColumnarWindowExec(windowExpression: Seq[NamedExpression], | |||
partitionSpec: Seq[Expression], | |||
orderSpec: Seq[SortOrder], | |||
child: SparkPlan) extends WindowExecBase(windowExpression, | |||
partitionSpec, orderSpec, child) { | |||
child: SparkPlan) extends WindowExecBase { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WindowExecBase is a trait in spark311
@@ -121,45 +121,15 @@ class ColumnarShuffleManager(conf: SparkConf) extends ShuffleManager with Loggin | |||
* Called on executors by reduce tasks. | |||
*/ | |||
override def getReader[K, C]( | |||
handle: ShuffleHandle, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: API changes in Shuffle
* [NSE-262] fix remainer loss in decimal divide (#263) * fix decimal divide int issue * correct cpp uts * use const reference Co-authored-by: Yuan <yuan.zhou@outlook.com> Co-authored-by: Yuan <yuan.zhou@outlook.com> * [NSE-261] ArrowDataSource: Add S3 Support (#270) Closes #261 * [NSE-196] clean up configs in unit tests (#271) * remove testing config * remove unused configs * [NSE-265] Reserve enough memory before UnsafeAppend in builder (#266) * change the UnsafeAppend to Append * fix buffer builder in shuffle shuffle builder use UnsafeAppend API for better performance. it tries to reserve enough space based on results of last recordbatch, this maybe not buggy if there's a dense recordbatch after a sparse one. this patch adds below fixes: - adds Reset() after Finish() in builder - reserve length for offset_builder in binary builder A further clean up on the reservation logic should be needed. Signed-off-by: Yuan Zhou <yuan.zhou@intel.com> Co-authored-by: Yuan Zhou <yuan.zhou@intel.com> * [NSE-274] Comment to trigger tpc-h RAM test (#275) Closes #274 * bump cmake to 3.16 (#281) Signed-off-by: Yuan Zhou <yuan.zhou@intel.com> * [NSE-276] Add option to switch Hadoop version (#277) Closes #276 * [NSE-119] clean up on comments (#288) Signed-off-by: Yuan Zhou <yuan.zhou@intel.com> * [NSE-206]Update installation guide and configuration guide. (#289) * [NSE-206]Update installation guide and configuration guide. * Fix numaBinding setting issue. & Update description for protobuf * [NSE-206]Fix Prerequisite and Arrow Installation Steps. (#290) * [NSE-245]Adding columnar RDD cache support (#246) * Adding columnar RDD cache support Signed-off-by: Chendi Xue <chendi.xue@intel.com> * Directly save reference, only convert to Array[Byte] when calling by BlockManager Signed-off-by: Chendi Xue <chendi.xue@intel.com> * Add DeAllocator to construction to make sure this instance will be released once it be deleted by JVM Signed-off-by: Chendi Xue <chendi.xue@intel.com> * Delete cache by adding a release in InMemoryRelation Since unpersist only delete RDD object, seems our deAllocator wasn't being called along Now we added a release function in InMemoryRelation clearCache() func, may need to think a new way for 3.1.0 Signed-off-by: Chendi Xue <chendi.xue@intel.com> * [NSE-207] fix issues found from aggregate unit tests (#233) * fix incorrect input in Expand * fix empty input for aggregate * fix only result expressions * fix empty aggregate expressions * fix res attr not found issue * refine * fix count distinct with null * fix groupby of NaN, -0.0 and 0.0 * fix count on mutiple cols with null in WSCG * format code * support normalize NaN and 0.0 * revert and update * support normalize function in WSCG * [NSE-206]Update documents and License for 1.1.0 (#292) * [NSE-206]Update documents and remove duplicate parts * Modify documents by comments * [NSE-293] fix unsafemap with key = '0' (#294) Signed-off-by: Yuan Zhou <yuan.zhou@intel.com> * [NSE-257] fix multiple slf4j bindings (#291) * [NSE-297] Disable incremental compiler in GHA CI (#298) Closes #297 * [NSE-285] ColumnarWindow: Support Date input in MAX/MIN (#286) Closes #285 * [NSE-304] Upgrade to Arrow 4.0.0: Change basic GHA TPC-H test target OAP Arrow branch (#306) * [NSE-302] remove exception (#303) * [NSE-273] support spark311 (#272) * support spark 3.0.2 Signed-off-by: Yuan Zhou <yuan.zhou@intel.com> * update to use spark 302 in unit tests Signed-off-by: Yuan Zhou <yuan.zhou@intel.com> * support spark 311 Signed-off-by: Yuan Zhou <yuan.zhou@intel.com> * fix Signed-off-by: Yuan Zhou <yuan.zhou@intel.com> * fix missing dep Signed-off-by: Yuan Zhou <yuan.zhou@intel.com> * fix broadcastexchange metrics Signed-off-by: Yuan Zhou <yuan.zhou@intel.com> * fix arrow data source Signed-off-by: Yuan Zhou <yuan.zhou@intel.com> * fix sum with decimal Signed-off-by: Yuan Zhou <yuan.zhou@intel.com> * fix c++ code Signed-off-by: Yuan Zhou <yuan.zhou@intel.com> * adding partial sum decimal sum Signed-off-by: Yuan Zhou <yuan.zhou@intel.com> * fix hashagg in wscg Signed-off-by: Yuan Zhou <yuan.zhou@intel.com> * fix partial sum with number type Signed-off-by: Yuan Zhou <yuan.zhou@intel.com> * fix AQE shuffle copy Signed-off-by: Yuan Zhou <yuan.zhou@intel.com> * fix shuffle redudant reat Signed-off-by: Yuan Zhou <yuan.zhou@intel.com> * fix rebase Signed-off-by: Yuan Zhou <yuan.zhou@intel.com> * fix format Signed-off-by: Yuan Zhou <yuan.zhou@intel.com> * avoid unecessary fallbacks Signed-off-by: Yuan Zhou <yuan.zhou@intel.com> * on-demand scala unit tests Signed-off-by: Yuan Zhou <yuan.zhou@intel.com> * clean up Signed-off-by: Yuan Zhou <yuan.zhou@intel.com> * [NSE-311] Build reports errors (#312) Closes #311 * [NSE-257] fix the dependency issue on v2 Co-authored-by: Rui Mo <rui.mo@intel.com> Co-authored-by: Hongze Zhang <hongze.zhang@intel.com> Co-authored-by: JiaKe <ke.a.jia@intel.com> Co-authored-by: Wei-Ting Chen <weiting.chen@intel.com> Co-authored-by: Chendi.Xue <chendi.xue@intel.com> Co-authored-by: Hong <hong2.wang@intel.com>
What changes were proposed in this pull request?
related #273
supports Spark 3.1.1
notable changes:
How was this patch tested?
locally verified