Skip to content
This repository has been archived by the owner on Apr 3, 2023. It is now read-only.

djo10/spark-alchemy-wrapper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DEPRECATED

spark-alchemy-wrapper

This is a helper for registering Spark Alchemy functions in PySpark.

To build and run paser jar:

sbt clean assembly

To include jar into pyspark:

pyspark --jars spark-alchemy-wrapper.jar

To register Spark Alchemy function and use in Python SparkSQL:

sc._jvm.com.spark.alchemy.wrapper.PythonHelper.registerHllFunctions(spark._jsparkSession)

Example

df = spark.range(100000)
df.createOrReplaceTempView("df")
spark.sql("select hll_cardinality(hll_init_agg(id) as count from df").show()
+-----+                                                                         
|count|
+-----+
|98093|
+-----+

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages