This is a helper for registering Spark Alchemy functions in PySpark.
sbt clean assembly
pyspark --jars spark-alchemy-wrapper.jar
sc._jvm.com.spark.alchemy.wrapper.PythonHelper.registerHllFunctions(spark._jsparkSession)
df = spark.range(100000)
df.createOrReplaceTempView("df")
spark.sql("select hll_cardinality(hll_init_agg(id) as count from df").show()
+-----+
|count|
+-----+
|98093|
+-----+