diff --git a/docs/source/api/raster-functions.rst b/docs/source/api/raster-functions.rst index 87af5c46e..bdad114e9 100644 --- a/docs/source/api/raster-functions.rst +++ b/docs/source/api/raster-functions.rst @@ -190,6 +190,7 @@ rst_combineavg The output raster will have the same pixel type as the input rasters. The output raster will have the same pixel size as the input rasters. The output raster will have the same coordinate reference system as the input rasters. + Also, see :doc:`rst_combineavg_agg ` function. :param tiles: A column containing an array of raster tiles. :type tiles: Column (ArrayType(RasterTileType)) @@ -229,58 +230,6 @@ rst_combineavg | {index_id: 593308294097928191, raster: [00 01 10 ... 00], parentPath: "dbfs:/path_to_file", driver: "NetCDF" } | +----------------------------------------------------------------------------------------------------------------+ -rst_combineavgagg -***************** - -.. function:: rst_combineavgagg(tile) - - Combines a group by statement over aggregated raster tiles by averaging the pixel values. - The rasters must have the same extent, number of bands, and pixel type. - The rasters must have the same pixel size and coordinate reference system. - The output raster will have the same extent as the input rasters. - The output raster will have the same number of bands as the input rasters. - The output raster will have the same pixel type as the input rasters. - The output raster will have the same pixel size as the input rasters. - The output raster will have the same coordinate reference system as the input rasters. - - :param tile: A grouped column containing raster tiles. - :type tile: Column (RasterTileType) - :rtype: Column: RasterTileType - - :example: - -.. tabs:: - .. code-tab:: py - - df.groupBy()\ - .agg(mos.rst_combineavgagg("tile").limit(1).display() - +----------------------------------------------------------------------------------------------------------------+ - | rst_combineavgagg(tile) | - +----------------------------------------------------------------------------------------------------------------+ - | {index_id: 593308294097928191, raster: [00 01 10 ... 00], parentPath: "dbfs:/path_to_file", driver: "NetCDF" } | - +----------------------------------------------------------------------------------------------------------------+ - - .. code-tab:: scala - - df.groupBy() - .agg(rst_combineavgagg(col("tile")).limit(1).show - +----------------------------------------------------------------------------------------------------------------+ - | rst_combineavgagg(tile) | - +----------------------------------------------------------------------------------------------------------------+ - | {index_id: 593308294097928191, raster: [00 01 10 ... 00], parentPath: "dbfs:/path_to_file", driver: "NetCDF" } | - +----------------------------------------------------------------------------------------------------------------+ - - .. code-tab:: sql - - SELECT rst_combineavgagg(tile) - FROM table - GROUP BY 1 - +----------------------------------------------------------------------------------------------------------------+ - | rst_combineavgagg(tile) | - +----------------------------------------------------------------------------------------------------------------+ - | {index_id: 593308294097928191, raster: [00 01 10 ... 00], parentPath: "dbfs:/path_to_file", driver: "NetCDF" } | - +----------------------------------------------------------------------------------------------------------------+ - rst_derivedband ************** @@ -295,6 +244,7 @@ rst_derivedband The output raster will have the same pixel type as the input rasters. The output raster will have the same pixel size as the input rasters. The output raster will have the same coordinate reference system as the input rasters. + Also, see :doc:`rst_derivedband_agg ` function. :param tiles: A column containing an array of raster tiles. :type tiles: Column (ArrayType(RasterTileType)) @@ -364,96 +314,6 @@ rst_derivedband +----------------------------------------------------------------------------------------------------------------+ -rst_derivedbandagg -***************** - -.. function:: rst_derivedbandagg(tile, python_func, func_name) - - Combines a group by statement over aggregated raster tiles by using the provided python function. - The rasters must have the same extent, number of bands, and pixel type. - The rasters must have the same pixel size and coordinate reference system. - The output raster will have the same extent as the input rasters. - The output raster will have the same number of bands as the input rasters. - The output raster will have the same pixel type as the input rasters. - The output raster will have the same pixel size as the input rasters. - The output raster will have the same coordinate reference system as the input rasters. - - :param tile: A grouped column containing raster tile(s). - :type tile: Column (RasterTileType) - :param python_func: A function to evaluate in python. - :type python_func: Column (StringType) - :param func_name: name of the function to evaluate in python. - :type func_name: Column (StringType) - :rtype: Column: RasterTileType - - :example: - -.. tabs:: - .. code-tab:: py - from textwrap import dedent - df\ - .select( - "date", "tile", - F.lit(dedent( - """ - import numpy as np - def average(in_ar, out_ar, xoff, yoff, xsize, ysize, raster_xsize, raster_ysize, buf_radius, gt, **kwargs): - out_ar[:] = np.sum(in_ar, axis=0) / len(in_ar) - """)).alias("py_func1"), - F.lit("average").alias("func1_name") - )\ - .groupBy("date", "py_func1", "func1_name")\ - .agg(mos.rst_derivedbandagg("tile","py_func1","func1_name")).limit(1).display() - +----------------------------------------------------------------------------------------------------------------+ - | rst_derivedbandagg(tile,py_func1,func1_name) | - +----------------------------------------------------------------------------------------------------------------+ - | {index_id: 593308294097928191, raster: [00 01 10 ... 00], parentPath: "dbfs:/path_to_file", driver: "NetCDF" } | - +----------------------------------------------------------------------------------------------------------------+ - - .. code-tab:: scala - - df - .select( - "date", "tile" - lit( - """ - |import numpy as np - |def average(in_ar, out_ar, xoff, yoff, xsize, ysize, raster_xsize, raster_ysize, buf_radius, gt, **kwargs): - | out_ar[:] = np.sum(in_ar, axis=0) / len(in_ar) - |""".stripMargin).as("py_func1"), - lit("average").as("func1_name") - ) - .groupBy("date", "py_func1", "func1_name") - .agg(mos.rst_derivedbandagg("tile","py_func1","func1_name")).limit(1).show - +----------------------------------------------------------------------------------------------------------------+ - | rst_derivedbandagg(tile,py_func1,func1_name) | - +----------------------------------------------------------------------------------------------------------------+ - | {index_id: 593308294097928191, raster: [00 01 10 ... 00], parentPath: "dbfs:/path_to_file", driver: "NetCDF" } | - +----------------------------------------------------------------------------------------------------------------+ - - .. code-tab:: sql - SELECT - date, py_func1, func1_name, - rst_derivedbandagg(tile, py_func1, func1_name) - FROM SELECT ( - date, tile, - """ - import numpy as np - def average(in_ar, out_ar, xoff, yoff, xsize, ysize, raster_xsize, raster_ysize, buf_radius, gt, **kwargs): - out_ar[:] = np.sum(in_ar, axis=0) / len(in_ar) - """ as py_func1, - "average" as func1_name - FROM table - ) - GROUP BY date, py_func1, func1_name - LIMIT 1 - +----------------------------------------------------------------------------------------------------------------+ - | rst_derivedbandagg(tile,py_func1,func1_name) | - +----------------------------------------------------------------------------------------------------------------+ - | {index_id: 593308294097928191, raster: [00 01 10 ... 00], parentPath: "dbfs:/path_to_file", driver: "NetCDF" } | - +----------------------------------------------------------------------------------------------------------------+ - - rst_frombands ************** @@ -527,6 +387,7 @@ rst_fromcontent .. tabs:: .. code-tab:: py + # binary is python bytearray data type df = spark.read.format("binaryFile")\ .load("dbfs:/FileStore/geospatial/mosaic/sample_raster_data/binary/netcdf-coral")\ @@ -538,6 +399,7 @@ rst_fromcontent +----------------------------------------------------------------------------------------------------------------+ .. code-tab:: scala + //binary is scala/java Array(Byte) data type val df = spark.read .format("binaryFile") @@ -910,9 +772,12 @@ rst_mapalgebra Here are examples of the json_spec': (1) shows default indexing, (2) shows reusing an index, and (3) shows band indexing. - (1) '{"calc": "A+B/C"}' - (2) '{"calc": "A+B/C", "A_index": 0, "B_index": 1, "C_index": 1}' - (3) '{"calc": "A+B/C", "A_index": 0, "B_index": 1, "C_index": 2, "A_band": 1, "B_band": 1, "C_band": 1}' + + .. code-block:: text + + (1) '{"calc": "A+B/C"}' + (2) '{"calc": "A+B/C", "A_index": 0, "B_index": 1, "C_index": 1}' + (3) '{"calc": "A+B/C", "A_index": 0, "B_index": 1, "C_index": 2, "A_band": 1, "B_band": 1, "C_band": 1}' :param tile: A column containing the raster tile. :type tile: Column (RasterTileType) @@ -1011,6 +876,7 @@ rst_merge The output raster will have the same pixel type as the input rasters. The output raster will have the same pixel size as the highest resolution input rasters. The output raster will have the same coordinate reference system as the input rasters. + Also, see :doc:`rst_merge_agg ` function. :param tiles: A column containing an array of raster tiles. :type tiles: Column (ArrayType(RasterTileType)) @@ -1048,63 +914,6 @@ rst_merge | {index_id: 593308294097928191, raster: [00 01 10 ... 00], parentPath: "dbfs:/path_to_file", driver: "NetCDF" } | +----------------------------------------------------------------------------------------------------------------+ -rst_mergeagg -************ - -.. function:: rst_mergeagg(tile) - - Combines a grouped aggregate of raster tiles into a single raster. - The rasters do not need to have the same extent. - The rasters must have the same coordinate reference system. - The rasters are combined using gdalwarp. - The noData value needs to be initialised; if not, the non valid pixels may introduce artifacts in the output raster. - The rasters are stacked in the order they are provided. - This order is randomized since this is an aggregation function. - If the order of rasters is important please first collect rasters and sort them by metadata information and then use - rst_merge function. - The output raster will have the extent covering all input rasters. - The output raster will have the same number of bands as the input rasters. - The output raster will have the same pixel type as the input rasters. - The output raster will have the same pixel size as the highest resolution input rasters. - The output raster will have the same coordinate reference system as the input rasters. - - :param tile: A column containing raster tiles. - :type tile: Column (RasterTileType) - :rtype: Column: RasterTileType - - :example: - -.. tabs:: - .. code-tab:: py - - df.groupBy("date")\ - .agg(mos.rst_mergeagg("tile")).limit(1).display() - +----------------------------------------------------------------------------------------------------------------+ - | rst_mergeagg(tile) | - +----------------------------------------------------------------------------------------------------------------+ - | {index_id: 593308294097928191, raster: [00 01 10 ... 00], parentPath: "dbfs:/path_to_file", driver: "NetCDF" } | - +----------------------------------------------------------------------------------------------------------------+ - - .. code-tab:: scala - - df.groupBy("date") - .agg(rst_mergeagg(col("tile"))).limit(1).show - +----------------------------------------------------------------------------------------------------------------+ - | rst_mergeagg(tile) | - +----------------------------------------------------------------------------------------------------------------+ - | {index_id: 593308294097928191, raster: [00 01 10 ... 00], parentPath: "dbfs:/path_to_file", driver: "NetCDF" } | - +----------------------------------------------------------------------------------------------------------------+ - - .. code-tab:: sql - - SELECT rst_mergeagg(tile) - FROM table - GROUP BY date - +----------------------------------------------------------------------------------------------------------------+ - | rst_mergeagg(tile) | - +----------------------------------------------------------------------------------------------------------------+ - | {index_id: 593308294097928191, raster: [00 01 10 ... 00], parentPath: "dbfs:/path_to_file", driver: "NetCDF" } | - +----------------------------------------------------------------------------------------------------------------+ rst_metadata ************* diff --git a/docs/source/api/spatial-aggregations.rst b/docs/source/api/spatial-aggregations.rst index 4f8d3a0ca..9f806fec9 100644 --- a/docs/source/api/spatial-aggregations.rst +++ b/docs/source/api/spatial-aggregations.rst @@ -2,11 +2,214 @@ Spatial aggregation functions ============================= +rst_combineavg_agg +***************** + +.. function:: rst_combineavg_agg(tile) + + Combines a group by statement over aggregated raster tiles by averaging the pixel values. + The rasters must have the same extent, number of bands, and pixel type. + The rasters must have the same pixel size and coordinate reference system. + The output raster will have the same extent as the input rasters. + The output raster will have the same number of bands as the input rasters. + The output raster will have the same pixel type as the input rasters. + The output raster will have the same pixel size as the input rasters. + The output raster will have the same coordinate reference system as the input rasters. + + :param tile: A grouped column containing raster tiles. + :type tile: Column (RasterTileType) + :rtype: Column: RasterTileType + + :example: + +.. tabs:: + .. code-tab:: py + + df.groupBy()\ + .agg(mos.rst_combineavg_agg("tile").limit(1).display() + +----------------------------------------------------------------------------------------------------------------+ + | rst_combineavg_agg(tile) | + +----------------------------------------------------------------------------------------------------------------+ + | {index_id: 593308294097928191, raster: [00 01 10 ... 00], parentPath: "dbfs:/path_to_file", driver: "NetCDF" } | + +----------------------------------------------------------------------------------------------------------------+ + + .. code-tab:: scala + + df.groupBy() + .agg(rst_combineavg_agg(col("tile")).limit(1).show + +----------------------------------------------------------------------------------------------------------------+ + | rst_combineavg_agg(tile) | + +----------------------------------------------------------------------------------------------------------------+ + | {index_id: 593308294097928191, raster: [00 01 10 ... 00], parentPath: "dbfs:/path_to_file", driver: "NetCDF" } | + +----------------------------------------------------------------------------------------------------------------+ + + .. code-tab:: sql + + SELECT rst_combineavg_agg(tile) + FROM table + GROUP BY 1 + +----------------------------------------------------------------------------------------------------------------+ + | rst_combineavg_agg(tile) | + +----------------------------------------------------------------------------------------------------------------+ + | {index_id: 593308294097928191, raster: [00 01 10 ... 00], parentPath: "dbfs:/path_to_file", driver: "NetCDF" } | + +----------------------------------------------------------------------------------------------------------------+ + + +rst_derivedband_agg +***************** + +.. function:: rst_derivedband_agg(tile, python_func, func_name) + + Combines a group by statement over aggregated raster tiles by using the provided python function. + The rasters must have the same extent, number of bands, and pixel type. + The rasters must have the same pixel size and coordinate reference system. + The output raster will have the same extent as the input rasters. + The output raster will have the same number of bands as the input rasters. + The output raster will have the same pixel type as the input rasters. + The output raster will have the same pixel size as the input rasters. + The output raster will have the same coordinate reference system as the input rasters. + + :param tile: A grouped column containing raster tile(s). + :type tile: Column (RasterTileType) + :param python_func: A function to evaluate in python. + :type python_func: Column (StringType) + :param func_name: name of the function to evaluate in python. + :type func_name: Column (StringType) + :rtype: Column: RasterTileType + + :example: + +.. tabs:: + .. code-tab:: py + + from textwrap import dedent + df\ + .select( + "date", "tile", + F.lit(dedent( + """ + import numpy as np + def average(in_ar, out_ar, xoff, yoff, xsize, ysize, raster_xsize, raster_ysize, buf_radius, gt, **kwargs): + out_ar[:] = np.sum(in_ar, axis=0) / len(in_ar) + """)).alias("py_func1"), + F.lit("average").alias("func1_name") + )\ + .groupBy("date", "py_func1", "func1_name")\ + .agg(mos.rst_derivedband_agg("tile","py_func1","func1_name")).limit(1).display() + +----------------------------------------------------------------------------------------------------------------+ + | rst_derivedband_agg(tile,py_func1,func1_name) | + +----------------------------------------------------------------------------------------------------------------+ + | {index_id: 593308294097928191, raster: [00 01 10 ... 00], parentPath: "dbfs:/path_to_file", driver: "NetCDF" } | + +----------------------------------------------------------------------------------------------------------------+ + + .. code-tab:: scala + + df + .select( + "date", "tile" + lit( + """ + |import numpy as np + |def average(in_ar, out_ar, xoff, yoff, xsize, ysize, raster_xsize, raster_ysize, buf_radius, gt, **kwargs): + | out_ar[:] = np.sum(in_ar, axis=0) / len(in_ar) + |""".stripMargin).as("py_func1"), + lit("average").as("func1_name") + ) + .groupBy("date", "py_func1", "func1_name") + .agg(mos.rst_derivedband_agg("tile","py_func1","func1_name")).limit(1).show + +----------------------------------------------------------------------------------------------------------------+ + | rst_derivedband_agg(tile,py_func1,func1_name) | + +----------------------------------------------------------------------------------------------------------------+ + | {index_id: 593308294097928191, raster: [00 01 10 ... 00], parentPath: "dbfs:/path_to_file", driver: "NetCDF" } | + +----------------------------------------------------------------------------------------------------------------+ + + .. code-tab:: sql + + SELECT + date, py_func1, func1_name, + rst_derivedband_agg(tile, py_func1, func1_name) + FROM SELECT ( + date, tile, + """ + import numpy as np + def average(in_ar, out_ar, xoff, yoff, xsize, ysize, raster_xsize, raster_ysize, buf_radius, gt, **kwargs): + out_ar[:] = np.sum(in_ar, axis=0) / len(in_ar) + """ as py_func1, + "average" as func1_name + FROM table + ) + GROUP BY date, py_func1, func1_name + LIMIT 1 + +----------------------------------------------------------------------------------------------------------------+ + | rst_derivedband_agg(tile,py_func1,func1_name) | + +----------------------------------------------------------------------------------------------------------------+ + | {index_id: 593308294097928191, raster: [00 01 10 ... 00], parentPath: "dbfs:/path_to_file", driver: "NetCDF" } | + +----------------------------------------------------------------------------------------------------------------+ + + +rst_merge_agg +************ + +.. function:: rst_merge_agg(tile) + + Combines a grouped aggregate of raster tiles into a single raster. + The rasters do not need to have the same extent. + The rasters must have the same coordinate reference system. + The rasters are combined using gdalwarp. + The noData value needs to be initialised; if not, the non valid pixels may introduce artifacts in the output raster. + The rasters are stacked in the order they are provided. + This order is randomized since this is an aggregation function. + If the order of rasters is important please first collect rasters and sort them by metadata information and then use + rst_merge function. + The output raster will have the extent covering all input rasters. + The output raster will have the same number of bands as the input rasters. + The output raster will have the same pixel type as the input rasters. + The output raster will have the same pixel size as the highest resolution input rasters. + The output raster will have the same coordinate reference system as the input rasters. + + :param tile: A column containing raster tiles. + :type tile: Column (RasterTileType) + :rtype: Column: RasterTileType + + :example: + +.. tabs:: + .. code-tab:: py + + df.groupBy("date")\ + .agg(mos.rst_merge_agg("tile")).limit(1).display() + +----------------------------------------------------------------------------------------------------------------+ + | rst_merge_agg(tile) | + +----------------------------------------------------------------------------------------------------------------+ + | {index_id: 593308294097928191, raster: [00 01 10 ... 00], parentPath: "dbfs:/path_to_file", driver: "NetCDF" } | + +----------------------------------------------------------------------------------------------------------------+ + + .. code-tab:: scala + + df.groupBy("date") + .agg(rst_merge_agg(col("tile"))).limit(1).show + +----------------------------------------------------------------------------------------------------------------+ + | rst_merge_agg(tile) | + +----------------------------------------------------------------------------------------------------------------+ + | {index_id: 593308294097928191, raster: [00 01 10 ... 00], parentPath: "dbfs:/path_to_file", driver: "NetCDF" } | + +----------------------------------------------------------------------------------------------------------------+ + + .. code-tab:: sql + + SELECT rst_merge_agg(tile) + FROM table + GROUP BY date + +----------------------------------------------------------------------------------------------------------------+ + | rst_merge_agg(tile) | + +----------------------------------------------------------------------------------------------------------------+ + | {index_id: 593308294097928191, raster: [00 01 10 ... 00], parentPath: "dbfs:/path_to_file", driver: "NetCDF" } | + +----------------------------------------------------------------------------------------------------------------+ + st_intersects_aggregate *********************** -.. function:: st_intersects_aggregate(leftIndex, rightIndex) +.. function:: st_intersects_agg(leftIndex, rightIndex) Returns `true` if any of the `leftIndex` and `rightIndex` pairs intersect. @@ -33,10 +236,10 @@ st_intersects_aggregate left_df .join(right_df, col("left_index.index_id") == col("right_index.index_id")) .groupBy() - .agg(st_intersects_aggregate(col("left_index"), col("right_index"))) + .agg(st_intersects_agg(col("left_index"), col("right_index"))) ).show(1, False) +------------------------------------------------+ - |st_intersects_aggregate(left_index, right_index)| + |st_intersects_agg(left_index, right_index)| +------------------------------------------------+ |true | +------------------------------------------------+ @@ -50,10 +253,10 @@ st_intersects_aggregate leftDf .join(rightDf, $"left_index.index_id" === $"right_index.index_id") .groupBy() - .agg(st_intersects_aggregate($"left_index", $"right_index")) + .agg(st_intersects_agg($"left_index", $"right_index")) .show(false) +------------------------------------------------+ - |st_intersects_aggregate(left_index, right_index)| + |st_intersects_agg(left_index, right_index)| +------------------------------------------------+ |true | +------------------------------------------------+ @@ -62,10 +265,10 @@ st_intersects_aggregate WITH l AS (SELECT grid_tessellateexplode("POLYGON ((0 0, 0 3, 3 3, 3 0))", 1) AS left_index), r AS (SELECT grid_tessellateexplode("POLYGON ((2 2, 2 4, 4 4, 4 2))", 1) AS right_index) - SELECT st_intersects_aggregate(l.left_index, r.right_index) + SELECT st_intersects_agg(l.left_index, r.right_index) FROM l INNER JOIN r on l.left_index.index_id = r.right_index.index_id +------------------------------------------------+ - |st_intersects_aggregate(left_index, right_index)| + |st_intersects_agg(left_index, right_index)| +------------------------------------------------+ |true | +------------------------------------------------+ @@ -83,20 +286,20 @@ st_intersects_aggregate showDF( select( join(df.l, df.r, df.l$left_index.index_id == df.r$right_index.index_id), - st_intersects_aggregate(column("left_index"), column("right_index")) + st_intersects_agg(column("left_index"), column("right_index")) ), truncate=F ) +------------------------------------------------+ - |st_intersects_aggregate(left_index, right_index)| + |st_intersects_agg(left_index, right_index)| +------------------------------------------------+ |true | +------------------------------------------------+ -st_intersection_aggregate +st_intersection_agg ************************* -.. function:: st_intersection_aggregate(leftIndex, rightIndex) +.. function:: st_intersection_agg(leftIndex, rightIndex) Computes the intersections of `leftIndex` and `rightIndex` and returns the union of these intersections. @@ -123,10 +326,10 @@ st_intersection_aggregate left_df .join(right_df, col("left_index.index_id") == col("right_index.index_id")) .groupBy() - .agg(st_astext(st_intersection_aggregate(col("left_index"), col("right_index")))) + .agg(st_astext(st_intersection_agg(col("left_index"), col("right_index")))) ).show(1, False) +--------------------------------------------------------------+ - |convert_to(st_intersection_aggregate(left_index, right_index))| + |convert_to(st_intersection_agg(left_index, right_index))| +--------------------------------------------------------------+ |POLYGON ((2 2, 3 2, 3 3, 2 3, 2 2)) | +--------------------------------------------------------------+ @@ -140,10 +343,10 @@ st_intersection_aggregate leftDf .join(rightDf, $"left_index.index_id" === $"right_index.index_id") .groupBy() - .agg(st_astext(st_intersection_aggregate($"left_index", $"right_index"))) + .agg(st_astext(st_intersection_agg($"left_index", $"right_index"))) .show(false) +--------------------------------------------------------------+ - |convert_to(st_intersection_aggregate(left_index, right_index))| + |convert_to(st_intersection_agg(left_index, right_index))| +--------------------------------------------------------------+ |POLYGON ((2 2, 3 2, 3 3, 2 3, 2 2)) | +--------------------------------------------------------------+ @@ -152,10 +355,10 @@ st_intersection_aggregate WITH l AS (SELECT grid_tessellateexplode("POLYGON ((0 0, 0 3, 3 3, 3 0))", 1) AS left_index), r AS (SELECT grid_tessellateexplode("POLYGON ((2 2, 2 4, 4 4, 4 2))", 1) AS right_index) - SELECT st_astext(st_intersection_aggregate(l.left_index, r.right_index)) + SELECT st_astext(st_intersection_agg(l.left_index, r.right_index)) FROM l INNER JOIN r on l.left_index.index_id = r.right_index.index_id +--------------------------------------------------------------+ - |convert_to(st_intersection_aggregate(left_index, right_index))| + |convert_to(st_intersection_agg(left_index, right_index))| +--------------------------------------------------------------+ |POLYGON ((2 2, 3 2, 3 3, 2 3, 2 2)) | +--------------------------------------------------------------+ @@ -173,11 +376,11 @@ st_intersection_aggregate showDF( select( join(df.l, df.r, df.l$left_index.index_id == df.r$right_index.index_id), - st_astext(st_intersection_aggregate(column("left_index"), column("right_index"))) + st_astext(st_intersection_agg(column("left_index"), column("right_index"))) ), truncate=F ) +--------------------------------------------------------------+ - |convert_to(st_intersection_aggregate(left_index, right_index))| + |convert_to(st_intersection_agg(left_index, right_index))| +--------------------------------------------------------------+ |POLYGON ((2 2, 3 2, 3 3, 2 3, 2 2)) | +--------------------------------------------------------------+ diff --git a/docs/source/api/spatial-functions.rst b/docs/source/api/spatial-functions.rst index 2e02fb3d6..350756e48 100644 --- a/docs/source/api/spatial-functions.rst +++ b/docs/source/api/spatial-functions.rst @@ -929,6 +929,7 @@ st_intersection .. function:: st_intersection(geom1, geom2) Returns a geometry representing the intersection of `left_geom` and `right_geom`. + Also, see :doc:`st_intersection_agg ` function. :param geom1: Geometry :type geom1: Column @@ -1665,6 +1666,7 @@ st_union .. function:: st_union(left_geom, right_geom) Returns the point set union of the input geometries. + Also, see :doc:`st_union_agg ` function. :param left_geom: Geometry :type left_geom: Column diff --git a/docs/source/api/spatial-indexing.rst b/docs/source/api/spatial-indexing.rst index 0ea059cb4..ba0a91132 100644 --- a/docs/source/api/spatial-indexing.rst +++ b/docs/source/api/spatial-indexing.rst @@ -857,7 +857,8 @@ grid_cell_intersection .. function:: grid_cell_intersection(left_chip, right_chip) - Returns the chip representing the intersection of two chips based on the same grid cell + Returns the chip representing the intersection of two chips based on the same grid cell. + Also, see :doc:`grid_cell_intersection_agg ` function. :param left_chip: Chip :type left_chip: Column: ChipType(LongType) @@ -912,7 +913,8 @@ grid_cell_union .. function:: grid_cell_union(left_chip, right_chip) - Returns the chip representing the union of two chips based on the same grid cell + Returns the chip representing the union of two chips based on the same grid cell. + Also, see :doc:`grid_cell_union_agg ` function. :param left_chip: Chip :type left_chip: Column: ChipType(LongType) diff --git a/docs/source/api/spatial-predicates.rst b/docs/source/api/spatial-predicates.rst index 09fc6fa31..c1c3c8288 100644 --- a/docs/source/api/spatial-predicates.rst +++ b/docs/source/api/spatial-predicates.rst @@ -67,6 +67,7 @@ st_intersects .. function:: st_intersects(geom1, geom2) Returns true if the geometry `geom1` intersects `geom2`. + Also, see :doc:`st_intersects_agg ` function. :param geom1: Geometry :type geom1: Column diff --git a/docs/source/images/init_script.png b/docs/source/images/init_script.png index 335f19904..d141cd6c2 100644 Binary files a/docs/source/images/init_script.png and b/docs/source/images/init_script.png differ diff --git a/docs/source/usage/install-gdal.rst b/docs/source/usage/install-gdal.rst index 7e1b0c19b..f18b7eae8 100644 --- a/docs/source/usage/install-gdal.rst +++ b/docs/source/usage/install-gdal.rst @@ -31,8 +31,8 @@ the mos.setup_gdal() function. .. note:: (a) This is close in behavior to Mosaic < 0.4 series (prior to DBR 13), with new options to pip install Mosaic for either ubuntugis gdal (3.4.3) or jammy default (3.4.1). - (b) `to_fuse_dir` can be one of `/Volumes/..`, `/Workspace/..`, `/dbfs/..`; - however, you should consider `setup_fuse_install()` for Volume based installs as that + (b) 'to_fuse_dir' can be one of '/Volumes/..', '/Workspace/..', '/dbfs/..'; + however, you should consider setup_fuse_install()` for Volume based installs as that exposes more options, to include copying JAR and JNI Shared Objects. .. function:: setup_gdal() @@ -100,5 +100,8 @@ code at the top of the notebook: mos.enable_mosaic(spark, dbutils) mos.enable_gdal(spark) + +.. code-block:: text + GDAL enabled. - GDAL 3.4.3, released 2022/04/22 \ No newline at end of file + GDAL 3.4.1, released 2021/12/27 \ No newline at end of file