Skip to content

Commit

Permalink
Merge branch 'databrickslabs-main'
Browse files Browse the repository at this point in the history
  • Loading branch information
a0x8o committed Jul 26, 2023
2 parents a577b27 + 0e55386 commit 820dd23
Show file tree
Hide file tree
Showing 46 changed files with 2,490 additions and 1,208 deletions.
1 change: 1 addition & 0 deletions .github/workflows/docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ on:
branches:
- main
- feature/docs_fix
- feature/new_docs
jobs:
build:
runs-on: ubuntu-latest
Expand Down
5 changes: 4 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
## v0.3.12
- Make JTS default Geometry Provider

## v0.3.11
- Update the CONTRIBUTING.md to follow the standard process.
- Fix for issue 383: grid_pointascellid fails with a Java type error when run on an already instantiated point.
Expand Down Expand Up @@ -172,4 +175,4 @@
- Add Geometry validity expressions
- Create WKT, WKB and Hex conversion expressions
- Setup the project
- Define GitHub templates
- Define GitHub templates
62 changes: 23 additions & 39 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -1,41 +1,25 @@
DB license

Copyright (2022) Databricks, Inc.

This library (the "Software") may not be used except in connection with the Licensee's use of the Databricks Platform Services pursuant
to an Agreement (defined below) between Licensee (defined below) and Databricks, Inc. ("Databricks"). The Object Code version of the
Software shall be deemed part of the Downloadable Services under the Agreement, or if the Agreement does not define Downloadable Services,
Subscription Services, or if neither are defined then the term in such Agreement that refers to the applicable Databricks Platform
Services (as defined below) shall be substituted herein for “Downloadable Services.” Licensee's use of the Software must comply at
all times with any restrictions applicable to the Downlodable Services and Subscription Services, generally, and must be used in
accordance with any applicable documentation. For the avoidance of doubt, the Software constitutes Databricks Confidential Information
under the Agreement.

Additionally, and notwithstanding anything in the Agreement to the contrary:
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR
IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
* you may view, make limited copies of, and may compile the Source Code version of the Software into an Object Code version of the
Software. For the avoidance of doubt, you may not make derivative works of Software (or make any any changes to the Source Code
version of the unless you have agreed to separate terms with Databricks permitting such modifications (e.g., a contribution license
agreement)).

If you have not agreed to an Agreement or otherwise do not agree to these terms, you may not use the Software or view, copy or compile
the Source Code of the Software.

This license terminates automatically upon the termination of the Agreement or Licensee's breach of these terms. Additionally,
Databricks may terminate this license at any time on notice. Upon termination, you must permanently delete the Software and all
copies thereof (including the Source Code).

Agreement: the agreement between Databricks and Licensee governing the use of the Databricks Platform Services, which shall be, with
respect to Databricks, the Databricks Terms of Service located at www.databricks.com/termsofservice, and with respect to Databricks
Community Edition, the Community Edition Terms of Service located at www.databricks.com/ce-termsofuse, in each case unless Licensee
has entered into a separate written agreement with Databricks governing the use of the applicable Databricks Platform Services.

Databricks Platform Services: the Databricks services or the Databricks Community Edition services, according to where the Software is used.

Licensee: the user of the Software, or, if the Software is being used on behalf of a company, the company.

Object Code: is version of the Software produced when an interpreter or a compiler translates the Source Code into recognizable and
executable machine code.

Source Code: the human readable portion of the Software.
Definitions.

Agreement: The agreement between Databricks, Inc., and you governing the use of the Databricks Services, which shall be, with respect to Databricks, the Databricks Terms of Service located at www.databricks.com/termsofservice, and with respect to Databricks Community Edition, the Community Edition Terms of Service located at www.databricks.com/ce-termsofuse, in each case unless you have entered into a separate written agreement with Databricks governing the use of the applicable Databricks Services.

Software: The source code and object code to which this license applies.

Scope of Use. You may not use this Software except in connection with your use of the Databricks Services pursuant to the Agreement. Your use of the Software must comply at all times with any restrictions applicable to the Databricks Services, generally, and must be used in accordance with any applicable documentation. You may view, use, copy, modify, publish, and/or distribute the Software solely for the purposes of using the code within or connecting to the Databricks Services. If you do not agree to these terms, you may not view, use, copy, modify, publish, and/or distribute the Software.

Redistribution. You may redistribute and sublicense the Software so long as all use is in compliance with these terms. In addition:

You must give any other recipients a copy of this License;
You must cause any modified files to carry prominent notices stating that you changed the files;
You must retain, in the source code form of any derivative works that you distribute, all copyright, patent, trademark, and attribution notices from the source code form, excluding those notices that do not pertain to any part of the derivative works; and
If the source code form includes a "NOTICE" text file as part of its distribution, then any derivative works that you distribute must include a readable copy of the attribution notices contained within such NOTICE file, excluding those notices that do not pertain to any part of the derivative works.
You may add your own copyright statement to your modifications and may provide additional license terms and conditions for use, reproduction, or distribution of your modifications, or for any such derivative works as a whole, provided your use, reproduction, and distribution of the Software otherwise complies with the conditions stated in this License.

Termination. This license terminates automatically upon your breach of these terms or upon the termination of your Agreement. Additionally, Databricks may terminate this license at any time on notice. Upon termination, you must permanently delete the Software and all copies thereof.

DISCLAIMER; LIMITATION OF LIABILITY.

THE SOFTWARE IS PROVIDED “AS-IS” AND WITH ALL FAULTS. DATABRICKS, ON BEHALF OF ITSELF AND ITS LICENSORS, SPECIFICALLY DISCLAIMS ALL WARRANTIES RELATING TO THE SOURCE CODE, EXPRESS AND IMPLIED, INCLUDING, WITHOUT LIMITATION, IMPLIED WARRANTIES, CONDITIONS AND OTHER TERMS OF MERCHANTABILITY, SATISFACTORY QUALITY OR FITNESS FOR A PARTICULAR PURPOSE, AND NON-INFRINGEMENT. DATABRICKS AND ITS LICENSORS TOTAL AGGREGATE LIABILITY RELATING TO OR ARISING OUT OF YOUR USE OF OR DATABRICKS’ PROVISIONING OF THE SOURCE CODE SHALL BE LIMITED TO ONE THOUSAND ($1,000) DOLLARS. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
8 changes: 4 additions & 4 deletions R/sparkR-mosaic/enableMosaic.R
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
#'
#' @description enableMosaic activates the context dependent Databricks Mosaic functions, giving control over the geometry API and index system used.
#' See \url{https://databrickslabs.github.io/mosaic/} for full documentation
#' @param geometryAPI character, default="ESRI"
#' @param geometryAPI character, default="JTS"
#' @param indexSystem character, default="H3"
#' @param indexSystem boolean, default=F
#' @name enableMosaic
Expand All @@ -12,10 +12,10 @@
#' @examples
#' \dontrun{
#' enableMosaic()
#' enableMosaic("ESRI", "H3")
#' enableMosaic("ESRI", "BNG") }
#' enableMosaic("JTS", "H3")
#' enableMosaic("JTS", "BNG") }
enableMosaic <- function(
geometryAPI="ESRI"
geometryAPI="JTS"
,indexSystem="H3"
,rasterAPI="GDAL"
){
Expand Down
8 changes: 4 additions & 4 deletions R/sparklyr-mosaic/enableMosaic.R
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
#' @description enableMosaic activates the context dependent Databricks Mosaic functions, giving control over the geometry API and index system used.
#' See \url{https://databrickslabs.github.io/mosaic/} for full documentation
#' @param sc sparkContext
#' @param geometryAPI character, default="ESRI"
#' @param geometryAPI character, default="JTS"
#' @param indexSystem character, default="H3"
#' @name enableMosaic
#' @rdname enableMosaic
Expand All @@ -12,12 +12,12 @@
#' @examples
#' \dontrun{
#' enableMosaic()
#' enableMosaic("ESRI", "H3")
#' enableMosaic("ESRI", "BNG")}
#' enableMosaic("JTS", "H3")
#' enableMosaic("JTS", "BNG")}

enableMosaic <- function(
sc
,geometryAPI="ESRI"
,geometryAPI="JTS"
,indexSystem="H3"
,rasterAPI="GDAL"
){
Expand Down
24 changes: 15 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ An extension to the [Apache Spark](https://spark.apache.org/) framework that all
[![Language grade: Python](https://img.shields.io/lgtm/grade/python/g/databrickslabs/mosaic.svg?logo=lgtm&logoWidth=18)](https://lgtm.com/projects/g/databrickslabs/mosaic/context:python)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)


## Why Mosaic?

Mosaic was created to simplify the implementation of scalable geospatial data pipelines by bounding together common Open Source geospatial libraries via Apache Spark, with a set of [examples and best practices](#examples) for common geospatial use cases.
Expand All @@ -20,8 +21,8 @@ Mosaic was created to simplify the implementation of scalable geospatial data pi
Mosaic provides geospatial tools for
* Data ingestion (WKT, WKB, GeoJSON)
* Data processing
* Geometry and geography `ST_` operations (with [ESRI](https://github.com/Esri/geometry-api-java) or [JTS](https://github.com/locationtech/jts))
* Indexing (with [H3](https://github.com/uber/h3) or BNG)
* Geometry and geography `ST_` operations (with default [JTS](https://github.com/locationtech/jts) or [ESRI](https://github.com/Esri/geometry-api-java))
* Indexing (with default [H3](https://github.com/uber/h3) or BNG)
* Chipping of polygons and lines over an indexing grid [co-developed with Ordnance Survey and Microsoft](https://databricks.com/blog/2021/10/11/efficient-point-in-polygon-joins-via-pyspark-and-bng-geospatial-indexing.html)
* Data visualization ([Kepler](https://github.com/keplergl/kepler.gl))

Expand All @@ -41,11 +42,16 @@ Image1: Mosaic logical design.

## Getting started

Create a Databricks cluster running __Databricks Runtime 10.0__ (or later).
We recommend using Databricks Runtime versions 11.3 LTS or 12.2 LTS with Photon enabled; this will leverage the
Databricks H3 expressions when using H3 grid system.

:warning: **Mosaic 0.3 series does not support DBR 13** (coming soon with Mosaic 0.4 series); also, DBR 10 is no longer supported in Mosaic.

As of the 0.3.11 release, Mosaic issues the following warning when initialized on a cluster that is neither Photon Runtime nor Databricks Runtime ML [[ADB](https://learn.microsoft.com/en-us/azure/databricks/runtime/) | [AWS](https://docs.databricks.com/runtime/index.html) | [GCP](https://docs.gcp.databricks.com/runtime/index.html)]:

We recommend using Databricks Runtime versions 11.2 or higher with Photon enabled, this will leverage the
Databricks h3 expressions when using H3 grid system.
> DEPRECATION WARNING: Mosaic is not supported on the selected Databricks Runtime. Mosaic will stop working on this cluster from version v0.4.0+. Please use a Databricks Photon-enabled Runtime (for performance benefits) or Runtime ML (for spatial AI benefits).
If you are receiving this warning in v0.3.11, you will want to change to a supported runtime prior to updating Mosaic to run 0.4.0. The reason we are making this change is that we are streamlining Mosaic internals to be more aligned with future product APIs which are powered by Photon. Along this direction of change, Mosaic will be standardizing to JTS as its default and supported Vector Geometry Provider.

### Documentation

Expand Down Expand Up @@ -75,9 +81,9 @@ Then enable it with
```scala
import com.databricks.labs.mosaic.functions.MosaicContext
import com.databricks.labs.mosaic.H3
import com.databricks.labs.mosaic.ESRI
import com.databricks.labs.mosaic.JTS

val mosaicContext = MosaicContext.build(H3, ESRI)
val mosaicContext = MosaicContext.build(H3, JTS)
import mosaicContext.functions._
```

Expand All @@ -103,9 +109,9 @@ Configure the [Automatic SQL Registration](https://databrickslabs.github.io/mosa
%scala
import com.databricks.labs.mosaic.functions.MosaicContext
import com.databricks.labs.mosaic.H3
import com.databricks.labs.mosaic.ESRI
import com.databricks.labs.mosaic.JTS

val mosaicContext = MosaicContext.build(H3, ESRI)
val mosaicContext = MosaicContext.build(H3, JTS)
mosaicContext.register(spark)
```

Expand Down
4 changes: 2 additions & 2 deletions docs/code-example-notebooks/setup/setup-scala.scala
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
// Databricks notebook source
import org.apache.spark.sql.functions._
import com.databricks.labs.mosaic.functions.MosaicContext
import com.databricks.labs.mosaic.ESRI
import com.databricks.labs.mosaic.JTS
import com.databricks.labs.mosaic.H3

val mosaicContext: MosaicContext = MosaicContext.build(H3, ESRI)
val mosaicContext: MosaicContext = MosaicContext.build(H3, JTS)

// COMMAND ----------

Expand Down
3 changes: 2 additions & 1 deletion docs/docs-requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,5 @@ ipython>=8.10.1
sphinxcontrib-fulltoc==1.2.0
livereload==2.6.3
autodocsumm==0.2.7
sphinx-tabs==3.2.0
sphinx-tabs==3.2.0
renku-sphinx-theme==0.2.3
104 changes: 104 additions & 0 deletions docs/source/_static/css/custom.css
Original file line number Diff line number Diff line change
Expand Up @@ -19,4 +19,108 @@
display: flex;
flex-wrap: wrap;
justify-content: center;
}

.mosaic-logo {
display: block;
margin-left: 10%;
margin-right: 10%;
margin-bottom: 2%;
width: 80%;
}

.wy-nav-content {
max-width: 70% !important;
}

.icon-home .logo {
filter: invert(66%) sepia(76%) saturate(752%) hue-rotate(317deg) brightness(95%) contrast(94%);
}

.package-health {
margin-bottom: 2%;
width: fit-content;
margin-left: auto;
margin-right: auto;
}

li {
list-style: disc;
}

.output_area img {
height: 20px;
}

.video-tile {
width: 440px;
float: left;
padding: 10px;
background: rgba(251, 181, 58, 0.46);
margin-right: 20px;
margin-top: 20px;
}

.video-tile p {
margin: 0 0 5px 0;
}

.video-title {
height: 82px;
}

.video-description {
padding: 10px;
height: 300px;
overflow: auto;
margin-bottom: 20px;
}

.video-tile ul {
width: fit-content;
margin: auto !important;
margin-top: 15px !important;
}

.linkedin-badge a {
display: inline-block;
margin: 0 5px;
}

.linkedin-badge a img {
width: 50px;
height: 50px;
border-radius: 50px;
box-shadow: 0 0 0 1px #fff;
transition: all 0.2s ease-in-out;
}

.linkedin-badge a:hover img {
box-shadow: 0 0 0 1px #0077b5;
}

.linkedin-badge p {
margin: 0;
margin-top: -5px !important;
font-size: 18px;
line-height: 20px;
color: #000000;
font-family: "Helvetica Neue", Helvetica, Arial, sans-serif;
font-weight: 600;
text-align: center;
text-decoration: none;
}

.linkedin-badge {
display: inline-block;
margin: 0 5px;
text-align: center;
}

.video-spacing {
height: 50px;
}

.speakers-title {
float: left;
}
Loading

0 comments on commit 820dd23

Please sign in to comment.