Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SEDONA-680] Remove rasterio from mandatory dependencies #1692

Merged
merged 3 commits into from
Nov 23, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 15 additions & 4 deletions .github/workflows/python.yml
Original file line number Diff line number Diff line change
Expand Up @@ -143,14 +143,25 @@ jobs:
- env:
PYTHON_VERSION: ${{ matrix.python }}
run: find spark-shaded/target -name sedona-*.jar -exec cp {} ${VENV_PATH}/lib/python${PYTHON_VERSION}/site-packages/pyspark/jars/ \;
- env:
- name: Run tests
env:
PYTHON_VERSION: ${{ matrix.python }}
run: |
export SPARK_HOME=${VENV_PATH}/lib/python${PYTHON_VERSION}/site-packages/pyspark
cd python
source ${VENV_PATH}/bin/activate
pytest tests
- env:
pytest -v tests
- name: Run basic tests without rasterio
env:
PYTHON_VERSION: ${{ matrix.python }}
run: |
export SPARK_HOME=${VENV_PATH}/lib/python${PYTHON_VERSION}/site-packages/pyspark
cd python
source ${VENV_PATH}/bin/activate
pip uninstall -y rasterio
pytest -v tests/core/test_rdd.py tests/sql/test_dataframe_api.py
- name: Run Spark Connect tests
env:
PYTHON_VERSION: ${{ matrix.python }}
run: |
if [ ! -f "${VENV_PATH}/lib/python${PYTHON_VERSION}/site-packages/pyspark/sbin/start-connect-server.sh" ]
Expand All @@ -165,4 +176,4 @@ jobs:
cd python
source ${VENV_PATH}/bin/activate
pip install "pyspark[connect]==${SPARK_VERSION}"
pytest tests/sql/test_dataframe_api.py
pytest -v tests/sql/test_dataframe_api.py
3 changes: 3 additions & 0 deletions docs/tutorial/raster.md
Original file line number Diff line number Diff line change
Expand Up @@ -615,6 +615,9 @@ raster.as_numpy_masked() # numpy array with nodata values masked as nan
If you want to work with the raster data using `rasterio`, you can retrieve a `rasterio.DatasetReader` object using the
`as_rasterio` method.

!!!note
You need to have the `rasterio` package installed (version >= 1.2.10) to use this method. You can install it using `pip install rasterio`.

```python
ds = raster.as_rasterio() # rasterio.DatasetReader object
# Work with the raster using rasterio
Expand Down
27 changes: 23 additions & 4 deletions python/sedona/sql/types.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,21 @@

from pyspark.sql.types import BinaryType, UserDefinedType

from ..raster import raster_serde
from ..raster.sedona_raster import SedonaRaster
# Only support RasterType when rasterio is installed
try:
import rasterio
except ImportError:
rasterio = None

if rasterio is not None:
from ..raster import raster_serde
from ..raster.sedona_raster import SedonaRaster
else:
# We'll skip RasterType UDT registration and raise error when deserializing
# RasterUDT objects if rasterio is not installed
raster_serde = None
SedonaRaster = None

from ..utils import geometry_serde


Expand Down Expand Up @@ -57,7 +70,12 @@ def serialize(self, obj):
raise NotImplementedError("RasterType.serialize is not implemented yet")

def deserialize(self, datum):
return raster_serde.deserialize(datum)
if raster_serde is not None:
return raster_serde.deserialize(datum)
else:
raise NotImplementedError(
"rasterio is not installed. Please install it to support RasterType deserialization"
)

@classmethod
def module(cls):
Expand All @@ -71,4 +89,5 @@ def scalaUDT(cls):
return "org.apache.spark.sql.sedona_sql.UDT.RasterUDT"


SedonaRaster.__UDT__ = RasterType()
if SedonaRaster is not None:
SedonaRaster.__UDT__ = RasterType()
10 changes: 8 additions & 2 deletions python/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -58,12 +58,18 @@
long_description=long_description,
long_description_content_type="text/markdown",
python_requires=">=3.6",
install_requires=["attrs", "shapely>=1.7.0", "rasterio>=1.2.10"],
install_requires=["attrs", "shapely>=1.7.0"],
extras_require={
"spark": ["pyspark>=2.3.0"],
"pydeck-map": ["geopandas", "pydeck==0.8.0"],
"kepler-map": ["geopandas", "keplergl==0.3.2"],
"all": ["pyspark>=2.3.0", "geopandas", "pydeck==0.8.0", "keplergl==0.3.2"],
"all": [
"pyspark>=2.3.0",
"geopandas",
"pydeck==0.8.0",
"keplergl==0.3.2",
"rasterio>=1.2.10",
],
},
project_urls={
"Documentation": "https://sedona.apache.org",
Expand Down
Loading