Harmony Py Tutorial¶
This notebook shows a basic example of a Harmony job using a Harmony test Collection to perform a combination of both spatial and temporal subsetting.
+First, we import a few things that will help us create a request and display images. We then import the Harmony Py classes we need to make a request.
+import sys; sys.path.append('..')
+
+import sys; sys.path.append('..')
+!{sys.executable} -m pip install -q -r ../requirements/examples.txt
+
+# Install harmony-py requirements. Not necessary if you ran `pip install harmony-py` in your kernel
+!{sys.executable} -m pip install -q -r ../requirements/core.txt
+
import datetime as dt
from IPython.display import display, JSON
import rasterio
import rasterio.plot
+import netCDF4 as nc4
+from matplotlib import pyplot as plt
+import numpy as np
+import helper
-from harmony import BBox, Client, Collection, Request
+from harmony import BBox, WKT, Client, Collection, Request, Environment
+
+
+
Now we create a Harmony Client object, letting it pick up our credentials from a .netrc
file.
harmony_client = Client(env=Environment.UAT)
Next, let's create a Collection object with the CMR collection id for the CMR collection we'd like to look at.
+We then create a Request which specifies that collection, a spatial
BBox
describing the bounding box for the area we're interested in (we'll look at the BBox
in other tutorials). In this case we're interested in looking at Alaska (and who wouldn't be?). We also include a date/time range to narrow down the data.
Because this data includes a lot of different variables, we limit it by passing in a list of variable
s we're interested in; in this test collection we'll look at the blue variable. We include a max_results
parameter to limit the results to the first 10 images just to get a sample of what things look like.
Next, we include a coordinate reference system (CRS) indicating we'd like to reproject the data into the Arctic Polar Stereographic projection. We also specify that we'd like the output to be in the GeoTIFF format with a resolution of 512x512 pixels.
+Finally we check if the request we've created is valid.
+collection = Collection(id='C1234088182-EEDTEST')
+
+request = Request(
+ collection=collection,
+ spatial=BBox(-165, 52, -140, 77),
+ temporal={
+ 'start': dt.datetime(2010, 1, 1),
+ 'stop': dt.datetime(2020, 12, 30)
+ },
+ variables=['blue_var'],
+ max_results=10,
+ crs='EPSG:3995',
+ format='image/tiff',
+ height=512,
+ width=512
+)
-
True+
Now that we have a request, we can submit it to Harmony using the Harmony Client object we created earlier. We'll get back an id for our request which we can use to find the job's status and get the results.
+job_id = harmony_client.submit(request)
+
Let's see how it's going. This will show the percentage complete in the progress
field. (We use the JSON helper function to show the results in a nicer-to-look-at format).
JSON(harmony_client.status(job_id))
+
<IPython.core.display.JSON object>+
Let's download the results to our system temp directory, overwriting files if they already exist. This returns us a list of Future
objects. Each of these "stand in" for a file in our set of results. We can ask a Future
for its result and when it's available, it will return the filename to us.
results = harmony_client.download_all(job_id, directory='/tmp', overwrite=True)
+
Allright, now let's show some colorful Alaska images! Here we iterate over the results, asking each Future
for its result, and then using rasterio
to open the file and display the image.
for r in results:
+ rasterio.plot.show(rasterio.open(r.result()))
+
/tmp/4856803_2020_01_01_7f00ff_global_blue_var_regridded_subsetted.tif ++
/tmp/4856804_2020_01_01_7f00ff_global_blue_var_regridded_subsetted.tif ++
/tmp/4856805_2020_01_02_3200ff_global_blue_var_regridded_subsetted.tif ++
/tmp/4856806_2020_01_03_0019ff_global_blue_var_regridded_subsetted.tif ++
/tmp/4856807_2020_01_04_0065ff_global_blue_var_regridded_subsetted.tif ++
/tmp/4856808_2020_01_05_00b2ff_global_blue_var_regridded_subsetted.tif ++
/tmp/4856809_2020_01_06_00feff_global_blue_var_regridded_subsetted.tif ++
/tmp/4856810_2020_01_07_00ffb2_global_blue_var_regridded_subsetted.tif ++
/tmp/4856811_2020_01_08_00ff66_global_blue_var_regridded_subsetted.tif ++
/tmp/4856812_2020_01_09_00ff19_global_blue_var_regridded_subsetted.tif ++
We can also get a URL corresponding to our request that we can use in a browser. Note: This will not work if the request includes a shapefile.
Now we create a Harmony Client object, letting it pick up our credentials from a .netrc
file.
harmony_client = Client()
+
+url = harmony_client.request_as_url(request)
+print(url)
-
-
Next, let's create a Collection object with the CMR collection id for the CMR collection we'd like to look at.
-We then create a Request which specifies that collection, a spatial
BBox
describing the bounding box for the area we're interested in (we'll look at the BBox
in other tutorials). In this case we're interested in looking at Alaska (and who wouldn't be?). We also include a date/time range to narrow down the data.
Because this data includes a lot of different variables, we limit it by passing in a list of variable
s we're interested in; in this test collection we'll look at the blue variable. We include a max_results
parameter to limit the results to the first 10 images just to get a sample of what things look like.
Next, we include a coordinate reference system (CRS) indicating we'd like to reproject the data into the Arctic Polar Stereographic projection. We also specify that we'd like the output to be in the GeoTIFF format with a resolution of 512x512 pixels.
-Finally we check if the request we've created is valid.
- +https://harmony.uat.earthdata.nasa.gov/C1234088182-EEDTEST/ogc-api-coverages/1.0.0/collections/parameter_vars/coverage/rangeset?forceAsync=true&subset=lat%2852%3A77%29&subset=lon%28-165%3A-140%29&subset=time%28%222010-01-01T00%3A00%3A00%22%3A%222020-12-30T00%3A00%3A00%22%29&outputcrs=EPSG%3A3995&width=512&height=512&format=image%2Ftiff&maxResults=10&variable=blue_var ++
Let's build another request. This time, we'll request a specific granule and grid using the UMM grid name from the CMR. You can find grids in the CMR at https://cmr.uat.earthdata.nasa.gov/search/grids.umm_json. Include a query parameter ?grid=LambertExample
to list detailed information about a specific grid.
collection = Collection(id='C1234088182-EEDTEST')
+
+collection = Collection(id='C1233860183-EEDTEST')
request = Request(
collection=collection,
- spatial=BBox(-165, 52, -140, 77),
- temporal={
- 'start': dt.datetime(2010, 1, 1),
- 'stop': dt.datetime(2020, 12, 30)
- },
- variables=['blue_var'],
- max_results=10,
- crs='EPSG:3995',
- format='image/tiff',
- height=512,
- width=512
+ granule_id='G1233860486-EEDTEST',
+ interpolation='near',
+ grid='GEOS1x1test'
)
request.is_valid()
-
-
True
Now that we have a request, we can submit it to Harmony using the Harmony Client object we created earlier. We'll get back an id for our request which we can use to find the job's status and get the results.
- +Submit the job and check on the status.
+job_id = harmony_client.submit(request)
+
+job_id = harmony_client.submit(request)
+JSON(harmony_client.status(job_id))
-
-
Let's see how it's going. This will show the percentage complete in the progress
field. (We use the JSON helper function to show the results in a nicer-to-look-at format).
<IPython.core.display.JSON object>+
Download and plot the file using pyplot and numpy.
+JSON(harmony_client.status(job_id))
+
+results = harmony_client.download_all(job_id, directory='/tmp', overwrite=True)
+nc4_file=nc4.Dataset(list(results)[0].result())
+arrays = []
+for var in ['red_var', 'green_var', 'blue_var', 'alpha_var']:
+ ds = nc4_file.variables[var][0,:]
+ arrays.append(ds)
+plt.imshow(np.dstack(arrays))
-
-
<matplotlib.image.AxesImage at 0x1271cefd0>+
<IPython.core.display.JSON object>+ +
Let's download the results to our system temp directory, overwriting files if they already exist. This returns us a list of Future
objects. Each of these "stand in" for a file in our set of results. We can ask a Future
for its result and when it's available, it will return the filename to us.
For request that completes directly without creating a job, the submit call returns the harmony JSON response including the direct download links.
+results = harmony_client.download_all(job_id, directory='/tmp', overwrite=True)
-
collection = Collection(id='C1233800302-EEDTEST')
-
'3d9ff057-50af-430b-bc9c-8edd13430078'
Allright, now let's show some colorful Alaska images! Here we iterate over the results, asking each Future
for its result, and then using rasterio
to open the file and display the image.
Download the result and show the file can be opened with NetCDF library.
+for r in results:
- rasterio.plot.show(rasterio.open(r.result()))
+
+results = harmony_client.download_all(response, directory='/tmp', overwrite=True)
+file_names = [f.result() for f in results]
+for filename in file_names:
+ if filename.endswith("nc"):
+ print(nc4.Dataset(filename))
-
-
<class 'netCDF4._netCDF4.Dataset'> +root group (NETCDF4 data model, file format HDF5): + GDAL_AREA_OR_POINT: Area + GDAL_HARMONY_HEX_COLOR: 7f00ff + GDAL_HARMONY_HUE: 0.75 + Conventions: CF-1.5 + GDAL: GDAL 3.2.0, released 2020/10/26 + history: Fri May 27 16:34:47 2022: GDAL CreateCopy( out/nc/001_00_7f00ff_global.nc, ... ) + dimensions(sizes): lon(3600), lat(1800) + variables(dimensions): |S1 crs(), float64 lat(lat), float64 lon(lon), uint8 red_var(lat, lon), uint8 green_var(lat, lon), uint8 blue_var(lat, lon), uint8 alpha_var(lat, lon) + groups: +
Example of submitting a request with WKT spatial. The supported WKT geometry types are listed at: https://harmony-py.readthedocs.io/en/latest/api.html#harmony.harmony.WKT
collection = Collection(id='C1233800302-EEDTEST')
-
-
-
-
-
-
-
+request = Request(
+ collection=collection,
+ spatial=WKT('POLYGON((-140 20, -50 20, -50 60, -140 60, -140 20))'),
+ granule_id=['C1233800302-EEDTEST'],
+ max_results=1,
+ temporal={
+ 'start': dt.datetime(1980, 1, 1),
+ 'stop': dt.datetime(2020, 12, 30)
+ },
+ variables=['blue_var'],
+ crs='EPSG:31975',
+ format='image/png'
+)
-
-
+request.is_valid()
+
-
True+
response = harmony_client.submit(request)
+response
+
'b049d8f3-6fea-439b-aad6-60da59dcf1f7'+
Show the result:
+for filename in [f.result() for f in harmony_client.download_all(response, directory='/tmp', overwrite=True)]:
+ if filename.endswith("png"):
+ helper.show_result(filename)
+
+ /tmp/4856817_001_00_7f00ff_global_blue_var_regridded.nc.png +
/Users/yliu10/.pyenv/versions/3.11.9/envs/harmony-py/lib/python3.11/site-packages/rasterio/__init__.py:304: NotGeoreferencedWarning: Dataset has no geotransform, gcps, or rpcs. The identity matrix will be returned. + dataset = DatasetReader(path, driver=driver, sharing=sharing, **kwargs) +