how to add a new gridded dataset to ogh #32

ChristinaB · 2018-09-21T19:45:56Z

Fork Observatory
Log in to CUAHSI JupyterHub server to run from HydroShare
Open a terminal and

git clone http://yourfork

into a sensible folder outside of the HS folder structure (e.g. make a folder called Github)
4. copy ogh.py and ogh_meta to your HS working directory
5. test existing Notebook and functions for case study location
6. change the name and Notebook to import and use a local version of ogh and ogh meta
7. Create functions for metadata, get, and compile (A, B, C below)
8. Test and debug
9. Download your data !! Yeahhh. Explore your data with other OGH functions.
10. Click Pull request https://github.com/Freshwater-Initiative/Observatory

Three main code additions:
A. Edit ogh_meta (click here to view code) for your new dataset
B. Create a new ogh get function for your new dataset. For example, create your own version of this:

def getDailyMET_livneh2013(homedir, mappingfile,
subdir='livneh2013/Daily_MET_1915_2011/raw',
catalog_label='dailymet_livneh2013'):
"""
Get the Livneh el al., 2013 Daily Meteorology files of interest using the reference mapping file

homedir: (dir) the home directory to be used for establishing subdirectories
mappingfile: (dir) the file path to the mappingfile, which contains the LAT, LONG_, and ELEV coordinates of interest
subdir: (dir) the subdirectory to be established under homedir
catalog_label: (str) the preferred name for the series of catalogged filepaths
"""
# check and generate DailyMET livneh 2013 data directory
filedir=os.path.join(homedir, subdir)
ensure_dir(filedir)

# generate table of lats and long coordinates
maptable = pd.read_csv(mappingfile)

# compile the longitude and latitude points
locations = compile_dailyMET_Livneh2013_locations(maptable)

# Download the files
ftp_download_p(locations)

# update the mappingfile with the file catalog
addCatalogToMap(outfilepath=mappingfile, maptable=maptable, folderpath=filedir, catalog_label=catalog_label)

# return to the home directory
os.chdir(homedir)
return(filedir)

C. Create new compile function for your dataset

def compile_bc_Livneh2013_locations(maptable):
"""
Compile a list of file URLs for bias corrected Livneh et al. 2013 (CIG)

maptable: (dataframe) a dataframe that contains the FID, LAT, LONG_, and ELEV for each interpolated data file
"""
locations=[]
for ind, row in maptable.iterrows():
    basename='_'.join(['data', str(row['LAT']), str(row['LONG_'])])
    url=['http://cses.washington.edu/rocinante/Livneh/bcLivneh_WWA_2013/forcings_ascii/', basename]
    locations.append(''.join(url))
return(locations)

supp_table1.pdf

The text was updated successfully, but these errors were encountered:

jphuong · 2018-09-21T20:25:21Z

@ChristinaB

I like the moxie in the instructions. However, the key step is forgotten, Step 0: File and Metadata Management. You're including a new data set, and they may have their own ways of doing things. This includes the time-period of the files, organization of the files, gridding schema, and variables represented. Working backwards from the functions is the hard way forward.

ChristinaB · 2018-09-22T00:00:46Z

Maybe I should have made Gist. But above there are "Three main code additions:
A. Edit ogh_meta (click here to view code) for your new dataset
and I didn't really get into the details in the github issue because I told him verbally. Also, I noticed that the supp table does not include the functions embedded in the get functions. Would that make it too long? We may want to update that to include everything...

keckje · 2018-09-24T18:55:49Z

Jim,

Do you have any time today or this evening I could try modifying oxl(ogh_xarray_landlab.py) to include a oxl.get_x_hourlywrf_pnnl2018 function?

I cloned your fork of the Observatory and ran the Observatory_usecase_7_xmapLandlab and am looking at the oxl set of functions. The usecase_7 has a number of errors when I run it from hydroshare.... like:

Do I need to use the ogh module you updated at geohack to run the notebook?

the pacific northwest national laboratory data is saved at:
http://cses.washington.edu/rocinante/WRF/PNNL_NARR_6km/

Thanks for your help

jphuong · 2018-09-24T19:18:29Z

@keckje @ChristinaB

Sorry, I'm at a workshop today, and I won't be able to get around to this error until Wednesday.

In the usecase7 notebook, the intention is to have them use the OGH v0.1.11 conda library (the current most stable version), while using functionalities from the oxl module, which will later become OGH.xarray_landlab module. If you've paired an ogh.py within the same folder as the notebook, just rename it to something else as I have like ogh_old.py.

To make sure you can run ogh v.0.1.11, run the following code in bash on HydroShare Jupyterhub to get through the necessary installations. In the long run, HydroShare-Jupyterhub needs to keep up with these versioning issues in their Docker image.

conda install -c conda-forge ogh fiona ncurses libgdal gdal pygraphviz --yes

jphuong · 2018-09-24T19:44:10Z

@keckje @ChristinaB

I've just pushed my changes to my Fork. It should work better if you've implement the conda install code.

keckje · 2018-09-25T01:16:44Z

Hi Jim, I added two functions (copied and modified two of your functions so that we can download the PNNL data. Should I push the changes to your fork of Observatory?

def compile_x_wrfpnnl2018_raw_locations(time_increments):
"""
Compile a list of file URLs for PNNL 2018 raw WRF data
time_increments: (list) a list of dates that identify each netcdf file
"""
locations=[]
domain='http://cses.washington.edu'
subdomain='/rocinante/WRF/PNNL_NARR_6km'

for ind, ymd in enumerate(time_increments):
    basename='/' + ymd[0:4] + '/data.' + ymd[0:4] + '-' + ymd[4:6] + '-' + ymd[6:8] + '.nc'
    url='{0}{1}{2}'.format(domain, subdomain, basename)
    locations.append(url)
return(locations)

def get_x_hourlywrf_PNNL2018(homedir,
spatialbounds,
subdir='PNNL2018/Hourly_WRF_1981_2015/noBC',
nworkers=4,
start_date='1981-01-01',
end_date='2015-12-31',
rename_timelatlong_names={'LAT':'LAT','LON':'LON'},
file_prefix='sp_',
replace_file=True):
"""
get hourly WRF data from a 2018 PNNL WRF run using xarray on netcdf files
"""
# check and generate data directory
filedir=os.path.join(homedir, subdir)
ogh.ensure_dir(filedir)
# modify each month between start_date and end_date to year-month
dates = [x.strftime('%Y%m%d') for x in pd.date_range(start=start_date, end=end_date, freq='D')]
# initialize parallel workers
da.set_options(pool=ThreadPool(nworkers))
ProgressBar().register()
# generate the list of files to download
filelist = compile_x_wrfpnnl2018_raw_locations(dates)
# download files of interest
NetCDFs=[]
for url in filelist:
NetCDFs.append(da.delayed(wget_x_download_spSubset)(fileurl=url,
spatialbounds=spatialbounds,
file_prefix=file_prefix,
rename_latlong_names=rename_timelatlong_names,
replace_file=replace_file))
# run operations
outputfiles = da.compute(NetCDFs)[0]
# reset working directory
os.chdir(homedir)
return(outputfiles)

ChristinaB mentioned this issue Sep 21, 2018

Designing maps for gridded precip effect on forcing thresholds Freshwater-Initiative/pyDHSVM#8

Open

9 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

how to add a new gridded dataset to ogh #32

how to add a new gridded dataset to ogh #32

ChristinaB commented Sep 21, 2018 •

edited

Loading

jphuong commented Sep 21, 2018

ChristinaB commented Sep 22, 2018

keckje commented Sep 24, 2018

jphuong commented Sep 24, 2018

jphuong commented Sep 24, 2018

keckje commented Sep 25, 2018 •

edited

Loading

how to add a new gridded dataset to ogh #32

how to add a new gridded dataset to ogh #32

Comments

ChristinaB commented Sep 21, 2018 • edited Loading

jphuong commented Sep 21, 2018

ChristinaB commented Sep 22, 2018

keckje commented Sep 24, 2018

jphuong commented Sep 24, 2018

jphuong commented Sep 24, 2018

keckje commented Sep 25, 2018 • edited Loading

ChristinaB commented Sep 21, 2018 •

edited

Loading

keckje commented Sep 25, 2018 •

edited

Loading