-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Switch doc examples to use nbsphinx (#3105)
* switching out examples to use nbsphinx * added jupyter_client to doc env * added ipykernel to doc env
- Loading branch information
Showing
8 changed files
with
573 additions
and
1,166 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,325 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Calculating Seasonal Averages from Timeseries of Monthly Means \n", | ||
"=====\n", | ||
"\n", | ||
"Author: [Joe Hamman](https://github.com/jhamman/)\n", | ||
"\n", | ||
"The data used for this example can be found in the [xarray-data](https://github.com/pydata/xarray-data) repository. You may need to change the path to `rasm.nc` below.\n", | ||
"\n", | ||
"Suppose we have a netCDF or `xarray.Dataset` of monthly mean data and we want to calculate the seasonal average. To do this properly, we need to calculate the weighted average considering that each month has a different number of days." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"ExecuteTime": { | ||
"end_time": "2018-11-28T20:51:35.958210Z", | ||
"start_time": "2018-11-28T20:51:35.936966Z" | ||
} | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"%matplotlib inline\n", | ||
"import numpy as np\n", | ||
"import pandas as pd\n", | ||
"import xarray as xr\n", | ||
"from netCDF4 import num2date\n", | ||
"import matplotlib.pyplot as plt " | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"#### Some calendar information so we can support any netCDF calendar. " | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"ExecuteTime": { | ||
"end_time": "2018-11-28T20:51:35.991620Z", | ||
"start_time": "2018-11-28T20:51:35.960336Z" | ||
} | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"dpm = {'noleap': [0, 31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31],\n", | ||
" '365_day': [0, 31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31],\n", | ||
" 'standard': [0, 31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31],\n", | ||
" 'gregorian': [0, 31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31],\n", | ||
" 'proleptic_gregorian': [0, 31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31],\n", | ||
" 'all_leap': [0, 31, 29, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31],\n", | ||
" '366_day': [0, 31, 29, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31],\n", | ||
" '360_day': [0, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30]} " | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"#### A few calendar functions to determine the number of days in each month\n", | ||
"If you were just using the standard calendar, it would be easy to use the `calendar.month_range` function." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"ExecuteTime": { | ||
"end_time": "2018-11-28T20:51:36.015151Z", | ||
"start_time": "2018-11-28T20:51:35.994079Z" | ||
} | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"def leap_year(year, calendar='standard'):\n", | ||
" \"\"\"Determine if year is a leap year\"\"\"\n", | ||
" leap = False\n", | ||
" if ((calendar in ['standard', 'gregorian',\n", | ||
" 'proleptic_gregorian', 'julian']) and\n", | ||
" (year % 4 == 0)):\n", | ||
" leap = True\n", | ||
" if ((calendar == 'proleptic_gregorian') and\n", | ||
" (year % 100 == 0) and\n", | ||
" (year % 400 != 0)):\n", | ||
" leap = False\n", | ||
" elif ((calendar in ['standard', 'gregorian']) and\n", | ||
" (year % 100 == 0) and (year % 400 != 0) and\n", | ||
" (year < 1583)):\n", | ||
" leap = False\n", | ||
" return leap\n", | ||
"\n", | ||
"def get_dpm(time, calendar='standard'):\n", | ||
" \"\"\"\n", | ||
" return a array of days per month corresponding to the months provided in `months`\n", | ||
" \"\"\"\n", | ||
" month_length = np.zeros(len(time), dtype=np.int)\n", | ||
" \n", | ||
" cal_days = dpm[calendar]\n", | ||
" \n", | ||
" for i, (month, year) in enumerate(zip(time.month, time.year)):\n", | ||
" month_length[i] = cal_days[month]\n", | ||
" if leap_year(year, calendar=calendar):\n", | ||
" month_length[i] += 1\n", | ||
" return month_length" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"#### Open the `Dataset`" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"ExecuteTime": { | ||
"end_time": "2018-11-28T20:51:36.072316Z", | ||
"start_time": "2018-11-28T20:51:36.016594Z" | ||
} | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"ds = xr.tutorial.open_dataset('rasm').load()\n", | ||
"print(ds)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"#### Now for the heavy lifting:\n", | ||
"We first have to come up with the weights,\n", | ||
"- calculate the month lengths for each monthly data record\n", | ||
"- calculate weights using `groupby('time.season')`\n", | ||
"\n", | ||
"Finally, we just need to multiply our weights by the `Dataset` and sum allong the time dimension. " | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"ExecuteTime": { | ||
"end_time": "2018-11-28T20:51:36.132413Z", | ||
"start_time": "2018-11-28T20:51:36.073708Z" | ||
} | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"# Make a DataArray with the number of days in each month, size = len(time)\n", | ||
"month_length = xr.DataArray(get_dpm(ds.time.to_index(), calendar='noleap'),\n", | ||
" coords=[ds.time], name='month_length')\n", | ||
"\n", | ||
"# Calculate the weights by grouping by 'time.season'.\n", | ||
"# Conversion to float type ('astype(float)') only necessary for Python 2.x\n", | ||
"weights = month_length.groupby('time.season') / month_length.astype(float).groupby('time.season').sum()\n", | ||
"\n", | ||
"# Test that the sum of the weights for each season is 1.0\n", | ||
"np.testing.assert_allclose(weights.groupby('time.season').sum().values, np.ones(4))\n", | ||
"\n", | ||
"# Calculate the weighted average\n", | ||
"ds_weighted = (ds * weights).groupby('time.season').sum(dim='time')" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"ExecuteTime": { | ||
"end_time": "2018-11-28T20:51:36.152913Z", | ||
"start_time": "2018-11-28T20:51:36.133997Z" | ||
} | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"print(ds_weighted)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"ExecuteTime": { | ||
"end_time": "2018-11-28T20:51:36.190765Z", | ||
"start_time": "2018-11-28T20:51:36.154416Z" | ||
} | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"# only used for comparisons\n", | ||
"ds_unweighted = ds.groupby('time.season').mean('time')\n", | ||
"ds_diff = ds_weighted - ds_unweighted" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"ExecuteTime": { | ||
"end_time": "2018-11-28T20:51:40.264871Z", | ||
"start_time": "2018-11-28T20:51:36.192467Z" | ||
} | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"# Quick plot to show the results\n", | ||
"notnull = pd.notnull(ds_unweighted['Tair'][0])\n", | ||
"\n", | ||
"fig, axes = plt.subplots(nrows=4, ncols=3, figsize=(14,12))\n", | ||
"for i, season in enumerate(('DJF', 'MAM', 'JJA', 'SON')):\n", | ||
" ds_weighted['Tair'].sel(season=season).where(notnull).plot.pcolormesh(\n", | ||
" ax=axes[i, 0], vmin=-30, vmax=30, cmap='Spectral_r', \n", | ||
" add_colorbar=True, extend='both')\n", | ||
" \n", | ||
" ds_unweighted['Tair'].sel(season=season).where(notnull).plot.pcolormesh(\n", | ||
" ax=axes[i, 1], vmin=-30, vmax=30, cmap='Spectral_r', \n", | ||
" add_colorbar=True, extend='both')\n", | ||
"\n", | ||
" ds_diff['Tair'].sel(season=season).where(notnull).plot.pcolormesh(\n", | ||
" ax=axes[i, 2], vmin=-0.1, vmax=.1, cmap='RdBu_r',\n", | ||
" add_colorbar=True, extend='both')\n", | ||
"\n", | ||
" axes[i, 0].set_ylabel(season)\n", | ||
" axes[i, 1].set_ylabel('')\n", | ||
" axes[i, 2].set_ylabel('')\n", | ||
"\n", | ||
"for ax in axes.flat:\n", | ||
" ax.axes.get_xaxis().set_ticklabels([])\n", | ||
" ax.axes.get_yaxis().set_ticklabels([])\n", | ||
" ax.axes.axis('tight')\n", | ||
" ax.set_xlabel('')\n", | ||
" \n", | ||
"axes[0, 0].set_title('Weighted by DPM')\n", | ||
"axes[0, 1].set_title('Equal Weighting')\n", | ||
"axes[0, 2].set_title('Difference')\n", | ||
" \n", | ||
"plt.tight_layout()\n", | ||
"\n", | ||
"fig.suptitle('Seasonal Surface Air Temperature', fontsize=16, y=1.02)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"ExecuteTime": { | ||
"end_time": "2018-11-28T20:51:40.284898Z", | ||
"start_time": "2018-11-28T20:51:40.266406Z" | ||
} | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"# Wrap it into a simple function\n", | ||
"def season_mean(ds, calendar='standard'):\n", | ||
" # Make a DataArray of season/year groups\n", | ||
" year_season = xr.DataArray(ds.time.to_index().to_period(freq='Q-NOV').to_timestamp(how='E'),\n", | ||
" coords=[ds.time], name='year_season')\n", | ||
"\n", | ||
" # Make a DataArray with the number of days in each month, size = len(time)\n", | ||
" month_length = xr.DataArray(get_dpm(ds.time.to_index(), calendar=calendar),\n", | ||
" coords=[ds.time], name='month_length')\n", | ||
" # Calculate the weights by grouping by 'time.season'\n", | ||
" weights = month_length.groupby('time.season') / month_length.groupby('time.season').sum()\n", | ||
"\n", | ||
" # Test that the sum of the weights for each season is 1.0\n", | ||
" np.testing.assert_allclose(weights.groupby('time.season').sum().values, np.ones(4))\n", | ||
"\n", | ||
" # Calculate the weighted average\n", | ||
" return (ds * weights).groupby('time.season').sum(dim='time')" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [] | ||
} | ||
], | ||
"metadata": { | ||
"anaconda-cloud": {}, | ||
"kernelspec": { | ||
"display_name": "Python 3", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.6.8" | ||
}, | ||
"toc": { | ||
"base_numbering": 1, | ||
"nav_menu": {}, | ||
"number_sections": true, | ||
"sideBar": true, | ||
"skip_h1_title": false, | ||
"title_cell": "Table of Contents", | ||
"title_sidebar": "Contents", | ||
"toc_cell": true, | ||
"toc_position": {}, | ||
"toc_section_display": true, | ||
"toc_window_display": true | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 2 | ||
} |
Oops, something went wrong.