Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add join catalogs notebook #481

Merged
merged 3 commits into from
Nov 5, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/tutorials.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ An introduction to LSDB's core features and functionality

Getting data into LSDB <tutorials/getting_data>
Filtering large catalogs <tutorials/filtering_large_catalogs>
Joining catalogs <tutorials/join_catalogs>
Exporting results <tutorials/exporting_results>

Advanced Topics
Expand Down
137 changes: 137 additions & 0 deletions docs/tutorials/join_catalogs.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,137 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Joining catalogs\n",
"\n",
"In this tutorial we join a small cone region of Gaia with Gaia Early Data Release 3 (EDR3) and compute the ratio between the distances given by their `parallax` and `r_med_geo` columns, respectively."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import lsdb\n",
"from lsdb.core.search import ConeSearch"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"First we load Gaia with its objects `source_id`, their positions and `parallax` columns."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"gaia = lsdb.read_hats(\n",
" \"https://data.lsdb.io/hats/gaia_dr3/gaia\",\n",
" margin_cache=\"https://data.lsdb.io/hats/gaia_dr3/gaia_10arcs\",\n",
" columns=[\"source_id\", \"ra\", \"dec\", \"parallax\"],\n",
" search_filter=ConeSearch(ra=0, dec=0, radius_arcsec=10 * 3600),\n",
")\n",
"gaia"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We will do the same with Gaia EDR3 but the distance column we will use is called `r_med_geo`, the median of the geometric distance estimate."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"gaia_edr3 = lsdb.read_hats(\n",
" \"https://data.lsdb.io/hats/gaia_dr3/gaia_edr3_distances\",\n",
" margin_cache=\"https://data.lsdb.io/hats/gaia_dr3/gaia_edr3_distances_10arcs\",\n",
" columns=[\"source_id\", \"ra\", \"dec\", \"r_med_geo\"],\n",
" search_filter=ConeSearch(ra=0, dec=0, radius_arcsec=10 * 3600),\n",
")\n",
"gaia_edr3"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We are now able to join both catalogs on the `source_id` column, as follows:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"joined = gaia.join(gaia_edr3, left_on=\"source_id\", right_on=\"source_id\")\n",
"joined"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's calculate a histogram with the ratio in catalog distances."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"results = (1e3 / joined[\"parallax_gaia\"]) / joined[\"r_med_geo_gaia_edr3_distances\"]\n",
"ratios = results.compute().to_numpy()\n",
"ratios"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"\n",
"plt.hist(ratios, bins=np.linspace(0.8, 1.2, 100))\n",
"plt.title(\"Histogram of Gaia distance / Gaia EDR3 distance\")\n",
"plt.show()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "demo",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.3"
}
},
"nbformat": 4,
"nbformat_minor": 2
}