Skip to content

Commit

Permalink
Correctly handle compressed fits files (#694)
Browse files Browse the repository at this point in the history
* Fixed docstring typo

* Added open_fits to image.main.py

* Added open_fits to views.py

* Added open_fits to pipeline.utils.get_rms_noise_image_values

* Added open_fits to pipeline.new_sources.py

* Added open_fits to forced_extraction

* Fixed circular imports

* Added pathlib import

* Added memmap argument to open_fits

* Added separate preload function

* Fix typing imports

* Add missing import

* Move open_fits to image/utils

* Updated open_fits imports

* Added correct typing imports

* Added fits import

* Updated logging

* PEP8

* Updated open_fits to correctly handle new compression methods

* Better compimagehdu check

* Remove unused imports

* Correctly handle regular single-hdu fits files

* Temporarily remove cache poetry install

* Permanently remove poetry install cacheing - it only takes 20s

* Updated docs to note that compressed fits files are supported

* Updated changelog

* Updated changelog
  • Loading branch information
ddobie authored Dec 12, 2023
1 parent d94243a commit 6209e0d
Show file tree
Hide file tree
Showing 9 changed files with 128 additions and 55 deletions.
5 changes: 0 additions & 5 deletions .github/workflows/test-suite.yml
Original file line number Diff line number Diff line change
Expand Up @@ -34,11 +34,6 @@ jobs:
uses: KyleMayes/install-llvm-action@v1
with:
version: "10.0"
- name: cache poetry install
uses: actions/cache@v2
with:
path: ~/.local
key: poetry-1.5.1-0

- uses: snok/install-poetry@v1
with:
Expand Down
5 changes: 4 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),

#### Added

- Added support for compressed FITS files [#694](https://github.com/askap-vast/vast-pipeline/pull/694)
- Added links to Data Central DAS and the Fink Broker to the source page [#697](https://github.com/askap-vast/vast-pipeline/pull/697/)
- Added `n_new_sources` column to run model to store the number of new sources in a pipeline run [#676](https://github.com/askap-vast/vast-pipeline/pull/676).
- Added `MAX_CUTOUT_IMAGES` to the pipeline settings to limit the number of postage stamps displayed on the source detail page [#658](https://github.com/askap-vast/vast-pipeline/pull/658).
Expand Down Expand Up @@ -41,6 +42,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),

#### Changed

- Updated all FITS loading to use a wrapper that can handle compressed FITS files [#694](https://github.com/askap-vast/vast-pipeline/pull/694)
- Downgrade ci-docs to python 3.8 [#702](https://github.com/askap-vast/vast-pipeline/pull/702)
- Update Gr1N poetry to v8, force python 3.8.10 [#701](https://github.com/askap-vast/vast-pipeline/pull/701)
- Updated path to test data in github actions and docs [#699](https://github.com/askap-vast/vast-pipeline/pull/699)
Expand Down Expand Up @@ -112,7 +114,8 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),

#### List of PRs

- [#702](https://github.com/askap-vast/vast-pipeline/pull/702): fix: Downgrade ci-docs to python 3.8
- [#694](https://github.com/askap-vast/vast-pipeline/pull/694): feat: Handle compressed fits files.
- [#702](https://github.com/askap-vast/vast-pipeline/pull/702): fix: Downgrade ci-docs to python 3.8.
- [#701](https://github.com/askap-vast/vast-pipeline/pull/701): fix: Update Gr1N poetry to v8, force python 3.8.10.
- [#699](https://github.com/askap-vast/vast-pipeline/pull/699): docs, feat: Add new regression data download URL and updates to Github Actions.
- [#697](https://github.com/askap-vast/vast-pipeline/pull/697/): feat: Added links to Data Central DAS and the Fink Broker to the source page.
Expand Down
3 changes: 1 addition & 2 deletions docs/using/runconfig.md
Original file line number Diff line number Diff line change
Expand Up @@ -151,8 +151,7 @@ Boolean. Astropy warnings are suppressed in the logging output if set to `True`.

**`inputs.image`**
Line entries or epoch headed entries.
The full paths to the image FITS files to be processed.
Epoch mode is activated by including an extra key value with the epoch name, see the example below for a demonstration.
The full paths to the image FITS files to be processed - these can be regular FITS files, or FITS files that use a [`CompImageHDU`](https://docs.astropy.org/en/stable/io/fits/api/images.html#astropy.io.fits.CompImageHDU). In principle the pipeline also supports [`.fits.fz`](https://heasarc.gsfc.nasa.gov/fitsio/fpack/) files, although this is not officially supported. Epoch mode is activated by including an extra key value with the epoch name, see the example below for a demonstration.
Refer to [this section](../design/association.md#epoch-based-association) of the documentation for more information on epoch based association.

<!-- markdownlint-disable MD046 -->
Expand Down
29 changes: 18 additions & 11 deletions vast_pipeline/image/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@

from vast_pipeline import models
from vast_pipeline.survey.translators import tr_selavy
from vast_pipeline.image.utils import open_fits


logger = logging.getLogger(__name__)
Expand All @@ -33,6 +34,7 @@ class Image(object):
path (str): The system path to the image.
"""

def __init__(self, path: str) -> None:
"""
Initiliase an image object.
Expand Down Expand Up @@ -80,7 +82,7 @@ class FitsImage(Image):

entire_image = True

def __init__(self, path: str, hdu_index: int=0) -> None:
def __init__(self, path: str, hdu_index: int = 0) -> None:
"""
Initialise a FitsImage object.
Expand All @@ -107,7 +109,7 @@ def __init__(self, path: str, hdu_index: int=0) -> None:

def __get_header(self, hdu_index: int) -> fits.Header:
"""
Retrieves the header from teh FITS image.
Retrieves the header from the FITS image.
Args:
hdu_index:
Expand All @@ -116,13 +118,14 @@ def __get_header(self, hdu_index: int) -> fits.Header:
Returns:
The FITS header as an astropy.io.fits.Header object.
"""

try:
with fits.open(self.path) as hdulist:
with open_fits(self.path) as hdulist:
hdu = hdulist[hdu_index]
except Exception:
raise IOError((
'Could not read this FITS file: '
f'{os.path.basename(self.path)}'
'Could not read FITS file: '
f'{self.path}'
))

return hdu.header.copy()
Expand Down Expand Up @@ -223,7 +226,8 @@ def __get_radius_pixels(
The radius of the image in pixels.
"""
if self.entire_image:
# a large circle that *should* include the whole image (and then some)
# a large circle that *should* include the whole image
# (and then some)
diameter = np.hypot(header[fits_naxis1], header[fits_naxis2])
else:
# We simply place the largest circle we can in the centre.
Expand All @@ -244,10 +248,11 @@ def __get_frequency(self, header: fits.Header) -> None:
self.freq_eff = None
self.freq_bw = None
try:
if ('ctype3' in header) and (header['ctype3'] in ('FREQ', 'VOPT')):
freq_keys = ('FREQ', 'VOPT')
if ('ctype3' in header) and (header['ctype3'] in freq_keys):
self.freq_eff = header['crval3']
self.freq_bw = header['cdelt3'] if 'cdelt3' in header else 0.0
elif ('ctype4' in header) and (header['ctype4'] in ('FREQ', 'VOPT')):
elif ('ctype4' in header) and (header['ctype4'] in freq_keys):
self.freq_eff = header['crval4']
self.freq_bw = header['cdelt4'] if 'cdelt4' in header else 0.0
else:
Expand All @@ -271,6 +276,7 @@ class SelavyImage(FitsImage):
associated with the image.
config (Dict): The image configuration settings.
"""

def __init__(
self,
path: str,
Expand Down Expand Up @@ -313,7 +319,8 @@ def read_selavy(self, dj_image: models.Image) -> pd.DataFrame:
Dataframe containing the cleaned and processed Selavy components.
"""
# TODO: improve with loading only the cols we need and set datatype
if self.selavy_path.endswith(".xml") or self.selavy_path.endswith(".vot"):
if self.selavy_path.endswith(
".xml") or self.selavy_path.endswith(".vot"):
df = Table.read(
self.selavy_path, format="votable", use_names_over_ids=True
).to_pandas()
Expand Down Expand Up @@ -460,12 +467,12 @@ def read_selavy(self, dj_image: models.Image) -> pd.DataFrame:
.agg('sum')
)

df['flux_int_isl_ratio'] = (
df['flux_int_isl_ratio'] = (
df['flux_int'].values
/ island_flux_totals.loc[df['island_id']]['flux_int'].values
)

df['flux_peak_isl_ratio'] = (
df['flux_peak_isl_ratio'] = (
df['flux_peak'].values
/ island_flux_totals.loc[df['island_id']]['flux_peak'].values
)
Expand Down
45 changes: 38 additions & 7 deletions vast_pipeline/image/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,9 @@
import numpy as np
import pandas as pd

from typing import Tuple
from typing import Tuple, Union, Optional
from pathlib import Path
from astropy.io import fits


logger = logging.getLogger(__name__)
Expand Down Expand Up @@ -81,7 +83,7 @@ def calc_error_radius(ra, ra_err, dec, dec_err) -> float:
np.deg2rad(i),
dec_1,
np.deg2rad(j)
)) for i,j in zip(ra_offsets, dec_offsets)
)) for i, j in zip(ra_offsets, dec_offsets)
]

seps = np.column_stack(seps)
Expand Down Expand Up @@ -190,7 +192,7 @@ def calc_condon_flux_errors(
(1. + (theta_B / major)**2)**alpha_maj2 *
(1. + (theta_b / minor)**2)**alpha_min2 *
snr**2)
rho_sq3 = ((major * minor / (4.* theta_B * theta_b)) *
rho_sq3 = ((major * minor / (4. * theta_B * theta_b)) *
(1. + (theta_B / major)**2)**alpha_maj3 *
(1. + (theta_b / minor)**2)**alpha_min3 *
snr**2)
Expand All @@ -210,9 +212,9 @@ def calc_condon_flux_errors(

# ra and dec errors
errorra = np.sqrt((error_par_major * np.sin(theta))**2 +
(error_par_minor * np.cos(theta))**2)
(error_par_minor * np.cos(theta))**2)
errordec = np.sqrt((error_par_major * np.cos(theta))**2 +
(error_par_minor * np.sin(theta))**2)
(error_par_minor * np.sin(theta))**2)

errormajor = np.sqrt(2) * major / rho1
errorminor = np.sqrt(2) * minor / rho2
Expand All @@ -238,11 +240,40 @@ def calc_condon_flux_errors(
help1 = (errormajor / major)**2
help2 = (errorminor / minor)**2
help3 = theta_B * theta_b / (major * minor)
errorflux = np.abs(flux_int) * np.sqrt(errorpeaksq / flux_peak**2 + help3 * (help1 + help2))
help4 = np.sqrt(errorpeaksq / flux_peak**2 + help3 * (help1 + help2))
errorflux = np.abs(flux_int) * help4

# need to return flux_peak if used.
return errorpeak, errorflux, errormajor, errorminor, errortheta, errorra, errordec

except Exception as e:
logger.debug("Error in the calculation of Condon errors for a source", exc_info=True)
logger.debug(
"Error in the calculation of Condon errors for a source",
exc_info=True)
return 0., 0., 0., 0., 0., 0., 0.


def open_fits(fits_path: Union[str, Path], memmap: Optional[bool] = True):
"""
This function opens both compressed and uncompressed fits files.
Args:
fits_path: Path to the fits file
memmap: Open the fits file with mmap.
Returns:
HDUList loaded from the fits file
"""

if isinstance(fits_path, Path):
fits_path = str(fits_path)

hdul = fits.open(fits_path, memmap=memmap)

# This is a messy way to check, but I can't think of a better one
if len(hdul) == 1:
return hdul
elif type(hdul[1]) == fits.hdu.compressed.CompImageHDU:
return fits.HDUList(hdul[1:])
else:
return hdul
Loading

0 comments on commit 6209e0d

Please sign in to comment.