PADOCC Package

Padocc (Pipeline to Aggregate Data for Optimal Cloud Capabilities) is a Data Aggregation pipeline for creating Kerchunk (or alternative) files to represent various datasets in different original formats. Currently the Pipeline supports writing JSON/Parquet Kerchunk files for input NetCDF/HDF files. Further developments will allow GeoTiff, GRIB and possibly MetOffice (.pp) files to be represented, as well as using the Pangeo Rechunker tool to create Zarr stores for Kerchunk-incompatible datasets.

Example Notebooks at this link

Documentation hosted at this link

Release 1.3.5

Release date: 17 April 2025

See the for details.

This package acknowledges contributions by Matt Brown as a pre-release tester.

Installation

To install this package, clone the repository using git clone (and switch to the MigrationOO branch - git checkout MigrationOO if release v1.3 has not been released.)

Then follow the steps below to install the package with the necessary dependencies.

python -m venv .venv
source .venv/bin/activate
pip install poetry
poetry install

Usage

Please refer to the tests/ scripts for how to use the GroupOperation and ProjectOperation classes.

Name		Name	Last commit message	Last commit date
Latest commit History 694 Commits
.github/workflows		.github/workflows
binder		binder
docs		docs
padocc		padocc
release_notes		release_notes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PADOCC Package

Release 1.3.5

Installation

Usage

About

Uh oh!

Releases 17

Packages

Uh oh!

Languages

License

cedadev/padocc

Folders and files

Latest commit

History

Repository files navigation

PADOCC Package

Release 1.3.5

Installation

Usage

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 17

Packages 0

Uh oh!

Languages

Packages