Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write downloader task for airflow #3

Open
tdunning opened this issue Aug 27, 2021 · 0 comments
Open

Write downloader task for airflow #3

tdunning opened this issue Aug 27, 2021 · 0 comments
Labels
good first issue Good for newcomers

Comments

@tdunning
Copy link
Member

We need to set up a periodic airflow task to do the following roughly every hour:

  • access U of Iowa archive of MRMS data
  • scan today's directory structure for list of data files
  • check previously downloaded files for difference
  • if different or new, download file
  • signal airflow about changes (if necessary)

On backfill,

  • download with at least 20 seconds between files

Questions:

  • what signals can we derive to detect file changes (perhaps if partially complete files are posted)?
  • how should we test the data integrity? Size? Quick format check by reading grib file?

Links:

See https://github.com/agstack/weather-server/blob/main/experiments/data/mrms.jl for data URLs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

1 participant