Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Host minimal example dataset for use in tests and examples #11

Open
2 tasks
Robinlovelace opened this issue Jul 8, 2024 · 2 comments
Open
2 tasks

Host minimal example dataset for use in tests and examples #11

Robinlovelace opened this issue Jul 8, 2024 · 2 comments

Comments

@Robinlovelace
Copy link
Collaborator

We can host:

  • A minimal gzipped csv
  • Parquet translation

As a release, e.g.: https://github.com/Robinlovelace/spanishoddata/releases/tag/v0.0.1

If that does not allow direct querying of the parquet files, as I expect, we can host elsewhere e.g. on GitHub pages.

@e-kotov
Copy link
Member

e-kotov commented Jul 13, 2024

Uploading to releases works (at lest for not so large data sets) as documented in this brilliant example: https://docs.ropensci.org/piggyback/articles/cloud_native.html#duckdb. The files in a GitHub release can be queried efficiently using {duckdb}. However, this is of course inferior to just having the parquet files in a hive style format on S3 storage. But probably good enough for demonstrating proof-of-concept.

@Robinlovelace
Copy link
Collaborator Author

That is really good to see!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants