Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Where does data live? #204

Open
hadley opened this issue Jun 24, 2020 · 2 comments
Open

Where does data live? #204

hadley opened this issue Jun 24, 2020 · 2 comments

Comments

@hadley
Copy link

hadley commented Jun 24, 2020

If your app need some static data, where should it live? Or if it needs dynamic data, what do you recommend?

@ColinFay
Copy link
Member

Hey,

I've written a small part about this here: https://engineering-shiny.org/optim-caveat.html#reading-data

My advice would be (in the context of an application that will be deployed on a server):

  • Small datasets that are not subject to change can be used as package data. I say small, because I benchmarked a few months back a Shiny app that bundled in itself "just" 300 mb of data, and when deployed and accessed by a few douzens of users at the same time, the memory is rapidly full

  • These 300mb of data should be just fine if the application does not aim at being on a server, but to be used as a package inside an R session: each user would use their own RAM so that should be ok. But in that case the only issue is that using it as an application dataset would make the checking process slower, and the launch of the app also a little slower—it might be a good trick to keep the dataset as a csv or fst or feather dataset, and read the data progressively through the app.

  • Otherwise, I would advice for using an external database, even more in the context of deployment: it will make the app faster, and will be more efficient in term of RAM and speed. It might require a little bit of technical knowledge for the devs to set it by themselves but nothing undoable, or maybe they can ask the IT for a small db. And when it comes to interacting with it, I'd advice for {DBI} + {dbplyr} to prevent SQL injection.

@hadley
Copy link
Author

hadley commented Jun 25, 2020

It sounds like the main issue with including data in the app is orthogonal to whether or not you use a package — if you have one R session per user, and want to support multiple simultaneous users, you need to be aware of memory usage in a way that you don't normally have to be.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants