-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Discussion: should we cover SQL, and if so, in which course? #70
Comments
I think it would be better suited in the intermediate course. I think keeping it focused on only a few things that are "new" in the novice course, the better (e.g. how to start thinking in the R language). Plus, at least in my field, we don't really work with SQL databases and if you did, the dplyr package can link to the database and you can continue using the dplyr verbs to extract the data (dplyr converts to SQL commands under the hood). |
I feel like having being at least somewhat aware of SQL is probably a prerequisite for an RSE. The idea of using pandas / dplyr for those operations on small local data in a couple tables for the novice material, and using that as context for SQL queries for bigger or external databases in the intermediate material might be a good way to go. |
I vote yes - intermediate. My general opinion is that once the concepts of querying datasets are understood in either pandas, dplyr, or sql, transitioning between them is mostly getting familiar with a new syntax and it is more important to hammer home the fundamental concepts in one syntax than teaching many different ones. Having that said, the abundance of sql (and not to forget : people talking about sql), justifies including an introduction imo. And although I think learners would pick it up quickly once encountering it in the wild, I see value in introducing it for showing how the skills we taught so far translates well and make sql seem less intimidating. The pandas docs have a relevant comparison of commands to consider including https://pandas.pydata.org/pandas-docs/stable/getting_started/comparison/comparison_with_sql.html |
A bit late to this, but I have always liked teaching SQL before R in data carpentry because you get exposed to the basic table operations (select cols and rows, group, both line by line and summary transformations) in a way that's a little less programmy. All that to say, I don't think SQL is appropriate in the novice material, but framing the R and Python sections that do this as "these are universal table operations" is important. |
Return to this later (as discussed in 2019-06-11 meeting). Focus on current content. Closing for now. |
The text was updated successfully, but these errors were encountered: