Discussion: should we cover SQL, and if so, in which course? #70

gvwilson · 2019-05-21T21:45:39Z

Should we explain how to write SQL queries?
If so, should this go in the novice or intermediate course?

lwjohnst86 · 2019-05-23T06:26:52Z

I think it would be better suited in the intermediate course. I think keeping it focused on only a few things that are "new" in the novice course, the better (e.g. how to start thinking in the R language). Plus, at least in my field, we don't really work with SQL databases and if you did, the dplyr package can link to the database and you can continue using the dplyr verbs to extract the data (dplyr converts to SQL commands under the hood).

ljdursi · 2019-05-23T13:25:46Z

I feel like having being at least somewhat aware of SQL is probably a prerequisite for an RSE. The idea of using pandas / dplyr for those operations on small local data in a couple tables for the novice material, and using that as context for SQL queries for bigger or external databases in the intermediate material might be a good way to go.

joelostblom · 2019-05-28T01:45:10Z

I vote yes - intermediate.

My general opinion is that once the concepts of querying datasets are understood in either pandas, dplyr, or sql, transitioning between them is mostly getting familiar with a new syntax and it is more important to hammer home the fundamental concepts in one syntax than teaching many different ones. Having that said, the abundance of sql (and not to forget : people talking about sql), justifies including an introduction imo. And although I think learners would pick it up quickly once encountering it in the wild, I see value in introducing it for showing how the skills we taught so far translates well and make sql seem less intimidating.

The pandas docs have a relevant comparison of commands to consider including https://pandas.pydata.org/pandas-docs/stable/getting_started/comparison/comparison_with_sql.html

ChristinaLK · 2019-06-08T13:44:19Z

A bit late to this, but I have always liked teaching SQL before R in data carpentry because you get exposed to the basic table operations (select cols and rows, group, both line by line and summary transformations) in a way that's a little less programmy.

All that to say, I don't think SQL is appropriate in the novice material, but framing the R and Python sections that do this as "these are universal table operations" is important.

lwjohnst86 · 2019-06-11T21:43:20Z

Return to this later (as discussed in 2019-06-11 meeting). Focus on current content. Closing for now.

gvwilson added the discussion discussion before a proposal label May 21, 2019

lwjohnst86 closed this as completed Jun 11, 2019

lwjohnst86 added the on hold Issues to come back to later. label Jun 11, 2019

This was referenced Jun 11, 2019

Discussion: should we cover regular expressions? #71

Closed

Discussion: should we cover JSON? #72

Closed

DamienIrving mentioned this issue Aug 20, 2019

Proposal: adopt outline for intermediate material in #64 #69

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Discussion: should we cover SQL, and if so, in which course? #70

Discussion: should we cover SQL, and if so, in which course? #70

gvwilson commented May 21, 2019

lwjohnst86 commented May 23, 2019

ljdursi commented May 23, 2019

joelostblom commented May 28, 2019

ChristinaLK commented Jun 8, 2019

lwjohnst86 commented Jun 11, 2019

Discussion: should we cover SQL, and if so, in which course? #70

Discussion: should we cover SQL, and if so, in which course? #70

Comments

gvwilson commented May 21, 2019

lwjohnst86 commented May 23, 2019

ljdursi commented May 23, 2019

joelostblom commented May 28, 2019

ChristinaLK commented Jun 8, 2019

lwjohnst86 commented Jun 11, 2019