You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Names in 2019: ftp://ftp.cdc.gov/pub/Health_Statistics/NCHS/Program_Code/NHIS/2019/
Names in 2018 (and before up to 2010): ftp://ftp.cdc.gov/pub/Health_Statistics/NCHS/Program_Code/NHIS/2018/
Update WVS Round 4 to 2020 version
Check results and encoding issue in variable label (update: still there in Stata 13, not Stata 14+).
Additional things to consider:
Dataset names
I like the initial "acronym + year" convention, but it produces strange names for multiple-year survey datasets:
ess1214 (not used) and ess0816
wvs9904 (unavoidable)
nhis1017 (unavoidable, unless we use a single year, but that removes any demo of keep if year)
gss7616 (unavoidable, unless we separate the years)
Merged datasets
Is it still a good idea to do that for e.g. ESS? Probably not, esp. if we need to limit datasets at 2,048 variables for Stata/IC.
Keep NHIS with multiple years. Use it to demo keep if year.
Keep WVS with multiple years (country-dependent).
Break down GSS.
Break down ESS.
Both WVS and ESS are used to demo keep if inlist(country, …), the other subset we want to show.
Additional datasets
It would make a lot of sense to have more datasets for the students to use than those used in the do-files.
Currently, the do-files are selective anyway: we provide ESS 2016 (Round 8) but do not use the data, even though the dependent variable also exists in that round.
GSS has a single codebook, so bundling many years would duplicate the codebook in the ZIP archives. Not ideal.
ESS could be broken down to Rounds 4 (2008), 8 (2016) and 9 (2018).
The text was updated successfully, but these errors were encountered:
Closes #21, #22 and #23 (copied below), #27.
Update from 2023
Stop updating the data, really.
(except for ESS, perhaps)data-raw/
srqm_data
to usedata-raw/
_readme
documentsDetailed notes
-- since QOG 2023 is outqog2023
qog2019
eu_*
variables-- since GSS has updated toogss7221
gss7616
(but see below)older yearsone old year toopossibly break down single data into yearly ones?restrict to 1976 and 2016raises question as to how to zip it all (currently usesgss7616*
to match files)-- in order to continue using torture question?ess2008
ess0816
, oress2008
andess2016
(different codebooks, so it's fine)_merge
problemess2016
despite not in use anywhere in the course do-fileswvs9904
-- keep old version for sharia law questioness2016
)nhis202*
recent yearnhis1020
?Note on QOG -- offers only this as a replacement in 2023, which is not ideal:
The plan for 2021:
week12.do
.week6.do
(which uses Round 4 only right now, despitetrrtort
also existing for Round 8).ess0810
— note: in previous course versions,ess0810
contained Rounds 4 (2008) and 5 (2010)ftp://ftp.cdc.gov/pub/Health_Statistics/NCHS/Program_Code/NHIS/2019/
ftp://ftp.cdc.gov/pub/Health_Statistics/NCHS/Program_Code/NHIS/2018/
Additional things to consider:
Dataset names
I like the initial "acronym + year" convention, but it produces strange names for multiple-year survey datasets:
ess1214
(not used) andess0816
wvs9904
(unavoidable)nhis1017
(unavoidable, unless we use a single year, but that removes any demo ofkeep if year
)gss7616
(unavoidable, unless we separate the years)Merged datasets
Is it still a good idea to do that for e.g. ESS? Probably not, esp. if we need to limit datasets at 2,048 variables for Stata/IC.
keep if year
.Both WVS and ESS are used to demo
keep if inlist(country, …)
, the other subset we want to show.Additional datasets
It would make a lot of sense to have more datasets for the students to use than those used in the do-files.
Currently, the do-files are selective anyway: we provide ESS 2016 (Round 8) but do not use the data, even though the dependent variable also exists in that round.
The text was updated successfully, but these errors were encountered: