You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@thomasnwilson suggested that we develop a function/package that can lint REDCap dictionaries and return a markdown/html report. These are the initial rules we thought of while waiting in airport.
REDCapLintR: Tool for REDCap Dictionary Good Practices
Working title: "REDCrapR" or "CrapR" or "Moving from REDCrap to REDCap"
Rule: no variable should end in "_v\d" (eg, _v2 or _v3)
Opinion: variables in a sequence should have a meaningful name that clearly communicates its position in the sequence.
Examples of bad behavior: age, age_v2, and age_v2_v2
Suggested fix: Rename variables to age_baseline, age_discharge, age_followup
Smell: at least 10% of text variables should have validation
Smell: at least 20% of variables should be non-text, like dropdowns or sliders
All piped values should originate from variables, events, or smart variables that are currently in the dictionary.
Check that all these still exist among a combined list of variables, events, & smart variables.
regex: \[[a-z][a-z0-9_-]*\]
All embedded variables should originate from variables, events, or smart variables that are currently in the dictionary.
Check that all these still exist among a combined list of variables, events, & smart variables.
regex: \{[a-z][a-z0-9_-]*\}
Rule: all date variables should have the same format within the project. Don't mix & match dmy and mdy.
All forms/instruments should be mapped to at least one event
exception: the instrument ends with "*_retired"
Rule: any variable with something like "phone" in the variable name, field label or field note should have a phone validation. Tokens include
phone
mobile
cell
contact number
Rule: any variable with something like "number" in the variable name, field label or field note should have a integer or numeric validation. Tokens include
number
age
count
Rule: any variable with something like "zip code" in the variable name, field label or field note should have a zip code validation. Tokens include
zip
zip_code
zipcode
Rule: any variable with something like T/F, Y/N in the variable name, field label or field note should have a "1" for true/yes/on and "0" for false/no/off. Tokens include (case insensitive):
t/f
true/false
y/n
yes/no
on/off
Rule: male & female consistently coded as 1/0, 1/2, or 8507/8532 (for OMOP)
m/f
male/female
?? can we expand this to tri-state variables like yes/no/maybe or yes/no/null ??
Rule: multiple choice responses options are coded as integers (instead of letters)
The text was updated successfully, but these errors were encountered:
@thomasnwilson suggested that we develop a function/package that can lint REDCap dictionaries and return a markdown/html report. These are the initial rules we thought of while waiting in airport.
REDCapLintR: Tool for REDCap Dictionary Good Practices
Working title: "REDCrapR" or "CrapR" or "Moving from REDCrap to REDCap"
Rule: no variable should end in "_v\d" (eg, _v2 or _v3)
Opinion: variables in a sequence should have a meaningful name that clearly communicates its position in the sequence.
Examples of bad behavior:
age
,age_v2
, andage_v2_v2
Suggested fix: Rename variables to
age_baseline
,age_discharge
,age_followup
Smell: at least 10% of text variables should have validation
Smell: at least 20% of variables should be non-text, like dropdowns or sliders
All piped values should originate from variables, events, or smart variables that are currently in the dictionary.
Check that all these still exist among a combined list of variables, events, & smart variables.
regex:
\[[a-z][a-z0-9_-]*\]
All embedded variables should originate from variables, events, or smart variables that are currently in the dictionary.
Check that all these still exist among a combined list of variables, events, & smart variables.
regex:
\{[a-z][a-z0-9_-]*\}
Rule: all date variables should have the same format within the project. Don't mix & match dmy and mdy.
All forms/instruments should be mapped to at least one event
Rule: any variable with something like "phone" in the variable name, field label or field note should have a phone validation. Tokens include
Rule: any variable with something like "number" in the variable name, field label or field note should have a integer or numeric validation. Tokens include
Rule: any variable with something like "zip code" in the variable name, field label or field note should have a zip code validation. Tokens include
Rule: any variable with something like T/F, Y/N in the variable name, field label or field note should have a "1" for true/yes/on and "0" for false/no/off. Tokens include (case insensitive):
Rule: male & female consistently coded as 1/0, 1/2, or 8507/8532 (for OMOP)
?? can we expand this to tri-state variables like yes/no/maybe or yes/no/null ??
Rule: multiple choice responses options are coded as integers (instead of letters)
The text was updated successfully, but these errors were encountered: