-
Notifications
You must be signed in to change notification settings - Fork 0
Home
ipacheckscto
is a Stata Module designed to check your SurveyCTO XLS form for common programming errors and best practices and outputs a list of issues and recommendations on the Stata result window with an option to export to an excel file. The SurveyCTO server and the desktop application already have built-in form validation tools that check XLS forms for syntax errors in forms. ipacheckscto
compliments these tools by running additional test for the following:
- common programming errors
- IPA recommended practices
ipacheckscto
should therefore be used only after the form validating the XLS form through the server or the desktop application.
NB: Not all issues flagged by
ipacheckscto
will require correction. Review the output carefully and decide if you need to make any changes to your form.
ipacheckscto
makes extensive use of Stata. Some parts of the program are also heavily dependent on Stata’s excel modules in order to create output files that are easy to use and disseminate. The excel modules are only available in Stata 14 or later and therefore requires that the user has Stata 14.0 or later installed on their machine prior to running ipacheckscto. IPA employees with older versions of Stata should contact IT for access to a newer version.
ipacheckscto is collated and distributed as a Stata package through github. You can install ipacheckscto directly from github by running the following line of code in your command line or by using a do-file:
net install ipacheckscto, all replace from("https://raw.githubusercontent.com/PovertyAction/ipacheckscto/master/ado")
ipacheckscto
package file includes the following files:
- ipacheckscto.ado – Stata program file
- ipacheckscto.sthlp – Stata helpfile
- ipacheckscto.dlg – Stata dialog file
help ipacheckscto
outfile(filename)
Specifies the output path and filename for exporting results into excel. The filename must include the extension .xls or .xlsx.
If
outfile()
is not specified, results will be displayed in the Stata result window only.
other(integer)
Specifies the integer value for other specify
option. If this option is specified, ipacheckscto will flag select_one/select_multiple fields which
- use
other specify
option in their choice list but are missing a childother specify
field; - have a child
other specify
field but appears after the childother specify
field; - use the or_other syntax. for instance
select_one fruits or_other
If
other()
option is not specified,ipacheckscto
will only check foror_other
syntax.
dontknow(integer)
Specifies the integer value for don't know
option. If specified, ipacheckscto
will flag;
- select_one/select_multiple fields which use choices list that does not include a
don't know
option; - integer/decimal/text fields that do not accept don't know values
If
dontknow()
option is not specified,ipacheckscto
will skip the don't know entirely.
refuse(integer)
Specifies the integer value for refuses to answer
option. If specified, ipacheckscto
will flag;
- select_one/select_multiple fields which use choices list that does not include a
refuses to answer
option; - integer/decimal/text fields that do not accept
refuses to answer
values
If
refuse()
option is not specified,ipacheckscto
will skip the refuse check entirely.
Check a SurveyCTO XLS form and display results on Stata window.
ipacheckscto using "C:\Users\Documents\Bontanga Baseline.xlsx“
Check a SurveyCTO XLS form and display results on Stata window. Include checks for other specify with the value of -666
ipacheckscto using "C:\Users\Documents\Bontanga Baseline.xlsx", other(-666)
Check a SurveyCTO XLS form and display results on Stata window. Include checks for other specify (-666), dontknow(-999) and refuse to answer (-888).
ipacheckscto using "C:\Users\Documents\Bontanga Baseline.xlsx", other(-666) dontknow(-999) refuse(-888)
Check a SurveyCTO XLS form and display results on Stata window. Include checks for other specify (-666), dontknow(-999) and refuse to answer (-888). Export results to excel file "C:\Users\Documents\output\botanga_baseline_check.xlsx"
ipacheckscto using "C:\Users\Documents\Bontanga Baseline.xlsx", other(-666) dontknow(-999) refuse(-888) outfile("C:\Users\Documents\output\botanga_baseline_check.xlsx")
Get the dialog box by typing db ipacheckscto
in the Stata command window:
Check a SurveyCTO XLS form and display results in Stata results window. Include checks for other specify (-666), dontknow(-999) and refuse to answer (-888).
Check a SurveyCTO XLS form and export results to "C:/.
Check a SurveyCTO XLS form and export results to "C:/. Include checks for other specify (-666), dontknow(-999) and refuse to answer (-888).
While using ipacheckscto
the programmer may need to run ipacheckscto
multiple times on their instrument. The diagram below shows a suggested workflow for programming in SurveyCTO.
-
Get the XLS version of SurveyCTO form: Most programmers already program their survey in excel and already have the XLS form. If the XLS form is on google sheets then you will need to download it in .xlsx format. You may also download an XLS form from the server if you used the online form editor.
-
Upload your XLS form to a SurveyCTO server or run it through the validate form tool on SurveyCTO Desktop application to verify and correct any syntax issues.
You can skip this step if you already downloaded your form from the server.
- Run
ipacheckscto
and review results carefully. Make any necessary adjustments to your form.
Repeat steps 2 and 3 until you are satisfied with the status of your form.
- Download the form onto the collect app for bench testing/piloting.
Check 0. summary
The summary results/sheet shows information about the XLS form as well as a summary of the results from the various checks. In the `output()` sheet, the sheet "summary" contains the result of this check.The form details section of the summary sheet shows basic information about the XLS form that was checked. This include:
Filename: Actual filename of the XLS form that was checked
Form Title: This is the title of your form. The form title is under the column “form_title” in the “settings” sheet of the XLS form.
Form ID: This is the unique ID that will identify the form. The form ID is under the column “form_id” in the “settings” sheet of the XLS form.
Form Definition Version: This is the version number of the form, which you must increase each time you modify an existing form. If you started with a form template or with one of the sample forms, then this is determined by a formula and is therefore automatic: so may not necessarily match the version number on the server.
Number of Languages: This is the number of languages in the XLS form. This number is determined by counting the number of label columns in the “survey” sheet of the XLS form.
Default Language: This is the name of the language associated with labels, images, and other content when no other language is specified. Form Encrypted:** This indicates “Yes” if the form data is encrypted using the SurveyCTO encryption keys and “No” if otherwise. IPA requires that all SurveyCTO forms be encrypted.
Number of Publishable Fields: This indicates the number of fields marked as publishable in the XLS form if the XLS form is encrypted. This is left blank if the XLS form is not encrypted.
Submission URL: This is the submission URL to use when submitting encrypted forms. The form will not accept form submissions if it is uploaded to a server that is different from what is indicated in the “submission_url” URL column of the “settings” sheet. The SurveyCTO server debug tool does not detect this discrepancy. However, enumerators may not be able to submit data from this form to our server and will continually get a prompt to re-enter the password whenever they try to submit the form.
This issue can be fixed by unchecking the “Respect submission_url if configured in forms” option which can be found in the admin settings.
The check summary section of “summary” sheet indicate the results of each check. The results are color coded based on the following general rules:
- issues identified [] – This indicates that the specified check was run, and a certain number of issues were identified in the XLS form.
- no issues identified [] – This indicates that the specified check was no issues were identified in the XLS form.
- check was skipped [] – This indicates that the specified check was skipped because it was not applicable to the XLS form. This only applies to checks 7, 8 and 9.
Check 1. recommend fields
Checks SurveyCTO XLS form for IPA recommended fields and shows the results in the “1. recommended fields” sheet. This sheet is divided into “required” and “recommended” sections.
The starttime, endtime & duration fields are required for the IPA DMS and should be included in each XLS form. These fields are automatically included in the SurveyCTO XLS templates.
Fields in the recommended section are not required but can be useful for data quality checks. Note the comments, text audits, audio audits, and sensor stream data may severely affect data download times.
Check 2. field names
Field names are expected to be short, unique, and without any spaces or punctuation. The SurveyCTO server debug tool already checks for spaces and punctuations in the field names; however, ipacheckscto complements this by also checking for “.” & “-” which are ignored by the server. These invalid fieldnames may not cause any problems in the data workflow; however, these field names will be changed in the imported Stata dataset since they are not valid variable names
Long variables names in your SurveyCTO form may also cause some problems in the data workflow due to Stata variable name restrictions. Stata will truncate any long variable name to 32 characters, and this could lead to an error if the form if the first 32 characters of the multiple fields are the same. For instance:
In this specific scenario, the field “e01_household_purchase_1_month_chicken” is imported as the variable “e01_household_purchase_1_month_c” while the field “e01_household_purchase_1_month_corned_beef” is imported as variable “v30”. This is because the first 32 characters of “e01_household_purchase_1_month_corned_beef” is “e01_household_purchase_1_month_c” which already exist in the dataset.
In the Stata import do-file generated by SurveyCTO, the variable names for both fields is exactly the same although the second variable was renamed. This will lead to the error message shown below
ipacheckscto
flags the field for review if the field name is greater than 22 characters.
NB. 22 characters is an arbitrary number.
While naming your fields, it is important to allow for expansion of field name during data quality checks, data cleaning and analysis. Capping the length of the field name at 22 allows for 10 characters to be added to the varname when necessary.
Check 3. disabled, read only
Although it is perfectly valid to mark as a field as disabled or readonly, sometimes programmers mistakenly mark a field as disabled or readonly. This checklist all disabled and readonly fields so the programmer can review them to ensure that they are disabled or readonly on purpose.
Check 4. field requirements
ipacheckscto flags issues with field requirement and export issues to the “4. requirements” sheet. Reviewing this output will help identify and ensure that fields that need to be required are correctly required while fields that do not need to be required are not required or are purposely marked as required. The following issues are flagged for review:
- integer, text, date, datetime, time, select_one or select_multiple field is not required
- note field is required
- field is readonly and required
- select_one or select_multiple field is required and has the appearance type “label”
- Visible geopoint field is required: Required geopoint fields can prevent the user from finalizing the form if they are unable to record a value for the field.
Check 5. constraint
ipacheckscto flags issues with constraint issues and export issues to the “5. constraints” sheet. This sheet will contain a list of fields with at least one of the following violations.
- integer or decimal field and missing constraint
- text fields with appearance type “numbers” or “numbers_phone” and missing constraint
- has constraint but is missing constraint message
Check 6. other specify
The other specify check flags issues with other specify in the XLS form and exports results to the sheet “6. other specify”. This will contain a list of fields with at least one of the following violations.
- or_other syntax: SurveyCTO allows the use of or_other with select_one and select_multiple variables. For instance, select_one gender or_other will automatically create a “other specify” field if the order option is selected. However, the parent field will be stored as a string variable instead of numeric and will cause the import do-file to crush.
- Child fields are specified: Checks that child field is specified if the choice_list for the parent variable includes an “other (specify)” option
- Child field is placed after parent field: Check that the child field is ordered after the parent field. If the child is placed after the parent, the child field will be skipped until the user attempts to finalize the form.
If the option
other()
is not used, then only the or_other syntax will be checked.
Check 7. dont know, refuse
The don’t know & refuse check flags field that does not allow for don’t know and refuses to answer responses. It is a recommended practice to include don’t know and refuse to answer responses in all fields that collect respondent responses. This will allow enumerators to be to correctly record responses when the respondent is not willing to answer. This check, therefore, flags all fields that do not allow for don’t know and refuse options.
Check 8. group names
ipacheckscto
checks that the names for a begin & end group pair for each group is the same. SurveyCTO does not require that these names the same and does not even require a name to be defined for the end group/repeat. However, as a best practice, it is important to keep the names the same for the begin & end groups so that it is easy the identify where groups begin and where they end.
Check 9. repeat fields
`ipacheckscto` verifies that repeat fields are used outside the repeat group. In SurveyCTO, each repeat field can be referenced just like other fields within the same repeat group. When this is done, SurveyCTO assumes that the reference is to the value in the current instance. However, when it is used outside of its repeat group then SurveyCTO is unable to determine the appropriate instance it is referencing. SurveyCTO contains special functions which can be used with repeat fields outside of their repeat group to appropriately index the value from the correct instance. Since the SurveyCTO server and debug tools do not check for this issue, however, when a repeat field is illegally used outside its primary repeat group is causes an error on the collect app or interface.
Check 10. choices
`ipacheckscto` checks for various errors in the choices sheet of the XLS form. This include: * duplicates in value column for the same choice group * duplicates in label column for the same choices group * missing label