Skip to content

Latest commit

 

History

History
42 lines (28 loc) · 1.46 KB

reference.md

File metadata and controls

42 lines (28 loc) · 1.46 KB
layout
reference

Glossary

including tab separated (tsv), comma separated (csv), Excel (xls, xlsx), JSON, XML, RDF as XML, Google Spreadsheets

{:auto_ids}

csv : A file extension indicating that a text file that has values separated by commas (comma-separated-values).

Clustering : A method for finding different groups of values that may actually be representing the same thing.

Faceting : A method for exploring the values in a variable. In this episode it is used to explore the values in order to identify errors in data entry.

Filter : To select a subset of data from a dataframe.

JSON : A file extension indicating that the values in a text file are structured using JavaScript Object Notation (JSON).

RDF : A file that extension indicating that the values in a file are structured using Resource Description Framework (RDF).

Regular expressions (regex) : A text string for describing a search pattern. They usually incorporate the use of wildcards to match letters, numbers, punctuation, spacing, or some combination.

tsv : A file extension indicating that a text file that has values separated by tabs (tab-separated-values).

xls : A file extension indicating that a file is a spreadsheet created by Microsoft Excel.

xlsx : A file extension indicating that a file is a spreadsheet created by Microsoft Excel using XML.

XML : A file extension indicating that the values in a file are structured using Extensible Markup Language (XML).