Skip to content

New data pipeline: field naming convention

Molly Graber edited this page Sep 18, 2020 · 6 revisions

When creating a new data pipeline conform to the established naming convention and data formatting for fields to enable joins across datasets. The convention is modeled after the PLUTO standard, which is outlined in PLUTO's data dictionary. This guide is not comprehensive and will continue to grow as new common fields emerge across RDP datasets.

Geographic attributes

Note: If a file is NYC-specific and contains information at the county level or lower, make sure that it has the fields borough and borocode

Field name Data type Output example Description
latitude double precision 40.725038 Latitude of point-level data
longitude double precision -73.956633 Longitude of point-level data
address text 120 Broadway House number and street name
cbg2010 text 360470284003 Full US Census Bureau block group ID, based on 2010 geographies
zipcode text 10011 The five digit zip code
council numeric 34 NYC only: The council district ID
borocd numeric 401 NYC only: The community district number, where the first digit is the borocode
ntacode varchar(4) MN36 NYC only: The neighborhood tabulation area ID. The first two characters are the borough.
county text Kings The name of the county
fips_county varchar(5) 36001 US Census Bureau county code
borough varchar(2) MN NYC only: Two-digit abbreviation of borough name
borocode numeric 4 NYC only: Code referring to the borough (1,2,3,4, or 5)
state text New York The name of the state
location text NYC, Region, or Nation Describes whether data refers to an NYC location, the surrounding metro region, or counties outside of the NYC metro region

Temporal attributes

Field name Data type Output example Description
date YYYY-MM-DD 2020-09-15 Date the data was collected
year_week IYYY-IW 2020-35 Week the data was collected
year_month YYYY-MM 2020-09 Month the data was collected

Other common attributes

Field name Data type Output example Description
income text $25,001-$50,000 Binned income levels
pct* numeric 30.22 Percentage/rates of some attribute, in whole digits rounded to two decimal points
Clone this wiki locally