-
Notifications
You must be signed in to change notification settings - Fork 8
New data pipeline: field naming convention
Molly Graber edited this page Sep 18, 2020
·
6 revisions
When creating a new data pipeline conform to the established naming convention and data formatting for fields to enable joins across datasets. The convention is modeled after the PLUTO standard, which is outlined in PLUTO's data dictionary. This guide is not comprehensive and will continue to grow as new common fields emerge across RDP datasets.
Geographic attributes
Note: If a file is NYC-specific and contains information at the county level or lower, make sure that it has the fields borough
and borocode
Field name | Data type | Output example | Description |
---|---|---|---|
latitude | double precision |
40.725038 | Latitude of point-level data |
longitude | double precision |
-73.956633 | Longitude of point-level data |
address | text |
120 Broadway | House number and street name |
cbg2010 | text |
360470284003 | Full US Census Bureau block group ID, based on 2010 geographies |
zipcode | text |
10011 | The five digit zip code |
council | numeric |
34 | NYC only: The council district ID |
borocd | numeric |
401 | NYC only: The community district number, where the first digit is the borocode |
ntacode | varchar(4) |
MN36 | NYC only: The neighborhood tabulation area ID. The first two characters are the borough. |
county | text |
Kings | The name of the county |
fips_county | varchar(5) |
36001 | US Census Bureau county code |
borough | varchar(2) |
MN | NYC only: Two-digit abbreviation of borough name |
borocode | numeric |
4 | NYC only: Code referring to the borough (1,2,3,4, or 5) |
state | text |
New York | The name of the state |
location | text |
NYC, Region, or Nation | Describes whether data refers to an NYC location, the surrounding metro region, or counties outside of the NYC metro region |
Temporal attributes
Field name | Data type | Output example | Description |
---|---|---|---|
date | YYYY-MM-DD |
2020-09-15 | Date the data was collected |
year_week | IYYY-IW |
2020-35 | Week the data was collected |
year_month | YYYY-MM |
2020-09 | Month the data was collected |
Other common attributes
Field name | Data type | Output example | Description |
---|---|---|---|
income | text |
$25,001-$50,000 | Binned income levels |
pct* | numeric |
30.22 | Percentage/rates of some attribute, in whole digits rounded to two decimal points |