This repository was archived by the owner on Dec 20, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path18-geocoding.Rmd
87 lines (59 loc) · 3.85 KB
/
18-geocoding.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
# Postcodes & addresses {#geocoding}
A common use case in our work is to do some spatial analysis on a dataset with addresses. For example, produce a map of certain incidents over the last year, or summarise the number of incidents by Ward.
## UPRN
*TODO*
## Postcodes
If there's no UPRN, getting a location via the postcode is usually the easiest option. It's not as accurate as a UPRN or a full address, but often it's accurate enough for what's required.
Below is an example of using our `add_pcd_vars()` function that builds on the [PostcodesioR](https://docs.ropensci.org/PostcodesioR/) package, which itself is a wrapper for the [postcodes.io](https://postcodes.io/) API. You pass it a data frame with a postcode column and it returns the data frame with new columns based on the postcode.
```{r add-pcd-vars, message=FALSE, warning=FALSE}
# Create a data frame with some example records
df <- tibble::tribble(
~name, ~postcode,
"SCC", "S1 2HH",
"Blades", "S2 4SU",
"Owls", "S6 1SW"
)
# # Load the function we need
# source("https://raw.githubusercontent.com/scc-pi/functions/main/Addresses.R")
#
# # Add postcode related columns to our data frame
# add_pcd_vars(df,
# pcd_name = "postcode",
# .admin_district = FALSE,
# .lat_long = TRUE,
# other_vars = c("admin_ward",
# "msoa_code")) %>%
# knitr::kable() #display them in a nice table
```
`admin_district` can be used to answer the questions: Is the postcode invalid? Is the postcode in Sheffield? You can anticipate values of `NA` and `Sheffield` respectively if the answers are yes. These are common questions so the function has an `.admin_district` argument.
Needing the coordinates of the postcode centroid is also common, so the function also has a `.lat_long` argument that if `TRUE` returns both a latitude column and a longitude column.
*TODO: link to sf example & area stats*
Other columns are added by including their name in a character vector passed with the `other_vars` function argument. For example, `other_vars=c("ccg_code","lsoa_code")`. The list of valid `other_vars` is defined at [docs.ropensci.org/PostcodesioR/\#examples](https://docs.ropensci.org/PostcodesioR/#examples).
The `pcd_name` function argument is used to specify the name of the data frame's postcode column when it isn't `postcode`. For example, `pcd_name="pupil_pcode"`.
Sometimes, the postcode in a dataset is not included as a separate column, but is at the end of a combined address column. Depending on the format and quality of the combined address column you could extract the postcode like this:
```{r, extract-pcd, message=FALSE, warning=FALSE}
# Create a data frame with some example records
df <- tibble::tribble(
~name, ~address,
"SCC", "Pinstone St, Sheffield City Centre, Sheffield S1 2HH",
"Blades", "Bramall Ln, Highfield, Sheffield S2 4SU",
"Owls", "76 Penistone Rd N, Sheffield S6 1SW"
)
# Extract the postcode from the address into a separate column
dplyr::mutate(df,
postcode = stringr::word(address, -2, -1, sep = "[:space:]")) |>
knitr::kable()
```
### PAS postcode lookup
PAS, the Performance & Analysis Service in the People Porfolio, maintains a postcode lookup table that performs the same role as the `add_pcd_vars()` function.
*TODO: Does it also include Sheffield specific variables e.g. which of the 100 neighbourhoods is the postcode in?*
### Postcode boundaries
*TODO:* [OPEN DATA GB POSTCODE UNIT BOUNDARIES](https://longair.net/blog/2021/08/23/open-data-gb-postcode-unit-boundaries/)
### Split postcodes
*TODO*
## Addresses
Anne Tetley has provided the following guidance (on Portal):
- [Geocoding Address Data in Portal](https://sheffieldcitycouncil.cloud.esriuk.com/portal/apps/storymaps/stories/65cd11fc65ef4c0cb89b761ebf367053)
*TODO*\
- Portal LLPG Cascade Locator\
- OS Places ESRI integration