Skip to content

Commit

Permalink
Merge pull request #218 from global-healthy-liveable-cities/enhancements
Browse files Browse the repository at this point in the history
Merge enhancements
  • Loading branch information
carlhiggs authored Mar 22, 2023
2 parents 04fc389 + 10e580f commit 410f35e
Show file tree
Hide file tree
Showing 48 changed files with 2,004 additions and 2,100 deletions.
97 changes: 74 additions & 23 deletions process/1_create_project_configuration_files.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,56 @@
Copies configuration templates to configuration folder for custom modification.
"""

import os.path
import os
import shutil
import sys

source_folder = './configuration/templates'
dest_folder = './configuration'
region_names = [
x.split('.yml')[0]
for x in os.listdir('./configuration/regions')
if x.endswith('.yml')
]

region_template = """
name: Full study region name, e.g. Las Palmas de Gran Canaria
year: Target year for analysis, e.g. 2023
country: Fully country name, e.g. España
country_code: Two character country code (ISO3166 Alpha-2 code), e.g. ES
continent: Continent name, e.g. Europe
crs:
name: name of the coordinate reference system (CRS), e.g. REGCAN95 / LAEA Europe
standard: acronym of the standard catalogue defining this CRS, eg. EPSG
srid: spatial reference identifier (SRID) integer that identifies this CRS according to the specified standard, e.g. 5635 (see https://spatialreference.org/, or search for what is commonly used in your city or country; e.g. a national CRS like those listed at https://en.wikipedia.org/wiki/List_of_national_coordinate_reference_systems )
utm: UTM grid if EPSG code is not known (used to manually derive EPSG code= 326** is for Northern Hemisphere, 327** is for Southern Hemisphere), e.g. 28N (see https://mangomap.com/robertyoung/maps/69585/what-utm-zone-am-i-in- )
study_region_boundary:
data: Region boundary data (path relative to project directory, or GHS:variable='value'), e.g. "region_boundaries/Example/Las Palmas de Gran Canaria - Centro Nacional de Información Geográfica - WGS84 - EPSG4326.geojson"
source: The name of the provider of this data, e.g. Centro Nacional de Información Geográfica
publication_date: Publication date for study region area data source, or date of currency, e.g. 2019-02-01
url: URL for the source dataset, or its provider, e.g. https://datos.gob.es/en/catalogo/e00125901-spaignllm
licence: Licence for the data, e.g. CC-BY-4.0
licence_url: URL for the data licence, e.g. https://www.ign.es/resources/licencia/Condiciones_licenciaUso_IGN.pdf
ghsl_urban_intersection: Whether the provided study region boundary will be further restricted to an urban area defined by its intersection with a linked urban region dataset (see urban_region), e.g. true
population: the population record in datasets.yml to be used for this region, e.g. example_las_palmas_pop_2022
OpenStreetMap: the OpenStreetMap record defined in datasets.yml to be used for this region, e.g. las_palmas_20230221
network:
osmnx_retain_all: If false, only retain main connected network when retrieving OSM roads, e.g. false
buffered_region: Recommended to set to 'true'. If false, use the study region boundary without buffering for network extraction (this may appropriate for true islands, but could be problematic for anywhere else where the network and associated amenities may be accessible beyond the edge of the study region boundary).
polygon_iteration: Iterate over and combine polygons (this may be appropriate for a series of islands), e.g. false
connection_threshold: Minimum distance to retain (a useful parameter for islands, like Hong Kong, but for most purposes you can leave this blank)
intersection_tolerance: Tolerance in metres for cleaning intersections. This is an important methodological choice, and the chosen parameter should be robust to a variety of network topologies in the city being studied. See https://github.com/gboeing/osmnx-examples/blob/main/notebooks/04-simplify-graph-consolidate-nodes.ipynb. For example: 12
urban_region: the urban region dataset defined in datasets.yml to be optionally used for this region (such as data from the Global Human Settlements Layer Urban Centres Database, GHSL-UCDB), e.g. GHS-Example-Las-Palmas
urban_query: GHS or other linkage of covariate data (GHS:variable='value', or path:variable='value' for a dataset with equivalently named fields defined in project parameters for air_pollution_covariates), e.g. GHS:UC_NM_MN=='Las Palmas de Gran Canaria' and CTR_MN_NM=='Spain'
country_gdp:
classification: Country GDP classification, e.g. lower-middle
reference: Citation for the GDP classification, e.g. The World Bank. 2020. World Bank country and lending groups. https://datahelpdesk.worldbank.org/knowledgebase/articles/906519-world-bank-country-and-lending-groups
covariate_data: additional covariates to be linked (currently this is designed to direct the process to retrieve additional city statistics from the GHSL UCDB entry for the city, where available), e.g. urban_query
custom_destinations: Details of custom destinations to use (e.g. as done for Maiduguri, Nigeria), in addition to those from OSM (optional, as required; else, leave blank) file name (located in study region folder), category plain name field, category full name field, Y coordinate, X coordinate, EPSG number, attribution
gtfs_feeds: Details for city specific parent folder (within GTFS data dir defined in config.yml) and feed-specific unzipped sub-folders containing GTFS feed data. For each feed, the data provider, year, start and end dates to use, and mapping of modes are specified (only required if feed does not conform to the GTFS specification https://developers.google.com/transit/gtfs/reference). For cities with no GTFS feeds identified, this may be left blank. Otherwise, copy the example in the example_ES_Las_Palmas_2023.yml file or other example files where multiple GTFS feeds for different modes have been defined to get started and customise this for your city.
policy_review: Optional path to results of policy indicator review for inclusion in generated reports
notes: Notes for this region
"""

print(
'Creating project configuration files, if not already existing in the configuration folder...',
Expand All @@ -22,39 +67,45 @@
else:
shutil.copyfile(path_file, path_file.replace('templates/', ''))
print(f'\t- created {file}')
except Exception as e:
raise Exception(f'An error occurred: {e}')

if len(sys.argv) >= 2:
codename = sys.argv[1]
if os.path.exists(f'./configuration/regions/{codename}.yml'):
print(
f"\nConfiguration file for the specified study region codename '{codename}' already exists: \nconfiguration/regions/{codename}.yml.\n\nPlease open and edit this file in a text editor following the provided example directions in order to complete configuration for your study region. A completed example study region configuration can be viewed in the file 'configuration/regions/example_ES_Las_Palmas_2023.yml'.\n\nTo view additional guidance on configuration, run this script again without a codename. Once configuration has been completed, to proceed to analysis for this city, enter:\npython 2_analyse_region.py {codename}\n",
)
else:
with open(f'./configuration/regions/{codename}.yml', 'w') as f:
f.write(region_template)
print(
f"\nNew region configuration file has been initialised using the codename, '{codename}', in the folder:\nconfiguration/regions/{codename}.yml\n\nPlease open and edit this file in a text editor following the provided example directions in order to complete configuration for your study region. A completed example study region configuration can be viewed in the file 'configuration/regions/example_ES_Las_Palmas_2023.yml'.\n\nTo view additional guidance on configuration, run this script again without a codename.\n",
)
else:
print(
"""
All required configuration files are present in the configuration folder, and may be customised as required:
f"""
Before commencing analysis, your study regions will need to be configured.
config.yml:
Configuration of overall project
Study regions are configured using .yml files located within the `configuration/regions` sub-folder. An example configuration for Las Palmas de Gran Canaria (for which data supporting analysis is included) has been provided in the file `process/configuration/regions/example_ES_Las_Palmas_2023.yml`, and further additional example regions have also been provided. The name of the file, for example `example_ES_Las_Palmas_2023`, acts a codename for the city when used in processing and avoids issues with ambiguity when analysing multiple cities across different regions and time points: 'example' clarifies that this is an example, 'ES' clarifies that this is a Spanish city, 'Las_Palmas' is a common short way of writing the city's name, and the analysis uses data chosen to target 2023. Using a naming convention like this is recommended when creating new region configuration files (e.g. ES_Barcelona_2023.yml, or AU_Melbourne_2023.yml).
regions.yml
Configuration of study regions of interest
The study region configuration files have a file extension .yml (YAML files), which means they are text files that can be edited in a text editor to define region specific details, including which datasets used - eg cities from a particular region could share common excerpts of population and OpenStreetMap, potentially).
datasets.yml
Configuration of datasets, which may be referenced by study regions of interest
Additional configuration may be required to define datasets referenced for regions, configuration of language for reporting using the following files:
indicators.yml
Configuration of indicators
datasets.yml: Configuration of datasets, which may be referenced by study regions of interest
osm_destination_definitions.csv
Configuration of key-value pairs used to identify features of interest using OpenStreetMap data
_report_configuration.xlsx: (required to generate reports) Use to configure generation of images and reports generated for processed cities; in particular, the translation of prose and executive summaries for different languages. This can be edited in a spreadsheet program like Microsoft Excel.
osm_open_space.yml
Configuration of queries used to derive a dataset of areas of open space using OpenStreetMap data
config.yml: (optional) Configuration of overall project, including your time zone for logging start and end times of analyses
resources.json
A file which may be used to log details of users processing environments (not currently implemented in code, but may be manually edited by users for their records).
Optional configuration of other parameters is also possible. Please visit our tool's website for further guidance on how to use this tool:
https://global-healthy-liveable-cities.github.io/
policies.yml
Configuration of policies for reporting in generated reports
Currently configured study regions : {', '.join(region_names)}
_report_configuration.xlsx
Use to configure generation of report PDFs for processed cities; in particular, the translation of prose and executive summaries for different languages
To initialise a new study region configuration file, you can run this script again providing your choice of codename, for example:
python 1_create_project_configuration_files.py example_ES_Las_Palmas_2023
""",
)
except Exception as e:
raise Exception(f'An error occurred: {e}')
21 changes: 17 additions & 4 deletions process/2_analyse_region.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,15 +12,17 @@

# Load study region configuration
from subprocesses._project_setup import (
analysis_timezone,
authors,
codename,
config_path,
date_hhmm,
folder_path,
name,
region_config,
region_dir,
region_names,
regions,
time,
version,
)
from tqdm.auto import tqdm
Expand All @@ -43,7 +45,7 @@
current_parameters = {
'date': date_hhmm,
'project': project_configuration,
codename: regions[codename],
codename: region_config,
}

if os.path.isfile(f'{region_dir}/_parameters.yml'):
Expand Down Expand Up @@ -87,6 +89,11 @@
f'A dated copy of project and region parameters has been saved as {region_dir}/_parameters.yml.\n\n',
)

print(
f'Analysis time zone: {analysis_timezone} (to set time zone for where you are, edit config.yml)',
)
start_analysis = time.time()
print(f"Analysis start: {time.strftime('%Y-%m-%d_%H%M')}")
study_region_setup = {
'_00_create_database.py': 'Create database',
'_01_create_study_region.py': 'Create study region',
Expand All @@ -103,7 +110,12 @@
'_12_neighbourhood_analysis.py': 'Analyse neighbourhoods',
'_13_aggregation.py': 'Aggregate region summary analyses',
}
pbar = tqdm(study_region_setup, position=0, leave=True)
pbar = tqdm(
study_region_setup,
position=0,
leave=True,
bar_format='{desc:35} {percentage:3.0f}%|{bar:30}| ({n_fmt}/{total_fmt})',
)
append_to_log_file = open(
f'{region_dir}/__{name}__{codename}_processing_log.txt', 'a',
)
Expand All @@ -122,6 +134,7 @@
f'\n\nProcessing {step} failed: {e}\n\n Please review the processing log file for this study region for more information on what caused this error and how to resolve it. The file __{name}__{codename}_processing_log.txt is located in the output directory and may be opened for viewing in a text editor.',
)
finally:
duration = (time.time() - start_analysis) / 60
print(
'\n\nOnce the setup of study region resources has been successfully completed, we encourage you to inspect the region-specific resources located in the output directory (e.g. text log file, geopackage output, csv files, PDF report and image files).',
f'{time.strftime("%Y-%m-%d_%H%M")} (analysis duration of approximately {duration:.1f} minutes)\n\nOnce the setup of study region resources has been successfully completed, we encourage you to inspect the region-specific resources located in the output directory (e.g. text log file, geopackage output, csv files, PDF report and image files).',
)
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,8 @@
folder_path,
indicators,
policies,
regions,
region_config,
region_names,
)


Expand All @@ -29,13 +30,13 @@ class config:
configuration = './configuration/_report_configuration.xlsx'


if config.city not in regions:
if config.city not in region_names:
sys.exit(
f'Specified city ({config.city}) does not appear to be in the list of configured cities ({list(regions.keys())})',
f'Specified city ({config.city}) does not appear to be in the list of configured cities ({region_names})',
)

config.folder_path = folder_path
config.region = regions[config.city]
config.region = region_config
if not os.path.exists(config.region['region_dir']):
sys.exit(
f"\n\nProcessed resource folder for this city couldn't be located:"
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
name: Graz
year: 2020
country: Austria
country_code: AT
continent: Europe
crs:
name:
standard: EPSG
srid: 32633
utm: 33N
study_region_boundary:
data: region_boundaries/boundaries.gpkg:Graz
source:
publication_date:
url:
licence:
licence_url:
ghsl_urban_intersection: true
population: ghsl_r2019a_2015
OpenStreetMap: global_indicators_25_cities
network:
osmnx_retain_all: false
buffered_region: true
polygon_iteration:
connection_threshold:
intersection_tolerance: 12
urban_region: GHS-URBAN
urban_query: GHS:UC_NM_MN=='Graz' and CTR_MN_NM=='Austria'
country_gdp:
classification: High-income
reference: The World Bank. 2020. World Bank country and lending groups. https://datahelpdesk.worldbank.org/knowledgebase/articles/906519-world-bank-country-and-lending-groups
covariate_data: urban_query
custom_destinations:
file:
dest_name:
dest_name_full:
lat:
lon:
epsg:
attribution:
gtfs_feeds:
policy_review: Graz_GHSCIC_policy_graded_2020.xlsx
notes:
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
name: Adelaide
year: 2020
country: Australia
country_code: AU
continent: Australasia
crs:
name:
standard: EPSG
srid: 7845
utm:
study_region_boundary:
data: region_boundaries/boundaries.gpkg:Adelaide
source:
publication_date:
url:
licence:
licence_url:
ghsl_urban_intersection: true
population: ghsl_r2019a_2015
OpenStreetMap: global_indicators_25_cities
network:
osmnx_retain_all: false
buffered_region: true
polygon_iteration:
connection_threshold:
intersection_tolerance: 12
urban_region: GHS-URBAN
urban_query: GHS:UC_NM_MN=='Adelaide' and CTR_MN_NM=='Australia'
country_gdp:
classification: High-income
reference: The World Bank. 2020. World Bank country and lending groups. https://datahelpdesk.worldbank.org/knowledgebase/articles/906519-world-bank-country-and-lending-groups
covariate_data: urban_query
custom_destinations:
file:
dest_name:
dest_name_full:
lat:
lon:
epsg:
attribution:
gtfs_feeds:
folder: gtfs_au_adelaide
gtfs_au_sa_adelaidemetro_20191004:
gtfs_provider: AdelaideMetro
gtfs_year: 2019
start_date_mmdd: 20191008
end_date_mmdd: 20191205
modes:
policy_review: Adelaide_GHSCIC_policy_graded_2020.xlsx
notes:
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
name: Melbourne
year: 2020
country: Australia
country_code: AU
continent: Australasia
crs:
name:
standard: EPSG
srid: 7845
utm:
study_region_boundary:
data: region_boundaries/boundaries.gpkg:Melbourne
source:
publication_date:
url:
licence:
licence_url:
ghsl_urban_intersection: true
population: ghsl_r2019a_2015
OpenStreetMap: global_indicators_25_cities
network:
osmnx_retain_all: false
buffered_region: true
polygon_iteration:
connection_threshold:
intersection_tolerance: 12
urban_region: GHS-URBAN
urban_query: GHS:UC_NM_MN=='Melbourne' and CTR_MN_NM=='Australia'
country_gdp:
classification: High-income
reference: The World Bank. 2020. World Bank country and lending groups. https://datahelpdesk.worldbank.org/knowledgebase/articles/906519-world-bank-country-and-lending-groups
covariate_data: urban_query
custom_destinations:
file:
dest_name:
dest_name_full:
lat:
lon:
epsg:
attribution:
gtfs_feeds:
folder: gtfs_au_melbourne
gtfs_au_vic_ptv_20191004:
gtfs_provider: PublicTransportVictoria
gtfs_year: 2019
start_date_mmdd: 20191008
end_date_mmdd: 20191205
modes:
policy_review: Melbourne_GHSCIC_policy_graded_2020.xlsx
notes:
Loading

0 comments on commit 410f35e

Please sign in to comment.