This a short analysis to illustrate if adding the distance to an art galley, museum or other cultural center help improve a model to predict the price you pay per night for your Airbnb. The Airbnb dataset used for this exercise was downloaded from the website Inside Airbnb and the inventory of cultural sites was downloaded from Seattle's Open Data Portal. Using these publicly accessible datasets we will show how to give some spatial context to a dataset while trying to answer the following questions:
- Are the neighborhoods with most expensive Airbnb prices per night the ones with the shortest average distance to cultural sites?
- Does the variable 'distance' help create a more accurate predictive model?
- Which are the most important variables for predicting price per night?
This is the list of Python libraries used:
- Numpy
- Pandas
- Geopandas
- Scikit-Learn
The inclusion of a geospatial variable called 'distance', which stands for the closest distance in meters from an Airbnb unit to a cultural site (e.g. art gallery, concert hall), slightly improved the model accuracy. This results can be used to increase awareness of the importance of geospatial variables in some modeling scenarios.
Blog post: https://blog.julionovoa.com/how-to-add-geospatial-context-to-a-predictive-model