Skip to content

a-azad/Seattle.housing.market

Repository files navigation

INTRODUCTION

About

This work is a personal study on Seattle's housing market and it is not for commercial use. Unfortunately, the data is limited to 2014-2015 and extension of the work to today's [crazy] market is not practical. Data exploration and ML modeling is captured in multiple "jupyter notebooks" in this repository. The problem is initially a regression modeling exercise. Simple linear model and tree-based algorithms was used. Although Random Forest (like always) offered a significant improvement to the model, for the purpose of statistical inference, improved (Ridge) linear model was chosen for this study.

Acknowledgement: The original dataset belongs to Kaggle: House Sales in King County, USA provided to redict house price using regression.

Data

import pandas as pd
import numpy as np 
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
data = pd.read_csv("clean_data.csv")
data.drop(['Unnamed: 0'], axis=1, inplace=True)
data.head()
<style> .dataframe thead tr:only-child th { text-align: right; }
.dataframe thead th {
    text-align: left;
}

.dataframe tbody tr th {
    vertical-align: top;
}
</style>
price bedrooms bathrooms sqft_living sqft_lot floors waterfront view condition grade ... sqft_basement yr_built yr_renovated zipcode lat long sqft_living15 sqft_lot15 year month
0 12.309982 3 1.00 1180 8.639411 1.0 0 0 3 7 ... 0 1955 0 98178 47.5112 -122.257 1340 8.639411 2014 10
1 13.195614 3 2.25 2570 8.887653 2.0 0 0 3 7 ... 400 1951 1 98125 47.7210 -122.319 1690 8.941022 2014 12
2 12.100712 2 1.00 770 9.210340 1.0 0 0 3 6 ... 0 1933 0 98028 47.7379 -122.233 2720 8.994917 2015 2
3 13.311329 4 3.00 1960 8.517193 1.0 0 0 5 7 ... 910 1965 0 98136 47.5208 -122.393 1360 8.517193 2014 12
4 13.142166 3 2.00 1680 8.997147 1.0 0 0 3 8 ... 0 1987 0 98074 47.6168 -122.045 1800 8.923058 2015 2

5 rows × 21 columns

import plotly
import plotly.graph_objs as go

plotly.tools.set_credentials_file(username='----', api_key='----')
mapbox_access_token = '-----'

data_ = [
    go.Scattermapbox(
        lat=list(data.lat),
        lon=list(data.long),
        mode='markers',
        marker=dict(size=2),
        text=[''],
    )
]

layout = go.Layout(
    autosize=True,
    hovermode='closest',
    mapbox=dict(
        accesstoken=mapbox_access_token,
        bearing=0,
        center=dict(
            lat=47.5,
            lon=-122
        ),
        pitch=0,
        zoom=5
    ),
)

fig = dict(data=data_, layout=layout);
plotly.plotly.iplot(fig)

Fin!

About

housing market study in King county, US

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published