Skip to content

erinkhoo/House_pricing_dataset

 
 

Repository files navigation

House pricing dataset

The collected dataset includes information about houses on sale in the Dublin area. Each house is an entry of the dataset: a mixed-type data comprising of numerical, categorical, visual and textual data.

The goal of this competition is to combine both numerical/categorical features, visual and textual features to predict the house-price.

The house price is determined by some factors like

location (area),
surface (size),
the number of bedrooms,
the number of bathrooms,
property type,
house-features (size of the windows, construction material).

The physical attributes of the house such as the number of bedrooms, the number of bathrooms, the surface of the house, property type, and its location are information that is directly accessible from the dataset. Instead, the house-features can (sometimes only indirectly) be inferred from the house-description, house-facility, house-features and house-image data.

The notebook shows how to open and use the features to predict the house price. The dataset consists of 14 zip files: datatset_csv.zip includes the numeric/text features for the train and text csv files; datase01.zip,...datase13.zip include the frontal images of the houses (13 files, ~24MB, for a total of 300MB).

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%