@author: Mingran Jia
-
URL of data:
https://nethouseprices.com/house-prices/Lanarkshire/GLASGOW?page=1
https://nethouseprices.com/house-prices/Lanarkshire/GLASGOW?page=2
…
https://nethouseprices.com/house-prices/Lanarkshire/GLASGOW?page=10 -
The context of the data:
> glimpse(house_price_glasgow) Rows: 500 Columns: 3 $ address <chr> "1 Ettrick Place, Glasgow, G43 1UA", "2~ $ prices <dbl> 144500, 212750, 185000, 90000, 126894, ~ $ types <chr> "Flat", "Flat", "Flat", "Semi Detached"~
-
The content of the data:
> str(house_price_glasgow) tibble [500 x 3] (S3: tbl_df/tbl/data.frame) $ address: chr [1:500] "1 Ettrick Place, Glasgow, G43<U+00A0>1UA" "2/1 26 Tassie Street, Glasgow, G41<U+00A0>3QF" "Flat 1/3 18 Prospecthill Grove, Glasgow, G42<U+00A0>9LD" "41 Ochil Street, Glasgow, G32<U+00A0>7SD" ... $ prices : num [1:500] 144500 212750 185000 90000 126894 ... $ types : chr [1:500] "Flat" "Flat" "Flat" "Semi Detached" ...
Research Questions
We wish to evaluate the relationship among house types, locations and prices to assist real estate developers set more reasonable price.
- The impact of different housing types on house prices
- The impact of different regions on house prices
- The mutual influence of different areas and room types
Statistical Analysis Plan for House Price
Population
- Glasgow House Price Statistics
Primary Objective:
- Estimate the influence of house types and locations on house prices
Secondary Objectives:
- Assess the top-heated house types and postcodes in the city
- Estimate the mutual influence of different locations and room types
Data Collection methods:
- Scrap the most recent 500 house prices of 201191 total in Glasgow from the house saling website of Scotland as the sample to represent the population of Glasgow house statistics.
- The house is identified by an unique address.
- The house is classified by limitedly different types.
Variables Under Consideration:
- House prices grouped by area; house prices grouped by different type; difference in house price for different type in the same area; difference in house price for different area in the same type - Primary outcome variable
- Areas division accessed by locations - Primary explanatory variable
- Top house type; top house area; top and bottom selling house type in each area; top and bottom selling house area of different types - Explanatory outcome variable
Missing Data Procedures:
- If any data of the house type or location is missing, that house is excluded from analysis.
- If the price is missing, use the average price of that area; if there are less than two house prices in that area, then that area is excluded from analysis.
Summaries to be presented:
- Basic statistical discription applied to house price including mean, standard deviation, median, etc…
Models to be fitted
- Linear model to be