-
Notifications
You must be signed in to change notification settings - Fork 43
Expand file tree
/
Copy pathProject_Abstract.txt
More file actions
36 lines (32 loc) · 1.66 KB
/
Project_Abstract.txt
File metadata and controls
36 lines (32 loc) · 1.66 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
TO
The code mania insructor
PROJECT ABSTRACT:
PROBLEM STATEMENT:
The purpose of the project is to predict median house values in californian districts, given many features from these districts.
The project also aims at building a model of housing prices in california using the california census data.
Districts or block groups are the smallest geographical units for which the US Census Bureau publishes sample data
(a block group tyically has a population of 600 to 3,000 people).
There are 20,640 districts in the project dataset.
CONTENT:
The data pertains to the houses found in a given California district and some summary stats about them based on the 1990 census data.
Be warned the data aren't cleaned so there are some preprocessing steps required!
The columns are as follows, their names are pretty self explanitory:
1.longitude
2.latitude
3.housing_median_age
4.total_rooms
5.total_bedrooms
6.population
7.households
8.median_income
9.median_house_value
10.ocean_proximity
APPROACH FOR THE PROBLEM STATEMENT:
There are many approaches inorder to tackle the problem.
But as per our knowledge and ability i am following Linear regression approach.
LINEAR REGRESSION:
Linear regression is another technique borrowed by machine learning from the field of statistics.
In statistics, linear regression is a linear approach to modeling the relationship between a scalar response (or dependent variable)
and one or more explanatory variables (or independent variables).
Linear regression was the first type of regression analysis to be studied rigorously, and to be used extensively in practical applications.
Linear regression draws the Bestfit line.