Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
100 Days Of Ml Code | 201 | 2 months ago | Jupyter Notebook | |||||||
A day to day plan for this challenge. Covers both theoritical and practical aspects | ||||||||||
Ineuron Full Stack Data Science Assignments | 68 | 4 months ago | Jupyter Notebook | |||||||
This Repository consists of Assignments and projects of the iNeuron Full Stack Data Science Course | ||||||||||
Automobile Dataset Analysis | 20 | 2 years ago | Jupyter Notebook | |||||||
This project analyzes and visualizes the Used Car Prices from the Automobile dataset in order to predict the most probable car price | ||||||||||
Walmart Sales Prediction | 19 | 5 years ago | Jupyter Notebook | |||||||
The Data Analysis Workshop | 18 | 2 years ago | mit | Jupyter Notebook | ||||||
A New Interactive Approach to Learning Data Analysis | ||||||||||
Machine Learning Bookcamp 2022 | 15 | 4 months ago | cc0-1.0 | Jupyter Notebook | ||||||
Solutions for the Machine Learning Zoomcamp 2022 by DataTalks.Club. | ||||||||||
Bangalore House Prediction App | 7 | 2 years ago | Jupyter Notebook | |||||||
Predicts home prices of Bangalore. Used Flutter, Flask and Jupyter Notebook. | ||||||||||
Predicting Baseball Statistics | 7 | 2 years ago | Jupyter Notebook | |||||||
Predicting Baseball Statistics: Classification and Regression Applications in Python Using scikit-learn | ||||||||||
Data Science | 7 | 10 months ago | Jupyter Notebook | |||||||
EDA and Machine Learning Models in R and Python (Regression, Classification, Clustering, SVM, Decision Tree, Random Forest, Time-Series Analysis, Recommender System, XGBoost) | ||||||||||
Olist_ecom_analysis | 6 | a year ago | 4 | Jupyter Notebook | ||||||
Data analysis about Brazilian e-commerce business Olist |
In this case study, you will prepare Ames Housing Dataset in a csv file in a way that it is suitable for a ML algorithm. You will achieve this by first exploring the data and performing feature transformations on provided dataset of house price prediction ML problem. You are required to train a ML model by using linear regression, ridge regression and lasso regression for predicting house prices.
The target 'SalePrice' variable is highly correlated with features such as OverallQual, GrLivArea, GarageCars, GarageArea and TotalBsmtSF among others.
Steps:
Split dataset in training set (X_train, y_train) and test set (X_test, y_test)
R^2 score on trainig set: 0.94609, MSE score on trainig set: 0.00808
R^2 score on test set: 0.89136, MSE score on test set: 0.01472
Ridge regression (alpha=0.05): R^2 score on training set: 0.94598, R^2 score on test set: 0.89410
Lasso regression (alpha= 0.0001): R^2 score on trainig set: 0.94169, R^2 score on test set: 0.90843
6.1 In practice, ridge regression is usually the first choice between two models.
6.2 However, if you have a large amount of features and expect only a few of them to be important, Lasso might be a better choice.
R^2 score | Linear Regression | Ridge Regression | Lasso Regression |
---|---|---|---|
training set | 0.94609 | 0.94598 | 0.94169 |
test set | 0.89136 | 0.89410 | 0.90843 |