NYC Taxi Fare Predictionv1.0
✦ Project Overview
A robust fare prediction system for NYC Yellow Taxi rides using advanced regression techniques in R. The model handles outliers and multicollinearity using Ridge regularization and is deployed as an interactive Shiny web app.
✦ Key Features
- ♥Accurate fare prediction using Ridge Regression
- ♥Handling of outliers and multicollinearity
- ♥Interactive R Shiny web application for end-users
- ♥Visual data analysis of taxi ride patterns
✦ Methodology
A rigorous statistical modeling approach using R:
01.
Data Preprocessing
Extensive cleaning of the TLC dataset to remove outliers (e.g., negative fares, zero-distance trips) and imputation of missing values.
02.
Feature Engineering
Created new variables like 'rush_hour' flags and 'borough_crossing' indicators to capture temporal and spatial pricing factors.
03.
Regularization
Implemented Ridge Regression to mitigate multicollinearity between correlated features (e.g., trip duration vs distance), improving model generalization.