top of page
Search

Prediction of the demand of bike sharing

  • Writer: Philip Lai
    Philip Lai
  • May 28, 2023
  • 1 min read

Project Objective

Using the given data, including "season, holiday, working day, weather, temp, attempt, humidity, windspeed, registered, casual, and count (rental quantity)," predict the future rental demand.


Data Ingestion

The data is provided by Kaggle.


Data Processing

  1. Data Observation: Use train.info(), train.describe(), test.info(), test.describe() to check for null values and outliers.

2. Feature engineering

  1. Remove outliers.

  2. Merge the data.

  3. Split the datetime into various time format data such as 'date', 'hour', 'year', 'weekday', etc.

  4. Use distplot to observe the data distribution of various features, including 'temp,' 'atemp,' 'humidity,' 'windspeed,' etc. Identify an issue with the distribution of windspeed.

  5. Use RandomForestRegression with features such as 'season, weather, humidity, month, year, temp, atemp' to predict windspeed.




3. Resplit the 'train' and 'test' data.

4. Transform the distribution of the count values from a "positive direction" to a normal distribution using logarithm (log).


5. Perform data prediction and export the data.


Analysis Methods

Since this machine learning utilizes a multi-feature dataset with known outcomes ('count') to predict unknown outcomes, random forest regression analysis is adopted to predict.


Presentation

Obtain the expected rental quantity for each time point.


 
 
 

コメント


bottom of page