Multi-level timeseries forecasting? How to do it?

So, I just finished a 48 hr datathon, and I did terribly, to be honest. It was my first datathon.

We were given a list of datasets:

  • 5 months of taxi demand data (January to May)
  • Weather dataset
  • Zone neighbors
  • dt (date and time of prediction)

And we were told to build a time series forecasting model to forecast the taxi demand. We were told to do it in a forecasting manner, like, Train with January and Test with February, Train with February and Test with March, and so on. In the end, they would evaluate our model by testing it with data from June, which we were not given. I just couldn't figure out how to do it.

This is how I wanted to do it : [code snippet link: https://www.kaggle.com/code/ryanholbrook/hybrid-models]

# 1. Train and predict with first model
model_1.fit(X_train_1, y_train)
y_pred_1 = model_1.predict(X_train)

# 2. Train and predict with second model on residuals
model_2.fit(X_train_2, y_train - y_pred_1)
y_pred_2 = model_2.predict(X_train_2)

# 3. Add to get overall predictions
y_pred = y_pred_1 + y_pred_2

Except I'd have to do it 4 times. Because I got 5 datasets.

And we were given this function to evaluate our model before submitting:

def predict(self, demand, weather, dt, neighbors):

But I just couldn't figure out how to implement this.

My question is, how should I have solved this problem? Using what algorithms?

If you could refer me to an example code or something that'd really help. Thanks.

Topic forecasting python-3.x time-series python

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.