When to split data into multiple regression models instead of one model?

Question

When to split data into multiple regression models instead of one model?

Srule

2022年4月30日 02:03

I'm playing with regression models in scikit-learn. The goal is to predict how much inventory we should purchase for the next 90 days. My data set has hundred of product categories. Each category has many unique features that do not apply to every category.

For Example: Shirt category could have "size" and "color" features where as the category Razors could have a "number of blades" feature.

Should I split my data up by category and make a different model for each? Or is it suffient to have one model in which I keep the products category as one of the features?

Topic regression machine-learning

Category Data Science

ombk · Accepted Answer · 2020年11月24日 00:32

First question what is the motivation.

If you get this question in real life, how would you tackle it? If you were to tackle it for your business?

For sure, you should split the data into categories. This is similar to feature engineering, you want the best data to predict your categories.

Create a model for each of the labels you want to predict and make sure to choose the best predictive features.

Siong Thye Goh · Accepted Answer · 2019年3月3日 04:57

You should split them by category since their features do not apply to each category.

Under certain circumstances that perhaps you manage to group some categories together based on some business logic, then perhaps you can build less models.

When to split data into multiple regression models instead of one model?

About