How to retrain a model if a subset of predictions needs improvement

Question

How to retrain a model if a subset of predictions needs improvement

Hing

2022年4月27日 15:41

In my work, sometimes my client complain about a subset of predictions not being accurate. Despite I know it's nearly impossible to just change the model for fitting that subgroup, while other predictions going well. But is it the case? Other than building another model specifically for that subset, I wonder is there anything I can do to improve the predictions within that subgroup? What kind of adjustments possible?

Topic data-science-model

Category Data Science

Ralph Winters · Accepted Answer · 2022年4月27日 15:41

The answer is really to take a deeper dive into why that subgroup is not performing. They may have different characteristics and may need a different model. However, when first looking into to this, consider this a business problem more than a tweaking problem, and I would start to pull out this group and look at them 1 by 1 and see if there is a business explanation for why the group is not performing the way you think it should. Certainly, if the business is complaining, they should be able to give you more insight on this.

Ashwiniku918 · Accepted Answer · 2022年4月27日 06:07

It always happens that some subset or subgroup of data may not be predicted well using the ML Model. This may happen due to many reasons :

The sub category may not have enough data to learn from
Bias in how data was captured
Features created are not able to capture the subcategory

As you mentioned model retraining may not improve the model but you could try the following ;

Row Wise Weighting : Increase the weights of these rows while training so that model gives a little more importance to them while learning. XGBoost provides this intrisically
Feature Engineering : Spend a little bit more time on feature engineering to capture variables which may help you in this specific group

Almost impossible, but see if you could collect more data

How to retrain a model if a subset of predictions needs improvement

About