How to retrain a model if a subset of predictions needs improvement

In my work, sometimes my client complain about a subset of predictions not being accurate. Despite I know it's nearly impossible to just change the model for fitting that subgroup, while other predictions going well. But is it the case? Other than building another model specifically for that subset, I wonder is there anything I can do to improve the predictions within that subgroup? What kind of adjustments possible?

Topic data-science-model

Category Data Science


The answer is really to take a deeper dive into why that subgroup is not performing. They may have different characteristics and may need a different model. However, when first looking into to this, consider this a business problem more than a tweaking problem, and I would start to pull out this group and look at them 1 by 1 and see if there is a business explanation for why the group is not performing the way you think it should. Certainly, if the business is complaining, they should be able to give you more insight on this.


It always happens that some subset or subgroup of data may not be predicted well using the ML Model. This may happen due to many reasons :

  1. The sub category may not have enough data to learn from
  2. Bias in how data was captured
  3. Features created are not able to capture the subcategory

As you mentioned model retraining may not improve the model but you could try the following ;

  1. Row Wise Weighting : Increase the weights of these rows while training so that model gives a little more importance to them while learning. XGBoost provides this intrisically

  2. Feature Engineering : Spend a little bit more time on feature engineering to capture variables which may help you in this specific group

Almost impossible, but see if you could collect more data

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.