Is it possible to do hard-coded decision tree on some variables and random forest / something on the remaining ones?
Is it possible to do hard-coded decision tree on some variables and random forest / something on the remaining ones?
The situation seems that for some variables it's possible to draw strong empirical assumptions, but for others their relative importance seems more random.
So e.g.
Researcher is certain that splitting X1 5 and X2 3 gives best information, since they are empirically sound splits e.g. based on stakeholder views. And X1, X2 are more important than X3, X4, X5, since X3, X4, X5 are redundant, if X1 or X2 don't exist.
Thus the model could essentially be based on X1, X2 only , but X3, X4, X5 should add explanatory power. Yet their relative importances are not known. Using the decision tree to them might be prone to model inaccuracies due to random forest or something perhaps offering better reduction in overfitting etc.
Topic decision-trees
Category Data Science