Changing a neural network to not overfit

Question

Changing a neural network to not overfit

tempx

2022年5月2日 20:16

I am trying to classify around 400K data with 13 attributes. I have used python sklearn's SVM package, but it didn't work, and then I learned that SVM's are not suitable for large dataset classification. Then I used the (sklearn) ANN using the following MLPClassifier:

MLPClassifier(solver='adam', alpha=1e-5, random_state=1,activation='relu', max_iter=500)

and trained the system using 200K samples, and tested the model on the remaining ones. The classification worked well. However, my concern is that the system is over trained or overfit. Can you please guide me on the number of hidden layers and node sizes to make sure that there is no overfit? (I have learned that the default implementation has 100 hidden neurons. Is it ok to use the default implementation as is?)

Topic scikit-learn neural-network classification

Category Data Science

Wickkiey · Accepted Answer · 2021年1月20日 05:18

To avoid overfit,

While model building with MLPClassifier,

Use early_stopping = True. This stops the training when there is no improvement with validation data.
Using default node size is fine for most of the cases, until it is has large number of features.
Since you have more data, you split with train, test and validation and validate the scores.
Check with various metrics (f1_score, precision, recall etc.). This is highly useful when it comes with imbalanced data set.
If you are highly concerned about over-fitting, you can explore cross_val_predict. The error standard deviation shows, how well the model will work on unseen data.

You will get more things, related to the data you have.

Changing a neural network to not overfit

About