Keras, DNN ending with sigmoid - model.predict produces values < 0.5. This indicates...?

I'm trying a simple Keras project with Dense layers for binary classification. About 300000 rows of data, labels are

training_set['TARGET'].value_counts()    
0    282686
1     24825

My model looks like this

def build_model():
    model = models.Sequential()
    model.add(layers.Dense(64, activation='relu', kernel_regularizer=regularizers.l2(0.001),
                           input_shape=(train_data.shape[1],)))
    model.add(layers.Dropout(0.5))
    model.add(layers.Dense(32, kernel_regularizer=regularizers.l2(0.001), activation='relu'))
    model.add(layers.Dropout(0.5))
    model.add(layers.Dense(1, activation='sigmoid'))
    model.compile(optimizer='rmsprop',
              loss='binary_crossentropy',
              metrics=['accuracy'])

    return model

So it's binary classification that ends with a sigmoid. It's my understanding that I should get values close to 0 or close to 1? I've tried different model architectures, hyperparameters, epochs, batch sizes, etc. but when I run model.predict on my validation set my values never get above 0.5. Here are some samples.

20 epochs, 16384 batch size
max 0.458850622177124,  min 0.1022530049085617
max 0.47131556272506714,  min 0.057787925004959106 

20 epochs, 8192 batch size
max 0.42957592010498047,  min 0.060324762016534805
max 0.3811708390712738,  min 0.022215187549591064

20 epochs, 4096 batch size
max 0.3163970410823822,  min 0.0657803937792778 

20 epochs, 2048 batch size
max 0.21799422800540924,  min 0.03832605481147766 

Is this an indication that I'm doing something wrong?

Training and validation loss

Topic keras

Category Data Science


Contrary to the previous answer, I would say that your output layer configuration is correct although I agree with the previous answer regarding your dropout being too high. A dropout of 0.5 will mean that 50% if your neurons will be dropped so basically you are dropping half your neurons in the layer which in turn means that your model will not be able to learn much.

Another point I would like to mention is that you should use adam as your optimizer as it gives a better result most of the times.

In short to improve your accuracy, do Hyperparameter tuning for things like no of layers, no of neurons, optimizer, learning rate, activation function, batch size, epochs etc. Use RandomizedSearchCV for this purpose.


I think the dropout is a bit high and if it's a binary classification, then why in the end a single node?

It's absolutely okay to use sigmoid for binary classification. So the answer provided below isn't apt.

Make sure your target variable is having proper shape in case of softmax...(one hot/ to_categorical())

def build_model():
    model = models.Sequential()
    model.add(layers.Dense(64, activation='relu', kernel_regularizer=regularizers.l2(0.001),
                           input_shape=(train_data.shape[1],)))
    model.add(layers.Dropout(0.3))
    model.add(layers.Dense(32, kernel_regularizer=regularizers.l2(0.001), activation='relu'))
    model.add(layers.Dropout(0.3))
    model.add(layers.Dense(**num_classes**, activation='softmax'))
    model.compile(optimizer='rmsprop',
              loss='binary_crossentropy',
              metrics=['accuracy'])
    
    return model

To improve it further, you need to use some techniques, such as cross-validation, batch normalization, and increase the epochs(maybe).

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.