Using keras in R to perform neural network, my model has very low accuracy but the prediction is good and I don't know why

I used the classic dataset - mnist dataset that has 784 columns of pixels and 1 column of the label (from 0 to 9), and I was going to transform the images into their corresponding seven segment display representation. The following is my code.

# Convert the labels(digits) in train_set and test_set to seven-segment display
train_set$a - ifelse(train_set$V1 %in% c(0,2,3,5,6,7,8,9),1,0)
train_set$b - ifelse(train_set$V1 %in% c(0,1,2,3,4,7,8,9),1,0)
train_set$c - ifelse(train_set$V1 %in% c(0,1,3,4,5,6,7,8,9),1,0)
train_set$d - ifelse(train_set$V1 %in% c(0,2,3,5,6,8),1,0)
train_set$e - ifelse(train_set$V1 %in% c(0,2,6,8),1,0)
train_set$f - ifelse(train_set$V1 %in% c(0,4,5,6,8,9),1,0)
train_set$g - ifelse(train_set$V1 %in% c(2,3,4,5,6,8,9),1,0)

test_set$a - ifelse(test_set$V1 %in% c(0,2,3,5,6,7,8,9),1,0)
test_set$b - ifelse(test_set$V1 %in% c(0,1,2,3,4,7,8,9),1,0)
test_set$c - ifelse(test_set$V1 %in% c(0,1,3,4,5,6,7,8,9),1,0)
test_set$d - ifelse(test_set$V1 %in% c(0,2,3,5,6,8),1,0)
test_set$e - ifelse(test_set$V1 %in% c(0,2,6,8),1,0)
test_set$f - ifelse(test_set$V1 %in% c(0,4,5,6,8,9),1,0)
test_set$g - ifelse(test_set$V1 %in% c(2,3,4,5,6,8,9),1,0)

# Split the given train data to train_x and train_y
# Reshaping the training pixels and labels data to arrays
train_x - as.matrix(train_set[, 2:785])
train_x - array_reshape(train_x, c(nrow(train_x), 784))
train_y - as.matrix(train_set[, 786:792])
train_y - array_reshape(train_y, c(nrow(train_y), 7))

# Split the given test data to test_x and test_y
# Reshaping the testing pixels and labels data to arrays
test_x - as.matrix(test_set[, 2:785])
test_x - array_reshape(test_x, c(nrow(test_x), 784))
test_y - as.matrix(test_set[, 786:792])
test_y - array_reshape(test_y, c(nrow(test_y), 7))

# Normalize inputs from 0-255 to 0-1
train_x - train_x / 255
test_x - test_x / 255

# Build the Model
image_size - 784 # 28*28
num_classes - 7 #7 segment display of the digits
model - keras_model_sequential() 
model %%
  #Hidden Layers
  layer_dense(units = 512, activation = 'relu', input_shape = c(image_size)) %%
  layer_dropout(rate = 0.25) %%
  layer_dense(units = 256, activation = 'relu') %%
  layer_dropout(rate = 0.5) %%
  # Output Layer
  layer_dense(units = num_classes, activation = 'sigmoid')

# Summary of the model
summary(model)

# Compile the neural network
model %% compile(
  loss = 'binary_crossentropy',
  optimizer = 'adam',
  metrics = c('accuracy')
)

# Modeling on Training Dataset
model %% fit(
  train_x, train_y,
  epochs = 5, batch_size = 128,
  validation_data = list(test_x, test_y)
)
# Prediction
pred - predict_proba(model, test_x)
pred - round(as.data.frame(pred))

test_set$predict - ifelse(pred$V1==1  pred$V2==1  pred$V3==1  pred$V4==1
                            pred$V5==1  pred$V6==1  pred$V7==0,0,NA)

test_set$predict - ifelse(pred$V1==0  pred$V2==1  pred$V3==1  pred$V4==0
                            pred$V5==0  pred$V6==0  pred$V7==0,1,test_set$predict)

test_set$predict - ifelse(pred$V1==1  pred$V2==1  pred$V3==0  pred$V4==1
                            pred$V5==1  pred$V6==0  pred$V7==1,2,test_set$predict)

test_set$predict - ifelse(pred$V1==1  pred$V2==1  pred$V3==1  pred$V4==1
                            pred$V5==0  pred$V6==0  pred$V7==1,3,test_set$predict)

test_set$predict - ifelse(pred$V1==0  pred$V2==1  pred$V3==1  pred$V4==0
                            pred$V5==0  pred$V6==1  pred$V7==1,4,test_set$predict)

test_set$predict - ifelse(pred$V1==1  pred$V2==0  pred$V3==1  pred$V4==1
                            pred$V5==0  pred$V6==1  pred$V7==1,5,test_set$predict)

test_set$predict - ifelse(pred$V1==1  pred$V2==0  pred$V3==1  pred$V4==1
                            pred$V5==1  pred$V6==1  pred$V7==1,6,test_set$predict)

test_set$predict - ifelse(pred$V1==1  pred$V2==1  pred$V3==1  pred$V4==0
                            pred$V5==0  pred$V6==0  pred$V7==0,7,test_set$predict)

test_set$predict - ifelse(pred$V1==1  pred$V2==1  pred$V3==1  pred$V4==1
                            pred$V5==1  pred$V6==1  pred$V7==1,8,test_set$predict)

test_set$predict - ifelse(pred$V1==1  pred$V2==1  pred$V3==1  pred$V4==0
                            pred$V5==0  pred$V6==1  pred$V7==1,9,test_set$predict)

confusionMatrix(factor(test_set$predict), factor(test_set$V1))

It turned out that my model always had only around 20% or 30% accuracy. However, when I used the model to do the prediction and transformed the outputs back into labels, the accuracy was quite good like roughly 85% every time. I don't know what part is wrong with my model. Can someone help me out? Really appreciated! The accuracy of my model: The accuracy of my prediction: The dataset can be downloaded here: https://www.kaggle.com/zalando-research/fashionmnist Here is the seven-segment display chart: enter image description here

Topic keras neural-network r

Category Data Science


It looks like you are mixing two different accuracy concepts, hence the difference in values:

  1. Your network is currently set-up to predict a value between 0 and 1 for each label (activation = 'sigmoid'). In this case you might get an output like [0.9 0.4 0.3 ... 0.2] etc. If you use this set-up and the measure 'accuracy' Keras will infer that you want to calculate binary accuracy. This is not the same as categorical accuracy

  2. When you run the prediction part you are using categorical accuracy.

To fix your problem change:

  1. set the final layer activation function to 'softmax'
  2. use 'categorical cross entropy' loss function

In this case, Keras should recognize that you want to measure categorical accuracy.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.