Poor neural net regressive fit to data that exhibit clear structure
I've been trying to use a simple NN to model data I've generated. The data lack a closed form expression, but exhibit clear structure. The MWE below emulates similar data. I find that any NN I create, including the one in the MWE below, seems to simultaneously over and under predict the values (in my actual data, the fishtail straddles the y=x line).
I've tried tuning the hyperparameters of the model, adding inputs, adding training data, selectively pruning training data, and adding/subtracting layers and neurons. In all cases, I seem to preserve some form of this behavior.
I am new to machine learning, so it's quite possible there's something basic that I'm missing. I guess I'm wondering:
- Is there something obvious I'm doing wrong, either in my approach or my understanding of the applicability of NN?
- If not, is this caused by the structure of the data itself? I know local minima in the loss function can cause issues at times, and I guess I'm worried that it is overfitting the middle (yellow) section of the data at the expense of poorly estimating the sides. I don't know if there's an easy way to fix that.
I'd appreciate any suggestions/pointers. Thanks!
Sample training data
#!/usr/bin/env python
import numpy
import matplotlib
from matplotlib import pyplot as plot
import tensorflow as tf
from tensorflow.keras import layers
from tensorflow.keras.callbacks import EarlyStopping
x = y = numpy.linspace(0,1,100)
net = numpy.array(numpy.meshgrid(x,y))
net = net.T.reshape(-1, 2)
z = []
# Generate some data with a bit of noise
for point in net:
z.append(-(point[1]-point[0])**2. + 0.1*numpy.random.uniform(low = -.1, high = .1))
z = numpy.array(z)
plot.scatter(net[:,0], net[:,1], c = z)
Simple neural net fit
# Rescale data
zmin = min(z)
zmax = max(z)
z = (z - zmin)/(zmax - zmin)
params = net
input_size = 2
output_size = 1
# Define a simple NN with 1 hidden layer
norm = layers.Normalization(axis=-1)
norm.adapt(numpy.array(params))
models = tf.keras.Sequential([
norm,
layers.Dense(10000, input_dim=input_size, activation='relu'),
layers.Dense(output_size, activation = 'linear')])
# Add condition to quit training, compile, and fit model
callback = EarlyStopping(monitor='val_loss', min_delta=0, patience=5)
custom_optimizer=tf.keras.optimizers.SGD(learning_rate=0.001, momentum=0.9)
models.compile(optimizer=custom_optimizer,loss='mean_squared_error')
models.fit(params,z,batch_size=100,epochs=10000,verbose=1, validation_split=0.2, callbacks=[callback])
prediction = models.predict(params)
plot.scatter(z, prediction)
ts = numpy.linspace(0,1,1000)
plot.plot(ts, ts, color = 'k')
Note
This MWE tends to overfit slightly. My actual data set does not run into that problem (yet).
Topic machine-learning-model regression neural-network machine-learning
Category Data Science