Poor neural net regressive fit to data that exhibit clear structure

Question

Poor neural net regressive fit to data that exhibit clear structure

NothingQuenchier

2022年5月3日 04:19

I've been trying to use a simple NN to model data I've generated. The data lack a closed form expression, but exhibit clear structure. The MWE below emulates similar data. I find that any NN I create, including the one in the MWE below, seems to simultaneously over and under predict the values (in my actual data, the fishtail straddles the y=x line).

I've tried tuning the hyperparameters of the model, adding inputs, adding training data, selectively pruning training data, and adding/subtracting layers and neurons. In all cases, I seem to preserve some form of this behavior.

I am new to machine learning, so it's quite possible there's something basic that I'm missing. I guess I'm wondering:

Is there something obvious I'm doing wrong, either in my approach or my understanding of the applicability of NN?
If not, is this caused by the structure of the data itself? I know local minima in the loss function can cause issues at times, and I guess I'm worried that it is overfitting the middle (yellow) section of the data at the expense of poorly estimating the sides. I don't know if there's an easy way to fix that.

I'd appreciate any suggestions/pointers. Thanks!

Sample training data

#!/usr/bin/env python

import numpy
import matplotlib
from matplotlib import pyplot as plot
import tensorflow as tf
from tensorflow.keras import layers
from tensorflow.keras.callbacks import EarlyStopping

x = y = numpy.linspace(0,1,100)
net = numpy.array(numpy.meshgrid(x,y))
net = net.T.reshape(-1, 2)

z = []
# Generate some data with a bit of noise
for point in net:
        z.append(-(point[1]-point[0])**2. + 0.1*numpy.random.uniform(low = -.1, high = .1))

z = numpy.array(z)

plot.scatter(net[:,0], net[:,1], c = z)

Simple neural net fit

# Rescale data
zmin = min(z)
zmax = max(z)
z = (z - zmin)/(zmax - zmin)

params = net
input_size = 2
output_size = 1

# Define a simple NN with 1 hidden layer
norm = layers.Normalization(axis=-1)
norm.adapt(numpy.array(params))

models = tf.keras.Sequential([
        norm,
        layers.Dense(10000, input_dim=input_size, activation='relu'),
        layers.Dense(output_size, activation = 'linear')])

# Add condition to quit training, compile, and fit model
callback = EarlyStopping(monitor='val_loss', min_delta=0, patience=5)
custom_optimizer=tf.keras.optimizers.SGD(learning_rate=0.001, momentum=0.9)
models.compile(optimizer=custom_optimizer,loss='mean_squared_error')
models.fit(params,z,batch_size=100,epochs=10000,verbose=1, validation_split=0.2, callbacks=[callback])

prediction = models.predict(params)
plot.scatter(z, prediction)
ts = numpy.linspace(0,1,1000)
plot.plot(ts, ts, color = 'k')

Note

This MWE tends to overfit slightly. My actual data set does not run into that problem (yet).

Topic machine-learning-model regression neural-network machine-learning

Category Data Science

Poor neural net regressive fit to data that exhibit clear structure

Sample training data

Simple neural net fit

Note

About