Stride in time series classification/regression using neural networks
When dealing with time series in neural networks, we use windows with a size and a stride as input. Is it advantageous to train such a neural network with a stride that is smaller than the stride used during inference, e.g. using stride 1 for training, but stride 25 during inference? Since the network is then trained with more windows, it should theoretically be more robust than if I train it with stride 25 and thus fewer windows. Is that correct? In addition, the computational resources required during inference are less than during training because a larger stride means fewer windows for prediction.
Topic deep-learning neural-network time-series
Category Data Science