Training stateful LSTM with different number of sequences

I'm using a stateful LSTM for stock market analysis, and I have varying amounts of data for each stock, ranging from 20 years to just a few weeks (i.e. for newly listed stocks).

I use 3 years of data as a minimum for training as I want to create some state within the network. I set a year as my sequence length, so if I have 12 years of data then I will submit 4 batches with 3 sequences in each. Only after I've submitted all batches do I then reset the network state ready for the next stock.

But is there any issue training with differing number of sequences? E.g. if I train with a company that has 20 years of data then the network will build up much more state than a company that I only have 3 years of data.

Topic lstm rnn neural-network machine-learning

Category Data Science


In the circumstance of stock market prediction, I think that when the sequence length reaches a certain point, the network will learn to forget the opening price 6 months ago or the volume 3 years back. That data is no longer relevant to the cell state, as recent events are more indicative of stock price changes.
In terms of the LSTM method of storing both long and short-term memory, the LSTM will have the functionality to "forget" old and irrelevant data from earlier in the sequence, maintaining a stable and efficient cell state regardless of sequence length (of course, the sequence should be long enough for the LSTM to build up any cell state at all).
Hope this helps.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.