Start & End Tokens in LSTM when making predictions

Question

Start & End Tokens in LSTM when making predictions

eknagi

2022年5月5日 00:09

I see examples of LSTM sequence to sequence generation models which use start and end tokens for each sequence.

I would like to understand when making predictions with this model, if I'd like to make predictions on an arbitrary sequence - is it required to include start and end tokens tokens in it?

Topic lstm tensorflow rnn nlp

Category Data Science

Jindřich · Accepted Answer · 2022年2月21日 08:14

It depends on what you use the LSTM for.

For sequence labeling or sequence classification, the special tokens are not necessary. Although, there might be a slight benefit of informing the network of what is the beginning and the end of a sentence, especially if the initial LSMT state is fixed and learned.

For autoregressive sequence-to-sequence models, the special tokens are crucial. The beginning-of-sentence token serves as an instruction to the decoder to start decoding (it needs a very first state to predict what the next first token is). The end-of-sentence token is an instruction for the decoding algorithm to stop generating more tokens.

Start & End Tokens in LSTM when making predictions

About