Should I pretrain my BERT model on specific dataset if it has only one class of labels?

Question

Should I pretrain my BERT model on specific dataset if it has only one class of labels?

Ir8_mind

2022年4月24日 12:48

I want to use BERT model for sentences similarity measuring task. I know that BERT models were trained with natural language inference architecture with dataset with labels neutral, entailment, contradiction.

My data to which I want to apply BERT for sentences similarity task has very specific terms and jargon, so I want to pretrain model on it before. But in that data there are only cases of entailment labels (about 20k rows). Is it a good idea to pretrain model on that data? How could I handle my problem the best way?

Thanks in advance

Topic bert transformer deep-learning nlp machine-learning

Category Data Science

Should I pretrain my BERT model on specific dataset if it has only one class of labels?

About