ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

Zhenzhong LanMingda ChenSebastian GoodmanKevin GimpelPiyush SharmaRadu Soricut

   Papers with code   Abstract  PDF

Increasing model size when pretraining natural language representations often results in improved performance on downstream tasks. However, at some point further model increases become harder due to GPU/TPU memory limitations, longer training times, and unexpected model degradation... (read more)

Benchmarked Models

No benchmarked models yet. Click here to submit a model.