Dynamic Evaluation of Transformer Language Models

Ben KrauseEmmanuel KahembweIain MurraySteve Renals

   Papers with code   Abstract  PDF

This research note combines two methods that have recently improved the state of the art in language modeling: Transformers and dynamic evaluation. Transformers use stacked layers of self-attention that allow them to capture long range dependencies in sequential data... (read more)

Benchmarked Models

RANK
MODEL
REPO
CODE RESULT
PAPER RESULT
ε-REPRODUCED
BUILD
1
Transformer-XL
(RMS dynamic eval)
16.44
16.40