Universal Transformers

Mostafa DehghaniStephan GouwsOriol VinyalsJakob UszkoreitŁukasz Kaiser

   Papers with code   Abstract  PDF

Recurrent neural networks (RNNs) sequentially process data by updating their state with each new data point, and have long been the de facto choice for sequence modeling tasks. However, their inherently sequential computation makes them slow to train... (read more)

Benchmarked Models

No benchmarked models yet. Click here to submit a model.