Pay Less Attention with Lightweight and Dynamic Convolutions

Felix WuAngela FanAlexei BaevskiYann N. DauphinMichael Auli

   Papers with code   Abstract  PDF

Self-attention is a useful mechanism to build generative models for language and images. It determines the importance of context elements by comparing each element to the current time step... (read more)

Benchmarked Models

No benchmarked models yet. Click here to submit a model.