Fast and Simple Mixture of Softmaxes with BPE and Hybrid-LightRNN for Language Generation

Xiang KongQizhe XieZihang DaiEduard Hovy

   Papers with code   Abstract  PDF

Mixture of Softmaxes (MoS) has been shown to be effective at addressing the expressiveness limitation of Softmax-based models. Despite the known advantage, MoS is practically sealed by its large consumption of memory and computational time due to the need of computing multiple Softmaxes... (read more)

Benchmarked Models

No benchmarked models yet. Click here to submit a model.