Scaling Neural Machine Translation

Myle OttSergey EdunovDavid GrangierMichael Auli

   Papers with code   Abstract  PDF

Sequence to sequence learning models still require several days to reach state of the art performance on large benchmark datasets using a single machine. This paper shows that reduced precision and large batch training can speedup training by nearly 5x on a single 8-GPU machine with careful tuning and implementation... (read more)

Benchmarked Models

No benchmarked models yet. Click here to submit a model.