Dataset Question Answering on SQuAD1.1 dev

Question Answering on SQuAD1.1 dev

#
MODEL
REPOSITORY
EM F1 SPEED
PAPER
ε-REPRODUCES PAPER
1
88.1% 94.0% 28.8
--
2
86.9% 93.2%
--
3
86.3% 92.8%
--
4
79.1% 86.9%
5
68.4% 77.8% 205.6
Models on Papers with Code for which code has not been tried out yet.
MODEL
PAPER
EM F1

This benchmark is evaluating models on the dev set of the SQuAD1.1 dataset.

Step 1: Evaluate models locally

First, use one of the public benchmarks libraries to evaluate your model.

Once you can run the benchmark locally, you are ready to connect it to our automatic service.

Step 2: Login and connect your GitHub Repository

Connect your GitHub repository to automatically start benchmarking your repository. Once connected we'll re-benchmark your master branch on every commit, giving your users confidence in using models in your repository and helping you spot any bugs.