Dataset Question Answering on SQuAD2.0 dev

Question Answering on SQuAD2.0 dev

#
MODEL
REPOSITORY
F1 EM
PAPER
ε-REPRODUCES PAPER
1
87.5% 84.3%

This benchmark is evaluating models on the dev set of the SQuAD2.0 dataset.

Step 1: Evaluate models locally

First, use one of the public benchmarks libraries to evaluate your model.

Once you can run the benchmark locally, you are ready to connect it to our automatic service.

Step 2: Login and connect your GitHub Repository

Connect your GitHub repository to automatically start benchmarking your repository. Once connected we'll re-benchmark your master branch on every commit, giving your users confidence in using models in your repository and helping you spot any bugs.