Step 1: Evaluate Your Model Locally

The first step is to evaluate your implementation locally. To do this you will need to create a sotabench.py in the root of your repository. This can contain whatever logic you need to load and process the dataset, and to produce model predictions for it. It will be run by the server each time you make a commit to master

But you will need to record your results for the server, and for this you'll need to use an evaluation library that integrates with sotabench. Some example libraries (with documentation) are


A general, framework-agnostic library for evaluating model predictions on selected benchmarks. Aims for maximum flexibility; useful for complicated task pipelines, like object detection. Docs here.

A PyTorch evaluation library for selected deep learning benchmarks. Useful for PyTorch users who are evaluating on well-standardised tasks, such as image classification. Docs here.

Or you can Add Your Own Benchmarking Library


Step 2: Connect Your GitHub Repository and We'll Run the Benchmark

Once you've successfully benchmarked your model, connect your GitHub repository. We'll run the benchmark for free on GPU machines and officially record the results.

The build environment is set up based on your requirements.txt file, so please add this file to your repository to capture your dependencies.

We'll continue re-running the benchmarks on every commit on the master branch to ensure the results are up-to-date and that that the model is still working.

To connect your repository, click on the button below to login with GitHub and select your repository.