Build Results

EM F1
MODEL CODE PAPER
ε-REPR
CODE PAPER
ε-REPR
PAPER
BERT large
(whole word masking, cased)
86.3% -- 92.8% --
BERT large
(whole word masking, uncased)
86.9% -- 93.2% --
DistilBERT
79.1% 77.7%
86.9% 85.8%
TEST PERPLEXITY
SPEED
MODEL CODE PAPER
ε-REPR
PAPER
DistilGPT2
49.91 -- 38425.0
GPT1
90.61 -- 29444.9
GPT-2 Large
22.61 22.05 4482.9
GPT-2 Medium
26.51 26.37 9316.7
GPT-2 Small
36.53 37.50
25200.1
Transformer-XL Large
18.19 18.30 2970.8