Small Language Models Are Also Few-Shot Learners

This video explains the latest work in Pattern-Exploiting Training. This paper shows that this distillation scheme from knowledge captured in pre-t...
Back to Top