XLNet: Generalized Autoregressive Pretraining for Language Understanding

Abstract: With the capability of modeling bidirectional contexts, denoising autoencoding based pretraining like BERT achieves better performance th...
Back to Top