ELECTRA: Pre-Training Text Encoders as Discriminators Rather than Generators

Hero March 11, 2020

This video explains the new Replaced Token Detection pre-training objective introduced in ELECTRA. ELECTRA is much more compute efficient due to defining the loss on the entire input sequence and avoiding the introduction of the [MASK] token into the self-supervised learning task. ELECTRA-small is trained on 1 GPU for 4 days and outperforms GPT trained with 30x more compute. ELECTRA is on par with RoBERTa and XLNet with 1/4 of the compute and surpasses those models with the same level of compute!
Thanks for watching! Please Subscribe!

Paper Link:
ELECTRA:
BERT:

Advertisement

ELECTRA: Pre-Training Text Encoders as Discriminators Rather than Generators

Generators

Post a Comment

0 Comments

Popular Videos

The Female Bodybuilder Who Often Gets Mistaken For A Man | BBC Documentary

Best hook fishing | Traditional hook fishing | The most catch fishing with beautiful natural

WHO says most coronavirus cases ‘mild’

No Surprise

39th Hukam: Do all work in accordance with Gurbani.

How to Cool Down and Stretch after a Workout - Scott Burnhard & Noemi

mobile legend zilong the liner

Archive

Recent

Categories

Contact Form

HOT

Menu Footer Widget

Advertisement

ELECTRA: Pre-Training Text Encoders as Discriminators Rather than Generators

Generators

You may like these posts

Post a Comment

0 Comments

Popular Videos

The Female Bodybuilder Who Often Gets Mistaken For A Man | BBC Documentary

Best hook fishing | Traditional hook fishing | The most catch fishing with beautiful natural

WHO says most coronavirus cases ‘mild’

No Surprise

39th Hukam: Do all work in accordance with Gurbani.

How to Cool Down and Stretch after a Workout - Scott Burnhard & Noemi

mobile legend zilong the liner

Archive

Recent

Categories

Contact Form

HOT

Menu Footer Widget