File:Overall pre-training and fine-tuning procedures for BERT. Apart from output layers, the same architectures are used in both pre-training and fine-tuning.png
Size of this preview: 800 × 330 pixels. Other resolution: 1,198 × 494 pixels.
Original file (1,198 × 494 pixels, file size: 124 KB, MIME type: image/png)
Summary
Description | English: Overall pre-training and fine-tuning procedures for BERT. Apart from output layers, the same architectures are used in both pre-training and fine-tuning. The same pre-trained model parameters are used to initialize models for different downstream tasks. During fine-tuning, all parameters are fine-tuned. [CLS] is a special symbol added in front of every input example, and [SEP] is a special separator token (e.g. separating questions/answers). |
Date | 14 March 2023 |
File source | BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding |
Author | Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova |
Licensing
|
File history
Click on a date/time to view the file as it appeared at that time.
Date/Time | Thumbnail | Dimensions | User | Comment | |
---|---|---|---|---|---|
current | 18:41, 17 March 2023 | 1,198 × 494 (124 KB) | MEHARBHATIA (talk | contribs) | Uploaded a work by Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova from BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding with UploadWizard |
You cannot overwrite this file.
File usage
The following page uses this file: