File:Overall pre-training and fine-tuning procedures for BERT. Apart from output layers, the same architectures are used in both pre-training and fine-tuning.png

From UBC Wiki

Original file(1,198 × 494 pixels, file size: 124 KB, MIME type: image/png)

Summary

Description
English: Overall pre-training and fine-tuning procedures for BERT. Apart from output layers, the same architectures are used in both pre-training and fine-tuning. The same pre-trained model parameters are used to initialize models for different downstream tasks. During fine-tuning, all parameters are fine-tuned. [CLS] is a special symbol added in front of every input example, and [SEP] is a special separator token (e.g. separating questions/answers).
Date 14 March 2023(2023-03-14)
File source BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Author Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova

Licensing

Some rights reserved
Permission is granted to copy, distribute and/or modify this document according to the terms in Creative Commons License, Attribution-ShareAlike 4.0. The full text of this license may be found here: CC by-sa 4.0
Attribution-Share-a-like

File history

Click on a date/time to view the file as it appeared at that time.

Date/TimeThumbnailDimensionsUserComment
current18:41, 17 March 2023Thumbnail for version as of 18:41, 17 March 20231,198 × 494 (124 KB)MEHARBHATIA (talk | contribs)Uploaded a work by Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova from BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding with UploadWizard

The following page uses this file: