File:Transformer-XL.png

From UBC Wiki

Transformer-XL.png(555 × 431 pixels, file size: 64 KB, MIME type: image/png)

Summary

Description
English: The Transformer-XL architecture is an approach to handle long-term dependencies in sequence data. It extends the traditional Transformer model by introducing a recurrence mechanism and relative positional encoding scheme, allowing it to model longer-term dependencies more effectively. This makes it particularly useful for tasks like language modeling, where understanding the context from earlier parts of the text can be crucial for generating accurate predictions.
Date 2022(2022)
File source https://arxiv.org/pdf/2207.06881.pdf
Author Aydar Bulatov,Yuri Kuratov, Mikhail S. Burtse

Licensing

Some rights reserved
Permission is granted to copy, distribute and/or modify this document according to the terms in Creative Commons License, Attribution-ShareAlike 4.0. The full text of this license may be found here: CC by-sa 4.0
Attribution-Share-a-like

File history

Click on a date/time to view the file as it appeared at that time.

Date/TimeThumbnailDimensionsUserComment
current03:32, 11 October 2023Thumbnail for version as of 03:32, 11 October 2023555 × 431 (64 KB)AmirhosseinAbaskohi (talk | contribs)Uploaded a work by Aydar Bulatov,Yuri Kuratov, Mikhail S. Burtse from https://arxiv.org/pdf/2207.06881.pdf with UploadWizard

The following page uses this file: