File:RMT Peformance.png
Size of this preview: 800 × 226 pixels. Other resolution: 1,351 × 381 pixels.
Original file (1,351 × 381 pixels, file size: 101 KB, MIME type: image/png)
Summary
Description | English: In this image, we observe the performance of three models - Baseline, Transformer-XL, and the RMT - on copy and reverse tasks. In a single segment setting, all models perform admirably, as the entire sequence is accessible without the need for recurrence. However, when the number of segments increases, the non-recurrent Baseline model encounters challenges in task-solving. In contrast, both memory models, Transformer-XL and RMT, exhibit the ability to retain crucial information from previous segments in memory. Of significance, RMT surpasses Transformer-XL in performance as the number of segments rises, as demonstrated in the panels showcasing per-character accuracy on various tasks. These tasks encompass copying, reversing, and associative retrieval, each with distinct source/target sequence lengths, and memory/cache sizes equivalent to the segment length for both models. Furthermore, it's important to note that RMT does not pass gradients between segments in this experiment, leading to distinct results compared to the Baseline model. |
Date | 2022 |
File source | https://arxiv.org/pdf/2207.06881.pdf |
Author | Aydar Bulatov,Yuri Kuratov, Mikhail S. Burtse |
Licensing
|
File history
Click on a date/time to view the file as it appeared at that time.
Date/Time | Thumbnail | Dimensions | User | Comment | |
---|---|---|---|---|---|
current | 04:16, 11 October 2023 | 1,351 × 381 (101 KB) | AmirhosseinAbaskohi (talk | contribs) | Uploaded a work by Aydar Bulatov,Yuri Kuratov, Mikhail S. Burtse from https://arxiv.org/pdf/2207.06881.pdf with UploadWizard |
You cannot overwrite this file.
File usage
The following page uses this file: