File:RMT Peformance.png

From UBC Wiki

Original file(1,351 × 381 pixels, file size: 101 KB, MIME type: image/png)

Summary

Description
English: In this image, we observe the performance of three models - Baseline, Transformer-XL, and the RMT - on copy and reverse tasks. In a single segment setting, all models perform admirably, as the entire sequence is accessible without the need for recurrence. However, when the number of segments increases, the non-recurrent Baseline model encounters challenges in task-solving. In contrast, both memory models, Transformer-XL and RMT, exhibit the ability to retain crucial information from previous segments in memory. Of significance, RMT surpasses Transformer-XL in performance as the number of segments rises, as demonstrated in the panels showcasing per-character accuracy on various tasks. These tasks encompass copying, reversing, and associative retrieval, each with distinct source/target sequence lengths, and memory/cache sizes equivalent to the segment length for both models. Furthermore, it's important to note that RMT does not pass gradients between segments in this experiment, leading to distinct results compared to the Baseline model.
Date 2022(2022)
File source https://arxiv.org/pdf/2207.06881.pdf
Author Aydar Bulatov,Yuri Kuratov, Mikhail S. Burtse

Licensing

Some rights reserved
Permission is granted to copy, distribute and/or modify this document according to the terms in Creative Commons License, Attribution-ShareAlike 4.0. The full text of this license may be found here: CC by-sa 4.0
Attribution-Share-a-like

File history

Click on a date/time to view the file as it appeared at that time.

Date/TimeThumbnailDimensionsUserComment
current04:16, 11 October 2023Thumbnail for version as of 04:16, 11 October 20231,351 × 381 (101 KB)AmirhosseinAbaskohi (talk | contribs)Uploaded a work by Aydar Bulatov,Yuri Kuratov, Mikhail S. Burtse from https://arxiv.org/pdf/2207.06881.pdf with UploadWizard

The following page uses this file: