File:RWKV architecture for language modelling.png

From UBC Wiki

RWKV_architecture_for_language_modelling.png(553 × 574 pixels, file size: 86 KB, MIME type: image/png)

Summary

Description
English: The Receptance Weighted Key Value (RWKV) architecture represents a groundbreaking fusion of the best of both worlds in deep learning. RWKV marries the efficient parallelizable training methods of Transformers with the inference efficiency typically associated with Recurrent Neural Networks (RNNs). This ingenious approach capitalizes on a linear attention mechanism, enabling the model to be seamlessly formulated as either a Transformer or an RNN. During training, this flexibility paves the way for parallelized computations, significantly enhancing efficiency.
Date 2023(2023)
File source https://arxiv.org/abs/2305.13048
Author Bo Peng, Eric Alcaide, Quentin Anthony, Alon Albalak, Samuel Arcadinho, Huanqi Cao, Xin Cheng, Michael Chung, Matteo Grella, Kranthi Kiran GV, Xuzheng He, Haowen Hou, Przemyslaw Kazienko, Jan Kocon, Jiaming Kong, Bartlomiej Koptyra, Hayden Lau, Krishna Sri Ipsit Mantri, Ferdinand Mom, Atsushi Saito, Xiangru Tang, Bolun Wang, Johan S. Wind, Stansilaw Wozniak, Ruichong Zhang, Zhenyuan Zhang, Qihang Zhao, Peng Zhou, Jian Zhu, Rui-Jie Zhu

Licensing

Some rights reserved
Permission is granted to copy, distribute and/or modify this document according to the terms in Creative Commons License, Attribution-ShareAlike 4.0. The full text of this license may be found here: CC by-sa 4.0
Attribution-Share-a-like

File history

Click on a date/time to view the file as it appeared at that time.

Date/TimeThumbnailDimensionsUserComment
current06:08, 11 October 2023Thumbnail for version as of 06:08, 11 October 2023553 × 574 (86 KB)AmirhosseinAbaskohi (talk | contribs)Uploaded a work by Bo Peng, Eric Alcaide, Quentin Anthony, Alon Albalak, Samuel Arcadinho, Huanqi Cao, Xin Cheng, Michael Chung, Matteo Grella, Kranthi Kiran GV, Xuzheng He, Haowen Hou, Przemyslaw Kazienko, Jan Kocon, Jiaming Kong, Bartlomiej Koptyra, Hayden Lau, Krishna Sri Ipsit Mantri, Ferdinand Mom, Atsushi Saito, Xiangru Tang, Bolun Wang, Johan S. Wind, Stansilaw Wozniak, Ruichong Zhang, Zhenyuan Zhang, Qihang Zhao, Peng Zhou, Jian Zhu, Rui-Jie Zhu from https://arxiv.org/abs/2305.13048 w...

The following page uses this file: