File:ModelArch.png
ModelArch.png (493 × 592 pixels, file size: 120 KB, MIME type: image/png)
Summary
Description | English: Here the model takes absurdity as the current input and combines it with the history (as represented by the hidden state) to predict the next word, is. The first layer performs a lookup of character embeddings (of dimension four) and stacks them to form the matrix . Then convolution operations are applied between and multiple filter matrices. Note that in the above example we have twelve filters—three filters of width two (blue), four filters of width three (yellow), and five filters of width four (red). A max-over-time pooling operation is applied to obtain a fixed-dimensional representation of the word, which is given to the highway network. The highway network’s output is used as the input to a multi-layer LSTM. Finally, an affine transformation followed by a softmax is applied over the hidden representation of the LSTM to obtain the distribution over the next word. Cross-entropy loss between the (predicted) distribution over next word and the actual next word is minimized. Element-wise addition, multiplication, and sigmoid operators are depicted in circles, and affine transformations (plus nonlinearities where appropriate) are represented by solid arrows. |
Date | 3 March 2018( ) |
File source | "Character-Aware Neural Language Models." AAAI. 2016. |
Author | Kim, Yoon, et al. |
Licensing
{{subst:uwl}}
File history
Click on a date/time to view the file as it appeared at that time.
Date/Time | Thumbnail | Dimensions | User | Comment | |
---|---|---|---|---|---|
current | 07:34, 4 March 2018 | 493 × 592 (120 KB) | KevinDsouza (talk | contribs) | User created page with UploadWizard |
You cannot overwrite this file.
File usage
The following page uses this file: