File:ModelArch.png

From UBC Wiki

ModelArch.png(493 × 592 pixels, file size: 120 KB, MIME type: image/png)

Summary

Description
English: Here the model takes absurdity as the current input and combines it with the history (as represented by the hidden state) to predict the next word, is. The first layer performs a lookup of character embeddings (of dimension four) and stacks them to form the matrix . Then convolution operations are applied between and multiple filter matrices. Note that in the above example we have twelve filters—three filters of width two (blue), four filters of width three (yellow), and five filters of width four (red). A max-over-time pooling operation is applied to obtain a fixed-dimensional representation of the word, which is given to the highway network. The highway network’s output is used as the input to a multi-layer LSTM. Finally, an affine transformation followed by a softmax is applied over the hidden representation of the LSTM to obtain the distribution over the next word. Cross-entropy loss between the (predicted) distribution over next word and the actual next word is minimized. Element-wise addition, multiplication, and sigmoid operators are depicted in circles, and affine transformations (plus nonlinearities where appropriate) are represented by solid arrows.
Date 3 March 2018(2018-03-03) (purge)
File source "Character-Aware Neural Language Models." AAAI. 2016.
Author Kim, Yoon, et al.

Licensing

{{subst:uwl}}

File history

Click on a date/time to view the file as it appeared at that time.

Date/TimeThumbnailDimensionsUserComment
current07:34, 4 March 2018Thumbnail for version as of 07:34, 4 March 2018493 × 592 (120 KB)KevinDsouza (talk | contribs)User created page with UploadWizard