Feedback
Hi Harshinee,
Overall, I thought you did a great job on this page. The quality of writing is excellent and I learned a lot about the varieties of LSTMs while reading the page. I have a few items of feedback which I think would improve the page even more
I think the main issue I have with this page is the lack of introduction of notation, which comes up in a couple of spots.
In the introduction, you talk about how LSTMs are used to model sequential data. It would be great if you could be specific about what that means. I assume that the goal is to create some embedding h_t based on the sequence x_1:t but as it stands, the reader has to infer what the inputs and outputs of the LSTM are. Moreover, what context vectors / hidden states get passed to the next LSTM?
Before diving into the math of the LSTM gates, it would be great if you could overview what's what. What is the context vector? What are the parameters of the LSTM?
In the "Variants of LSTMs" section the above problems just get worse. I think the text descriptions of each method are great but I had a really hard time parsing what each of the equations did without understanding the symbols being used. I honestly don't think you really need the math. I would simple introduce the method as you have now and refer readers to the specific papers if they would like specific details of how to implement these variants.
I think that the application section could benefit from a plot or table showcasing the applications of LSTMs. Maybe grab a plot or result from one of the papers which you cite.
Nits:
- In the LSTM gates section you define f_t, and i_t but as far as I can tell you don't use them. I'm guessing there's another equation where they are used similar to how o_t is used?
- Your Bibliography is in the "To Add" section instead of the "Bibliography" section.