Learn LSTM in RNN

Long Short Term Memory is claimed to be capable of predicting time series when there are long time lags of unknown sizes between important events. However, as to 2015.6, not many clear tutorials have been found on the Internet. I am going to list a collection of materials I came across. Probably I will write a tutorial myself soon.

Wikipedia: https://en.wikipedia.org/wiki/Long_short_term_memory

Horchreiter, 1997. Long Short Term Memory. http://deeplearning.cs.cmu.edu/pdfs/Hochreiter97_lstm.pdf. This seems to be the very first paper applying LSTM in RNN context. I can’t understand it well however.

Felix Gers’s phd thesis. http://www.felixgers.de/papers/phd.pdf. Not very clear though.

The most clear entry-level tutorial for me: http://www.willamette.edu/~gorr/classes/cs449/lstm.html. It illustrates the reason LSTM is called LSTM.

Alex Graves. 2014. Generating Sequences With Recurrent Neural Networks. http://arxiv.org/pdf/1308.0850v5.pdf. This paper reveals how RNN can be used to generate things with LSTM.

A paper from Microsoft. SPOKEN LANGUAGE UNDERSTANDING USING LONG SHORT-TERM MEMORY NEURAL NETWORKS. http://research.microsoft.com/pubs/228844/20140915012634_789031_1017.pdf. Haven’t read though.

Theano tutorial: http://christianherta.de/lehre/dataScience/machineLearning/neuralNetworks/LSTM.php http://deeplearning.net/tutorial/lstm.html#lstm

My Quora question: http://www.quora.com/How-does-LSTM-help-prevent-the-vanishing-and-exploding-gradient-problem-in-Recurrent-Neural-Network

Learn LSTM in RNN

Leave a comment

Cancel reply