7. Simple RNN

This chapter describes simple RNNs.

While simple RNNs have limitations in learning long-term dependencies, they are valuable for educational purposes to understand more advanced RNN architectures like LSTMs and GRUs.

Chapter Contents

7.1. Formulation of Simple RNN
7.2. Computing the gradients for Back Propagation Through Time (BPTT)
7.3. Implementation
7.4. TensorFlow and Keras versions
7.5. BPTT in Many-to-Many type
7.6. Exploding and Vanishing Gradients Problems

Note

In this document, we will use diagrams like Fig.7-1 to illustrate the operation of the RNN.

For example, Fig.7-1 shows that an input sequence {$x^{(0)},x^{(1)},\ldots,x^{(T)}$} is fed into the RNN sequentially, starting from the first element.

Fig.7-1: Many-to-One Simple-RNN with Dense Layer

It is important to note that although Fig.7-1 depicts several RNN units, a single RNN unit receives the input sequence sequentially. In other words, the same unit is used to process each element of the sequence one after another. See Fig.7-2 for further clarification.

Fig.7-2: RNN Operation at Each Time Step

7. Simple RNN

Note

Fig.7-1: Many-to-One Simple-RNN with Dense Layer

Fig.7-2: RNN Operation at Each Time Step