Browse the glossary using this index

Special | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | ALL

R

Picture of Yee Wei Law

Recurrent neural networks

by Yee Wei Law - Friday, 31 January 2025, 3:17 PM
 

A recurrent neural network (RNN) is a neural network which maps an input space of sequences to an output space of sequences in a stateful way[RHW86, Mur22].

While convolutional neural networks excel at two-dimensional (2D) data, recurrent neural networks (RNNs) are better suited for one-dimensional (1D), sequential data[GBC16, §9.11].

Unlike early artificial neural networks (ANNs) which have a feedforward structure, RNNs have a cyclic structure, inspired by the cyclical connectivity of neurons; see Fig. 1.

The forward pass of an RNN is the same as that of a multilayer perceptron, except that activations arrive at a hidden layer from both the current external input and the hidden-layer activations from the previous timestep.

Fig. 1 visualises the operation of an RNN by “unfolding” or “unrolling” the network across timesteps, with the same network parameters applied at each timestep.

Note: The term “timestep” should be understood more generally as an index for sequential data.

For the backward pass, two well-known algorithms are applicable: 1️⃣ real-time recurrent learning and the simpler, computationally more efficient 2️⃣ backpropagation through time[Wer90].

Fig. 1: On the left, an RNN is often visualised as a neural network with recurrent connections. The recurrent connections should be understood, through unfolding or unrolling the network across timesteps, as applying the same network parameters to the current input and the previous state at each timestep. On the right, while the recurrent connections (blue arrows) propagate the network state over timesteps, the standard network connections (black arrows) propagate activations from one layer to the next within the same timestep. Diagram adapted from [ZLLS23, Figure 9.1].

Fig. 1 implies information flows in one direction, the direction associated with causality.

However, for many sequence labelling tasks, the correct output depends on the entire input sequence, or at least a sufficiently long input sequence. Examples of these tasks include speech recognition and language translation. Addressing the need of these tasks gave rise to bidirectional RNNs[SP97].

Standard/traditional RNNs suffer from the following deficiencies[Gra12, YSHZ19, MSO24]:

  • They are susceptible to the problems of vanishing gradients and exploding gradients.
  • They cannot store information for long periods of time.
  • Except for bidirectional RNNs, they access context information in only one direction (i.e., typically past information in the time domain).

Due to the drawbacks above, RNNs are typically used with “leaky” units enabling the networks to accumulate information over a long duration[GBC16, §10.10]. The resultant RNNs are called gated RNNs. The most successful gated RNNs are those using long short-term memory (LSTM) or gated recurrent units (GRU).

References

[GBC16] I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning, MIT Press, 2016. Available at https://www.deeplearningbook.org.
[Gra12] A. Graves, Supervised Sequence Labelling with Recurrent Neural Networks, Springer Berlin, Heidelberg, 2012. https://doi.org/10.1007/978-3-642-24797-2.
[MSO24] I. D. Mienye, T. G. Swart, and G. Obaido, Recurrent neural networks: A comprehensive review of architectures, variants, and applications, Information 15 no. 9 (2024). https://doi.org/10.3390/info15090517.
[Mur22] K. P. Murphy, Probabilistic Machine Learning: An Introduction, MIT Press, 2022. Available at http://probml.ai.
[RHW86] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, Learning representations by back-propagating errors, Nature 323 (1986), 533–536. https://doi.org/10.1038/323533a0.
[SP97] M. Schuster and K. Paliwal, Bidirectional recurrent neural networks, IEEE Transactions on Signal Processing 45 no. 11 (1997), 2673–2681. https://doi.org/10.1109/78.650093.
[VHMN20] G. Van Houdt, C. Mosquera, and G. Nápoles, A review on the long short-term memory model, Artificial Intelligence Review 53 no. 8 (2020), 5929–5955. https://doi.org/10.1007/s10462-020-09838-1.
[Wer90] P. Werbos, Backpropagation through time: what it does and how to do it, Proceedings of the IEEE 78 no. 10 (1990), 1550–1560. https://doi.org/10.1109/5.58337.
[YSHZ19] Y. Yu, X. Si, C. Hu, and J. Zhang, A review of recurrent neural networks: LSTM cells and network architectures, Neural Computation 31 no. 7 (2019), 1235–1270. https://doi.org/10.1162/neco_a_01199.
[ZLLS23] A. Zhang, Z. C. Lipton, M. Li, and A. J. Smola, Dive into Deep Learning, Cambridge University Press, 2023. Available at https://d2l.ai/.

Picture of Yee Wei Law

Reinforcement learning

by Yee Wei Law - Tuesday, 18 March 2025, 9:56 AM
 

Work in progress

Reinforcement learning (RL) is a family of algorithms that learn an optimal policy, whose goals is to maximize the expected return when interacting with an environment[Goo25].

RL has existed since the 1950s[BD10], but it was the introduction of high-capacity function approximators, namely deep neural networks, that rejuvenated RL in recent years[LKTF20].

There are three main types of RL[LKTV20, FPMC24]:

  1. Online or on-policy RL: In this classic setting, an agent interacts freely with
  2. Off-policy RL: In this classic setting, an agent
  3. Offline RL:

References

[BD10] R. Bellman and S. Dreyfus, Dynamic Programming, 33, Princeton University Press, 2010. https://doi.org/10.2307/j.ctv1nxcw0f.
[BK19] S.L. Brunton and J.N. Kutz, Data-Driven Science and Engineering: Machine Learning, Dynamical Systems, and Control, Cambridge University Press, 2019. https://doi.org/10.1017/9781108380690.
[GBC16] I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning, MIT Press, 2016. Available at https://www.deeplearningbook.org.
[Goo25] Google, reinforcement learning (RL), Machine Learning Glossary, 2025, accessed 3 Jan 2025. Available at https://developers.google.com/machine-learning/glossary#reinforcement-learning-rl.
[LKTF20] S. Levine, A. Kumar, G. Tucker, and J. Fu, Offline reinforcement learning: Tutorial, review, and perspectives on open problems, arXiv preprint arXiv:2005.01643, 2020. https://doi.org/10.48550/arXiv.2005.01643.
[FPMC24] R. Figueiredo Prudencio, M.R.O.A. Maximo, and E.L. Colombini, A survey on offline reinforcement learning: Taxonomy, review, and open problems, IEEE Transactions on Neural Networks and Learning Systems 35 no. 8 (2024), 10237–10257. https://doi.org/10.1109/TNNLS.2023.3250269.
[SB18] R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, 2nd ed., MIT Press, 2018.
[ZLLS23] A. Zhang, Z. C. Lipton, M. Li, and A. J. Smola, Dive into Deep Learning, Cambridge University Press, 2023. Available at https://d2l.ai/.