Description
This module introduces foundational concepts and algorithms of reinforcement learning. The module will cover general concepts, such as tabular-based reinforcement learning, Q-learning, Markov decision processes, and policy gradient methods. The module consists of lectures, laboratory work (coding-based tutorials). Algorithms will be connected to robotics-inspired reinforcement learning problems. The module will prepare learners to complete research-like tasks, drawing on a range of sources, with a significant level of autonomy. Learners will gain an understanding of the material and main concepts/theories taught in this module. They will develop skills for analysis and synthesis. Knowledge of research-informed literature will be an outcome of this module. Learners will be able to identify key areas of problems and choose appropriate methods for their resolution.
Aims:
The aims of this module are to:
- Provide the foundations of reinforcement learning which lay the foundations for follow-up modules, such as robot learning.
- Support students in the development of a breadth of knowledge and understanding in the fundamentals of reinforcement learning concepts and solution approaches with the goal of applying this to robotic problems in a subsequent module.
- Provide students with the tools for critical analysis: Be able to justify the choices made in the selection of techniques applied in creating practical solutions to reinforcement learning problems based on a critical assessment of their effectiveness, efficiency, and the limits of their applicability.
Intended learning outcomes:
On successful completion of the module, a student will be able to:
- Understand basic concepts of reinforcement learning.
- Develop a systematic approach to developing and analyzing reinforcement learning algorithms.
- Evaluate the quality and suitability of different reinforcement learning methods for different scenarios.
- Examine and interpret properties of reinforcement learning algorithms.
Indicative content:
The following are indicative of the topics the module will typically cover:
- Markov decision processes.
- Dynamic programming (model based, known transitions).
- Model-free predictions.
- Model-free control.
- Value function approximation.
- Policy gradient methods.
- Integrating learning and planning.
- Exploration/Exploitation.
- Case Study.
- POMDPs.
Requisite conditions:
To be eligible to select this module as optional or elective, a student must be registered on a programme and year of study for which it is formally available.
Module deliveries for 2024/25 academic year
Last updated
This module description was last updated on 19th August 2024.
Ìý