Master
2025/2026




Mathematical Foundations of Reinforcement learning
Type:
Elective course (Math of Machine Learning)
Delivered by:
Big Data and Information Retrieval School
Where:
Faculty of Computer Science
When:
2 year, 2 module
Open to:
students of one campus
Language:
English
ECTS credits:
3
Contact hours:
28
Course Syllabus
Abstract
Reinforcement learning is a type of machine learning. The key feature of this method, unlike classical machine learning, is the interaction of the agent (algorithm) with an environment from which he receives feedback in the form of rewards. The agent's goal is to maximize the sum of rewards that the environment gives him for the "right" interaction. During the course, we will get acquainted with the basic concepts of reinforcement learning theory, talk about the exploration of the environment and the paradigm of optimism. We also study modern reinforcement learning algorithms such as TRPO, PPO, and entropic reinforcement learning, which are widely used, for example, in modern methods of training large language models.
Learning Objectives
- Understanding mathematical fundamentals of modern reinforcement learning methods.
Expected Learning Outcomes
- The ability to apply mathematical methods for the analysis of reinforcement learning algorithms
- The ability to construct the effective algorithms
- Developing the skill of problem-solving and understanding algorithms
Course Contents
- Introduction to stochastic multi-armed bandits.
- Policy Evaluation
- Learning in MDP
- Exploration in MDP
- General state space MDP
- Policy optimization
Assessment Elements
- HomeworksTasks for the topics covered in class
- ExamThe oral exam includes the main topics covered in lectures and seminars.
Bibliography
Recommended Core Bibliography
- 9780262257053 - Sutton, Richard S.; Barto, Andrew G. - Reinforcement Learning : An Introduction - 1998 - A Bradford Book - http://search.ebscohost.com/login.aspx?direct=true&db=nlebk&AN=1094 - nlebk - 1094
- Li, Y. (2017). Deep Reinforcement Learning: An Overview. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsbas&AN=edsbas.281A6E8D
Recommended Additional Bibliography
- Markov decision processes in practice, , 2017