NLP

2025/2026

Статус: Маго-лего

Кто читает: Департамент больших данных и информационного поиска

Где читается: Факультет компьютерных наук

Когда читается: 3 модуль

Охват аудитории: для своего кампуса

Преподаватели: Сатаев Эмиль Робертович, Тихонова Мария Ивановна

Язык: английский

Кредиты: 3

Контактные часы: 24

Full Syllabus Ask Question

Abstract

The course focuses on the basics of natural language processing (NLP), one of the key areas of artificial intelligence aimed at teaching computers to understand, analyze, and process text data. You will learn how NLP has developed rapidly in recent years, how statistical methods are being reincarnated in neural network approaches, and what tasks and concepts underlie modern text processing technologies. During the course, you will learn in detail about advanced architectures such as transformers and the mechanism of attention, which have become the basis of most modern NLP models. In addition, you will study current tasks, approaches to their solution and master practical skills applicable in real projects.

Learning Objectives

• understand the principles of Transformer architecture and its key components;
• be able to solve a wide range of language modeling and NLP problems using transformational models
• know modern architectures of encoder, decoder and encoder-decoder transformers;
• understand the structure and features of modern LLMs;

Expected Learning Outcomes

• Develops basic AI agents using LLM.
• Designs and implements RAG pipelines for Q&A and generation tasks;
• understands the principles of modern LLM;
• Implements and applies transformational models for classification, language modeling, and generation tasks;
The student understands various tokenization strategies, applies encoder transformers to classification tasks, and adjusts pre-trained models for applied NLP tasks.
The student understands the principles of decoder transformers and GPT models, explains the role of RLHF, and uses LLM to generate text.
The student is able to select and apply LLM assessment metrics, interpret benchmarking results, and critically analyze the quality of generative models.
The student designs and implements RAG pipelines, uses external knowledge bases, and understands the limitations and advantages of retrieval-oriented approaches.
The student understands the principles of building LLM agents, implements simple agents, and analyzes their behavior and limitations.

Course Contents

Transformer Architecture and Machine translation
Tokenization and encoder transformers. Classification using Transformers
Decoder transformers, GPT and RLHF
LLM quality assessment and benchmarking
Retrieval-Augmented Generation (RAG)
Introduction to LLM-based AI Agents

Assessment Elements

Экзамен
Домашнее задание 1
Домашнее задание 2
Домашнее задание 3

Interim Assessment

2025/2026 3rd module
Итоговая оценка складывается из практической части (ДЗ) и оценки на экзамене (ЭКЗ). Итог = Округление (0.2 ЭКЗ + 0.2 * (ДЗ-1 + БОНУС/3) + 0.3 * (ДЗ-2 + БОНУС/3) + 0.3 * (ДЗ-3+ БОНУС/3)), где ДЗ - оценка за домашние задания, ЭКЗ - оценка за экзамен, БОНУС – бонусные баллы за курс, которые распределяются равномерно между ДЗ. Экзамен: экзамен проходит в онлайн формате и состоит из двух частей: ТЕСТ + ответы на устные вопросы (У). Итог за экзамен складывается по формуле: ЭКЗ = 0,4 ТЕСТ + 0,6 У

Authors

Akhmedova Giunai Intigam kyzy

Course Syllabus