Безопасность систем на базе LLM

Бакалавриат 2025/2026

Лучший по критерию «Полезность курса для Вашей будущей карьеры»

Лучший по критерию «Полезность курса для расширения кругозора и разностороннего развития»

Лучший по критерию «Новизна полученных знаний»

Статус: Курс по выбору (Прикладная математика и информатика)

Кто читает: Департамент больших данных и информационного поиска

Где читается: Факультет компьютерных наук

Когда читается: 4-й курс, 3 модуль

Охват аудитории: для своего кампуса

Язык: английский

Кредиты: 4

Контактные часы: 20

Full Syllabus Ask Question

Abstract

LLMs are becoming more and more powerful, reliable and cheap, and therefore are used to solve problems in more and more applications. At the same time, LLMs, by virtue of their peculiarities, introduce new classes of vulnerabilities that require appropriate protection. In this short hands-on course, we will look at how (and why) jailbreaks and seed injections work, how to detect and prevent them, and how to use standard frameworks to assess the security of LLM systems.

Learning Objectives

To understand standard methods of protecting LLM systems
To apply software packages to protect LLM-applications
To understand information security frameworks aimed at securing LLM-based systems

Expected Learning Outcomes

To know the main vulnerabilities and security issues of LLM-based systems
To understand attack techniques and methods, such as jailbreaks and inoculum injections
To understand the conceptual causes of LLM-specific security issues
To apply software tools to find vulnerabilities in deployed LLMs

Course Contents

Атаки на LLM-модели: промпт-инъекции, джейлбрейки и другие.
Атаки на LLM-системы: MCP, агенты, IDE.
Оценка безопасности LLM-системы: моделирование угроз.
Защита LLM-моделей и систем. Alignment, гардрейлы и так далее.
Защита LLM-моделей и систем. Механистическая интерпретация, разреженные автоэнкодеры и Technical AI Safety
Классические adversarial-атаки и атаки на системы компьютерного зрения
Синтетические медиа (дипфейки, войсфейки) и связанные угрозы. Breakthrough scale, кейсы. Детектирование.
Генеративные модели в кибербезопасности. Заключение.

Assessment Elements

Test
Тест по темам 1-5
Homework 1
LLM: атаки
Homework 2
LLM: защиты
Homework 3
LLM: mechinterp
Test 2
Тест по темам 6-8
Bonus
пройти Lakera Gandalf и написать writeup

Interim Assessment

2025/2026 3rd module
0.2 * Homework 2 + 0 * Bonus + 0.3 * Homework 1 + 0.15 * Test + 0.2 * Homework 3 + 0.15 * Test 2

Bibliography

Recommended Core Bibliography

Shoeybi, M., Patwary, M., Puri, R., LeGresley, P., Casper, J., & Catanzaro, B. (2019). Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism.

Recommended Additional Bibliography

Прагматичный ИИ : машинное обучение и облачные технологии, Гифт, Н., 2019

Course Syllabus