• A
  • A
  • A
  • АБB
  • АБB
  • АБB
  • А
  • А
  • А
  • А
  • А
Обычная версия сайта
Магистратура 2025/2026

Упорядоченные множества в анализе данных

Статус: Курс обязательный (Науки о данных (Data Science))
Когда читается: 1-й курс, 1, 2 модуль
Охват аудитории: для своего кампуса
Язык: английский

Course Syllabus

Abstract

“Ordered sets for data analysis” is a first-semester master course within “Data Science” master program. Order relations are ubiquitous, we meet them when we consider numbers, Boolean algebras, partitions, multisets, graphs, logical formulas, and many other mathematical entities. On the one hand, order relations are used for representing data and knowledge, on the other hand they serve as important tools for describing models and methods of data analysis, such as decision trees, random forests, version spaces, association rules, etc. We start from the basic notions of binary relations, their representations with matrices and graphs, consider properties of relations and operations on them, giving examples on problems of data analysis. We come then to the notions of order, semilattice, and lattice. Very important data models like numbers, vectors of numbers, set algebras, partitions make lattices. We consider an important type of lattices, called concept lattices, which constitute the main subject of Formal Concept Analysis (FCA), a modern domain of applied lattice theory. Concept lattices provide very useful tools for data analysis and pattern mining. On the one hand, formal concepts, the elements of a concept lattice, describe sets of objects sharing sets of common attributes, thus presenting a nice model of (bi)clustering. Ordered sets of concepts give a hierarchy of classes of a subject domain, which makes a backbone of domain ontology. On the other hand, concept lattices represent dependencies on attributes, both precise (called implications), and imprecise, known in data mining as association rules. We consider most important algorithms and algorithmic problems related to applications of FCA. Concept lattices allow us to describe some fundamental models of data analysis and pattern mining such as decision trees, random forests, version spaces, and concept-based hypotheses. Since a serious limitation of many methods of pattern mining is computational complexity, complexity analysis makes an important leitmotif of the course. Throughout the course we consider applications of the introduced notions in various applied domains, such as text analysis, medicine, pharmacology, etc. Homework consists of both theoretical problems and practical problems in designing particular.