• A
  • A
  • A
  • АБB
  • АБB
  • АБB
  • А
  • А
  • А
  • А
  • А
Обычная версия сайта
Бакалавриат 2025/2026

Методы работы с данными для экономистов

Когда читается: 3-й курс, 3, 4 модуль
Охват аудитории: для своего кампуса
Язык: английский
Кредиты: 4
Контактные часы: 64

Course Syllabus

Abstract

The course consists of three parts: 1. Introduction to programming; 2. Overview of the most commonly used machine learning algorithms; 3. Time permitting, an introduction to text analysis. In the first part of the course students will learn basic programming using computing language R. Obtained skills will allow to implement all methods taught subsequently. Additionally, students will learn how to explore and analyse structured and un-structured data sets. Finally, provided introduction to programming will also be useful in subsequent courses in econometrics and economics. At first, to gain intuition, we will study how to solve the problem in a brute force manner and then explore R packages and built in functions to deal with a problem in the most efficient manner. In the second part of the course we will focus on most commonly used machine learning algorithms. We will cover regression techniques (parametric, nonparametric and high-dimensional), classification methods, resampling methods, model selection, unsupervised learning and text analysis (time permitting). Finally, in the last part of the course, we will cover techniques for processing and analyzing text data, including preprocessing, dictionary-based methods, and topic modeling.
Learning Objectives

Learning Objectives

  • The objective of this course is to provide students with a hands on introduction to data science in economics (or more broadly to data science in the social sciences).
  • At the end of the course students should have developed the following skills: • Ability to write simple computer programs using computing language Python; • Implement basic machine learning algorithms; • Understand assumptions and statistical properties of machine learning algorithms; • Be able to use machine learning algorithms to solve real world business problems.
Expected Learning Outcomes

Expected Learning Outcomes

  • analyse unbiasedness, consistency and obtain asymptotic distribution of these estimators
  • apply basic ideas of statistical learning
  • be able to use non – parametric techniques
  • derive OLS estimator both in the univariate and in the multivariate case
  • explain how the data science is used in industry and in academia
  • solve data science problems implementing control structures and functions
  • be able to use Python, NumPy, Pandas and Matplotlib
  • • implement OLS estimator in the computing language Python, analyse their properties using Monte – Carlo simulations and also apply OLS estimation techniques to the real data
  • • Implement logit and linear (quadratic) discriminant analysis using the computing language Python
  • • Implement estimators of linear model selection on the computer, using the computing language Python
  • • implement algorithms (regression trees, classification trees, bagging, random forest, boosting) using the computing language Python
  • • analyze text data in Python
  • • be able to derive high-quality information from text (text analysis)
  • • formulate research questions that machine learning and/or text data can help answer
  • • be able to manipulate basic data structures used in the computing language Python
  • • write basic regular expressions
  • • Implement both model selection techniques and bootstrap methods using the computing language Python
  • • implement PCA, K – Means clustering, Hierarchical clustering using the computing language Python
Course Contents

Course Contents

  • Introduction to Data Science In Economics
  • Control Structures and Functions
  • Vectorized Computation, Data Aggregation and Data Visualization (Part 1)
  • Vectorized Computation, Data Aggregation and Data Visualization (Part 2)
  • Introduction to Statistical Learning
  • Large Sample Properties of OLS
  • Classification
  • Resampling Methods
  • Linear Model Selection and Regularization
  • Nonparametric Estimation
  • Tree Based Methods
  • Unsupervised Learning
  • Introduction to Text Analysis: From Text to Data (Part 1)
  • Introduction to Text Analysis: From Text to Data (Part 2)
Assessment Elements

Assessment Elements

  • blocking Final Exam
    In order to get a passing grade for the course, the student must sit (all parts) of the examination.
  • non-blocking assignment 3
  • non-blocking assignment 4
  • non-blocking assignment 1
  • non-blocking assignment 2
Interim Assessment

Interim Assessment

  • 2025/2026 4th module
    0.6 * Final Exam + 0.1 * assignment 1 + 0.1 * assignment 2 + 0.1 * assignment 3 + 0.1 * assignment 4
Bibliography

Bibliography

Recommended Core Bibliography

  • Lutz, M. (2008). Learning Python (Vol. 3rd ed). Beijing: O’Reilly Media. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=415392

Recommended Additional Bibliography

  • 9781491962992 - Bengfort, Benjamin; Bilbro, Rebecca; Ojeda, Tony - Applied Text Analysis with Python : Enabling Language-Aware Data Products with Machine Learning - 2018 - O'Reilly Media - https://search.ebscohost.com/login.aspx?direct=true&db=nlebk&AN=1827695 - nlebk - 1827695

Authors

  • Lagios Nikolas