• A
  • A
  • A
  • АБB
  • АБB
  • АБB
  • А
  • А
  • А
  • А
  • А
Обычная версия сайта
2024/2025

Основы работы с данными

Статус: Маго-лего
Когда читается: 1 модуль
Охват аудитории: для своего кампуса
Язык: русский
Кредиты: 3

Программа дисциплины

Аннотация

This course introduces students to the fundamental tools and techniques used in data analysis, providing a solid foundation for understanding and interpreting data. Through hands-on activities and practical exercises, participants will learn how to collect, clean, analyze, and visualize data using popular software tools such as Excel, R, and Python.
Цель освоения дисциплины

Цель освоения дисциплины

  • The class aims to introduce students to foundations of exploratory data analysis, vizualization, data wrangling, and basics of statistical inference.
Планируемые результаты обучения

Планируемые результаты обучения

  • The ability to create a PowerPoint presentation that will follow current design trends
  • Students are familiar with navigating in Excel, filtering, and sorting data. Basic functions: sum, average, count, max, min, Absolute and relative references, Subtotal, if, sumif, Syntaxis of functions
  • Students are able to use the following features of excel: AND, OR, NOT, IFS, and nested IF functions; Pivot tables, slices; Vlookup function; Visualise data; Merging tables.
  • Students are familiar with syntaxis of R, ways of getting help; notions of objects, vectors, and types of data in R. They can code basic contingency tables. Students can use main function from dplyr.
  • Students are able to use main dplyr functions. Students are capable of creating main plot types using ggplot2. Students can calculate main descriptive statistics.
  • Students are familiar with open source data such as survey projects (WVS, EVS, ESS, Barometers) and government statistics (Rosstat). Students can use function from the R package rvest to scrape tables and textual data from the web. They can use join function to combine data from different tables.
  • Students are familiar with applications of ggplot2 in the context of working with textual data. Students can pre-process and explore a text corpus using basic text statistics and tidytext package.
  • Students can identify appropriate data for fitting an OLS model. Students can check statistical assumptions for variables to be included in statistical analysis. Students can fit and interpret a basic OLS model in R. Students can present results of an OLS regression using stargazer and jtools packages.
Содержание учебной дисциплины

Содержание учебной дисциплины

  • Introduction to Excel 1
  • Introduction to Excel 2
  • Introduction to R and Exploratory Data Analysis in R 1
  • Data manipulation and vizualization
  • Open source data and data scraping
  • Introduction to text mining in R
  • OLS Regression
  • PowerPoint
Элементы контроля

Элементы контроля

  • неблокирующий Data exploration and manipulation tests
    Students will have 4 at-home tests that cover 4 main topics of this course: 1) excel, 2) exploratory data analysis in R, 3) text mining and data scraping, 4) OLS Regression
  • неблокирующий Exam
    Take home project where students have to work on a problem using Excel and R. Students will provide Excel spreadsheets and R scripts for evaluation. Results of data analysis from Excel and R should be organized in a PowerPoint presentation.
Промежуточная аттестация

Промежуточная аттестация

  • 2024/2025 1st module
    0.6 * Data exploration and manipulation tests + 0.4 * Exam
Список литературы

Список литературы

Рекомендуемая основная литература

  • Discovering statistics using R, Field, A., 2012
  • R in action: Data analysis and graphics with R, Kabacoff, R.I., 2015

Рекомендуемая дополнительная литература

  • Quantitative finance: a simulation-based introduction using excel, Davison, M., 2014

Авторы

  • Зубарев Никита Сергеевич