Master
2024/2025



Basic Tools for Data Analysis
Type:
Elective course (Comparative Politics of Eurasia)
Area of studies:
Political Science
When:
1 year, 1 module
Mode of studies:
offline
Open to:
students of one campus
Instructors:
Nikita Zubarev
Master’s programme:
Comparative Politics of Eurasia
Language:
English
ECTS credits:
3
Course Syllabus
Abstract
This course introduces students to the fundamental tools and techniques used in data analysis, providing a solid foundation for understanding and interpreting data. Through hands-on activities and practical exercises, participants will learn how to collect, clean, analyze, and visualize data using popular software tools such as Excel, R, and Python.
Learning Objectives
- The class aims to introduce students to foundations of exploratory data analysis, vizualization, data wrangling, and basics of statistical inference.
Expected Learning Outcomes
- The ability to create a PowerPoint presentation that will follow current design trends
- Students are familiar with navigating in Excel, filtering, and sorting data. Basic functions: sum, average, count, max, min, Absolute and relative references, Subtotal, if, sumif, Syntaxis of functions
- Students are able to use the following features of excel: AND, OR, NOT, IFS, and nested IF functions; Pivot tables, slices; Vlookup function; Visualise data; Merging tables.
- Students are familiar with syntaxis of R, ways of getting help; notions of objects, vectors, and types of data in R. They can code basic contingency tables. Students can use main function from dplyr.
- Students are able to use main dplyr functions. Students are capable of creating main plot types using ggplot2. Students can calculate main descriptive statistics.
- Students are familiar with open source data such as survey projects (WVS, EVS, ESS, Barometers) and government statistics (Rosstat). Students can use function from the R package rvest to scrape tables and textual data from the web. They can use join function to combine data from different tables.
- Students are familiar with applications of ggplot2 in the context of working with textual data. Students can pre-process and explore a text corpus using basic text statistics and tidytext package.
- Students can identify appropriate data for fitting an OLS model. Students can check statistical assumptions for variables to be included in statistical analysis. Students can fit and interpret a basic OLS model in R. Students can present results of an OLS regression using stargazer and jtools packages.
Course Contents
- Introduction to Excel 1
- Introduction to Excel 2
- Introduction to R and Exploratory Data Analysis in R 1
- Data manipulation and vizualization
- Open source data and data scraping
- Introduction to text mining in R
- OLS Regression
- PowerPoint
Assessment Elements
- Data exploration and manipulation testsStudents will have 4 at-home tests that cover 4 main topics of this course: 1) excel, 2) exploratory data analysis in R, 3) text mining and data scraping, 4) OLS Regression
- ExamTake home project where students have to work on a problem using Excel and R. Students will provide Excel spreadsheets and R scripts for evaluation. Results of data analysis from Excel and R should be organized in a PowerPoint presentation.