Магистратура
2025/2026





Анализ данных секвенирования 2
Статус:
Курс обязательный (Анализ данных в биологии и медицине)
Где читается:
Факультет компьютерных наук
Когда читается:
2-й курс, 1, 2 модуль
Охват аудитории:
для всех кампусов НИУ ВШЭ
Язык:
английский
Контактные часы:
56
Course Syllabus
Abstract
The discipline "Bioinformatics for Next Generation Sequencing" is taught in order to provide students with a comprehensive understanding of modern methods of mass parallel sequencing (NGS) data analysis. The course covers key areas of NGS application: transcriptomics (RNA-seq, scRNA-seq), genomics, metagenomics, epigenomics (ChIP-seq, ATAC-seq, Hi-C) and the study of RNA-chromatin interactions. Students will acquire practical skills in working with raw data (fastq), quality control, alignment, variant interpretation (vcf), gene expression analysis, metagenomic data processing, and working with NGS data presentation formats (BAM, SAM, VCF). Special attention is paid to the development of modern bioinformatic pipelines and tools (GATK, CellRanger, Kallisto, QIIME2), statistical analysis methods and the basics of machine learning for bioinformatics tasks.
Learning Objectives
- Formation of students' systemic knowledge and practical skills in the field of bioinformatic analysis of next-generation sequencing data, including raw data processing, statistical analysis, interpretation of results, and solving applied research problems in genomics and transcriptomics.
Expected Learning Outcomes
- Master the standard pipeline for analyzing genomic variants.
- Master the skills of working with basic bioinformatic pipelines (GATK, RNA-seq analysis, ChIP-seq analysis).
- Be able to interpret the results of variant analysis (VCF), differential expression, and ChIP-seq/ATAC-seq peaks.
- Know how the main NGS platforms (Illumina, PacBio, Oxford Nanopore) and data formats work.
- Get hands-on experience working with metagenomic data.
- Be able to evaluate the quality of sequencing, perform preprocessing (trimming, filtering) of data.
- Understand and apply machine learning techniques to reduce dimensionality and analyze high-dimensional NGS data.
- Develop epigenomics data analysis skills.
- Know the methods of single-cell RNA-seq data analysis and metagenomics (16S, WGS).
- Master the basic principles of scRNA-seq data analysis.
- To develop the skills of primary RNA-seq data analysis.
Course Contents
- Experimental NGS methods and primary data analysis
- Transcriptomics (RNA-seq)
- Single-cell RNA-seq (scRNA-seq)
- Machine learning for NGS data
- Epigenomics: Hi-C
- Metagenomics
- Genomics and Medical applications
- Epigenomics: ChIP-seq and ATAC-seq
- The RNA-chromatin interactome
Assessment Elements
- «Домашняя работа 1» (HW1)RNA-seq data analysis: quality control of FASTQ files, alignment to the reference genome, quantification of genes, basic analysis of differential expression.
- «Домашняя работа 2» (HW2)Single-cell RNA-seq data analysis: filtering, normalization, dimensionalization, clusterization, and cluster annotation.
- «Домашняя работа 3» (HW3)Analysis of metagenomics (16S) data in QIIME2: denoising, construction of a phylogenetic tree, calculation of alpha and beta diversity.
- «Домашняя работа 4» (HW4)ChIP-seq/ATAC-seq data analysis: quality control, alignment, peak calling, peak annotation, and motive analysis.
- «Домашняя работа 5» (HW5)Calling genomic variants using the GATK pipeline: BQSR, HaplotypeCaller, filtering and annotation of the VCF file.
- ЭкзаменA theoretical exam covering all the topics of the course.
Interim Assessment
- 2025/2026 2nd module0.12 * «Домашняя работа 1» (HW1) + 0.12 * «Домашняя работа 2» (HW2) + 0.18 * «Домашняя работа 3» (HW3) + 0.09 * «Домашняя работа 4» (HW4) + 0.09 * «Домашняя работа 5» (HW5) + 0.4 * Экзамен
Bibliography
Recommended Core Bibliography
- Systematic evaluation of spliced alignment programs for RNA-seq data. (2018). Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsbas&AN=edsbas.EB285ADD
Recommended Additional Bibliography
- The new technologies of high-throughput single-cell RNA sequencing ; Новейшие технологии высокопроизводительного секвенирования транскриптома отдельных клеток. (2019). https://doi.org/10.1016/j.cell.2016.11.048.