VVZ API is not affiliated with ETH Zurich. Data might be outdated or incorrect. Please view the official ETHZ Vorlesungsverzeichnis for binding information.

376-1983-00L 8 Credits BSC D-HEST , D-BIOL
You're viewing possible stale or outdated data. Please check the latest semester for more up-to-date information.

Foundations of Data Science

VVZ CR n/a

Last Updated: 2026-02-05 16:38:12

Abstract

This course teaches the basic techniques, methodologies, and practical skills required to draw meaningful insights from real-world data with focus on biomedical and health data. The goals are to learn how to use acclaimed software tools (pandas, scikit-learn) for acquiring, cleaning, analyzing, exploring, and visualizing data; making data-driven inferences and decisions and communicating results.

Objective

At the end of the course, a student should be able to: 1. Construct a coherent understanding of the techniques and software tools required to perform the fundamental steps of the data science pipeline; 2. Acquire data from different sources (data formats, API, open data, big data platforms); 3. Prepare data for subsequent analysis (handling missing and incorrect data; perform data quality assessments; identify and deal with outliers); 4. Solve real-world scenarios, including tackling imbalanced data and selecting suitable models; 5. Perform data interpretation (statistics, knowledge extraction, critical thinking, ad-hoc visualizations); 6. Evaluate outcomes and make decisions based on data; 7. Effectively communicate results (reporting, visualizations, publishing reproducible results, ethical concerns).

Content

1. Introduction to data science 2. Data wrangling (data acquisition, cleaning, handling missing data, outlier detection) 3. Data visualization and reporting results (graphic vocabulary, graph types, methods of data visualization) 4. Statistics (repetition of basics) 5. Machine learning (definition, supervised and unsupervised ML, training vs test set, cross-validation) 6. Regression 7. Classification 8. Clustering 9. Feature selection 10. Ethics in data science 11. Data science applications

General Information

Language
English
Levels
BSC
Frequency
Yearly recurring

Examination

Type
session examination
Mode
written 90 minutes
Aids
“Individual cheat sheet, A4, handwritten, one-side. Additionally, two Python cheat sheets will be handed out to the students prior to examination start on the examination day.”
Digital
The exam takes place on devices provided by ETH Zurich.
The final grade consists of: 80 % written examination / 20 % project work. The project must be re-done in case of repetition. A bonus of 0.25 points can be acquired by handing in the 6 learning tasks (homework).

Course Components

Type Title Time & Place Hours
lecture Foundations of Data Science
  • Tue 09:45-11:30 (HPV G 4)
2 h weekly
exercise Foundations of Data Science
Groups are selected in myStudies.
  • Thu 14:15-16:00 (ETZ K 91)
  • Thu 14:15-16:00 (HG D 3.1)
  • Thu 14:15-16:00 (HG D 5.1)
  • Thu 14:15-16:00 (IFW B 42)
  • Thu 14:15-16:00 (IFW C 31)
2 h weekly

Offered In