VVZ API is not affiliated with ETH Zurich. Data might be outdated or incorrect. Please view the official ETHZ Vorlesungsverzeichnis for binding information.
Foundations of Data Science
Last Updated: 2026-02-05 16:38:12
Abstract
This course teaches the basic techniques, methodologies, and practical skills required to draw meaningful insights from real-world data with focus on biomedical and health data. The goals are to learn how to use acclaimed software tools (pandas, scikit-learn) for acquiring, cleaning, analyzing, exploring, and visualizing data; making data-driven inferences and decisions and communicating results.
Objective
At the end of the course, a student should be able to: 1. Construct a coherent understanding of the techniques and software tools required to perform the fundamental steps of the data science pipeline; 2. Acquire data from different sources (data formats, API, open data, big data platforms); 3. Prepare data for subsequent analysis (handling missing and incorrect data; perform data quality assessments; identify and deal with outliers); 4. Solve real-world scenarios, including tackling imbalanced data and selecting suitable models; 5. Perform data interpretation (statistics, knowledge extraction, critical thinking, ad-hoc visualizations); 6. Evaluate outcomes and make decisions based on data; 7. Effectively communicate results (reporting, visualizations, publishing reproducible results, ethical concerns).
Content
1. Introduction to data science 2. Data wrangling (data acquisition, cleaning, handling missing data, outlier detection) 3. Data visualization and reporting results (graphic vocabulary, graph types, methods of data visualization) 4. Statistics (repetition of basics) 5. Machine learning (definition, supervised and unsupervised ML, training vs test set, cross-validation) 6. Regression 7. Classification 8. Clustering 9. Feature selection 10. Ethics in data science 11. Data science applications
General Information
- Language
- English
- Levels
- BSC
- Frequency
- Yearly recurring
Examination
- Type
- session examination
- Mode
- written 90 minutes
- Aids
- “Individual cheat sheet, A4, handwritten, one-side. Additionally, two Python cheat sheets will be handed out to the students prior to examination start on the examination day.”
- Digital
- The exam takes place on devices provided by ETH Zurich.
Course Components
| Type | Title | Time & Place | Hours |
|---|---|---|---|
| lecture | Foundations of Data Science |
|
2 h weekly |
| exercise |
Foundations of Data Science
Groups are selected in myStudies.
|
|
2 h weekly |