VVZ API is not affiliated with ETH Zurich. Data might be outdated or incorrect. Please view the official ETHZ Vorlesungsverzeichnis for binding information.

376-1983-00L 6 Credits BSC D-BIOL , D-HEST
You're viewing possible stale or outdated data. Please check the latest semester for more up-to-date information.

Foundations of Data Science

Registration only possible for BSc HST students in 5th semester (or further). Prerequisite: "Introduction to Python Programming" (376-1725-00L)
VVZ CR n/a

Last Updated: 2026-06-03 00:07:47

Abstract

This course introduces data science and machine learning through the full workflow from raw data to insights. Students gain competency in core data operations, essential statistical tools, supervised and unsupervised methods, and ethical practice. Tutorials provide hands‑on experience using Python and IDEs.

Objective

At the end of the course, a student should be able to: 1. Describe and apply core data operations, including collection, cleaning, preprocessing, transformation, and exploratory data analysis. 2. Explain and use simple statistical models (central tendency, variance, correlations, basic probability) to analyze and interpret data. 3. Differentiate major machine learning approaches and analyze their use cases, including regression, classification, and clustering. 4. Describe and apply key supervised learning concepts, including data splits, feature engineering, model selection, and evaluation. 5. Explain and apply foundational unsupervised learning methods, including clustering and dimensionality reduction. 6. Demonstrate good data science practices, including reproducibility, documentation, version control, model validation, and ethical considerations. 7. Use essential data science tools, Python and Integrated Development Environments (IDEs), to implement basic data and machine‑learning workflows.

Content

This course offers an applied introduction to data science and machine learning, guiding students through the complete workflow from raw data to meaningful insights. Students learn to describe and apply core data operations, including data collection, cleaning, preprocessing, transformation, and exploratory data analysis, while building a foundation in simple statistical models and key machine learning methods. Throughout the course, students will be able to explain the purpose of essential statistical tools and use them to analyze data, differentiate between core machine learning approches, and apply core concepts of supervised and unsupervised learning. Emphasis is placed on good data science practices, including reproducibility, documentation, version control, model validation, and responsible and ethical use of machine learning. Students also gain hands‑on experience using Python and integrated development environments to implement data processing and modeling workflows. A central course project leads students through an end to end data science process involving the formulation of a research question, the selection of a dataset, the preprocessing and analysis of data, the application of appropriate models, and the presentation of results. Tutorials support this project by following the same workflow steps and providing dedicated time for guided progress. By the end of the course, students develop the analytical, technical, and methodological skills needed to conduct rigorous data driven research.

General Information

Language
English
Levels
BSC
Frequency
Yearly recurring

Examination

Type
session examination
Mode
written 90 minutes
Aids
None
Digital
The exam takes place on devices provided by ETH Zurich.
The final grade consists of: 50 % written examination / 50 % project work. The project must be re-done in case of repetition.

Course Components

Type Title Time & Place Hours
lecture Foundations of Data Science No time listed 2 h weekly
exercise Foundations of Data Science No time listed 2 h weekly

Offered In