VVZ API is not affiliated with ETH Zurich. Data might be outdated or incorrect. Please view the official ETHZ Vorlesungsverzeichnis for binding information.
Foundations of Data Science
Last Updated: 2026-06-03 00:07:47
Abstract
This course introduces data science and machine learning through the full workflow from raw data to insights. Students gain competency in core data operations, essential statistical tools, supervised and unsupervised methods, and ethical practice. Tutorials provide hands‑on experience using Python and IDEs.
Objective
At the end of the course, a student should be able to: 1. Describe and apply core data operations, including collection, cleaning, preprocessing, transformation, and exploratory data analysis. 2. Explain and use simple statistical models (central tendency, variance, correlations, basic probability) to analyze and interpret data. 3. Differentiate major machine learning approaches and analyze their use cases, including regression, classification, and clustering. 4. Describe and apply key supervised learning concepts, including data splits, feature engineering, model selection, and evaluation. 5. Explain and apply foundational unsupervised learning methods, including clustering and dimensionality reduction. 6. Demonstrate good data science practices, including reproducibility, documentation, version control, model validation, and ethical considerations. 7. Use essential data science tools, Python and Integrated Development Environments (IDEs), to implement basic data and machine‑learning workflows.
Content
This course offers an applied introduction to data science and machine learning, guiding students through the complete workflow from raw data to meaningful insights. Students learn to describe and apply core data operations, including data collection, cleaning, preprocessing, transformation, and exploratory data analysis, while building a foundation in simple statistical models and key machine learning methods. Throughout the course, students will be able to explain the purpose of essential statistical tools and use them to analyze data, differentiate between core machine learning approches, and apply core concepts of supervised and unsupervised learning. Emphasis is placed on good data science practices, including reproducibility, documentation, version control, model validation, and responsible and ethical use of machine learning. Students also gain hands‑on experience using Python and integrated development environments to implement data processing and modeling workflows. A central course project leads students through an end to end data science process involving the formulation of a research question, the selection of a dataset, the preprocessing and analysis of data, the application of appropriate models, and the presentation of results. Tutorials support this project by following the same workflow steps and providing dedicated time for guided progress. By the end of the course, students develop the analytical, technical, and methodological skills needed to conduct rigorous data driven research.
General Information
- Language
- English
- Levels
- BSC
- Frequency
- Yearly recurring
Examination
- Type
- session examination
- Mode
- written 90 minutes
- Aids
- None
- Digital
- The exam takes place on devices provided by ETH Zurich.
Course Components
| Type | Title | Time & Place | Hours |
|---|---|---|---|
| lecture | Foundations of Data Science | No time listed | 2 h weekly |
| exercise | Foundations of Data Science | No time listed | 2 h weekly |