VVZ API is not affiliated with ETH Zurich. Data might be outdated or incorrect. Please view the official ETHZ Vorlesungsverzeichnis for binding information.
Hacking for Sciences - An Applied Guide to Programming with Data
Last Updated: 2026-06-01 11:30:56
Abstract
The vast majority of data has been created within the last decade. As a result, more and more fields of research start to consider and embrace programming to process and analyse data. This course teaches applied programming with data and aims to leverage the open source tech stack to deal with this new wealth and complexity of data.
Objective
The idea behind Hacking for Science is build a solid understanding of core technologies and concepts to help researchers develop a data processing strategy and increase your possibilities when working with data. The course approach is to single out those concepts stemming from software development that are easy to adopt and useful to non-computer scientists. The course has three major learning objectives: - Understand the role of focal components in a data science tech toolbox. Learn how technologies like R, Python, Git Version Control, docker or Cloud Computing could play together in your research project. - Learn how to manage and version control source code. Hacking for Science teaches how to use git version control to collaborate professionally, make your research reproducible and your code base persistent. - Applied data sourcing and data transformation Learn how to communicate with SQL databases. Learn how to consume data from different sources using machine to machine communication interfaces (APIs) such as the OpenStreetMap geocoding API / Routing Engine or the KOF data API for macroeconomic time series. Non-Goals: Hacking for Science is not a Statistics, Econometrics or Machine Learning course. Though experience in these fields will help inasmuch that students will have an easier time to motivate investing in programming and to come up with their own application examples, profound methodological knowledge is not a prerequisite.
Content
Hacking for Science is a guide to programming with data. It is tailored to the needs of a field in which scholars’ typical curricula do not contain a strong programming component. Yet this course argues that what the open source community calls a ‘software carpentry’ level is totally within reach for a quantitative social scientist and well worth the investment: being able to code leverages field specific expertise and fosters interdisciplinary collaboration, as source code continues to become an important communication channel. The course contains three blocks that are mostly based on the three learning objectives presented above. Hacking for Social Sciences explicitly plans to spread its three blocks over 1-2 months to give students the ability to work on applied examples in between sessions in order to get most out of the subsequent session. The first block demonstrates the components of a modern data science tech stack, classifies technologies and gives a big picture overview: from languages such as R and Python to container technology such as docker. The second block focuses on git version control, the de facto industry standard to manage source code. Version control is not only crucial to knowledge management and reproducible research, but it is also the backbone of collaboration in distributed teams. The third and final block focuses on data themselves and teaches how to obtain data through machine to machine communication. Furthermore, the third block discusses data management in a research project.
Resources
Lecture Notes
A free and open online book (made with quarto) is available fromhttps://rse-book.github.io/. The book/script will be continuously updated during the course to account for questions and participants' questions.All course materials including, slides, resources and source code will be made available through the course Website:https://rseed.ch/h4sci.html
Literature
Bannert, Matthias (2024): Research Software Engineering -- A Guide to the Open Source Ecosystem, CRC/Chapman & Hall Data Science series. ISBN-13 : 978-1032261270 Link
Learning Materials (Links)
- Main link
- Course Website
- Learning environment
- Course Github Repository
- Literature
- Free, Open Online Book: Research Software Engineering (CC-BY-NC-SA licensed)
General Information
- Language
- English
- Levels
- DR
- Frequency
- Yearly recurring
Examination
- Type
- ungraded semester performance
Registration & Places
- Max Places
- 40
- Signup End
- 24.09.2025
Course Components
| Type | Title | Time & Place | Hours |
|---|---|---|---|
| lecture |
Hacking for Sciences - An Applied Guide to Programming with Data
Irregular lecture.
|
|
28 h semesterly |
Offered In
-
Doktorat Management, Technologie und Ökonomie (Mehr Informationen unter: )
-