
CS 410/510: Topological Methods in Data Analysis and Machine Learning
Course Information:
- Term: Spring 2025
- CRN: 35831/35842
- Time: 1600-1720 Tu/Th
- Room: Friendly 106
- Prereq (for 410): CS315 (Intermediate Algorithms) OR MATH341 (Elementary Linear Algebra) OR instructor approval
Instructor:
- Name: Tao Hou
- Office: 333 Deschutes Hall
- Email: taohou at uoregon dot edu
- Office Hours: Tu/Th 2:45PM-3:45PM
Course Description:
This course is on the emerging field of topological data analysis (TDA) with an emphasis on applications for the practical purposes of data analysis. A core technique covered is Persistent Homology, which is a major tool driving the advancement of TDA. If time permits, more techniques such as Mapper, Merge Trees, or Reeb Graphs will also be covered. Since machine learning is a key application aimed by the course, part of the teaching is devoted to the necessary fundamental concepts for machine learning, given that students are not assumed to have known machine learning already. How to make use of the mainstream libraries in TDA (in our case, Gudhi) will also be taught.
TDA is about capturing the “global shape of data” that is meaningful in practice (after all, that’s what topology is good at). To make the course more accessible, key notions in topology (such as homology) are only taught in an “intuitive” way with the excessive mathematical formality intentionally suppressed. The goal is to motivate robust applications while still maintaining the soundness and solidness in understanding the key topological concepts for driving the applications. To that end, certain important concepts at the core of applications, which are sometimes implicitly assumed to be true, are explicitly laid out in this course.
Textbook:
Baris Coskunuzer and Cüneyt Gürcan Akçora. Topological Methods in Machine Learning: A Tutorial for Practitioners
Grading:
Assignments (35%), Final Project (60%), Participation (5%)
Course Announcements:
Other than directly stating things in class, I shall also announce important issues regarding the course through emails (automatically sent to the whole class through Canvas). So keep an eye on your UO email box.
Topics and Slides:
- 1. Topology and Data: A Tour
- 2. Topological Invariant
- 3. Persistent Homology
- 4. Persistent Homology: Case Study
- 5. Gudhi Library
- 6. Intro to Machine Learning
- 6.1. Overview
- 6.2. Predictors, Empirical Risk Minimization, Validation
- 6.3. Classifiers, ERM, Loss functions
- 6.4. Neural Networks (additional illustrations of gradient descent)
- 7. Vectorizing PD for Machine Learning
- 8. Persistent Homology in Machine Learning: Case Study
- 8.1 Topological Regularizer
- I made most slides on my own (ones other than those on machine learning) but I also referred to / used a lot of images from online resources or papers. I tried to remark the sources to the best I can.
- Slides on machine learning are taken from the following courses available online:
- EE104 at Stanford by Prof. Sanjay Lall and Prof. Stephen Boyd.
- EN 601.468/668 at Johns Hopkins by Prof. Philipp Koehn.
More Resources:
* You can find much more by google searching yourself- Books on TDA
* Most of the books listed are more theoretical than the slides and the chosen textbook- Computational Topology: An Introduction by Herbert Edelsbrunner and John Harer
*A free copy slightly different from the officially published version can be found, e.g., here - Computational Topology for Data Analysis by Tamal Dey and Yusu Wang
- Topological Data Analysis with Applications by Gunnar Carlsson and Mikael Vejdemo-Johansson
* Accessible through UO login - Topological Data Analysis for Genomics and Evolution by Raul Rabadan and Andrew J. Blumberg
* Accessible through UO login
- Computational Topology: An Introduction by Herbert Edelsbrunner and John Harer
- Introduction to Topological Data Analysis lecture notes by Patrick Schnider
- The course DSD291 by Yusu Wang at UCSD has a very comprehensive list of software for TDA (see the “Software / Tools” resources section) -- it also has a good batch of slides on TDA
- Topological Data Analysis for Machine Learning by Bastian Rieck
- Topological Methods in Machine Learning and Artificial Intelligence by Gunnar Carlsson
- Topological Data Analysis by Frederic Chazal and Julien Tierny, which has a lot to do with visualizations
- Topological Data Analysis: Mapper, Persistence and Applications Tutorial by Paweł Dłotko which has more illustrations on Mapper
- Topological Data Analysis by Peter Bubenik