Krzysztof Onak > Teaching > Algorithmic Techniques for Taming Big Data (DS-563, Spring 2025)

Algorithmic Techniques for Taming Big Data (DS-563/CS-543, Spring 2025)

Course Description: Growing amounts of available data lead to significant challenges in processing them efficiently. In many cases, it is no longer possible to design feasible algorithms that can freely access the entire data set. Instead of that, we often have to resort to techniques that allow for reducing the amount of data such as sampling, sketching, dimensionality reduction, and core sets. Apart from these approaches, the course will also explore scenarios in which large data sets are distributed across several machines, or even geographical locations, and the goal is to design efficient communication protocols or MapReduce algorithms.

The course will include a final project and programming assignments in which we will explore the performance of our techniques when applied to publicly available data sets. Throughout the course, we will explore various strategies for implementing techniques that have theoretical guarantees in practice.

Syllabus: [pdf]

Instructor: Krzysztof Onak (konak@bu.edu)
Office Hours: See post @6 on Piazza

Lecture: Tuesday/Thursday 3:30–4:45pm, CDS 264
Meetings:
    • Tuesdays 3:30–4:45pm, CDS 701
    • Wednesdays 1:25–2:15pm, SOC B61
    • Wednesday 2:30–3:20pm, CGS 423
    • Thursdays 3:30–4:45pm, CDS 701

Piazza (announcements and discussions): https://piazza.com/bu/spring2025/ds563cs543

Lectures

Useful Auxiliary Materials