Modern Data Analytics 2023

Organization: Prof. Dr. Dan Olteanu, Prof. Dr. Michael Böhlen

This seminar overviews recent research development at the intersection of databases and machine learning. In particular, it considers two distinct lines of work:

The application of machine learning to databases: Use models to predict query performance or replace traditional modules in a database management system such as indices.
The application of databases to machine learning: Use database techniques to improve the runtime performance for training machine learning models.

Learning outcome: The goal of the seminar is to expose the students to the recent trends in academia and industry on rethinking database management systems and on how to effectively unify knowledge on both machine learning and databases to scale data science workloads.

Target audience: MSc in Data Science students (the maximum number of students is restricted to 18)

Semester: This seminar will be offered in Fall 2023.

Teaching format: Each participant: prepares a presentation based on a research paper; answers follow-up technical questions; reads the other papers in the seminar session; and actively participates in the technical discussions in the seminar. Each participant has a buddy, who will help improve their presentation by making suggestions for improvements and attending dry runs of the presentation. The first complete draft of the presentation is due one month after the kickoff meeting. The best presentation of the seminar will be selected by the participants and receive a prize.

Registration: Please register as required by the department. In addition, please browse the papers mentioned below. In the kickoff meeting, the papers will be assigned to students, so make sure you get assigned to a paper you want.

Meetings: The first meeting will be on Tuesday, September 19, 2022 from 10:15 to 12:00 in room BIN 1.D.29. The meeting will feature a presentation by the organizers overviewing the topics to be investigated in the seminar and it will answer questions from the participants. In this session, students will be assigned to papers.

The slides used in the first meeting are here.

The student presentations will take place on Saturday November 11 and December 2, 2023 in BIN 2.A.01.

Participation at all three meetings is compulsory. The assessment depends on the quality of the presentation, active participation during the seminar, and input as a buddy.

How to read papers and give talks

How to read papers:

Focus questions to help identify the main contributions of a paper
Survival kit includes tips on how to read technical sections and the "three-pass approach" to tie all together
Reading Research Papers by Andrew Ng

How to give talks:

These two articles have a number of good suggestions.
This video is pretty good as well.
How To Speak by Patrick Winston - a newer version of Patrick's talk

Papers to be read by all students

The following are individual paper assignments organized by topics. Whenever an entry has two papers, this means that both papers can be presented together (as they use similar ideas), or only one of them can be presented.

Topic 1: Learned Data Structures used in Database Systems

1.1 The Case for Learned Index Structures
1.2 ALEX: An Updatable Adaptive Learned Index

Topic 2: Learned Query Optimization and Evaluation

Topic 3: In-database Machine Learning and Linear Algebra

Paper Assignments, Buddies, and Supervisors

Group 1: Saturday, November 11, 2023

Paper	Presenter	Buddy	Supervisors
1.1	Wanke Tong	Glenn Bucagu	Johannes Marti
1.2	Yizhi Zhang	Christian Berger	Haozhe Zhang
2.1	Christian Berger	Yizhi Zhang	Michael Böhlen Xinyu Zhu
2.2	Noah Mamie	Solveig Helland	Johannes Marti
2.3	Yingying Liu	YuinKwan Chan	Johannes Marti
2.4	Glenn Bucagu	Haozhe Luo	Haozhe Zhang
2.5	Solveig Helland	Noah Mamie	Haozhe Zhang
3.10	Haozhe Luo	Wanke Tong	Christoph Mayer
Group 2: Saturday, December 2, 2023
3.1	Alen Frey	Giuseppe Doda	Michael Böhlen
3.2	Carol Ernst	Prakhar Bhandari	Ahmet Kara
3.3	YuinKwan Chan	Yingying Liu	Ahmet Kara
3.4	Prakhar Bhandari	Chi Zhang	Dan Olteanu
3.5	Chi Zhang	Carol Ernst	Dan Olteanu
3.6	Renzhi Hang	Raphael Imfeld	Christoph Mayer
3.9	Raphael Imfeld	Renzhi Hang	Michael Böhlen Xinyu Zhu
2.6	Giuseppe Doda	Alen Frey	Christoph Mayer

Department of Informatics Data Systems and Theory

Quicklinks und Sprachwechsel

Main navigation