CS 249 Query Optimization

Project proposal due Jan 20 (soft deadline, email to Remy)

This is a graduate-level research-oriented course offered in Winter 2025. The course aims to introduce foundation knowledge and discuss recent advances in query optimization. Another goal of the course is to teach the students how to effectively read research papers. The course contains lecture time by the instructor covering basics of query optimization, and guided discussion on foundational and cutting-edge research. The students are expected to conduct a research project in related topics.

Prerequisites

The students are expected to be familiar with the inner workings of relational databases, including the relational algebra, basics of query optimization, cost estimation, and query execution.

Grading

The research project takes up 100% of the grade. Please send your GitHub user name to the instructor to access a list of project ideas, and feel free to schedule a meeting to discuss your project further. The students are encouraged to send weekly progress updates on their project to the instructor to ensure they are on track to completion. There is no fixed format, you can just informally describe what you have done over the week.

Below are a list of papers we will read, in order.

[1]
Leis, V. et al. 2015. How good are query optimizers, really? Proc. VLDB Endow. 9, 3 (2015), 204–215. DOI:https://doi.org/10.14778/2850583.2850594.
[2]
Hilprecht, B. et al. 2020. DeepDB: Learn from data, not from queries! Proc. VLDB Endow. 13, 7 (2020), 992–1005. DOI:https://doi.org/10.14778/3384345.3384349.
[3]
Moerkotte, G. and Neumann, T. 2006. Analysis of two existing and one new dynamic programming algorithm for the generation of optimal bushy join trees without cross products. Proceedings of the 32nd international conference on very large data bases, seoul, korea, september 12-15, 2006 (2006), 930–941.
[4]
Khamis, M.A. et al. 2024. Pessimistic cardinality estimation.
[5]
Mancini, R. et al. 2022. Efficient massively parallel join optimization for large queries. SIGMOD ’22: International conference on management of data, philadelphia, PA, USA, june 12 - 17, 2022 (2022), 122–135.
[6]
Li, F. et al. 2016. Wander join: Online aggregation via random walks. Proceedings of the 2016 international conference on management of data (New York, NY, USA, 2016), 615–629.
[7]
Bonifati, A. et al. 2023. Threshold queries. SIGMOD Rec. 52, 1 (2023), 64–73. DOI:https://doi.org/10.1145/3604437.3604452.
[8]
Trummer, I. et al. 2021. SkinnerDB: Regret-bounded query evaluation via reinforcement learning. ACM Trans. Database Syst. 46, 3 (2021), 9:1–9:45. DOI:https://doi.org/10.1145/3464389.
[9]
Tarjan, R.E. and Yannakakis, M. 1984. Simple linear-time algorithms to test chordality of graphs, test acyclicity of hypergraphs, and selectively reduce acyclic hypergraphs. SIAM J. Comput. 13, 3 (1984), 566–579. DOI:https://doi.org/10.1137/0213035.