CPSC V 404 101 2025W
Instructor(s)
Extended Description
Whereas CPSC 304 focuses on using a relational database management system (RDBMS), CPSC 404 focuses on the internals and performance issues of an RDBMS. Topics include: disks and the storage hierarchy, the I/O cost model, buffer pool management, disk scheduling, record layouts, metadata, system catalogs, tree-structured indexes (especially B+ trees), hash indexes, query evaluation, query optimization, transaction processing, concurrency, and crash recovery.
Course learning outcomes for a typical term may include:
- Explain and justify the I/O cost-based model for query evaluation in relational database management systems (RDBMSs).
- Recommend the most useful indexes for a set of tables, given some information about the expected query mix (e.g., the importance and frequency of expected queries).
- Show how to use an index (e.g., B+ tree, extendible hash structure, linear hash structure) to look up search keys in an index and the corresponding rows in a table. Demonstrate insertions and deletions of keys in the index, by performing splitting or merging of nodes. Analyze the
- complexity of operations against these data structures.
- Explain the query evaluation and optimization decisions regarding joins, indexes, pipelining, selection, projection, etc. made by an RDBMS optimizer, given appropriate metadata.
- Estimate the I/O cost of evaluating and optimizing a given SQL query, given a set of indexes and appropriate metadata about the tables and indexes.
- Estimate the I/O cost of sorting a large file using external mergesort.
- Explain how transactions and concurrency are managed to permit greater throughput in an RDBMS. Explain how scheduling and deadlock are managed in a system that uses locks, and identify the types of anomalies and performance issues that can result.
- Define the following properties of database transactions, and give practical examples of their
- characteristics or desirable features: ACID properties, serializability, and isolation levels.
- Explain how logging works in a DBMS that uses the ARIES crash recovery algorithm. Provide
- guidelines for setting appropriate checkpoint intervals. Explain in detail, using examples, how crash recovery is performed.