UBC Computer Science publishes 4 papers at Very Large Data Bases conference

Dr. Laks Lakshmanan and former undergraduate student Sraavan Sridhar represented UBC Computer Science in London, UK, with papers in graph partitioning, information diffusion, subgraph algorithms and large language models

From healthcare information to social networks to cloud storage, databases are everywhere. With more information pouring into the world-wide web and new scientific data accumulating every day, the relationships amongst data in large databases can become very complex. Computer science researchers are studying database management to understand how to best organize, analyze and query large sets of data.

From September 1 - 5 in London, U.K., UBC Computer Science researchers presented their projects at the International Conference on Very Large Data Bases (VLDB), where scientists from around the world gathered to discuss the latest advances in topics such as data management, scalable data science, data mining and analytics.

Dr. Laks Lakshmanan and collaborators proposed an efficient algorithm for finding a set of key influential users in a network, such as in viral marketing scenarios. The paper, “Efficient and Effective Algorithms for A Family of Influence Maximization Problems with A Matroid Constraint,” detailed a new method of finding these critical individuals and showed that their method outperformed all competitors in terms of quality, time and memory usage. A key highlight is that they have deployed their algorithm on an online gaming platform, where it vastly improved their user engagement.

Another paper from Dr. Lakshmanan and collaborators involved conducting an in-depth study on algorithms for Densest Subgraph Discovery, a type of mathematical problem that seeks to find the densest, or most well-connected, subgraph. Finding the densest subgraph can provide insight into various scientific problems, such as analyzing social networks or finding patterns in biological networks. In the paper, “In-depth Analysis of Densest Subgraph Discovery in a Unified Framework,” the researchers found new variants of these algorithms and offer new avenues of research in subgraph algorithms.

Lastly, Dr. Lakshmanan and collaborators investigated sets of large language models (LLMs) of varying capabilities and costs. While LLMs have revolutionized problem solving, they often come with prodigious costs. A critical question for tackling this is whether using a group of smaller LLMs to solve tasks would perform as well as using a large powerful LLM. In the paper, “ThriftLLM: On Cost-Effective Selection of Large Language Models for Classification Queries,” led by UBC Computer Science postdoctoral fellow Keke Huang, former Ph.D. student Dujian Ding and current Master’s student Yifei Li, the researchers formalized the problem and solved it by designing a new aggregation scheme for combining individual LLM responses and measuring aggregation quality. They showed that their algorithm, ThriftLLM, achieves high performance at a fraction of the cost of a single large, powerful LLM.

Dr. Margo Seltzer’s group presented a new way of partitioning graphs, or complex networks of data. The paper, “CUTTANA: Scalable Graph Partitioning for Faster Distributed Graph Databases and Analytics,” led by former UBC Computer Science graduate student Milad Rezaei Hajidehi and former undergraduate student Sraavan Sridhar, demonstrates how CUTTANA, the high-quality and scalable graph partitioner, has improved partitioning quality and runtime performance on graph analytics compared to existing graph partitioners.