Research by faculty member
More recently, I am interested in integrating the paradigms of database-style querying, IR-style search, and RecSys-style recommendations. And I want to do this taking user's context into account. Context as in the social neighborhood of the user as well as context as in the user's current information needs or her current task. Opinions and "intelligence" of the crowd is something to be naturally harnessed in this setting. Stay tuned for more information on what drives my research these days.
the PROOF Centre of Excellence for the prevention of organ failures since 2008, I have been leading a team of computational scientists, statisticians and system biologists to conduct various genomics studies on heart, lung and kidney failures. The team oversees every aspect of "Big Data" from storage, quality control to data mining, model building, discovery and validation of biomarker panels. The team has developed state-of-the-art computational pipelines for every step of biomarker discovery and validation. Those analysis pipelines have been applied successfully to numerous studies. The flagship biomarker project of the PROOF Centre is the development of biomarker panels for diagnosing acute rejection on transplanted heart or kidney patients. Starting from 2004, with total funding in excess of $20 million Canadian dollars, we have worked diligently on every step of the process, from discovery, to validation and clinical implementation. There was also an international trial involving hundreds of patients in Canada, US, Australia and India. The panel for heart transplants, in particular, has been made into a new laboratory test, to be given to patients in St Paul’s hospital starting this year.
A totally different direction of my research contributions is the body of studies on summarizing and extracting information from written conversations, such as emails, blogs and tweets. Over the past 15 years, the group led by Carenini, a UBC colleague, and myself have published extensively in all the premier international forums. See here for more details. Our projects were partially funded by Google, IBM and SAP. This line of work has culminated into our book on summarizing text conversations. Since its publication in 2011, the book has become the third most downloaded books of the Morgan Claypool series on data management.
Lastly, I also lead a research program that focal areas: (A) aggregate query processing for wireless sensor networks; (B) topic modeling and sentiment extraction for text streams; (C) outlier detection and explanations; and (D) prefix based forecasting.
- Making sense of data that is stored in relational databases or XML is difficult. For example, if civil engineers are trying to extract information about where two pieces of a building intersect, they may need to find 10 different elements in a schema that contains thousands of options. This project seeks to allow users to understand their schemas well enough to query them. This is joint work with Zainab Zolaktaf.
- In many cases where analysis is being performed, a user may have an aggregation query to which she knows what the correct answer should be for one case. Trying to determine why the answer that the user is getting is different from the one provided by the "Oracle" is a frustrating and error-prone process. This project seeks to allow users to get feedback to why their aggregation queries are not providing the answer that they expect. This is joint work with Omar AlOmeir.