Collaborative Filtering and The Missing at Random Assumption
By Benjamin Marlin
In this talk I will present a broad
overview of my research on the problem of non-random missing data in
collaborative filtering. I will introduce the concept of a missing data
mechanism following Little and Rubin, describe how the missing at random
assumption might easily be violated in a recommender system, and what the
implications are for modeling, learning, inference, prediction, and error
estimation. I will describe work done at Yahoo! Research and Yahoo! Music to
collect a novel data set that allows us to study these questions in the context
of a real recommender system. Finally, I will describe some of the models we
have looked at that include simple non-random missing data mechanisms, and
discuss empirical results on both the collaborative prediction and collaborative
ranking tasks.