Exploration, Exploitation, and a Little Bit of State - DLS Talk by Robert Kleinberg (Cornell University)


MacLeod Building (2356 Main Mall), Room 228


Learning and decision-making problems often boil down to a balancing act between exploring new possibilities and exploiting the best known one. For more than fifty years, the multi-armed bandit problem has been the predominant theoretical model for investigating these issues. By boiling the exploration-exploitation dilemma down to its essence --- in which past choices only influence future ones via the information the learner obtains --- the basic multi-armed bandit model fails to capture the process of learning in stateful environments where past choices may limit the learner's future options or influence their expected values, as often happens in applications. In full generality these stateful learning problems are computationally hard, but we have taken some initial steps towards identifying practical and efficiently solvable (or approximable) special cases. I will report on this progress and speculate on what challenges lie ahead.


Bobby Kleinberg is a Professor of Computer Science at Cornell University. His research pertains to the design and analysis of algorithms and their applications to machine learning, economics, networking, and other areas. Prior to receiving his doctorate from MIT in 2005, Kleinberg spent three years at Akamai Technologies; he and his co- workers received the 2018 SIGCOMM Networking Systems Award for pioneering the world's first and largest Internet content distribution network. He is also the recipient of a Microsoft Research New Faculty Fellowship, an Alfred P. Sloan Foundation Fellowship, and an NSF CAREER Award.

Host: Hu Fu, UBC Computer Science

