Workshop on Online Social Systems

Keynote Talks

Three keynote talks will be given at WOSS 2012. Detailed information (speaker bios, talk tiltles, and abstracts) can be found in the following.

Extracting Relevant and Trustworthy Information from Microblogs

Speaker: Krishna Gummadi (Max Planck Institute for Software Systems)

Abstract: Microblogging sites like Twitter have emerged as a popular platform for exchanging real-time information on the Web. Twitter is used by hundreds of millions of users ranging from popular news organizations and celebrities to domain experts in fields like computer science and astrophysics and spammers. As a result, the quality of information posted in Twitter is highly variable and finding users that are trusted and authoritative sources of information on specific topics is a key challenge. I will attempt to address this challenge in this two-part talk. In the first part of the talk, I will focus on understanding and combating link farming activity in Twitter. Users, especially spammers, resort to link farming to acquire large numbers of follower links in the social network. Acquiring followers not only increases the size of a user's direct audience, but also contributes to the perceived influence of the user, which in turn impacts the ranking of the user's tweets by search engines. I will first discuss results from our recent studies investigating link farming activity in the Twitter network and then propose mechanisms to discourage the activity. In the second part of the talk, I will focus on the problem of finding topic experts in Twitter. I will propose a new methodology that relies on the wisdom of the Twitter crowds. Specifically, we leverage Twitter Lists, which are often carefully created by individual users to include experts on topics that interest them and whose meta-data (List names and descriptions) provide valuable semantic cues to experts' domain of expertise. I will first describe how we mined List information to build Cognos, a scalable expert search system for Twitter and then present results from a real-world deployment.

Krishna Gummadi leads the Networked Systems research group at the Max Planck Institute for Software Systems (MPI-SWS) in Germany. He received his Ph.D. (2005) and M.S. (2002) degrees in Computer Science and Engineering from the University of Washington, Seattle under the guidance of Professors Steven D. Gribble and Henry M. Levy. He also holds a B.Tech (2000) degree in Computer Science and Engineering from the Indian Institute of Technology, Madras. Krishna's research interests are in the measurement, analysis, design, and evaluation of complex Internet-scale systems. His current projects focus on (a) making Internet access infrastructures more transparent, (b) enabling efficient and cost-effective bulk content delivery in the Internet, (c) understanding the evolution of online social network structures and the dynamics of information flows over them, (d) leveraging social networks to design better information sharing systems, and (e) building more trustworthy cloud computing infrastructures. Krishna's work on Internet access networks, online social networks, and peer-to-peer systems has led to a number of widely cited papers. He also received best paper awards at OSDI, SIGCOMM IMW, and MMCN for his work on Internet measurements and peer-to-peer systems.

Four degrees of separation in 69 billion friendships: social sciences meet very large social networks

Speaker: Sebastiano Vigna (Università degli Studi di Milano)

Abstract: Facebook is currently the largest online social network made of people and friendship links. Since the 60's, sociologists tried to find how many friendship links you must traverse in average to get from any person in the world to any other one, using experiments involving a few hundred people and concluding that there were six "degrees of separation". The idea was actually introduced in a short story by the Hungarian writer Frigyes Karinthy, and made popular by a play by John Guare and by a film directed by Fred Shepisi. More generally, sociologists where interested in the distance distribution of friendship: how many pairs of people are separated by k degrees? We will discuss some new, high-performance diffusion-based approximate algorithms that made it possible to conclude that on Facebook there are 3.74 degrees of separation. Part of the process that made the computation possible on inexpensive, off-the-shelf hardware was to compress the entire Facebook graph (at the time 721 million users, 69 billion friendship links) in just 211 GB, still retaining a very high access speed.

Sebastiano Vigna obtained his PhD in Computer Science from the Università degli Studi di Milano, where he is currently Associate Professor. His interests lie in the interaction between theory and practice. He has worked on highly theoretical topics such as computability on the reals, distributed computability, self-stabilization, succinct data structures, query recommendation, and theoretical/experimental analysis of PageRank, but he is also (co)author of several widely used software tools ranging from high-performance libraries to a model-driven software generator, a search engine, a crawler, and a graph compression framework. In 2011 he collaborated to the computation the distance distribution of the whole Facebook graph, from which it was possible to evince that there on Facebook there are just 3.74 degrees of separation.

Enabling Fast Pages and Furious Development While Supporting A Billion Users

Speaker: Subbu N. Subramanian (Facebook Inc.)

Abstract: Facebook is one of the top sites on the internet and supports more than 900 million users. It handles billions of messages, hundreds of millions of photos, and generates hundreds of terabytes of data - every day! This data is also becoming more complex and interconnected over time. Every page the site serves, requires processing large amounts of data and needs to be rendered in milliseconds. Business and practical constraints dictate that more users are served with less resources. In addition, product changes regularly occur in a rapid manner. These constraints dictate that the site requires an infrastructure that is scalable, fast, efficient and flexible beyond what has been built ever before. In this talk, we will share key learnings from our experience in building an infrastructure that addresses the above challenges. In particular, we will discuss key components of the Facebook software architecture, instrumentation and data collection mechanisms that allow us to monitor the health of the site, and innovative tools that analyze vast amount of data to help us pre-empt site issues and help identify root causes when things go wrong. We describe how this infrastructure and tools allow the engineers to move fast and rapidly launch products as Facebook builds for a billion users and beyond.

Subbu Subramanian is currently an engineering manager on the Infrastructure team at Facebook, where he focuses on keeping the site reliable, efficient and scalable. During his tenure at Facebook, Subbu has worked on a variety of product and infrastructure projects including user growth, advertiser campaign management and asynchronous job processing systems. Before joining Facebook, Subbu was a founding member of two startups in the Silicon Valley. Prior to that he was a Researcher at IBM's Almaden Research Center where he did research on relational and semi-structured data query processing. Subbu holds a PhD in Computer Science and is a author of 20 publication and 10 patents.

The First International Workshop on Online Social Systems (WOSS 2012)

Keynote Talks

Extracting Relevant and Trustworthy Information from Microblogs

Four degrees of separation in 69 billion friendships: social sciences meet very large social networks

Enabling Fast Pages and Furious Development While Supporting A Billion Users