Talk Moshe Gabel

Date
Location

ICCSX836

Abstract: Recent years have seen an explosion in the number of connected devices, which means not only growth in velocity and volume of data, but also that data sources are increasingly spatially distributed,incurring higher communication costs. Data mining algorithms often assumethat data is centralized or that communication is inexpensive: the settin g is implicitly assumed to be a data center. In settings like wireless sen sor networks, however, communication uses limited battery power. Moreove r, most work only considers one-shot computation: computing a result oncefrom a fixed data set. Yet data is increasingly dynamic, and many applic ations require current results over a recent time window. This talk w ill focus on computing approximations over aggregated distributed data str eams with reduced communication using geometric monitoring. Geometric moni toring is a recent general framework developed in our group for monitoringdistributed streams. The key observation for geometric monitoring is thatlow-communication monitoring of even highly-complex functions is often po ssible by deriving constraints in the input domain, rather than the outpu t. These global constraints are then decomposed to local constraints on in put data of each stream (or node), that each node can check independently . We'll review three novel distributed approximations for important n on-linear functions: variance, Shannon's entropy, and least-squares regr ession. Our algorithms provide deterministic user-defined approximation bo unds, while avoiding messages unless they are needed to maintain those bo unds. Compared to the centralized solution, our algorithms reduce communi cation by up to two orders of magnitude on several real data sets that rep resent real applications, including machine failure detection, network m

Find more undergrad events on our internal portal at https://my.cs.ubc.ca.

This event's address: https://my.cs.ubc.ca/event/2017/09/talk-moshe-gabel