CS Tech Talk: Introduction to Analytics and Big Data


DMP 110

'Big Data' is the term for a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools, or traditional data processing applications. The challenges include capture, curation, storage, search, sharing, transfer, analysis, and visualization. The trend to larger data sets is due to the additional information derivable from analysis of a single large set of related data, as compared to separate smaller sets with the same total amount of data. This allows correlations to be found, to spot business trends, determine quality of research, prevent diseases, combat crime, and determine real-time roadway traffic conditions.[1]

In this introductory discussion, the data challenges from a business perspective will be presented, real-world business problems will be explored and Hadoop and it's associated technologies will be explained. HDFS, MapReduce and the related ecosystem projects will be introduced. This discussion is targeted at those engineering professionals who are interested in the related technologies. They may have little or no background experience, but are looking to leverage these technologies in their organizations. Ample time will be given for Hadoop Use Case discussion and the audience is invited to bring discussion problems for group comments, or submit in advance.


Geoff Fawkes is a technology executive with a background in software development and business operational management spanning the past 20+ years. He is currently a technology advisor with the Federal Government of Canada’s National Research Council program (NRC-IRAP), providing business and technology advisory services along with financial support to grow-oriented Canadian small and medium-sized enterprises. Previously, he was in the role of Director, Engineering at Teradata leading the Toronto and Beijing architecture, engineering and testing teams. In addition to his engineering responsibilities, he served as the site IT infrastructure lead for the Teradata analytical database hardware and IT support systems for the Canada-wide sales and R&D groups. He has 10+ years expertise in offshore software development in India, China and Brazil, having structured and mentored teams to grow their expertise globally.

