Machine learning (ML) underpins many applications that profoundly transform our private and working lives in almost every domain. ML systems to execute these workloads are, however, still in their infancy and rapidly evolving. In this talk, I will make a case for a data-centric view of ML, substantiated from application, workload and system perspectives. To understand the evolution of data systems, it is useful to review the history of database (DB) systems, modern means of data management, and repeated efforts of integrating ML primitives. Subsequently, I will give an overview of Apache SystemML as a representative ML system based on declarative specification and automatic optimization, key pillars underlying the success of DB systems for half a century. SystemML provides an R-like syntax and automatically compiles these high-level linear algebra programs into hybrid runtime plans of single-node, in-memory operations, and distributed operations on Spark. Interestingly, DB and ML systems share many similarities regarding internal optimization, data access, and execution techniques. Based on seven years of experience doing research and development in SystemML, I will then share major lessons learned and draw conclusions on supporting the entire end-to-end data science lifecycle from data integration and cleaning, over ML model training, to model deployment and serving.
Greeting and Opening
Horst BISCHOF Univ.-Prof. Dipl.-Ing. Dr.techn.
Vice Rector for Research at Graz University of Technology
Roderick BLOEM Univ.-Prof. Ph.D.
Dean of the Faculty of Computer Science and
Biomedical Engineering at Graz University of Technology
Matthias BÖHM Univ.-Prof. Dipl.-Wirt.-Inf. Dr.-Ing.
BMVIT endowed chair for Data Management at the Institute of Interactive Systems and Data Science at Graz University of Technology
followed by a sparkling wine reception and a buffet
Matthias Böhm is a professor for data management in data science at Graz University of Technology, Austria, where he holds a BMVIT-en dowed chair for data management. Prior to joining TU Graz in 2018, he was a re search staff member at IBM Research - Almaden, USA, with a major focus on optimization and runtime techniques for declarative, large-scale machine learning. Since 2015, Matthias also serves as a PMC member for Apache SystemML. He received his Ph.D. from Dresden University of Technology, Germany in 2011 with a dissertation on cost-based optimization of integration flows. His previous research also includes systems support for time series forecasting as well as in-memory indexing and query processing. Matthias is a recipient of the 2016 VLDB Best Paper Award, and a 2016 SIGMOD Research Highlight Award.
TU Graz | Institute of Interactive Systems and Data Science, alumniTUGraz 1887
21. November 2018, 14:00 - 16:30
Ceremony Hall at Graz University of Technology, Rechbauerstrasse 12, 1st floor, 8010 Graz