People of ACM - Martin Wicke
January 16, 2018
How did your earlier work in geometric modeling and simulations prepare you to be part of the TensorFlow team, developing software that conducts deep learning and neural network research?
Problems in modeling and simulation for computer graphics and computational geometry are often phrased as nonlinear optimization problems. Machine learning problems are also (mostly nonlinear) optimization problems. Though the methods differ, intuition about the complexity and behavior of the underlying linear algebra operations is useful. GPUs were used extensively in graphics long before their utility was recognized more broadly, and experience with that computing platform also helps.
What was the Google AI Team’s biggest challenge in developing TensorFlow 1.0 and how did you overcome it?
Google manages datacenters slightly differently from the rest of the world. In particular, the experience and capabilities offered to users is very different than, say, running computations on a public cloud computing service. TensorFlow was initially developed with these constraints and features in mind, which lead to some mismatches to the publicly available supporting infrastructure. Many of the initial problems with slow execution can be attributed to these differences.
This was a visibility problem more than a technical one. Once TensorFlow was open-sourced, these issues surfaced and we could fix them. The lesson here is that software stacks are quite far from being meaningfully standardized, and that abstractions, where they exist, are very leaky, especially once performance is considered and especially at the intersection of high performance and distributed computing.
More than 480 people around the world contributed to TensorFlow in the first year since Google made it open source software. What are the main ways in which these contributions have improved the software?
TensorFlow would not be where we are now without the contributions of the community. To date, over 1,100 people have contributed to TensorFlow code directly. There are two main ways that our contributors improve TensorFlow: they identify problems and fix them, and they contribute new features. The TensorFlow team alone would not have had the time or knowledge to implement many of these. I cannot possibly credit every contribution here, but some of the more significant ones were Windows support as well as many different language bindings.
More generally, TensorFlow is portable and proven to work in many contexts and systems because we have a large and diverse community using it.
What will scalable machine learning software look like in five years? (that is, how and where will it be used)?
I expect that we will see more specialized hardware, and I also expect that users will have to worry less and less about the details of the hardware. This is similar to what happened with computer graphics and GPUs with the emergence of OpenGL and DirectX (Khronos' and Microsoft’s graphics libraries, respectively), and again for general purpose computing on GPUs with the emergence of CUDA (a library for parallel computing created by NVIDIA).
The abstraction level in machine learning frameworks will inevitably rise, and automated model construction will make designing the details of neural network architectures a thing of the past. Many more people will use machine learning, but fewer people will work on tweaking implementation details.
As a consequence, I believe machine learning will become pretty ubiquitous, driven by the gains that we see wherever it is applied today. I expect to see many more productivity tools in the wider sense, with assistants providing significant help with cognitive tasks.
Martin Wicke is a software engineer working on Google AI. His core research interests include machine intelligence, distributed systems and computer vision. He is a key member of the team that developed TensorFlow, a scalable machine learning software package that Google released in 2015 as an open source project.
Wicke completed a PhD at ETH Zurich on geometric modeling and simulation using particle methods. He later worked in a variety of areas including learning system behavior to improve routing in sensor networks, as well as video processing and analysis. Wicke was the presenter of the ACM Learning Webinar, TensorFlow: A Framework for Scalable Machine Learning.