People of ACM - Yuan Xie

November 14, 2017

What research area(s) is receiving the most of your attention right now?

I am looking at application-driven and technology-driven novel circuits/architectures and design methodologies. My current research projects include novel architecture with emerging 3D integrated circuit (IC) and nonvolatile memory, interconnect architecture, and heterogeneous system architecture. In particular, my students and I have put a lot of effort into novel architectures for emerging workloads with an emphasis on artificial intelligence (AI). These novel architectures include computer architectures for deep learning neural networks, neuromorphic computing, and bio-inspired computing.

In your recent book Die-Stacking Architecture co-authored with Jishen Zhao, you predict that 3D memory stacking will be a computer architecture design that will become prevalent in the coming years. Will you tell us a little about 3D memory stacking?

Die-stacking technology is also called three-dimensional integrated circuits (3D ICs). The concept is to stack multiple layers of integrated circuits vertically, and connect them together with vertical interconnections called through-silicon vias (TSVs). My research group has been working on die-stacking architecture for more than a decade. We’ve been looking at different ways to innovate the processor architecture designs with this revolutionary technology. Recently, memory vendors have developed multi-layer 3D stacked DRAM products, such as Samsung’s High-bandwidth Memory (HBM) and Micron’s Hybrid-Memory Cube (HMC). Using interposer technologies, processors can be integrated with 3D stacked memory into the same package, increasing the in-package memory capacity dramatically. The first commercial die-stacking architecture is the AMD Fury X graphic processing unit (GPU) with 4GB HBM die-stacking memory, which was officially released in 2015. Since then, we have seen many other products that integrate 3D memory, such as Nvidia’s Volta GPU, Google’s TPU2, and, most recently, Intel and AMD’s partnership on Intel’s Kaby Lake G series, which integrates AMD’s Radeon GPU and 4GB HBM2.

How might the introduction of radically new hardware impact the existing ecosystem of software?

It could have significant impact on the ecosystem. For example, integrating GBs of 3D stacked memory into a processor package can create interesting research questions on how to use such a large-capacity DRAM, either as a large last-level cache or as a part of the main memory. If using it as a part of the main memory, the OS should be aware of such NUMA (non-uniform memory access) and the compiler may also need to optimize the code generation with such new memory hierarchy. Another example is in the area of quantum computing. Once the low-level quantum hardware is ready, it is important to devise high-level programming languages and compilers to describe and optimize quantum algorithms.

What are the possible architectural innovations in the AI era?

Machine learning is changing the way we implement applications, and has made significant progress over the last decade, due to the abundant data coupled with significant improvement in computer power. The emerging AI workloads have motivated many architectural innovations, ranging from memory architectures to specialized hardware such as GPUs and TPUs.

I think one of the key challenges is to address the “memory wall” for hardware accelerator designs in machine learning applications, due to the increasing size of the datasets and the models. With joint design of algorithm and hardware architecture, and reduced bitwidth precision, increased sparsity and compression are used to minimize the data movement overheads between computation and data storage. In addition, memory-centric designs will also be possible solutions: (1) memory-rich accelerators such as GPUs or TPUs integrated with GBs of die-stacking HBM memory; and (2) intelligent memory and storage architecture where efficient computing logic is implemented within memory or storage so that data movement can be dramatically reduced.

Yuan Xie is a Professor of Computer Engineering and Director of the Scalable and Energy-Efficient Architecture Lab (SEAL) at the University of California, Santa Barbara (UCSB). Before joining UCSB in 2014, he worked for IBM Microelectronics, AMD Research Lab, and on the faculty at Pennsylvania State University. His research interests include computer architecture, electronics, design automation, very large scale integration (VLSI) design and embedded systems design.  He has published three books, more than 70 journals, and more than 200 refereed conference papers, and holds six patents in these research areas.

Xie is the Editor-in-Chief of ACM Journal on Emerging Technologies in Computing Systems (JETC) and a Senior Associate Editor of ACM Transactions on Design Automation for Electronic Systems (TODAES). His honors include being a recipient of an NSF Career Award and being inducted to the Hall of Fame in three prestigious computer architecture conferences (ISCA, MICRO, and HPCA). He was an ACM Distinguished Speaker from 2010 to 2016.