People of ACM - Nam Sung Kim

April 20, 2021

How did you initially become interested in computer architecture?

When I was in elementary school, my parents bought me an Apple II computer, hoping that I would develop interest in the computer field. Back then, computers were touted and advertised as a very promising future technology in Korea and there was a boom, with many parents buying personal computers for their children. Initially, I just played games with my computer like any other kid, but I began to develop some interest in programming to make my own games and build custom electronics hardware to enhance my computer.

Later, when I went to college and studied electrical engineering, I did very well with any computer-related subject, because everything felt so easy and comfortable to me. Developing further interest, as an undergraduate student, I started to work at a research lab in the college where graduate students designed x86 compatible microprocessors. I learned a lot from that experience and really enjoyed everything that I did at the lab. This experience eventually led me to come to the US and study computer architecture as a graduate student.

One of the projects you have been working on recently is developing memory technology that will not only store information like conventional memory, but also perform computing like a microprocessor at the same time. Why is this an important goal for the field?

Emerging applications such as machine learning (ML) and artificial intelligence (AI) have demanded more memory bandwidth every generation. However, to increase memory bandwidth further, we also have to consume a lot more power and energy than before, partly because memory technology scaling has been slowing down. Recently, power and energy consumption for moving data from memory to ML/AI processors and accelerators began to dominate.

In other words, the current trends of demanding more memory bandwidth and increased memory power/energy consumption for ML/AI systems is not sustainable. To tackle this challenge, processing in memory (PIM) architecture got renewed attention. PIM architecture, which processes data inside memory instead of moving the data to the processor/accelerator for processing, can significantly reduce the amount of data moved between processors/accelerators and memory and thus reduce associated power and energy consumption.

What are the unique aspects of the PIM architecture you developed at Samsung compared with past PIM architectures?

There were a few challenges that prevented the past PIM architectures from becoming commercially successful. One challenge was demanding changes in the memory subsystem of processors/accelerator and/or application code. Such changes cannot be easily driven by the memory industry, since they need to be done by other industry players such as processor design companies (e.g., Intel and AMD) and software developers.

To overcome this hurdle, I architected a PIM that does not require any change in the memory subsystem. In other words, my PIM chip can be a drop-in replacement of existing memory as it was built on the industry standard memory interface. Furthermore, I also led the development of the software stack that allows users to run their application code without any change. This will considerably lower the barrier for the memory industry to make the computing industry adopt this new memory technology.

How will computer architecture need to adapt in the coming years to allow for the growing use of artificial intelligence applications?

As the end of technology scaling is near, we will need to explore more specialized computer architecture. However, too much specialization is not sustainable and desirable because AI algorithms are rapidly evolving; specialized hardware design cannot keep up with the rate of AI algorithm evolution, as hardware design is expensive and time-consuming.

Therefore, in my view, the key challenge is how we balance between specialization and programmability. That stated, I believe a recent chiplet technology will play a very important role in the sense that we make a custom, specialized processor or accelerator fast and inexpensively by integrating (pre-designed) building blocks on a substrate, in the same way kids build something with LEGO blocks. In such a situation, a flexible but efficient interface will allow designers to put together basic building blocks with little effort. This will considerably reduce both the time and cost of designing new chips that are specialized for future AI applications.

Nam Sung Kim is a Professor of Electrical and Computer Engineering at the University of Illinois at Urbana-Champaign, where he leads the Future Architecture and System Technology for Scalable Computing (FAST) Lab. He is also a consultant with Samsung Electronics, where he served as Senior Vice President of Samsung’s Memory Division from 2018 to 2020. His research interests include high performance computing and energy-efficient processors, as well as memory, storage, network and system architectures. Kim has also done work in unconventional/emerging areas of computing architecture such as analog/digital hybrid computing.

He has published more than 200 peer-reviewed articles to highly selective conferences and journals, and his articles have received more than 11,000 citations. He is serving as a Program Committee Member (Industry Track) for the IEEE/ACM International Symposium on Computer Architecture (ISCA 2021).

Among his many honors, Kim is a recipient of the ACM-SIGARCH and IEEE-CS TCCA ISCA Influential Paper Award and was inducted into the Hall of Fame of the three most prestigious computer architecture conferences: the IEEE International Symposium on High-Performance Computer Architecture (HPCA), IEEE/ACM International Symposium on Microarchitecture (MICRO), and ISCA. He was recently named an ACM Fellow for his contributions to the design and modeling of power-efficient computer architectures.