People of ACM - Blair MacIntyre

March 21, 2017

You have been working in the augmented reality field since 1991, and you have said you are interested in “understanding the fundamental (and often subtle) problems that have made augmented reality (AR) systems difficult to design, deploy and use.” What key technological advances have come about in the past five years that have resulted in AR and virtual reality (VR) technologies being more widely introduced to the market?

The fundamental advances that have brought VR and AR out of the labs are rapid improvement of the components in modern smartphones. At their core, the affordable and high-quality modern VR displays are built around displays created for these phones; certainly, other technologies are necessary, as well, but the availability of cheap, fast, high-resolution and high-quality displays drove this current round of consumer-oriented VR.

The phones themselves have provided a platform for AR experiences, which are inherently mobile, and while a phone is not the ideal platform for AR, modern smartphones have enabled AR experiences to be created and delivered to millions of people using technology they already have. The miniaturization that leads to fast, smaller, cheaper and more powerful phones allowed displays like Google Glass to be created. Both of these platforms—the smartphone and Glass’s “heads- up display”—allow simple AR experiences, but still fall short of the vision people have for AR, that of being immersed in a mixed reality that blends 3D graphics with the world around you.

The next wave of AR technology, exemplified by displays such as Microsoft’s Hololens, will finally be able to deliver these kinds of experiences. For these displays, the driving technology is advanced spatial mapping and sensor fusion, allowing them to precisely locate the display within a space. With Hololens, we can place graphics in a room with the user so that it is stable and feels more like a part of the world; with more common smartphone-based AR or 2D heads-up displays, the media is either fixed to the 2D screen or appears to be floating near the user, detached from the world.

The Argon web browser, which you developed at Georgia Tech, exposes AR technology in the web, allowing people to overlay 3D web content on the video from their phone’s camera, combining geospatial data and computer vision tracking technologies. For example, a publisher or bookstore owner could enhance their website with AR capabilities. Someone looking at a book on their phone might be able to use AR to help find it in the store, or they might look at a book they were interested in and have interesting information from that publisher’s website appear around the book. Why do you think Argon will be a model for the use of AR on mobile devices in the future?

The vision most people have about AR is that “relevant information” will appear in the world around them, but that vision ignores the practical issue of content creation: where does this information come from, how does the AR system know what to display, and how do developers create rich, interactive content that provides real value? Content developers must be able to create content that provides value—of the sort we see in applications or websites—beyond just “3D content elements in the world.” And users must have ways of choosing what sources to interact with. When we look at the web, we see a model that can naturally be applied to this problem: the browser’s notion of search can be extended from text and location-based queries to include sensors, images and objects the user is looking at (such as the book above), with the user controlling what search engines are used and when and where they search. The content units are not “3D objects” but rather coordinated content (akin to web applications) that leverage what they know about the user and their context to select and display relevant content elements (such as the information about the book). Because this content is displayed by an application running in a general-purpose engine (the browser), developers can create whatever interactions they desire across all the content displayed.

The missing piece, which has driven our work on Argon, is giving the user the ability to display multiple AR web applications at once, and exercise control over the representation of reality the applications are rendered over. If a user wants to simultaneously display two or more applications to provide repair instructions for their car, or tourism information for a museum, that should be up to them. If they want to view the tourist information for a city from the comfort of their home, by overlaying it on a 3D model of the city or Google Streetview images, they should be able to. This ability to control what content to display, and over which representation of reality, will only be possible through a general infrastructure such as the web. The application sandboxes being created by AR and VR companies are currently giving complete control to the developers and are structured to coerce users into selecting a specific set of hardware of software components.

My longer-term goal, however, is not to position the web against these native ecosystems. Rather, I hope to demonstrate the power of decoupling content from platform, and encourage companies like Microsoft, Google and Apple to adopt similar abstractions that give users the ability to display many AR and VR applications at once, choose what displays and interaction approaches they want, and control what information these applications receive about them. In that way, the web also serves as a platform to demonstrate a user-centered approach to creating and delivering AR and VR content and applications.

Do you think semi-transparent glasses using AR, such as Google Glass or Microsoft Hololens, will eventually become more widely adopted? What technical challenges need to be overcome for this kind of wearable technology to become popular with consumers?

Yes, they will become widely adopted, although it will take some time before people use AR displays on a regular basis. Some people talk about finding the killer application for AR, but the biggest challenge with wearable AR displays is not finding compelling applications, but getting the technology to the point where it no longer gets in the way. On one hand, the displays need to get small and powerful enough that they are no more obtrusive than a pair of glasses, have a wide field of view, can be interacted with through voice and gesture, and can run all day without needing to be recharged. These are the obvious technical and engineering challenges that many companies are trying to tackle.

On the other hand, the content that is displayed must not interfere with what a user is doing in their life. When you talk to someone, 3D graphics probably shouldn’t occlude your view of them (even though it might occlude other things in the world). If you focus on something near to you, augmentations that are closer or further away should fade out of focus, just as the physical world does. And when you walk down the street, content must not be distracting; it needs to blend as seamlessly with the world as the superimposed graphics we see on sports broadcasts on TV. These kinds of problems require sophisticated sensing, modeling, display and semantic understanding of the world and are, together, quite difficult.

The good thing, however, is that we don’t need get to the point of having displays people can wear all the time to have real impact in the world. Current displays, like Glass and Hololens, can solve real problems in enterprise and vertical markets, like logistics, medicine and education, without satisfying the needs of consumers. These sorts of applications will arrive quite soon, and hopefully they will be successful enough to support the continued development of these technologies until they are ready for consumer use.

Looking into the future, what might be some exciting examples of virtual and augmented reality technologies that will enhance our lives 20 years from now?

Like many people, I am most excited about a future where AR and VR technologies are part of one larger MR ecosystem that is used to bring people together. There will be situations where AR or VR will be the “right” technology, such as for games that take place in imaginary virtual worlds, or for augmenting a specific physical thing near you such as for repair of a broken piece of equipment. But many imagined uses of AR and VR have to do with bringing people together or helping them understand the world.

For collaborative applications, it’s easy to imagine both AR and VR interfaces, where groups of distributed people come together in different ways. One person might be in VR and see the shared space and other participants virtually; another might bring their collaborators into their local space as avatars and show remote physical items virtually; others participants might be together in a physical space and see the others virtually. This blend of technologies, catering to the situation and preferences and needs of each user, will create new kinds of collaboration and sharing and synchronous social experiences.

Distributed meetings are an obvious example, but the more casual opportunities for sharing might be more significant. Imagine all those times you are working on something, physical or digital, for your job or for fun, and you’d like to talk to someone about it. Now imagine that you can virtually get together as easily as you can send a text message today. Have them share, comment on and discuss things with you, with all the rich interaction from gestures and facial expressions, tone of voice and movement, just as if you were together. This is what most excites me about the future of these technologies.

Blair MacIntyre is a Professor in the School of Interactive Computing and founder of the Augmented Environments Lab at the Georgia Institute of Technology. His research focuses on the design and implementation of interactive mixed-reality (MR) and augmented-reality (AR) technologies and experiences. He has collaborated on a variety of AR gaming and entertainment projects, including serving as co-director of the Georgia Tech Game Studio and the Qualcomm Augmented Reality Game Studio. Recently, he has been working to add AR technologies to web-based mobile applications through the Argon web-browser projects, and is on leave from Georgia Tech at Mozilla to continue working to add AR technologies to the web.

MacIntyre’s honors include receiving a National Science Foundation (NSF) Career Award. At ACM’s 50 Years of the Turing Award Celebration in June, he will be moderating a panel titled “Augmented Reality: From Gaming to Cognitive Aids and Beyond.”