People of ACM - Martha Larson

October 8, 2019

What is multimedia computing, and why is it an important area of computer science?

Multimedia is an area of data science that studies the combination of different signals, including images, text, video, audio and sensor signals. Multimedia research applies machine learning and artificial intelligence techniques to analyze multimedia content and make the information it contains available for use. Many researchers in the field focus on content created by people for the purpose of communication, for example, videos that we upload online or music playlists. Multimedia also includes the data that arise when people interact with digital content, including clicks, views, likes, comments and transactions, commonly used by recommender systems.

Multimedia researchers take the position that a single modality allows only narrow understanding of a communicated message or partial analysis of a sensed event. Multimedia computing is important because it simultaneously integrates information from multiple modalities, supporting analysis of the broader picture. A key challenge in multimedia research is keeping the user central: multimedia information must be used in the service of the people who produced it, which includes protecting them from harm, such as privacy violations.

Early in your career, you carried out linguistics research in syntax and semantics. How does this background inform your work in search engines and recommendation systems?

Linguistics and computer science are known to be sister disciplines, and we often think of them as linked via their common concern with formal language theory and logic. Less often, we stop to reflect that both disciplines study systems. My linguistics background means that I understand human language as a constantly evolving system that is created and perpetuated by the people who use it. Underlying language is a complex, but consistent, architecture. However, when we use language, we have enormous freedom to defy expectations and to bend language structures to meet the communicative needs of the moment.

Linguists understand general principles but can’t predict exactly how specific individuals will put their thoughts into words, or exactly how natural language as a whole will evolve. Search engines and recommender systems share some of the same properties. Without people using them, they cannot exist. Researchers can develop new algorithms but can’t predict how users will experience the system. Instead, it is necessary to anticipate that user interactions can carry the system off in unexpected directions.

Why is benchmarking important in the multimedia field? Will you tell us a little about MediaEval, which you co-founded and coordinate?

Benchmarks define a standard problem formulation, dataset, and evaluation procedure. Benchmarking is important for multimedia because we need to be able to fairly compare algorithms in order to determine if new algorithms are actually improving the state of the art. These days, benchmarks often take the form of shared tasks, where a group of researchers decide to develop and release a data set together, a productive pooling of resources.

The shortcoming of benchmarks is that they tend to focus on a narrow set of tasks, and reward research that achieves incremental improvement as measured on a single dataset with respect to a single metric. MediaEval attempts to address these shortcomings. We fight the oversimplification of research goals in the field of multimedia access and retrieval. We emphasize qualitative insight: excelling in MediaEval is more about understanding the problem rather than achieving the highest score. Certainly, we have leaderboards, like any data science competition. However, when we come together to discuss our results at our yearly workshop, we highlight the work of participants who make innovative contributions, even though they might not be at the top of the leaderboard. Gareth Jones, at Dublin City University, Ireland, and I launched MediaEval in 2010. That means this year is our 10th anniversary. As we celebrate this landmark, we are also looking toward the future.

This will be the first year that the ACM Multimedia Conference will include an Artifact Review and Badging initiative as part of a reproducibility track. What was the genesis of this new track?

As General Chairs this year, Benoit Huet of EURECOM, France, Laurent Amsaleg of CNRS-IRISA, France, and I were specifically looking to introduce innovations that would keep the conference on the leading edge of scientific achievement. We felt that the conference should explicitly support authors in releasing resources to the community that contribute to the reproduction of their research.

The Multimedia Systems Conference (MMSys) was the first ACM SIGMM conference to introduce Artifact Review and Badging, and the push to expand badging to ACM Multimedia was led by Laurent together with Björn Þór Jónsson, at the IT University of Copenhagen, Denmark. Laurent and Björn Þór, who are serving as ACM Multimedia’s first Reproducibility Chairs, recognized the enormous amount of time and dedication that it takes to achieve true reproducibility. When they designed the new Reproducibility track, they focused on providing authors and reviewers a generous amount of time, so that they can actually collaborate in a substantial way on the reproduction. The result is a reproducibility paper that is presented at the conference, this year in a special reproducibility poster session.

Data science, artificial intelligence, and multimedia present many opportunities, some of which we haven't yet imagined. What advice do you offer your students on what to study to prepare for future careers?

When I started my Bachelor’s degree, the World Wide Web did not yet exist, nor was it conceivable. I was well into my graduate school years before the Web really came into its own and the first search engines started to appear. I had no way of directly shaping what I studied to prepare myself for what I do today. It’s likely to be the same for today’s students. They are faced with the challenge of preparing themselves for a future that they cannot anticipate.

My advice is to enrich your education by diving into subjects that enjoy invariance (persist) over generations. Find joy in math. Learn how to express yourself in writing. Learn to solve problems. Become familiar with the full range of disciplines that study human systems, including, law, sociology, and anthropology. Understand why literature and art captivates us, and how it gives us insights into ourselves. Dive into philosophy and develop your own moral compass, which will guide you in making challenging ethical decisions that you encounter during your career. Study history: the present is the history of the future. Critical thinking skills are, well, critical. But there is no single right way to develop them. What is important is that you “learn how to learn.”

If you have the chance, also learn how to teach. You will develop the habit of sharing knowledge and the ability to see through the eyes of people with many different perspectives. Remember you are preparing yourself not only for a career, but also for life, which we hope includes a long and happy retirement. Develop your capacity for curiosity and wonder, and an appreciation of what is truly good in this world.

Martha Larson is Professor of Multimedia Information Technology at Radboud University in Nijmegen, the Netherlands. She is also a member of the Multimedia Computing Group at Delft University of Technology, the Netherlands. Her research centers on search engines and systems for retrieval and recommendation that provide users with intelligent access to multimedia content. Her current focus includes modeling user intent and protecting user privacy.

Larson has made a particular contribution in the area of search technologies that can find information collections of speech documents, and is a co-author (with Gareth J. F. Jones) of Spoken Content Retrieval: a Survey of Techniques and Technologies. She serves on the board of the Centre for Language and Speech Technology, and is also coordinator of the Benchmarking Initiative for Multimedia Evaluation (MediaEval).

Larson has been active with the ACM International Conference on Multimedia (ACMMM) for a decade. She has served as Area Chair both for crowdsourcing and for privacy and, in 2016, she was Co-chair of the Brave New Ideas track. Over the years, she has co-organized workshops at the conference on spoken audio content and on crowdsourcing for multimedia. This year she is serving as a General Co-chair of the conference.