People of ACM - Trilce Estrada
October 19, 2021
What have been some of the most exciting developments in computational science in recent years?
As in most fields nowadays, computational science has started to leverage the predictive capabilities of machine learning for several purposes: to come up with faster solutions, to come up with models for problems that are very hard to solve analytically, or even to identify new scientific questions. A clear example of this type of synergy between machine learning and computational science is AlphaFold, from Google’s DeepMind, which competed in the Critical Assessment of Techniques for Protein Structure Prediction (CASP) competition. AlphaFold determined the 3D shape of proteins from their amino acid sequence, and was able to achieve high accuracy compared to experimental structures. As we keep using machine learning for science, we are able to achieve progress at a pace never seen before. It is a really exciting time to work in this field.
However, not everything is unicorns and rainbows—computational science had been dealing with a reproducibility crisis even before machine learning was used so ubiquitously. Now that we add an additional layer of complexity into the mix, we have to deal with new challenges with respect to interpretability, reproducibility, and trust of results. My focus on computational science these days is not only on accuracy and scalability, but on making sure that if we inject machine learning anywhere in the workflow, that we are able to provide our scientists with compelling explanations of what the models are learning and how that information is being used for specific decisions. This year, in collaboration with Michela Taufer (University of Tennessee, Knoxville), Ewa Deelman (University of California, Santa Cruz), Mary Hall (University of Utah) and Rafael Ferrera da Silva (University of California, Santa Cruz), we held virtual world cafes to understand the perspectives and challenges that computational scientists, system administrators, and educators face with respect to these issues, and then formulate strategies to move the field forward in a direction that can be trusted.
One of your interests is introducing automatic decision-making processes to distributed computing environments. Will you give us an example of this? What are the key challenges in this area?
There are two components to this problem: how applications perform on large systems, and to what degree we can modify application behavior to make them more efficient. At the system level, we’d like to understand how scaling the number of resources affects performance. For example, would specific communication patterns result in a bottleneck if we increase the scale of compute nodes by one or two orders of magnitude? How does this bottleneck manifest at small scales versus larger scales? Work that we have been doing with Patrick Bridges (University of New Mexico) and Patrick Widener (Sandia National Laboratories) looks into quantifying the uncertainty and predicting the scalability of different types of applications on HPC systems.
On the other hand, at the application level, we would like to infer application behavior in order to steer its parameters and achieve faster times to solution while using less computational resources. To do that, we need to look inside the application itself, its intermediate results, in a way that is efficient and does not interfere with the application. That is why it is very important that the analysis techniques that we use are lightweight and interpretable. Work that we have been doing with Michela Taufer, Ewa Deelman, and Harel Weinstein (Cornell University) focuses on interpretable in situ analysis of molecular dynamics simulations, where interpretability goes all the way from the data encodings to the analytics.
You have been involved in mentorship programs to attract more women and Latinx members to participate in computing. What have you learned from these experiences?
As counterintuitive as it sounds, my focus has not been specifically directed to attract minorities, but to make the spaces welcoming and open for everybody, and more important, to make sure that students feel supported and safe. Attracting more women and minorities is a byproduct of these efforts. What I have learned is that everybody, even the most self-assured student, longs for acceptance and belonging. We all, at some point in our career, have doubts on whether we belong to this field, especially when there is not enough representation, as is the case in high performance computing. Many of us feel inadequate or lack the assertiveness to make sure that our voices are heard. Some push hard against the barriers and rise to the top with little help. But the majority need safe spaces and allies that can be with us along the way, to help us see our own strengths and make us feel part of a community.
I have helped organize professional development and mentorship programs like the IPDPS PhD forum (an international gathering for scientists and engineers working in parallel computing) for five years, and Broader Engagement and Mentor Protege at the SC conference with amazing people like Jay Lofstead (Sandia National Laboratories) and Luc Bouge (École normale supérieure de Rennes). It has been my privilege to see how both students and mentors grow and flourish from these experiences.
What have been some of the most consequential decisions of your career? What lessons about this can you share with younger colleagues who are just starting out in the field?
Choosing my advisors and mentors has been the most significant decision throughout my career. My PhD advisor, Michela Taufer, has been an amazing force, always helping me to be better. I owe her a huge debt of gratitude. Before that, my Master’s advisor, Olac Fuentes (University of Texas at El Paso) was my first mentor and he gave me the confidence to come to the United States and start my career. I am very fortunate to have a large network of trusted mentors and peers that are always willing to offer advice, a kind word, or a little push. My advice to younger colleagues is that they start building and cultivating this network. Most people really want to help others—we just need to give them an excuse to do so.
Trilce Estrada is an Associate Professor and Director of the Data Science Laboratory at the University of New Mexico. Her research interests span the intersection of machine learning, high performance computing, and big data, and their applications to interdisciplinary problems. Her overarching research goal is to solve computationally-intensive and data-intensive problems in science, health, and education, especially in scenarios where resources and trained professionals are scarce.
Among her honors, Estrada received a National Science Foundation Career Award. In 2019 she received the ACM SIGHPC Emerging Woman Leader in Technical Computing award “for her innovative and transformative deployment of machine learning for knowledge discovery in molecular dynamic simulations and in situ analytics.”