People of ACM - Matthew Roughan

February 26, 2019

You’ve written extensively about internet traffic matrices. What recent development in our understanding of traffic matrices will impact network measurement research in the coming years?

Traffic matrices are an important component for network management, both for efficiency and for reliability, and I’ve always liked working on topics that can have a practical impact. But most of the problems that originally got me into the area—problems related to measurement and using them in practice—are now solved. The topic that has been interesting me recently is how to synthesize artificial traffic matrices.

We don’t have a lot of good (public) datasets for internet traffic matrices, and even if we did have a few more, they’re high-dimensional data, so we’ll never really have enough of them. Synthesis is one way to work with the matrices without needing vast numbers of measurements. Of course, the synthetic matrices have to live in a rich and interesting space, and generating that is a challenge. Most recently I have been excited about using maximum entropy. This is a great technique because it allows you to impose conditions or constraints on a model based on what you know (for instance, traffic can’t be negative). However, the really nice part is that it generalizes Laplace’s principle of indifference (sometimes attributed to Jacob Bernoulli), which is the foundation of discrete probability and (you could argue) which underlies Bayesian statistics. It’s the principle that says that unless you know better, assume that the probability a tossed coin will come up heads is one-half, and that a standard dice will roll a six is one-sixth.

Maximum entropy takes the principle of indifference to the next level, and in doing so it allows you to use data you have (for instance, about the bias of a coin) in a model that doesn’t rely on any other (hidden) assumptions. That’s a great property when you are modeling. But it also has appeal because it connects up to information theory (which we’ve used for estimation of traffic matrices) and ideas about complexity that I really want to explore further.

In a recent talk, you said that good mathematical abstractions will be an important tool in cybersecurity efforts in the coming years. Will you explain this?

This is a critically important topic. Too much cybersecurity is currently being done through ad hoc and reactive fixes. Often the fix for one problem inadvertently creates a suite of new vulnerabilities. We see the evidence for this if we look at, for instance, the CVE (Common Vulnerability and Exploit) database. We can see that the number of problems has increased dramatically in recent years. The reactive approach to cybersecurity is failing.

There are an increasing number of attempts to address this problem. They range from a DARPA grand challenge a couple of years ago, to using AI, to process-based approaches. But to me, the fundamental question before we even get into the “how-tos” of security is “what is my security policy?” More specifically, “what I am guarding, and against what?” If I don’t even know these things, how can I ever be sure I am secure?

These questions shouldn’t depend on nitty-gritty details. When I tell you my security policy, I shouldn’t be telling you about the type of devices I have, or the operating systems. I should be able to describe it in “abstract” terms, so that I can separate the specific details from the aspects I really care about. That way I can change the underlying technical implementation without changing my policy, and so my network can grow and adapt flexibly without changing my high-level security policy. More important, with good abstractions, I can then reason about the policy itself without being caught up in minutia.

Abstraction is a very well-known tool for writing good software, but coming up with the right abstraction is almost never easy. We have lots of experience with abstraction for software systems, but not so much in network security (my focus).

Why “mathematical” abstractions in particular? The quote from Edsger Dijkstra says it all: “The purpose of abstraction is not to be vague, but to create a new semantic level in which one can be absolutely precise.” When security policies even exist, they are usually expressed in words. I can’t work with words. They can be vague or inconsistent. I need something more precise, and there isn’t any language more precise than mathematics.

What’s more, if we work with mathematical definitions, we inherit hundreds of years of beautiful theory to help us reason about the results. Why wouldn’t you want mathematical abstractions whenever they are possible?

In the recent paper The "Robust yet Fragile" Nature of the Internet, you and your co-authors explored the idea of finding unifying properties in complex networks. As the Internet of Things (IoT) seems to increase the complexity of computer networks, how will identifying unifying properties improve our ability to efficiently manage these networks?

I don’t think the IoT increases the internet’s fundamental complexity, or at least it doesn’t increase complexity as much as it changes the scale, and the reasons why are explained in that very paper.

One of the core ideas of the paper was that models of the internet structure at the time were simply wrong. Academics had noticed that power-laws degree distributions were common in many networks (including the internet, but also in social and biological networks). Power-laws are a simple model to describe distributions with high variation (where events that are very much larger than typical events can happen), and they occur in nature with remarkable regularity. So it wouldn’t be surprising to see them in something as complex as the internet.

However, in the context of the internet, the models being used for power-law networks were “inverted” in the sense that they placed the nodes with extreme high-degree towards the center of the network, in the core or backbone. In reality, this can’t happen. Technological constraints on routers and switches mean that the core is built of low-degree but high-capacity devices, and that the high-degree nodes occur towards the edge of the network where lots of users are brought in. Informally we can think of the network growing by spreading out into lots of thin branches at the edges, and getting thicker in the core or trunk, the same way you might imagine a plant growing.

The IoT just continues this trend. It creates massive growth at the edges, and the core will need to grow in capacity to support that growth, but the core doesn’t really need to change much in nature. And the edges grow by spreading outwards.

So the message of the paper, with regard to network management for large transit providers, is to look forward to a lot more of the same.

It’s different for internet service providers (ISPs) out in the edge of the network. They can expect many more devices to be connected into their network, and that will create new stresses. For one, it should push the adoption of Internet Protocol (IP) version 6. It will also create cybersecurity problems at a whole new level, and that is a problem for everyone.

That’s the “robust yet fragile” story in a nutshell. The network is hugely and vastly robust against certain types of growth and change (hence the success of the internet, despite many challenges), but fragile against other types of perturbations, security being the most obvious example.

You’ve indicated that your interest in networks extends to biological and social networks. What have you learned about computer networks that you have been able to apply to your work in these different kinds of networks?

The internet has grown over the last 20 or so years through a process that is more akin to the evolution of an ecosystem than any engineered system we have built before. Its success lies in the way it has facilitated innovation in protocols, hardware and applications. But that same rapid innovation environment has allowed anything and everything to happen. The net effect is a system that looks (to me) more like a coral reef than, for instance, an aircraft or skyscraper or some other complex engineered system.

As a result, I think there are many analogies between social and biological networks and the internet. Maybe you can see this already from my previous answer, but I think there’s an even better example: cybersecurity. Cybersecurity is a brilliant example because the analogy teaches us useful lessons. We already use biological analogies in this domain: we talk of “viruses” and “worms” and for very good reason –- these types of attacks can be modeled in exactly the same way as their biological analogues. But biology has more to teach us: simple examples included:

  • the vulnerability of monocultures to parasites and other forms of hijacking, and its extension to understanding that monolithic architectures and monopoly providers foster attacks; if everyone uses a different OS, the scale and hence economics of hacking change dramatically;·
  • the increasing advent of middle boxes to fix or improve certain aspects of the network (e.g., network address translation) represents a kind of arteriosclerosis, creating inflexibility and leading to new vulnerabilities;
  • parasites and their ilk can evolve more quickly than the systems they prey on;
  • cures and preventatives lose their effectiveness quickly if used carelessly, e.g., our current problems with antibiotic resistance; and
  • complex (e.g., multicellular) systems/entities have many vulnerabilities, so if we need really tight security for a system (e.g., a power station) it should be simple.

When we are building more advanced, more automated cyber-defenses, we should be looking to nature as well. Think about our immune system, and what can go wrong with it. What happens when a virus subverts the immune system itself? Or when the immune system otherwise malfunctions? Sometimes these failures are worse than the disease. So the lesson is that we have to take care that our defenses don’t just increase the vulnerable attack surface, and give hackers a point of attack that can cause even more damage.

A simple example, for me, lies in password management systems. These are a really good idea. If you don’t have one, get one now, but don’t join the most popular one. The last thing we need is to make such a critical component of our security into a single, monolithic system. It would be much too tempting a target, and that would lead to vast effort expended towards cracking it, and massive damage occurring when that happens. We’re much better off with a diverse and vibrant ecosystem with many such tools.

Of course, analogies are a high-level tool, and can be facile. The devil is in the detail, and taking analogies too far can be dangerous. And so we come back around to the need for precise abstractions, and tools to reason about these.

Matthew Roughan is a Professor in the School of Mathematical Sciences at the University of Adelaide and a Chief Investigator at the Australian Research Council Centre for Excellence for Mathematical and Statistical Frontiers. His research interests include internet measurement, as well as the performance, efficiency, optimization and measurement of networks. He has also done work in cybersecurity and cyber-privacy.

Roughan was named an ACM Fellow (2018) for contributions to internet measurement and analysis, with applications to network engineering.