A Modified Depensation Model of Peer to Peer Networks: Systemic Catastrophes and other Potential Weaknesses
by Andrew H. Chen and Andrew M. Schroeder
Introduction
Peer to peer networks have generated significant attention in the recent past, especially file trading networks such as Gnutella and Morpheus. These networks allow users to easily access and exchange a wide array of digital information, potentially including pirated music and other digital media. As these networks are decentralized, there is no central entity that can be held responsible for illegal actions on the network, leading some to believe that they can never be shut down.
However, in this paper we argue that these networks are potentially more vulnerable than previously believed by utilizing a modified version of the "depensation model," commonly found in ecological models of fish and birds. If the model is accurate, one interesting implication is that P2P networks may be susceptible to "catastrophes," situations in which the P2P network's user base has negative natural growth. This implies that it is not necessary to take action against all the users on the network to shut it down, but rather just a subset.
In particular, our analysis of the model leads to four potential strategies, which can be used in conjunction:
- Randomly selecting and litigating against users engaged in piracy.
- Creating fictious users that carry incorrectly named or damaged files. [4]
- Broadcasting fictious queries in order to degrade network performance.
- Selectively targeting litigation against the small percentage of users that carry the majority of the files.
Just as catastrophes can be caused in animal populations by overharvesting, it may be easier for the recording industry to solve the P2P problem than previously thought, by purposely pushing the network over this tipping point.
Problem Description
The promise of the Internet lies in its ability to eventually make information available to everyone, anytime, anywhere. Eventually, any type of media, whether audio, video, literature, or images, will be able to be accessed, shared, and distributed to the whole world. In many respects this promise has been fulfilled in the United States.
The ability to distribute information so effectively poses problems for companies that rely on intellectual property, such as creators of software, movies, music, books, etc. As the Internet supercedes traditional methods of distribution, there is an increasing danger that profits will fall as people find the companies' products for free. For example, the music industry estimates that it loses $4.1 billion [3] in revenue from music piracy.
This problem has gained tremendous attention in the last year, as the Recording Industry Association of America (RIAA) successfully challenged, and eventually shut down Napster, the most popular music trading service at that time. Since then, Napster has restarted their services, but the ability to trade pirated music was severely handicapped, which has resulted in a 90% decline in their user base [2].
However, these users have not simply stopped trading files; they have just moved onto more advanced file-trading platforms based on peer-to-peer technology, such as Gnutella and Morpheus. Many have stated that these P2P networks cannot be stopped by the RIAA [6], but this issue has not been thoroughly studied with a mathematical model.
To do so, a model must be created which describes both the P2P network as well as the RIAA interaction with the network's users. This paper describes a conceptual model of P2P networks, based on the depensation model found in studies of population dynamics. This may be an accurate model because there are many similarities between P2P networks and certain animal populations such as bird flocks or fish schools. For these animals there are advantages to grouping together, such as protection against predators. This grouping creates a "critical mass" which must be satisfied for the animal populations to have positive growth. Similarly, there is a critical mass for users in P2P networks, because if there are too few connected to the network, files will be difficult to find. Thus, if the number of users is below this minimum number, the user base will not grow, and the service will eventually collapse. Another similarity is that there is an overall limit to the size of the animal population, imposed by the resources of the environment, such as water and space. Similarly, in P2P networks, there is a maximum number of users limited by bandwidth [5].
By modifying the depensation model found in animal populations, it may be possible to accurately describe P2P networks, and most importantly, the results of "harvesting" the population. This paper aims to address the following questions:
- How must the depensation model be modified in order to account for conceptual differences between P2P networks and animal populations?
- What are the conditions necessary to cause catastrophes in P2P networks?
- What does the model imply about other ways to limit or stop P2P networks?
- What is the most effective method to stop P2P networks?
Simplifications
To obtain an appropriate model for P2P networks, it is necessary to simplify the problem by making a few logical assumptions.
Infrastructure Assumptions
In particular, because P2P networks can be quite large and are built on a complex and diverse infrastructure, it is not viable to create a model which captures all of the interaction of agents across the entire Internet. Thus, we make the following assumptions:
- The P2P network is balanced and homogeneous (equal edges, bandwidth, etc.).
- Messages sent across the network have common characteristics (size, TTL, etc.), and are sent at similar rates.
User Assumptions
Additionally, it is necessary to make a set of assumptions on the behavior of the users on the network, based on intuitive reasoning. Thus, we assume:
- The growth of new users depends on the number of users on the network.
- Not everyone both shares and downloads files; there are a significant number of "freeloaders" that only download without sharing. [1]
- Users will stop using the service if their searches for files come up negative.
- As the network performance starts to degrade, the usage of the service will begin to decline.
However, beyond these behavioral assumptions, it is necessary to make assumptions about the potential actions of the RIAA or other organizations aiming to stop P2P file-trading. This paper will not discuss the specifics of how such an organization would actually go about removing users, other than suggesting what has been done before: litigation. Thus, we also assume:
- There is a way to stop users from using the service.
- If there is an effort to stop individual users, then growth of new users will decline.
The last assumption argues that methods used to stop users on the P2P network will additionally intimidate or otherwise discourage new users from joining. Even if the effort is more benign, such as introducing a service that competes with the functions of P2P networks, it would most likely cause adoption rates to be lowered.
With these simplifications, it is now possible to create a model to describe P2P networks and the results of removing users from the service.
Mathematical Model
Unmodified Depensation Model
Peer to peer networks, as discussed earlier, have much in common with animal population. Because of this, the depensation model will be modified to describe P2P networks. However, to begin with, an unmodified version of the model will be discussed:
[8]
|
Symbol |
Definition |
|
r |
Growth rate |
|
N |
Population |
|
Nc |
Critical mass for population |
|
K |
Carrying capacity for environment |
|
q |
Catchability coefficient |
|
E |
Harvesting effort |
Table 1: Definitions of variables in the depensation model.

Figure 1: Graph of depensation model.
In the graph shown in Figure 1, the two non-zero equilibrium points are marked. N1 is unstable, and N2 is stable. This means that over time, the population should eventually grow to the N2 equilibrium point, as it is exactly the point at which the harvesting and the population growth rates cancel each other out.
Modified Depensation Model
From assumption 6, however, there is a modification that needs to be made to this model. The growth rate needs to decrease when effort is increased, thus, we divide r by αE+1, where α is a parameter indicating how removal effort affects user growth rate. A high number would indicate that removal effort is highly effective in discouraging new users from entering the network, and a low one would indicate a low discouragement.
The new equation, followed by descriptions specifically for P2P networks follow:
![]()
|
Symbol |
Definition |
|
r |
Growth rate |
|
α |
Relation of removal effort to new growth rate |
|
N |
Number of users |
|
Nc |
Critical mass of users |
|
K |
Maximum number of users supported by the network (carrying capacity) |
|
q |
Removal coefficient |
|
E |
Effort to remove users from network |
Table 2: P2P Symbol Definitions.

Figure 2: Graph of modified depensation model.
As shown in Figure 2, modifying the model does not change the curve's overall shape; it looks the same as the unmodified model. One should see the differences in the two models only when the values for effort are factored in, changing the height of the curve.
Many of these constants must simply be observed empirically. For example, it is difficult to predict what would happen to the growth rate if users began to be removed from the network. However, some variables lend themselves to speculation. For example, although it would vary for different methods of user removal, the removal coefficient q is probably small, because the most potent method, litigation, is slow-acting and generally targets small numbers of users. At the very best, it may be possible to achieve a high removal coefficient by pursuing larger groups of users through their ISPs, schools, or companies. It is through this type of intuitive reasoning that information can be gathered on these variables without resorting to extensive research.
Critical Mass and Carrying Capacity
Two important numbers, carrying capacity and critical mass, can be decomposed further. Both depend on the idea of the probability of a search hit, which will be referred to as p. This variable p depends on several factors, such as the number of files each user shares, the diversity of these files, and the musical preferences of users on this network. Although these parameters seem difficult to quantify, especially the last one, one may be able to make a reasonable estimate by sitting on the P2P network and watching searches and responses go by, to see how often users find the files they are looking for.
Before discussing these different costs, here are the variables that will be used:
|
Symbol |
Definition |
|
R |
Maximum number of reachable users within T hops, assuming D edges per node |
|
D |
Number of edges per node |
|
T |
Time-to-live, or number of hops message will traverse |
|
p |
Probability of finding a specific file on the network |
|
m |
Probability of finding a specific file on the specific node |
|
Sq |
Size of query packet |
|
Sr |
Size of response packet |
|
Q |
Average number of queries/responses per second |
|
K |
Average number of responses per query |
|
F |
Average size of files transferred |
Table 3: Variables for Critical Mass and Carrying Capacity Equations.
First, one can note that the critical mass is tied closely with the parameter p, because as the number of users increases, the probability of a successful search increases as well. A thorough treatment of this relation is difficult, because the critical mass is simply the number of users necessary to push p over some percentage, completely determined by the psychology of the users. This adds in a lot of complexity because it means the critical mass may shift depending on external factors, such as the availability of substitutes trading files over this particular network. Users may be able to move to other networks or distribution methods, which would affect the critical mass.
Carrying capacity relates to more concrete factors, fortunately. Because of the assumption that the P2P network is fully balanced, the number of users within T hops of a specific node, assuming each edge has D edges, is given by:
![]()
From this, the cost of querying and passing on queries is given below, where Q is the number of queries from that node per second, m is the probability the specific node has the file, Sq is the size of the query packet, and R is the number of nodes the query is sent out to:
![]()
The first term in the equation quantifies the fact that everyone within T hops of the node will be issuing queries to the node. The next term details how much capacity is used to pass the response. The final term captures the capacity needed in the node's own searches.
The cost of responding and passing on responses to queries is given below, where p is the probability of users finding the files they want, S is the size of the response packet, and R and Q are the same as before:

where T/R is the probability that of a response being passed upstream through an individual node, from a downstream neighbor.
Finally, after the response, the node connects directly with the other node, and there is a cost for the download and upload. It is given below, where Cd and Cu are the costs of downloading and uploading files, respectively. F is the average size of files transferred across the network, and m (where m >> p) is the probability that the node has the file the other nodes are looking for.
![]()
Combining each one of these costs, we can then find the total bandwidth, which is simply each of the costs added together. This is given by:
![]()
The carrying capacity of the network can now be defined. Because the users on the network are limited by the bound of the bandwidth capacity of each node, when CT is equal to the bandwidth capacity, the carrying capacity is reached. The carrying capacity may increase or decrease depending on several factors, including how often the users are searching, or how quickly they are able to find the files they want. It is also obvious that decreasing the carrying capacity decreases the total number of trades that can be happening across the network, which limits the potential for piracy.
Catastrophes
Also, it is important to note that with both models, there is the potential for "catastrophes," the situation in which harvesting leads to a collapse in the population or user base. For this to happen, the effort must be increased so that the two equilibrium points on the curve (indicated by the intersection of the curve and the effort line) merge into one (Figure 3).

Figure 3: Increasing harvesting effort.
As the harvesting effort is increased, the two equilibrium points eventually converge to one point, at which there is a catastrophe. The resulting equilibrium point is unstable, and so the only stable point becomes N=0. At this point, the natural growth rate of the population is negative, and the system collapses. By solving for the equilibrium points, it is then possible to find the conditions necessary to cause this collapse.
Because of the modification to the model, as the effort is increased, the curve itself flattens out (Figure 4).

Figure 4: Effects of harvesting on the growth rate in the
modified depensation model.
Thus, as the effort increases, and the curve flattens, it is easier to achieve a catastrophe than in the unmodified animal population model.
Solution of the Mathematical Problem
Next we must find the equilibrium points. This is done by equating the differential equation to zero and solving for N:

The collapse occurs when N2 = N3, and we try to solve for E to find the appropriate amount of effort to cause the catastrophe.

One of the solutions to E is going to be negative, and because effort cannot be negative, it is discarded.
Results and Conclusions
From the solution, which describes the effort needed to achieve a catastrophe, there are several important conclusions that can be drawn. To minimize E, the denominator should be as large as possible. The model indicates that by far, the most effective method of causing a catastrophe would be to raise α. A high α would reflect a method of user removal that would be effective in scaring or otherwise discouraging new users from joining the network. It might work to randomly sue users, and even if the lawsuits were not successful, it might create enough trouble to keep people from wanting to use the network.
Another potential weakness comes from the critical mass and carrying capacity. The larger the critical mass is, and conversely, the smaller the carrying capacity is, the easier the system can be made to collapse. The critical mass is strongly dependent on p, the probability of users' searches returning hits, and it may be possible to lower this factor by selectively threatening users which carry a large number of files, or by posing as legitimate users while carrying false files. This would frustrate users of the network, causing the critical mass to be smaller. If this were to happen, a smaller effort would be needed to achieve a catastrophe (Figure 5).

Figure 5: Smaller harvesting effort needed if the critical
mass is larger.
Also, the carrying capacity may be interfered with, also by making p smaller (misses produce further queries), or by broadcasting fake requests across the network. This would consume bandwidth and lower the carrying capacity so that the network would support a lower number of workers. This also results in a smaller amount of effort needed to cause a catastrophe (Figure 6).

Figure 6: Smaller harvesting effort needed if the carrying
capacity is smaller.
Thus, through a combination of removal methods, targeting random users as well as selectively pursuing large file-carriers, P2P networks may be pushed to the tipping point. Empirically this was observed in the Xerox study, in which "nearly 50% of all responses are returned by the top 1% of sharing" [1] and obviously shutting down this top 1% would do great damage to the network. This approach could be combined with adding stress to the network by posing as legitimate users while flooding the other users with requests, which would consume bandwidth, which may frustrate users, and degrade performance. Because the nature of P2P is very democratic, imposters may be difficult to stop, as they infiltrate and break the system from within.
Improvement
Whatever the implications of the model may be, it is still only conceptual. It is not clear if the assumptions, such as balanced networks, are actually correct. For example, last year the Gnutella network actually collapsed [7], but it happened because the connection speeds were not homogenous across the network, and low bandwidth clients walled off the communication between high bandwidth clients. The model presented in this paper would not account for such phenomenon, because of the homogenous network assumption.
One of the most damaging critiques of this model also relates to the balanced network assumption. Although the Internet is widely interconnected, it is not clear that examining a subset of the network, as this paper does, is a fair analysis of the entire network. If one subset was wiped out, the consequences for the rest of the network may be very damaging, or not at all, depending on the level of interconnectedness. For example, if the entire Internet was shut down, but a large group that was "clumped" together away from other users continued, they might still be able to trade files. Thus, students in dorms may never be stopped from trading music with each other, although it may be the goal of the companies to keep the phenomenon from exploding to a global level.
Furthermore, there are clear differences between the growth of users and the growth of animals. If one were able to push a fish population below the critical mass in a single day, it would suffer a catastrophe and never recover. However, if one were able to shut down a P2P network for a day, people might still keep the programs on their computers and try again later, and thus the user base could recover and jump over the critical mass.
Thus, although the effort needed to cause a catastrophe has been solved in this paper, it is also not clear whether or not reaching this point is ever feasible. It may be that potential punishment is not serious enough, or cannot happen widely enough, to deter new users. Or perhaps the carrying capacity of a well-designed P2P network is huge, and amount of flooding can overwhelm the network.
To find the answers to these questions, one must attempt to measure each of the values included in the model empirically, would require extensive research. Thus, many of these issues are potential areas of improvement and research in the future.
References
- 1
- Adar, E. and Huberman, B. "Free Riding on Gnutella," 27 September 2000. <http://www.firstmonday.dk/issues/issue5_10/adar/> (17 December 2001).
- 2
- Costello, S. "Napster Usage Nosedives 90 Percent," 6 June 2001. <http://www.pcworld.com/news/article/0,aid,51903,00.asp> (17 December 2001).
- 3
- Dickinson, T. "Dickinson on Internet Piracy of Music, Software," 20 July 2000. <http://usinfo.state.gov/topical/global/ecom/00072004.htm> (17 December 2001).
- 4
- Levine, D. "Not the Real Slim Shady," 10 June 2002. <http://www.salon.com/tech/feature/2002/06/10/eminem_mp3/?x> (13 June 2002).
- 5
- McCannell, S. "Why the RIAA is Fighting a Losing Battle," 12 May 2000. <http://www.oreillynet.com/pub/a/network/2000/05/12/magazine/riaa.html (17 December 2001).
- 6
- Ritter, J. "Why Gnutella Can't Scale. No Really." February 2001. <http://www.darkridge.com/~jpr5/doc/gnutella.html> (17 December 2001).
- 7
- Truelove, K. "Gnutella: Alive, Well, and Changing Fast," 25 January 2001. <http://www.openp2p.com/pub/a/p2p/2001/01/25/truelove0101.html> (17 December 2001).
- 8
- Tung, K. K. "Lecture 3, AMATH 383," 17 September 2001. <http://amath.washington.edu/courses/383-autumn-2001/lecture3.pdf> (17 December 2001).
Biographies
Andrew Schroeder (andypants_@hotmail.com) is a senior in the department of Applied Computational Mathematical Sciences at the University of Washington. When he is not studying mathematics, he is tutoring it at South Seattle Community College. Andrew is passionate about improvisational theater. He is the director of the University of Washington's improv group and teaches improvisational acting programs in the Seattle Public Schools. For Andrew, whether it is math or improvisational teaching is a chance to do something positive and connect with people in the process.
Andrew Chen (achen@u.washington.edu) graduated from the University of Washington at the age of 19 with a degree in Applied Computational Mathematical Sciences. He has held positions in software engineering and architecture at several startups in the Pacific Northwest, and is currently with the Seattle office of Mohr, Davidow Ventures, a venture capital firm headquartered in Silicon Valley. Andrew has a broad set of interests, including computational finance, evolutionary psychology, philosophy, and bioinformatics.
Additional Notes
A previous version of this paper was covered in Business 2.0, a nationally distributed business magazine owned by AOL Time Warner: http://www.business2.com/articles/mag/0,1640,42874,00.html
We would like to thank Professor K. K. Tung of the University of Washington Applied Mathematics Department for inspiring us to write this paper.
