A Framework for Performance Management of Component Based Distributed Applications

Adrian Mos

Performance Engineering Laboratory
Dublin City University, Glasnevin, Dublin 9, Ireland
+353 1 700-7644

mosa@eeng.dcu.ie

Advisor: Dr. John Murphy

ACM membership number: 4207858

1. Problem and Motivation

Component Based Middleware platforms [1] such as Enterprise Java Beans (EJB), Microsoft .NET or Corba Component Model (CCM) address the needs of large enterprise projects by providing reusable standardized services and reliable runtime environments which developers can effortlessly integrate and use in their applications. To reduce development costs, developers often use Commercial-Off-The-Shelf (COTS) components or outsource parts of the development effort to third parties. The caveat is that when the resulting application is large, it is difficult for architects and developers to completely understand the implications of different design options over the overall performance of the running system. They often make functional assumptions (total system workload or workload distribution), and technological assumptions (best practices for a particular platform, operating characteristics of an application server), which may lead to design decisions that cause severe performance problems after the system has been deployed. This situation is aggravated by the fact that although application servers vary greatly in performance and capabilities [2], many advertise a similar set of features, making it difficult for system integrators to choose the one that is the most appropriate for the task.
There is a stringent need for tools that can help in building large-scale enterprise applications efficiently. Since performance is a critical issue in these applications, tools that can automate the discovery of performance problems in the appropriate context, without tying the developers to a particular middleware product, can be particularly useful.
The presented work is targeted at large enterprise systems and aims to pinpoint performance issues in the architectural context in which they occur. Because usually poor performance is caused by poor design, not poor code [3][4], this approach can help developers understand the reasons behind a performance issue.

2. Background and Related Work

There is a significant amount of research and work in monitoring CORBA systems [5]. However, there are no generic EJB monitoring frameworks that can provide design level performance information (i.e. method and component level parameters). A number of application servers provide a certain degree of monitoring but most of them do so at a network/protocol level, giving little help to system developers who want to understand which component/method is having the scalability problem. There are also a few third-party plug-ins [6] for application servers that would provide such monitoring information; however, they are targeted at specific application servers, on specific platforms, offering little flexibility in choosing the development environment.
In a different category are EJB testing tools [7] that perform stress testing on EJB components and provide information on their behaviour. Such tools automatically generate clients for each EJB and run scripts with different numbers of simultaneous test clients to see how the EJBs perform. The main disadvantage of such a solution is the fact that it does not gather information from a real-life system but from separated components. Without monitoring the actual deployed system, it is difficult to obtain an accurate performance model for the entire system.
The use of on-demand agent-based monitoring in frameworks such as JAMM [8] resembles the instrumentation infrastructure used by COMPAS. The proxy layer injected by COMPAS in target applications has the similar purpose of creating a set of semi-autonomous manageable entities that can communicate with a central authority. JAMM is however concerned with gathering low-level performance monitoring data such as CPU and network usage from nodes in a distributed environment. There is no concept of software components or objects in JAMM, therefore no monitoring at method level or component level. COMPAS gathers component level application performance data, providing insight into application design; in contrast, approaches such as JAMM provide insight into the lower-level architecture of a hardware-software solution.

One of the most significant methods for performance modelling and prediction is presented in [9] reporting important results in the improvement of the software development process, specifically the use of Software Performance Engineering methods aided by related tools such as SPE-ED. These tools assume that developers can map application entities such as objects or methods to run-time entities such as I/O utilization, CPU cycles or network characteristics. It has been proved that such techniques and tools like SPE-ED help in achieving performance goals and reducing performance related risks for general object-oriented systems and even for distributed systems. However, we argue that middleware such as EJB or other component-oriented platforms, exhibit an inherent complexity, which developers find hard if not impossible to quantify even in simple models. Automated services (such as caching, pooling, replication, clustering, persistence or Java Virtual Machine optimisations), provided by EJB application servers, contribute to an improved and at the same time highly unpredictable run-time environment. In contrast, COMPAS extracts simplified performance information such as method execution times from running versions of the target application, and creates UML performance models automatically. This approach eliminates the need for assumptions and can offer more accurate models and predictions.
The Form framework [10] automatically generates execution profiles from Java applications, being partially similar in intent to the COMPAS Modelling module. Form uses JVM instrumentation to intercept object level events used to build UML sequence diagrams showing the captured interactions and is not particularly performance oriented. COMPAS, in contrast, uses a non-intrusive approach, augmenting the original application; it is strongly focused on the performance of component-based systems, taking into consideration specific factors such as component types and lifecycle events in representing traversable UML models. Such platform related information is necessary to understand the implications of a design decision. Choosing a particular type of component in favour of another, (e.g. a stateless session EJB as opposed to a stateful session EJB) can prove critical in the overall performance of a system, as demonstrated by experimental results in [4].

3. Uniqueness of the Approach

COMPAS provides a reusable solution for analysing large-scale enterprise applications during and after development. It adopts a portable approach, which allows any J2EE application running on any J2EE-compliant application server to be analysed. In addition, no changes are necessary to the target application, and no plug-ins in the application server need to be installed.
COMPAS provides insight into an application’s functionality and design, without requiring source code. This can be useful especially in large enterprise projects where understanding the exact performance implications of the adopted design solutions is a challenging task.
To gather the required data, COMPAS employs a novel approach to monitoring, with the following main characteristics:

COMPAS aligns with the IBM autonomic computing initiative [11], which represents a major direction of research aimed at managing complex systems. In COMPAS, the proxies are small, independent, lightweight entities; together they could act as a performance-tuning force for the target system, in a scalable manner. We envisage that this approach could be used as the foundation for future self-optimising systems, providing localised performance information. In addition, it could enable application adaptation, in cases where such adaptation is possible and can make use of the provided performance information.
Apart from gathering performance information from a system running under normal conditions, proxies can launch the execution of load-testing sequences for a particular component. These sequences would then determine limit-values such as the maximum throughput for particular components.

3.1 COMPAS Overview

The Component Performance Assurance Solutions (COMPAS) Framework comprises three main functional parts or modules that are interrelated:

There is a logical feedback loop connecting the monitoring and modelling modules. It refines the monitoring process by focusing the instrumentation on those parts of the system where the performance problems originate. This drastically reduces the total overhead incurred by COMPAS on the Target Application (TA).
The modules can be used in conjunction or separately, depending on the amount of information available to developers. They would use COMPAS to instrument, model and predict the performance of a TA, which can be a full, completed enterprise application, or just a functional running subsystem of the enterprise application. This approach integrates well in development environments that adhere to iterative development processes such as Rational Unified Process or Extreme Programming. Such processes demand that a running version of the application exists at the end of every iteration, making monitoring possible.

3.2 COMPAS Monitoring

3.2.1 Portability and Non-Intrusiveness

The monitoring module extracts run-time information from the TA without the need for developers to change the TA’s source code or alter the run-time infrastructure (e.g. by installing a plug-in in the component container). Instead, developers follow an automated procedure that changes the deployment structure of the TA, enabling it for performance monitoring.

The deployment structure of the TA (i.e. the EAR file in the case of J2EE applications) is analysed and each original component (ORC) found is extracted. For each ORC (i.e. EJB component in a J2EE scenario) further analysis is performed using reflective techniques. After all the necessary information is extracted, the ORC is augmented with performance instrumentation and remote management capabilities. There are neither source code changes nor byte-code changes to the ORC. Instead, a small automatically- compiled class is added to the component deployment package, and the deployment descriptor changed accordingly.
The resulted component (PXC) has an embedded proxy layer (Figure 1) that can capture application logic events (e.g. method invocations), as well as lifecycle events (e.g. creation, destruction, passivation, activation). This is a change from previous work [12] where the PXC was a separate component that had to communicate (using the container middleware) with the ORC.
A history of the performance parameters associated with an ORC is maintained, which allows the detection of performance degradation (e.g. a method exhibits 10 times increase in its execution time). When performance degradation is detected, an alert is attached to the corresponding PXC, and the monitoring infrastructure is notified.

3.2.2 Adaptability and Control

Based on user-defined rules, performance alerts are issued by the modelling environment, when certain conditions such as “too much growth in execution time” or “scenario throughput is greater then user defined value” are met. If the user defines values such as expected mean and maximum values for a particular scenario response time, the models will show alerts in those areas exceeding these values. If the user does not specify such values, the framework can still suggest possible performance problems when certain conditions like “the response time increases dramatically when small numbers of simultaneous scenario instances are executed” are encountered. If a particular step in the affected scenario is mainly responsible for the degradation of scenario performance parameters, that step is identified and the alert narrowed down to it.


The animation in Figure 2 illustrates the adaptability of the monitoring infrastructure. After a performance alert is fired by one of the top-level components, lower-level proxies are gradually switched on until the root cause is identified. The gradual activation of subsequent proxies is based on the model knowledge and can be influenced by historical data. In the example, PXC 1.1 activates PXC 1.2 based on the previous history of successful problem discovery on that path. The proxy adaptation and learning capabilities can greatly reduce the overhead incurred by monitoring a production system as only the minimum required amount of instrumentation is performed at any moment in time. After the execution scenarios (logical paths consisting of chains of components) have been identified during the modelling phase, the only active PXCs will be those corresponding to top-level ORCs (the first components in each scenario). If a performance alert is issued by any of the top-level PXCs, COMPAS will activate the remaining PXCs in that particular scenario, and the alert can be narrowed down to the ORC, in the logical path, that is responsible for the performance loss perceived in the top-level PXC.

In order to accurately characterise a component with regard to performance, metrics such as maximum throughput are needed. They could be used in the performance prediction module and their values can be obtained by monitoring the component; however, there are cases in which, under the real or simulated testing scenarios, the component does not reach its capacity. In such cases, the proxy can launch a load testing sequence to stress the component to its limit. These sequences can be performed during idle time in production environments, or indeed at any appropriate time during production testing. An issue under consideration is preserving application integrity while load testing a component (e.g. assuring that stress testing does not delete items from the product database). A possible solution is to rollback test transactions just before they have reached completion, or if integrity alteration is unavoidable, skip load testing completely for that component.

3.3 COMPAS Modelling and Prediction

The information extracted by the monitoring infrastructure is used in the modelling module to create UML execution models of the TA. The model extraction process can be based on statistical methods and uses dynamic information such as time-stamps and method execution times together with static information from component deployment descriptors such as inter-component dependencies (depending on platform specifications). Another simpler option is to run the system for a “training” period in which only a single interaction is active at a time. The information gathered by proxies during the training period can be easily composed into execution models. These models are augmented with performance indicators as specified by the UML Profile for Schedulability, Performance, and Time [13]. To facilitate the understanding of the system, the generated models are traversable both horizontally between scenarios at the same realisation level, and vertically between different layers of realisation using the concepts defined by the Model Driven Architecture [14] (MDA). The MDA approach is useful for managing the complexity of the generated models, and allows a faster identification of performance problems in the TA design.

Figure 3 and Figure 4 illustrate how a performance problem is narrowed down using the MDA approach. Both diagrams are Platform Independent Models [14], however, it can be imagined that developers can proceed to lower levels such as EJB Platform Specific Models [14] to identify technology specific events such as lifecycle events that can cause performance degradation.

The performance prediction module simulates the generated models and developers can specify different workload characteristics such as the number of simultaneous users and their inter-arrival rate. In addition, developers can specify expected performance attributes for particular ORCs, which are transformed into conditions for generating alerts during the simulation. We envisage that in the simulation process, developers will be able to change the generated models and observe the effects such changes can have on the overall performance of the application.
COMPAS will use different technological profiles corresponding to particular platforms such as EJB or .NET. Such profiles will contain known performance issues and patterns such as [15] for the platforms they represent and will facilitate the detection of wrong technological decisions or anti-patterns in that context. For example, an EJB PSM can show a performance alert when an entity bean finder-method returns a large result set. In such a situation, COMPAS may suggest a pattern such as Value List Handler [15] to alleviate the performance problem.

4. Results and Contributions

A proof-of-concept monitoring framework has been implemented for the EJB technology. Proxy components can be generated for any J2EE application, using a completely automated installation process. At the end of the process, a “proxy-ed” application (EAR) is available for redeployment. The proxy layer can capture performance metrics such as method invocation time, pool size, and lifecycle events for all session and entity beans. Work is under way to instruments message-driven beans as well.
Graphical consoles have been implemented that can show the deployment structure of the target application and real-time performance graphs for any EJB business method. A history of events can be displayed in the console or logged on physical storage for later analysis. Snapshots of the graphical consoles can be seen at the following URL: http://www.eeng.dcu.ie/~mosa/src/
Work is under way to implement the adaptive and learning capabilities for the proxy layer. Model extraction techniques such as injecting single scenarios into the system and monitoring it are being evaluated; it is expected that in the near future a simple feedback loop will be implemented.
Several small-size test J2EE applications were used to test the COMPAS proxy layer installation and functionality. In addition, COMPAS was tested on Sun’s Petstore sample application as well as on IBM’s Trade 3 J2EE performance benchmark. The performance overhead incurred by monitoring Trade 3 with COMPAS was found to be relatively small even when the system was under heavy loads. We envisage that with the introduction of the adaptive monitoring techniques, the overall overhead will be further reduced to a negligible value, appropriate for long-term monitoring.

COMPAS facilitates the understanding of the structure and performance issues in large enterprise applications. It is completely portable and can be used with current and future generations of J2EE middleware from J2EE compliant vendors, without being locked in any particular middleware product. The installation procedure is fully automated and non-intrusive. It injects a lightweight proxy layer in the target application, enabling performance instrumentation for application-level events as well as for component lifecycle events. By taking an adaptive approach to monitoring (based on model data and learning), we support a shift from static performance management to dynamic performance management, which is critical in large-scale applications that can grow and evolve unpredictably. COMPAS reduces the need for assumptions in performance prediction by closely integrating adaptive monitoring with modelling and performance prediction. In COMPAS, real data drives the creation of the call paths; this follows the dynamic nature of binding in component-based systems
We envisage that COMPAS could be integrated in any J2EE container and provide a reflective property that could enable applications to reflect upon themselves in performance management terms. Since the middleware would be providing COMPAS services, there would be no need for an installation procedure anymore. In such an environment, if an application were enabled for adaptation, it could use the performance information to optimise its behaviour; this approach comes to support the autonomic computing initiative for self-optimising systems.
Integrating well in iterative development processes (such as RUP and XP) as well as in post-production environments, COMPAS could prove useful as a non-intrusive and automated approach to performance tuning of large component-based systems.

5. References

  1. C. Szyperski, D. Gruntz, S. Murer, Component Software: Beyond Object-Oriented Programming, Second Edition, Addison-Wesley, November 2002
  2. I. Gorton, A. Liu, P. Brebner. Rigorous Evaluation of COTS Middleware Technology. In IEEE Computer, March 2003.
  3. P.C. Clements. Coming Attractions in Software Architecture. No.CMU/SEI-96-TR-003, Software Engineering Institute, Carnegie Mellon University, February 1996
  4. E. Cecchet, J. Marguerite, W. Zwaenepoel. Performance and scalability of EJB applications. In Proceedings of the 17th ACM Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA) November 2002, Seattle, WA
  5. F. Lange, R. Kroeger, M. Gergeleit. JEWEL: Design and Measurement of a Distributed Measurement System. In IEEE Transactions on Parallel and Distributed Systems, November 1992
  6. Sitraka. PerformaSure. http://www.sitraka.com/software/performasure/
  7. Empirix. Bean Test. http://www.empirix.com/empirix/web+test+monitoring/products/
  8. B. Tierney et al. A Monitoring Sensor Management System for Grid Environments. In Proceedings of Ninth IEEE International Symposium on High Performance Distributed Computing (HPDC'00), August 2000, Pittsburgh, PA.
  9. C.U. Smith, L.G. Williams. Performance and Scalability of Distributed Software Architectures: An SPE Approach, Parallel and Distributed Computing Practices, 2002
  10. T. Souder, S. Mancoridis, M. Salahm. Form: A Framework for Creating Views of Program Executions. In Proceedings of IEEE International Conference on Software Maintenance ICSM'01, Florence, Italy, November 2001
  11. J. O. Kephart, D. M. Chess. The Vision of Autonomic Computing. In IEEE Computer, January 2003
  12. A. Mos, J. Murphy. Performance Management in Component-Oriented Systems Using a Model Driven Architecture™ Approach. In Proceedings of The 6th IEEE International Enterprise Distributed Object Computing Conference (EDOC), Lausanne, Switzerland, September 2002
  13. Object Management Group. UML Profile for Schedulability, Performance, and Time Specification. OMG document number ptc/02-03-02, OMG, 2002
  14. Object Management Group, Model Driven Architecture, OMG document number ormsc/2001-07-01, OMG, 2001
  15. J. Crupi, D. Alur, D. Malks. Core J2EE Patterns, Prentice Hall, 30 September 2001