ACM Home Page
Please provide us with feedback. Feedback
Program optimization for instruction caches
Full text PdfPdf (954 KB)
Source ACM SIGARCH Computer Architecture News archive
Volume 17 ,  Issue 2  (April 1989) table of contents
Special issue: Proceedings of ASPLOS-III: the third international conference on architecture support for programming languages and operating systems
Pages: 183 - 191  
Year of Publication: 1989
ISSN:0163-5964
Also published in ...
Author
S. McFarling  Computer Systems Laboratory, Stanford University
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 7,   Downloads (12 Months): 118,   Citation Count: 87
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues   peer to peer  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/68182.68200
What is a DOI?

ABSTRACT

This paper presents an optimization algorithm for reducing instruction cache misses. The algorithm uses profile information to reposition programs in memory so that a direct-mapped cache behaves much like an optimal cache with full associativity and full knowledge of the future. For best results, the cache should have a mechanism for excluding certain instructions designated by the compiler. This paper first presents a reduced form of the algorithm. This form is shown to produce an optimal miss rate for programs without conditionals and with a tree call graph, assuming basic blocks can be reordered at will. If conditionals are allowed, but there are no loops within conditionals, the algorithm does as well as an optimal cache for the worst case execution of the program consistent with the profile information. Next, the algorithm is extended with heuristics for general programs. The effectiveness of these heuristics are demonstrated with empirical results for a set of 10 programs for various cache sizes. The improvement depends on cache size. For a 512 word cache, miss rates for a direct-mapped instruction cache are halved. For an 8K word cache, miss rates fall by over 75%. Over a wide range of cache sizes the algorithm is as effective as increasing the cache size by a factor of 3 times. For 512 words, the algorithm generates only 32% more misses than an optimal cache. Optimized programs on a direct-mapped cache have lower miss rates than unoptimized programs on set-associative caches of the same size.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
Horowitz, M., Chow, P., et. al., "MIPS-X: A 20 MIPS Peak, 32-Bit Microprocessor with On- Chip Cache", IEEE Journal of Solid-State Circuits, Vol. SC-22, No. 5, October, 1987, pp. 790-799.
 
3
Agarwal, A., Chow, P., Horowitz, M., Acken, J., Salz, A., and Hennessy, J., "On-Chip Instruction Caches for High Performance Processors", Proc. of the Conference on Advanced Research in VLS{, Losleben, P.,ed., Stanford University, Stanford, Ca, March 1987.
 
4
5
6
 
7
Hatfield, D.J., and Gerald, j., "Program Restructuring for Virtual Memory", IBM Systems J., Vol. 10, No. 3, 1971, pp. 168-192.
8
 
9
Ferrari, D., "The Improvement of Program Behavior", Computer, Vol. 9, No. 11, Nov., 1976, pp. 39-47.
 
10
Baer, J.L., and Caughey, R., "Segmentation and Optimization of programs from Cyclic Structure Analysis ", Proc. AFIPS, 1972, pp. 23-36.
 
11
 
12
 
13
Abu-Sufrah, W., "Identifying Program Localities at the Source Level", Technical Report UIUCDCS -R-82-1108, Dept. of Computer Science, Univ. of Illinois, October 1982.
 
14
 
15
Chow, F., "Private Communication".
 
16
 
17
Belady, L.A., "A Study of Replacement Algorithms for a Virtual-Storage Computer ", IBM Systems J., Vol. 5, No. 2, 1966, pp. 78-101.

CITED BY  87
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 


Peer to Peer - Readers of this Article have also read: