EP 0 806 725 B1
5
10
15
20
25
30
35
40
45
50
55
2
Description
BACKGROUND OF THE INVENTION
[0001] This invention relates to code optimization in data processing systems during program compile time. More
particularly, this invention relates to a technique for improving the optimization of small assembly code routines during
program compile time.
[0002] The use of compilers in data processing systems to generate assembly code from high level source code is
well known. Current compilers employ a variety of optimization techniques in order to generate assembly code capable
of more efficiently executing program routines. Examples of such optimization techniques are spill code optimization,
register allocation using interference graph coloring techniques (such as the techniques disclosed in U.S. Patent Nos.
4,571,678 and 5,249,295), and stack storage management.
[0003] In a typical implementation of a compiler having optimization routines, the source code is initially translated
into an intermediate form, the optimization is performed on the intermediate code, and the result is assembled into the
final assembly code as the last stage in the process.
[0004] While the various optimization techniques have been found to be generally useful in enabling the generation
of efficient assembly code, certain routines have traditionally been incorporated into the final assembly code by by-
passing the optimization procedures and inserting hand-written assembly code in the final stage of the compiler. These
assembly routines are typically small routines such as a population count (e.g. calculate the number of bits in a particular
operand), setting a status register, and the like.
[0005] One known technique for accomplishing this purpose is to include in the code generator a mechanism for
assembly language insertions during a final step of the compiler, which is known as inlining. In general, the inlining
capability permits users to call routines which are implemented with assembly language templates organized into
special files with a distinct extension. As noted above, these assembly language templates are typically inlined late in
the compilation process after all the high level code has been subjected to the optimization procedures and reduced
to an assembly language form. Since many optimization procedures critical to processor performance have already
occurred before the inlining of the assembly language templates, these inserted assembly instructions are often poorly
optimized within the context of the insertion.
[0006] An article by R.E. Kole, "Optimizing Compiler Exploits DSP Features" High Performance Systems, vol. 11,
no. 2, February 1990, Manhasset, NY, US, pages 40-45 (hereinafter "KOLE") addresses the use of high-level languages
("HLL"), such as C, to write software for systems where digital-signal processors ("DSP") serve as hosts. KOLE states
the problem as "The use of high level languages can make it a lot easier to write software for systems where digital
signal processors serve as hosts - but the C programming language, preferred by most embedded-system designers,
is often considered unacceptable. C compilers have generally lacked optimizations for such DSP architectural features
as parallel instruction execution, multiple data buses, multiple memories, and pipelining. That's why Intermetrics and
Motorola are working together to supply a C programming environment, including an optimizing C compiler, for
DSP96002 processor." (KOLE , page 1) "While the reasons to consider C for DSP applications are strong, there remains
the fundamental issue of the performance of systems written in C. Key factors that affect the performance of C include
the general quality of compiler optimization, specific optimizations to exploit DSP hardware, and general applicability
of the C language to DSP hardware features. The biggest challenge is that C is a general-purpose language with a
standard definition, while DSPs have unique architectural features to solve specific engineering problems." (KOLE,
page 2, column 1, second full paragraph) "As a result, a specific DSP function may not be directly expressible in C."
(KOLE, page 2, column 1, third full paragraph) "Still, a C compiler can offer effective implementation for many DSP
algorithms. To do so, the compiler must provide maximum optimization ... [setting forth two requirements of optimization
with no bearing on the case at issue]. Finally, the compiler must permit easy access to hand-coded assembler routines
when a C implementation is inappropriate." (KOLE, page 2, column 1, fourth full paragraph) "Not all features of the
DSP 96002 are accessible today from the C language. One example is the processor's use of two independent 32-bit
memory spaces (X and Y)." (KOLE, page 4, column 3, fourth full paragraph) "Unfortunately, the C language does not
offer any facility for describing or manipulating data in multiple memories." (KOLE, page 2, column 1,last full paragraph)
"Use of X-memory objects, however, will require use of an in-line assembler." (KOLE, page 5, column 1, first full par-
agraph) "Another feature not addressable from C is the ability to perform modulo-address arithmetic automatically."
(KOLE, page 5, column 1, second full paragraph) "If these features cannot be directly manipulated for C, then they
must be easily accessed from within an overall C structure. The InterTools compiler offers two methods to directly
embed assembler code in a C program. The first, denoted in a source program by the -CASM key word, makes it easy
to expand out-of-line assembler subroutines in-line. In the second, denoted by the - ASM key word, the programmer
can control the in-line expression based on the form of the arguments given in each invocation." (KOLE, page 5, column
1, third full paragraph) "[The -CASM] approach has the advantage of offering the same semantics as a C subroutine
call, so that the entry and exit from the inline code are well behaved." (KOLE, page 5, column 1, fourth full paragraph)