Method and apparatus for early insertion of assembler code for

publicité
Europäisches Patentamt
(19)
European Patent Office
*EP000806725B1*
Office européen des brevets
(11)
EP 0 806 725 B1
EUROPEAN PATENT SPECIFICATION
(12)
(45) Date of publication and mention
(51) Int Cl.7:
of the grant of the patent:
27.08.2003 Bulletin 2003/35
G06F 9/45
(21) Application number: 97106998.4
(22) Date of filing: 28.04.1997
(54) Method and apparatus for early insertion of assembler code for optimization
Verfahren und Anordnung zum frühzeitigen Einfügen von Assemblercode zwecks Optimierung
Procédé et dispositif d’introduction prématuré de code assembleur à des fins d’optimisation
(84) Designated Contracting States:
(56) References cited:
DE FR GB NL SE
(30) Priority: 07.05.1996 US 643895
(43) Date of publication of application:
12.11.1997 Bulletin 1997/46
(73) Proprietor: SUN MICROSYSTEMS, INC.
Mountain View, California 94043-1100 (US)
(72) Inventor: Goebel, Kurt J.
Mountain View, California 94043 (US)
(74) Representative: Käck, Jürgen et al
EP 0 806 725 B1
Kahler Käck Mollekopf
Patentanwälte
Vorderer Anger 239
86899 Landsberg (DE)
• R. E. KOLE: "OPTIMIZING COMPILER EXPLOITS
DSP FEATURES" HIGH PERFORMANCE
SYSTEMS, vol. 11, no. 2, February 1990,
MANHASSET, NY, US, pages 40-45,
XP000096460
• D. SHEAR: "HLL Compilers and DSP run-time
libraries make DSP-system programming easy"
EDN - ELECTRONIC DESIGN NEWS, vol. 33, no.
13, 23 June 1988, NEWTON, MA, US, pages 69-76,
XP000105779
• R. B. GROVE ET AL: "GEM Optimizing Compilers
for Alpha AXP Systems" COMCON SPRING ’93 DIGEST OF PAPERS, 22 - 26 February 1993, SAN
FRANCISCO, CA, US, pages 465-473,
XP000379081
Note: Within nine months from the publication of the mention of the grant of the European patent, any person may give
notice to the European Patent Office of opposition to the European patent granted. Notice of opposition shall be filed in
a written reasoned statement. It shall not be deemed to have been filed until the opposition fee has been paid. (Art.
99(1) European Patent Convention).
Printed by Jouve, 75001 PARIS (FR)
EP 0 806 725 B1
Description
BACKGROUND OF THE INVENTION
5
10
15
20
25
30
35
40
45
50
55
[0001] This invention relates to code optimization in data processing systems during program compile time. More
particularly, this invention relates to a technique for improving the optimization of small assembly code routines during
program compile time.
[0002] The use of compilers in data processing systems to generate assembly code from high level source code is
well known. Current compilers employ a variety of optimization techniques in order to generate assembly code capable
of more efficiently executing program routines. Examples of such optimization techniques are spill code optimization,
register allocation using interference graph coloring techniques (such as the techniques disclosed in U.S. Patent Nos.
4,571,678 and 5,249,295), and stack storage management.
[0003] In a typical implementation of a compiler having optimization routines, the source code is initially translated
into an intermediate form, the optimization is performed on the intermediate code, and the result is assembled into the
final assembly code as the last stage in the process.
[0004] While the various optimization techniques have been found to be generally useful in enabling the generation
of efficient assembly code, certain routines have traditionally been incorporated into the final assembly code by bypassing the optimization procedures and inserting hand-written assembly code in the final stage of the compiler. These
assembly routines are typically small routines such as a population count (e.g. calculate the number of bits in a particular
operand), setting a status register, and the like.
[0005] One known technique for accomplishing this purpose is to include in the code generator a mechanism for
assembly language insertions during a final step of the compiler, which is known as inlining. In general, the inlining
capability permits users to call routines which are implemented with assembly language templates organized into
special files with a distinct extension. As noted above, these assembly language templates are typically inlined late in
the compilation process after all the high level code has been subjected to the optimization procedures and reduced
to an assembly language form. Since many optimization procedures critical to processor performance have already
occurred before the inlining of the assembly language templates, these inserted assembly instructions are often poorly
optimized within the context of the insertion.
[0006] An article by R.E. Kole, "Optimizing Compiler Exploits DSP Features" High Performance Systems, vol. 11,
no. 2, February 1990, Manhasset, NY, US, pages 40-45 (hereinafter "KOLE") addresses the use of high-level languages
("HLL"), such as C, to write software for systems where digital-signal processors ("DSP") serve as hosts. KOLE states
the problem as "The use of high level languages can make it a lot easier to write software for systems where digital
signal processors serve as hosts - but the C programming language, preferred by most embedded-system designers,
is often considered unacceptable. C compilers have generally lacked optimizations for such DSP architectural features
as parallel instruction execution, multiple data buses, multiple memories, and pipelining. That's why Intermetrics and
Motorola are working together to supply a C programming environment, including an optimizing C compiler, for
DSP96002 processor." (KOLE , page 1) "While the reasons to consider C for DSP applications are strong, there remains
the fundamental issue of the performance of systems written in C. Key factors that affect the performance of C include
the general quality of compiler optimization, specific optimizations to exploit DSP hardware, and general applicability
of the C language to DSP hardware features. The biggest challenge is that C is a general-purpose language with a
standard definition, while DSPs have unique architectural features to solve specific engineering problems." (KOLE,
page 2, column 1, second full paragraph) "As a result, a specific DSP function may not be directly expressible in C."
(KOLE, page 2, column 1, third full paragraph) "Still, a C compiler can offer effective implementation for many DSP
algorithms. To do so, the compiler must provide maximum optimization ... [setting forth two requirements of optimization
with no bearing on the case at issue]. Finally, the compiler must permit easy access to hand-coded assembler routines
when a C implementation is inappropriate." (KOLE, page 2, column 1, fourth full paragraph) "Not all features of the
DSP 96002 are accessible today from the C language. One example is the processor's use of two independent 32-bit
memory spaces (X and Y)." (KOLE, page 4, column 3, fourth full paragraph) "Unfortunately, the C language does not
offer any facility for describing or manipulating data in multiple memories." (KOLE, page 2, column 1,last full paragraph)
"Use of X-memory objects, however, will require use of an in-line assembler." (KOLE, page 5, column 1, first full paragraph) "Another feature not addressable from C is the ability to perform modulo-address arithmetic automatically."
(KOLE, page 5, column 1, second full paragraph) "If these features cannot be directly manipulated for C, then they
must be easily accessed from within an overall C structure. The InterTools compiler offers two methods to directly
embed assembler code in a C program. The first, denoted in a source program by the -CASM key word, makes it easy
to expand out-of-line assembler subroutines in-line. In the second, denoted by the - ASM key word, the programmer
can control the in-line expression based on the form of the arguments given in each invocation." (KOLE, page 5, column
1, third full paragraph) "[The -CASM] approach has the advantage of offering the same semantics as a C subroutine
call, so that the entry and exit from the inline code are well behaved." (KOLE, page 5, column 1, fourth full paragraph)
2
EP 0 806 725 B1
5
10
15
20
25
"|T]he -ASM form of inline is more like a substitution macro than a subroutine coded in-line." (KOLE, page 5, column
2, first paragraph)
[0007] An article by D. Shear, "HLL Compilers and DSP run-time libraries make DSP system programming easy"
EDN-Electronic Design News, vol. 33, no. 13, 23 June 1988, Newton, MA, US, pages 69-76 (hereinafter "SHEAR") is
directed at the use of HLL compilers and DSP run-time libraries for making DSP programming easier. "You no longer
have to be an algorithm expert to accomplish a DSP-based design: High-level languages (HLLs) have arrived en masse
for digital-signal-processing (DSP) applications. As recently as a year ago, no viable HLLs were even available for
DSPs. Now, however, almost all of the general-purpose floating point DSP chips offer HLL compilers, and many of the
fixed-point chips do so as well. You can create applications software for your DSP system in C, Pascal, or Forth, though
C is by far the dominant language. DSP run-time libraries that contain useful, prewritten routines for many popular
DSP algorithms are also available." (SHEAR, page 1, column 1, first paragraph) "High-level languages can incur speed
penalties. When you use a high-level language, the compiler takes care of all the details of the program and creates
the assembly-language code. For the most part, compilers are stupid; therefore, the code they generate isn't very
efficient." (SHEAR, page 3, column 1, last paragraph) "Fortunately, although you need to be aware of the speed penalty
HLLs can incur [ ] you could solve the problem by writing part of your code in assembly language. Another solution is
to use an optimizing compiler, which a number of manufacturers offer. An optimizing compiler goes through the code
and reorganizes it, using a variety of tricks to enhance the program's efficiency." (SHEAR, page 3, column 2, last
paragraph) "Although optimizing compilers are useful, there'll be times when the code resulting from your optimizing
compiler just doesn't run fast enough for your purposes. You can speedup those sections of code by rewriting them
as in-line assembly-language code or by using an assembly-language routine called from the high-level language."
(SHEAR, page 4, column 3, first full paragraph) "Optimizing compilers usually don't operate on the in-line assembly
language. The vendor assumes instead, that you've already optimized the code yourself. AT&Ts C compiler allows you
to switch the optimization back on if you want to use it on the in-line assembly-language code. The company recommends that you always send your entire program, including the in-line assembly-language code, through the optimizer
even if it doesn't require optimization, so that your code will always by compatible with future releases of the compiler."
(SHEAR, page 4, column 3, third full paragraph).
SUMMARY OF THE INVENTION
30
35
40
45
50
55
[0008] The invention provides a technique for permitting early inlining of assembly code templates in appropriate
cases so that the assembly code can be subjected to all the optimization procedures incorporated into a compiler, with
the resulting more efficient generation of assembly code.
[0009] From a process standpoint, the invention comprises a method as defined in claim 1.
[0010] The step of virtualizing the template preferably includes the steps of identifying physical register assignments
within the template, and transforming the physical register assignments into virtual register assignments. The steps of
identifying and transforming are preferably sequentially performed on the input operands and the output operands in
the assembly code template.
[0011] The combined intermediate form assembly code and source code are subjected to the optimization procedures
performed by the compiler. After the optimization procedures are completed, the compiler generates assembly code.
For those assembly code templates deemed ineligible for early inlining, these ineligible assembly code routines are
later inlined into the result of the optimization procedure during the generation of the assembly code by the compiler.
[0012] From a system standpoint, the invention comprises a computer system as defined in claim 7.
[0013] The compiler inlining procedure includes a procedure for virtualizing a template, a procedure for transforming
the assembly code in the template into an intermediate form, a procedure for transforming the source code into an
intermediate form, and a procedure for combining the intermediate form assembly code and the intermediate form
source code. The procedure for virtualizing the template includes a procedure for identifying physical register assignments within the template and transforming the physical register assignments into virtual register assignments. The
identifying and transforming procedures are preferably sequentially performed on the input operands and the output
operands.
[0014] The compiler further preferably includes a procedure for generating assembly code from the result of the at
least one optimization procedure, as well as a procedure for inlining into the assembly code the assembly code from
any assembly code template having an instruction or an operand not included in the recognized set.
[0015] Furthermore, the invention comprises a computer program product as defined in claim 14.
[0016] The invention enables small assembly code routines to be inlined early with the surrounding source code
routines prior to performing optimization processing within the compiler. As a result, more effective optimization is
performed, which results in the generation of assembly code capable of more effective execution of a given program.
[0017] For a fuller understanding of the nature and advantages of the invention, reference should be had to the
ensuing detailed description, taken in conjunction with the accompanying drawings.
3
EP 0 806 725 B1
BRIEF DESCRIPTION OF THE DRAWINGS
[0018]
5
10
15
Fig. 1 is a block diagram of a data processing system incorporating the invention.
Fig. 2 is a functional diagram illustrating various portions of the compiler of Fig. 1.
Fig. 3(a) is a graph of an internal representation of code for a main procedure generated by the compiler before
inlining.
Fig. 3(b) is a graph of an internal representation of code for an assembly template generated by the compiler
before inlining.
Fig. 4 is a graph of an internal representation of code for the main procedure after early inlining and virtualization
of the assembly language template in accordance with the present invention.
Fig. 5 is a flow chart showing steps performed by the compiler during early inlining.
Figs. 6(a) and 6(b) show examples of code where physical register numbers have been converted to virtual register
numbers.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
20
25
[0019] Turning now to the drawings, Fig. 1 is a block diagram of a data processing system 100 incorporating the
invention. Data processing system 100 includes a CPU 102, a memory 104, and I/O lines 106. Memory 104 includes
source code 10, compiler software 12, assembly code 14, an intermediate representation 16 of the source code 10,
an assembly language template 18, and an intermediate representation 20 of the source code 10 after inlining. It will
be understood by persons of ordinary skill in the art that data processing system 100 can also include numerous
elements not shown in the Figure for the sake of clarity, such as disk drives, keyboards, display devices, network
connections, additional memory, additional CPUs, LANs, etc.
[0020] Fig. 2 illustrates the major portions of compiler 12. As seen in this figure, source code from source 10 is initially
supplied to an internal representation (IR) translator 202 in which the source code is converted to intermediate representation 16. An example of a C language procedure "main" that calls an assembly language template is:
30
35
40
This procedure calls an assembly language template "getzero" having no parameters. An example of an intermediate
representation of the procedure main is shown in Fig. 3(a) and discussed in further detail below. An example of the
SPARC assembly language template getzero called by the main procedure is as follows:
45
50
55
The assembly language template "getzero" has no parameters and consists of a load instruction and a store instruction.
SPARC is a trademark of SPARC International, Inc. An example of an intermediate representation of the template
getzero is shown in Fig. 3(b) and discussed in further detail below. The process to effect a transformation to intermediate
form is well-known to persons of ordinary skill in the art and will vary depending on the processor and type of languages
used in a given implementation.
[0021] An Assembly Template Virtualization process 204 combines the internal representation 16 of the source code
with the internal representation of eligible externally supplied assembly templates 18 in the manner described below.
This process is termed "early inlining". Thereafter, the combined intermediate representation of the source code and
the intermediate representation of the assembly code are subjected to the full array of optimization procedures 206
4
EP 0 806 725 B1
5
10
15
20
25
30
35
40
45
50
modeled into the compiler. The result of the optimization procedures is input to assembler 208, which generates the
assembly code module 14. Small assembly templates/routines deemed ineligible for inlining prior to optimization are
inlined with the assembly code 14 generated by assembler module 208.
[0022] Fig. 5 is a flow chart of steps performed to do early inlining of assembly language routines by compiler 12.
Specifically these steps, are performed by CPU 102 executing instructions of assembly template virtualization process
204. The following example uses procedure "main" and template "getzero" provided above.
[0023] As shown in Fig. 5, the early inlining process proceeds as follows. In steps 502, the IR of source code 10 of
main is reviewed for calls to an assembly language template. Whenever a call to an assembly language template is
encountered in step 502 the called template is reviewed to determine if it can be inlined early. To make this determination, in step 504 each instruction and operand in the template is compared with a prepared list of recognized instructions and operands stored in memory 104. If not all instructions and operands in the template are recognized by the
optimization process, the assembly code represented by the template is scheduled for insertion (late inlining) at the
assembler stage of process 208 in step 506.
[0024] Otherwise, if all instructions and operands are recognized, i.e., can be handled by optimizer 206, the assembly
template code 18 is converted into an intermediate representation form in step 508 using techniques known to persons
of ordinary skill in the artFig. 3(b) discussed below provides an example of intermediate representation of the template
getzero.
[0025] The intermediate form of the template is then inlined into the intermediate form of the source code and virtualized. (In some implementations, virtualization occurs before the intermediate representations are combined). The
physical registers and stack pointer references in the template are virtualized by replacing actual physical registers
with virtual registers and virtual stack pointers in steps 512-518, as described below.
[0026] As a result, the assembly code represented by the assembly language template has been transformed into
the intermediate form that is utilized by the compiler 12 just as the surrounding high level language constructs from
the source code have been transformed. Thereafter, the inserted instructions are fully optimized along with the rest of
the code in process 206 so that normal optimization (including such known processes as global scheduling, pipelining,
global register optimization, etc.) can be performed.
[0027] The list of instructions and operands of step 504 can contain any instructions and/or operands, the only criteria
for inclusion in the list being that the optimization process 206 be able to optimize the instructions or operands when
it encounters them during optimization. For example, if the optimizer 206 of an preferred embodiment does not recognize (or optimize) references to the SPARC floating point %fsr register, templates referencing that register cannot be
early inlined. As a second example, if the optimizer 206 of a preferred embodiment does not recognize the "popc"
instruction, templates containing that instruction cannot be early inlined.
[0028] If all instructions and operands are supported and all operands are modeled, the template is considered
eligible for early inlining. In step 510 the call (see elements 302 and 304 of Fig. 3(a)) is removed and the determined
template (see Fig. 3(b)) is inserted in the place of the call.
[0029] Once inserted, the register operands of the inline template are then virtualized in the following manner. In
step 512, any input arguments to the template are first identified. The argument setup code for the arguments is then
detected and analyzed. Physical register assignments are then transformed into virtual registers. For example, Fig. 6
(a) shows an example where register %o0 contains an input argument for the template and where a value is loaded
into register %o0 prior to the location where the template will be inserted. At both locations, %o0 is changed to next
available, virtual register (in this case %r101). Next, in step 514 any output arguments of the template are identified
as well as places in the calling routine that use the output arguments, and physical registers are changed to virtual
registers. Fig. 6(b) shows an example of such a reassignment, in which physical register %o0 is changed to virtual
register %r102. Next, in step 516, the body of the inline template is then reviewed for further physical register usage.
Any detected use of a physical register is transformed into a next available virtual register. (In a preferred embodiment,
register %g0 is not virtualized because it is a read-only register having a hardwired value of "0"). In step 518, uses of
physical stack memory locations uses are detected and reassigned to local stack locations (compare the stack pointer
reference in Figs. 3(b) and 4).
[0030] After completing virtualization, all traces of the inline call have been removed. The result appears to the
compiler 12 as if the instructions forming the template had been generated from the high level language from source
10. As a consequence, subsequent code optimization phases in compiler 12 are not constrained in any way.
[0031] A fragment of the internal representation 16 (also called "IR") of the procedure main before early inlining is:
55
5
EP 0 806 725 B1
5
10
Fig. 3(a) shows a graphical representation of the IR for main stored in memory 104. Note that the IR initially contains
a call 302 to the template "getzero."
[0032] A fragment of the internal representation of the template "getzero" main before early inlining is:
15
20
Fig. 3(b) shows a graphical representation of the IR for template getzero stored in memory 104.
[0033] The IR fragment for main after early inlining of template getzero and virtualization is:
25
30
35
40
Fig. 4 shows the IR representation of procedure main after early inlining and virtualization. Note that the template
getzero has replaced the call 302,304 to getzero. Note also that the reference to physical register %o0 has been
replaced with a reference to the virtual register %r100. The reference to physical stack location %sp+0x44 has been
replaced with a reference to local stack location %fp+ -16. (In the example, both %r100 and %fp+-16 are completely
arbitrary, and merely represent the next available virtual register and the next available local stack location.)
[0034] The template's virtualized instructions are now ready for the code generator's optimization phases. In the
example, the template's output, register %r100 is not used by the C code of the main routine. Similarly, the stack
location fp+-16 is not needed. As a result, the entire template is optimized away by optimization routine 206:
45
50
55
6
EP 0 806 725 B1
5
10
15
20
25
Thus, in this simple example, early inlining allows the optimization routine 206 to determine that the inlined routine is
unnecessary and to remove it.
[0035] As will now be apparent, the invention enables some small assembly code routines to be combined with the
source code prior to the optimization procedures in compiler 12. Consequently, those eligible routines can be optimized
along with surrounding portions of the code originating with source 10 to enhance the overall performance of the
assembly code. In addition, for those assembly code routines which are ineligible for the early inlining with source
code, these ineligible routines are processed in the standard way: viz. by later inlining in the assembler module.
[0036] While the above provides a full and complete disclosure of the preferred embodiment of the invention, various
modifications, alternate constructions and equivalents will occur to those skilled in the art. Therefore, the above should
not be construed as limiting the invention, which is defined by the appended claims.
Claims
30
1.
A method of inserting an assembly code routine into a source code routine (10), said method comprising the steps
of:
providing an assembly code template (18) having the instructions and operands of the assembly code routine;
providing a set of recognized instructions recognizable by the compiler (12) for code optimization;
35
characterized by:
scanning (202) the instructions and operands of the template (18) to determine whether all instructions and
operands of the template are included in the set of recognized instructions; if so,
virtualizing (204) the template (18);
transforming the assembly code (18) into an intermediate form usable by the compiler (12);
transforming a source code (10) into an intermediate form (16) usable by the compiler; and
combining the intermediate form assembly code and the intermediate form source code (16),
40
45
wherein the assembly code routine is inserted into the source code routine (10) prior to optimization (206)
in a data processing system (100) compiler (12).
2.
The method of claim 1 wherein said step of virtualizing (204) the template (18) includes the steps of identifying
physical register assignments within the template(18), and transforming the physical register assignments into
virtual register assignments.
3.
The method of claim 2 wherein said assembly code template (18) has input operands and output operands; and
wherein said steps of identifying and transforming are sequentially performed on input operands and output operands.
4.
The method of claim 1 further including the step of subjecting the combined (20) intermediate form assembly code
and intermediate form source code (16) to at least one optimization procedure (206).
50
55
7
EP 0 806 725 B1
5.
The method of Claim 4 further Including the step of generating (208) assembly code (14) from the result of the at
least one optimization procedure (206).
6.
The method of claim 5, further including the step of inlining (208) into the result of the at least one optimization
procedure (206) any assembly code routine having any instruction or operand not included in said recognized set.
7.
A computer system, comprising:
5
a central processing unit (CPU) (102) having a memory unit (104) for storing assembly code (14);
a source code (10) supply specifying program routine for use by the CPU (102);
a source of assembly code routines in the form of templates (18) having instructions and operands;
a compiler (12) for generating assembly code (14) from the source code (10) and the assembly code routines
(18), said assembly compiler (12) including at least one optimization procedure (206); and
a set of recognized instructions recognizable by the compiler (12) for code optimization (206);
10
15
characterized by:
said compiler (12) further including a first procedure (204) for scanning the instructions and operands in a
given assembly code template (18), to determine whether all instructions and operands in the template (18)
are included in the recognized set, and a second procedure for inlining the assembly code template (18) into
the source code (10) prior to the optimization procedure (206) whenever all instructions and operands in a
given assembly code template (18) are included in the recognized set.
20
8.
The computer system according to claim 7 wherein the inlining procedure includes a procedure for virtualizing
(204) the template (18), a procedure for transforming the assembly code in the template (18) into an intermediate
form, a procedure for transforming the source code (10) into an intermediate form (16), and a procedure for combining the intermediate form assembly code and the intermediate form source code (16).
9.
The computer system of claim 8 wherein the procedure for virtualizing the templates (18) includes a procedure for
identifying physical register assignments within the template and transforming the physical register assignments
into virtual register assignments.
25
30
35
10. The computer system according to claim 9 wherein the assembly code template includes input operands and
output operands; and wherein the identifying and transforming procedures are sequentially performed on the input
operands and output operands.
11. The computer system of claim 7 wherein the compiler (12) further includes a procedure for subjecting the combined
(20) intermediate form assembly code and source code to at least one optimization process (206).
40
45
12. The computer system of claim 11 wherein the compiler (12) includes a procedure for generating (206) assembly
code (14) from the result of the at least one optimization procedure (206).
13. The computer system according to claim 12 wherein the compiler (12) includes a procedure for inlining (208) into
the assembly code (20) the assembly code from any assembly code template (18) having an instruction or an
operand not included in the said recognized set.
14. A computer program product, comprising:
50
55
a computer usable medium having computer readable code embodied therein for causing early inlining of an
assembly language template (18) prior to a compiler optimization process (206), the computer program product
comprising:
a source code supply (10) specifying program routine for providing the source code to the compiler (12);
an assembly code template (18) supply routine, supplying to the compiler templates having instructions
and operands;
a compiler (12) for generating assembly code (14) from the source code (10) and the assembly code
routines (18), said assembly compiler (12) including at least one optimization procedure (206); and
a set of recognized instructions recognizable by the compiler (12) for code optimization (206);
8
EP 0 806 725 B1
characterized by:
said compiler (12) further including a first procedure (202) for scanning the instructions and operands in a
given assembly code template (18) to determine whether all instructions and operands in the template (18)
are included in the recognized set, and a second procedure for inlining the assembly code template (18) into
the source code (10) prior to the optimization procedure (206) whenever all instructions and operands in a
given assembly code template (18) are Included in the recognized set.
5
10
15
20
15. The computer program product according to claim 14 wherein the inlining procedure includes a procedure for
virtualizing (204) the template (18), a procedure for transforming the assembly code (10) In the template (18) into
an intermediate form, a procedure for transforming the source code into an intermediate form (16), and a procedure
for combining the intermediate form assembly code and the intermediate form source code (16).
16. The computer program product o f claim 15 wherein the procedure for virtualizing (204) the template (18) includes
a procedure for identifying physical register assignments within the template and transforming the physical register
assignments into virtual register assignments.
17. The computer program product according to claim 16 wherein the assembly code template (18) includes input
operands and output operands; and wherein the identifying and transforming procedures are sequentially performed on the input operands and the output operands.
18. The computer program product of claim 14 wherein said compiler (12) further includes a procedure for subjecting
the combined intermediate form (20) assembly code and source code to at least one optimization process (206).
25
30
19. The computer program product of claim 18 wherein the compiler (12) includes a procedure for generating assembly
code (14) from the result of at least one optimization procedure (206).
20. The computer program product according to claim 19 wherein the compiler (12) includes a procedure (208) for
inlining into the assembly code (14) the assembly code from any assembly code template having an instruction
or an operand not included in said recognized set.
Patentansprüche
35
1.
Verfahren zum Einfügen einer Assemblercoderoutine in eine Quellencoderoutine (10), wobei das Verfahren die
Schritte umfasst:
Vorsehen einer Assemblercodeschablone (18) mit den Befehlen und Operanden der Assemblercoderoutine;
Vorsehen eines Satzes von erkannten Befehlen, die vom Kompilierer(12) erkennbar sind, zur Codeoptimierung;
40
gekennzeichnet durch:
Abtasten (202) der Befehle und Operanden der Schablone (18), um festzustellen, ob alle Befehle und Operanden der Schablone in dem Satz von erkannten Befehlen enthalten sind; wenn ja
Virtualisieren (204) der Schablone (18);
Transformieren des Assemblercodes (18) in eine Zwischenform, die vom Kompilierer (12) verwendbar ist;
Transformieren eines Quellencodes (10) in eine Zwischenform (16), die vom Kompilierer verwendbar ist; und
Kombinieren des Zwischenform-Assemblercodes und des Zwischenform-Quellencodes (16),
45
50
wobei die Assemblercoderoutine in die Quellencoderoutine (10) vor der Optimierung (206) in einen Kompilierer
(12) eines Datenverarbeitungssystems (100) eingefügt wird.
2.
Verfahren nach Anspruch 1, wobei der Schritt des Virtualisierens (204) der Schablone (18) die Schritte des Identifizierens von Belegungen von physikalischen Registern innerhalb der Schablone (18) und des Transformierens
der Belegungen von physikalischen Registern in Belegungen von virtuellen Registern umfasst.
3.
Verfahren nach Anspruch 2, wobei die Assemblercodeschablone (18) Eingangsoperanden und Ausgangsoperan-
55
9
EP 0 806 725 B1
den aufweist; und wobei die Schritte des Identifizierens und Transformierens nacheinander an den Eingangsoperanden und Ausgangsoperanden durchgeführt werden.
4.
Verfahren nach Anspruch 1, welches ferner den Schritt des Unterziehens des kombinierten (20) ZwischenformAssemblercodes und Zwischenform-Quellencodes (16) mindestens einer Optimierungsprozedur (206) umfasst.
5.
Verfahren nach Anspruch 4, welches ferner den Schritt des Erzeugens (208) eines Assemblercodes (14) aus dem
Ergebnis der mindestens einen Optimierungsprozedur (206) umfasst.
6.
Verfahren nach Anspruch 5, welches ferner den Schritt des Einreihens (208) irgendeiner Assemblercoderoutine
mit irgendeinem Befehl oder Operanden, der nicht im erkannten Satz enthalten ist, in das Ergebnis der mindestens
einen Optimierungsprozedur (206) umfasst.
7.
Computersystem mit:
5
10
15
einer Zentralverarbeitungseinheit (CPU) (102) mit einer Speichereinheit (104) zum Speichern eines Assemblercodes (14);
einer Zuführungsfestlegungs-Programmroutine für den Quellencode (10) zur Verwendung durch die CPU
(102);
einer Quelle für Assemblercoderoutinen in Form von Schablonen (18) mit Befehlen und Operanden;
einem Kompilierer (12) zum Erzeugen des Assemblercodes (14) aus dem Quellencode (10) und den Assemblercoderoutinen (18), wobei der Assemblerkompilierer (12) mindestens eine Optimierungsprozedur (206)
umfasst; und
einem Satz von erkannten Befehlen, die vom Kompilierer (12) erkennbar sind, zur Codeoptimierung (206);
20
25
dadurch gekennzeichnet, dass:
der Kompilierer (12) ferner eine erste Prozedur (204) zum Abtasten der Befehle und Operanden in einer gegebenen Assemblercodeschablone (18), um festzustellen, ob alle Befehle und Operanden in der Schablone
(18) im erkannten Satz enthalten sind, und eine zweite Prozedur zum Einreihen der Assemblercodeschablone
(18) in den Quellencode (10) vor der Optimierungsprozedur (206), sobald alle Befehle und Operanden in einer
gegebenen Assemblercodeschablone (18) im erkannten Satz enthalten sind, umfasst.
30
8.
Computersystem nach Anspruch 7, wobei die Einreihungsprozedur eine Prozedur zum Virtualisieren (204) der
Schablone (18), eine Prozedur zum Transformieren des Assemblercodes in der Schablone (18) in eine Zwischenform, eine Prozedur zum Transformieren des Quellencodes (10) in eine Zwischenform (16) und eine Prozedur
zum Kombinieren des Zwischenform-Assemblercodes und des Zwischenform-Quellencodes (16) umfasst.
9.
Computersystem nach Anspruch 8, wobei die Prozedur zum Virtualisieren der Schablonen (18) eine Prozedur
zum Identifizieren von Belegungen von physikalischen Registern innerhalb der Schablone und zum Transformieren
der Belegungen von physikalischen Registern in Belegungen von virtuellen Registern umfasst.
35
40
45
10. Computersystem nach Anspruch 9, wobei die Assemblercodeschablone Eingangsoperanden und Ausgangsoperanden umfasst; und wobei die Identifikationsund Transformationsprozeduren nacheinander an den Eingangsoperanden und den Ausgangsoperanden durchgeführt werden.
11. Computersystem nach Anspruch 7, wobei der Kompilierer (12) ferner eine Prozedur zum Unterziehen des kombinierten (20) Zwischenform-Assemblercodes und Quellencodes mindestens einem Optimierungsprozess (206)
umfasst.
50
12. Computersystem nach Anspruch 11, wobei der Kompilierer (12) eine Prozedur zum Erzeugen (206) eines Assemblercodes (14) aus dem Ergebnis der mindestens einen Optimierungsprozedur (206) umfasst.
55
13. Computersystem nach Anspruch 12, wobei der Kompilierer (12) eine Prozedur zum Einreihen (208) des Assemblercodes von irgendeiner Assemblercodeschablone (18) mit einem Befehl oder Operanden, der nicht im erkannten Satz enthalten ist, in den Assemblercode (20) umfasst.
14. Computerprogrammprodukt mit:
10
EP 0 806 725 B1
einem vom Computer verwendbaren Medium mit einem darin verkörperten maschinenlesbaren Code zum
Bewirken einer frühen Einreihung einer Assemblersprachschablone (18) vor einem Kompilierer-Optimierungsprozess (206), wobei das Computerprogrammprodukt umfasst:
eine Zuführungsfestlegungs-Programmroutine für einen Quellencode (10) zum Liefern des Quellencodes
zum Kompilierer (12);
eine Zuführungsroutine für eine Assemblercodeschablone (18), die Schablonen mit Befehlen und Operanden zum Kompilierer liefert;
einen Kompilierer (12) zum Erzeugen eines Assemblercodes (14) aus dem Quellencode (10) und den
Assemblercoderoutinen (18), wobei der Assemblerkompilierer (12) mindestens eine Optimierungsprozedur (206) umfasst; und
einen Satz von erkannten Befehlen, die vom Kompilierer (12) erkennbar sind, zur Codeoptimierung (206);
5
10
dadurch gekennzeichnet, dass:
15
der Kompilierer (12) ferner eine erste Prozedur (202) zum Abtasten der Befehle und Operanden in einer gegebenen Assemblercodeschablone (18), um festzustellen, ob alle Befehle und Operanden in der Schablone
(18) im erkannten Satz enthalten sind, und eine zweite Prozedur zum Einreihen der Assemblercodeschablone
(18) in den Quellencode (10) vor der Optimierungsprozedur (206), sobald alle Befehle und Operanden in einer
gegebenen Assemblercodeschablone (18) im erkannten Satz enthalten sind, umfasst.
20
25
30
15. Computerprogrammprodukt nach Anspruch 14, wobei die Einreihungsprozedur eine Prozedur zum Virtualisieren
(204) der Schablone (18), eine Prozedur zum Transformieren des Assemblercodes (10) in der Schablone (18) in
eine Zwischenform, eine Prozedur zum Transformieren des Quellencodes in eine Zwischenform (16) und eine
Prozedur zum Kombinieren des Zwischenform-Assemblercodes und des Zwischenform-Quellencodes (16) umfasst.
16. Computerprogrammprodukt nach Anspruch 15, wobei die Prozedur zum Virtualisieren (204) der Schablone (18)
eine Prozedur zum Identifizieren von Belegungen von physikalischen Registern innerhalb der Schablone und zum
Transformieren der Belegungen von physikalischen Registern in Belegungen von virtuellen Registern umfasst.
17. Computerprogrammprodukt nach Anspruch 16, wobei die Assemblercodeschablone (18) Eingangsoperanden und
Ausgangsoperanden umfasst; und wobei die Identifikationsund Transformationsprozeduren nacheinander an den
Eingangsoperanden und den Ausgangsoperanden durchgeführt werden.
35
18. Computerprogrammprodukt nach Anspruch 14, wobei der Kompilierer (12) ferner eine Prozedur zum Unterziehen
des kombinierten Zwischenform- (20) Assemblercodes und Quellencodes mindestens einem Optimierungsprozess (206) umfasst.
40
45
19. Computerprogrammprodukt nach Anspruch 18, wobei der Kompilierer (12) eine Prozedur zum Erzeugen eines
Assemblercodes (14) aus dem Ergebnis der mindestens einen Optimierungsprozedur (206) umfasst.
20. Computerprogrammprodukt nach Anspruch 19, wobei der Kompilierer (12) eine Prozedur (208) zum Einreihen
des Assemblercodes von irgendeiner Assemblercodeschablone mit einem Befehl oder Operanden, der nicht im
erkannten Satz enthalten ist, in den Assemblercode (14) umfasst.
Revendications
50
55
1.
Procédé d'insertion d'un sous-programme en code assembleur dans un sous-programme en code source (10),
ledit procédé comprenant les étapes consistant à :
fournir un modèle de code en assembleur (18) comportant les instructions et les opérandes du sous-programme de code en assembleur,
fournir un ensemble d'instructions reconnues reconnaissables par le compilateur (12) en vue d'une optimisation de code,
caractérisé par le fait de :
11
EP 0 806 725 B1
balayer (202) les instructions et les opérandes du modèle (18) pour définir si toutes les instructions et opérandes du modèle sont inclus dans l'ensemble d'instructions reconnues ; si c'est le cas,
rendre virtuel (204) le modèle (18),
transformer le code assembleur (18) sous une forme intermédiaire utilisable par le compilateur (12),
transformer un code source (10) sous une forme intermédiaire (16) utilisable par le compilateur, et
combiner le code assembleur sous forme intermédiaire et le code source sous forme intermédiaire (16),
5
dans lequel le sous-programme en code assembleur est inséré dans le sous-programme en code source
(10) avant une optimisation (206) dans un compilateur (12) d'un système de traitement de données (100).
10
15
2.
Procédé selon la revendication 1, dans lequel ladite étape consistant à rendre virtuel (204) le modèle (18) comprend
les étapes consistant à identifier des affectations physiques de registre à l'intérieur du modèle (18), et à transformer
les affectations physiques de registre en affectations virtuelles de registre.
3.
Procédé selon la revendication 2, dans lequel ledit modèle en code assembleur (18) présente des opérandes
d'entrée et des opérandes de sortie, et dans lequel lesdites étapes consistant à identifier et à transformer sont
séquentiellement exécutées sur les opérandes d'entrée et sur les opérandes de sortie.
4.
Procédé selon la revendication 1, comprenant en outre l'étape consistant à soumettre le code assembleur sous
forme intermédiaire et le code source sous forme intermédiaire (16) combinés (20) à au moins une procédure
d'optimisation (206).
5.
Procédé selon la revendication 4, comprenant en outre l'étape consistant à générer (208) un code assembleur
(14) à partir du résultat de la au moins une procédure d'optimisation (206).
6.
Procédé selon la revendication 5, comprenant en outre l'étape consistant à incorporer (208) dans le résultat de la
au moins une procédure d'optimisation (206) un sous-programme en code assembleur quelconque ayant une
instruction ou un opérande quelconques non inclus dans ledit ensemble reconnu.
7.
Système informatique, comprenant :
20
25
30
une unité centrale de traitement (CPU) (102) ayant une unité de mémoire (104) destinée à mémoriser un code
assembleur (14),
une fourniture de code source (10) spécifiant une séquence de programme pour une utilisation par le processeur CPU (102),
une source de sous-programmes en code assembleur sous la forme de modèles (18) comportant des instructions et des opérandes,
un compilateur (12) destiné à générer un code assembleur (14) à partir du code source (10) et des sousprogrammes en code assembleur (18), ledit compilateur d'assembleur (12) comprenant au moins une procédure d'optimisation (206), et
un ensemble d'instructions reconnues reconnaissables par le compilateur (12) en vue d'une optimisation de
code (206),
35
40
caractérisé par :
45
ledit compilateur (12) comprenant en outre une première procédure (204) destinée à balayer les instructions
et les opérandes dans un modèle en code assembleur donné (18), afin de déterminer si toutes les instructions
et les opérandes dans le modèle (18) sont inclus dans l'ensemble reconnu, et une seconde procédure destinée
à incorporer le modèle en code assembleur (18) dans le code source (10) avant la procédure d'optimisation
(206), à chaque fois que toutes les instructions et opérandes dans un modèle en code assembleur donné (18)
sont inclus dans l'ensemble reconnu.
50
8.
55
Système informatique selon la revendication 7, dans lequel la procédure d'incorporation comprend une procédure
pour rendre virtuel (204) le modèle (18), une procédure pour transformer le code assembleur dans le modèle (18)
sous une forme intermédiaire, une procédure pour transformer le code source (10) sous une forme intermédiaire
(16), et une procédure pour combiner le code assembleur sous forme intermédiaire et le code source sous forme
intermédiaire (16).
12
EP 0 806 725 B1
9.
5
10
Système informatique selon la revendication 8, dans lequel la procédure pour rendre virtuel les modèles (18)
comprend une procédure pour identifier les affectations physiques de registre à l'intérieur du modèle, et pour
transformer les affectations physiques de registre en des affectations virtuelles de registre.
10. Système informatique selon la revendication 9, dans lequel le modèle en code assembleur comprend des opérandes d'entrée et des opérandes de sortie, et dans lequel les procédures d'identification et de transformation sont
exécutées en séquence sur les opérandes d'entrée et les opérandes de sortie.
11. Système informatique selon la revendication 7, dans lequel le compilateur (12) comprend en outre une procédure
pour soumettre le code assembleur sous forme intermédiaire et le code source combinés (20) à au moins un
traitement d'optimisation (206).
12. Système informatique selon la revendication 11, dans lequel le compilateur (12) comprend une procédure pour
générer (206) un code assembleur (14) à partir du résultat de la au moins une procédure d'optimisation (206).
15
13. Système informatique selon la revendication 12, dans lequel le compilateur (12) comprend une procédure pour
incorporer (208) dans le code assembleur (20) le code assembleur d'un modèle quelconque en code assembleur
(18) comportant une instruction ou un opérande non inclus dans ledit ensemble reconnu.
20
14. Produit de sous-programme informatique, comprenant :
un support utilisable par un ordinateur comportant un code pouvant être lu par un ordinateur inclus dans celuici pour susciter une première incorporation d'un modèle en langage assembleur (18) avant un traitement
d'optimisation par compilateur (206), le produit de sous-programme informatique comprenant :
25
30
une fourniture de code source (10) spécifiant une séquence de programme pour fournir le code source
au compilateur (12),
un sous-programme de fourniture de modèle en code assembleur (18) fournissant au compilateur des
modèles comportant des instructions et des opérandes,
un compilateur (12) pour générer un code assembleur (14) à partir du code source (10) et des sousprogrammes en code assembleur (18), ledit compilateur en assembleur (12) comprenant au moins une
procédure d'optimisation (206), et
un ensemble d'instructions reconnues reconnaissables par le compilateur (12) en vue d'une optimisation
de code (206),
35
caractérisé par :
40
45
ledit compilateur (12) comprenant en outre une première procédure (202) destinée à balayer les instructions
et les opérandes dans un modèle en code assembleur donné (18) pour déterminer si toutes les instructions
et opérandes dans le modèle (18) sont inclus dans l'ensemble reconnu, et une seconde procédure destinée
à incorporer le modèle en code assembleur (18) dans le code source (10) avant la procédure d'optimisation
(206) chaque fois que toutes les instructions et opérandes dans un modèle en code assembleur donné (18)
sont inclus dans l'ensemble reconnu.
15. Produit de sous-programme informatique selon la revendication 14, dans lequel la procédure d'incorporation comprend une procédure pour rendre virtuel (204) le modèle (18), une procédure pour transformer le code assembleur
(10) dans le modèle (18) sous une forme intermédiaire, une procédure pour transformer le code source sous une
forme intermédiaire (16), et une procédure pour combiner le code assembleur sous forme intermédiaire et le code
source sous forme intermédiaire (16).
50
16. Produit de sous-programme informatique selon la revendication 15, dans lequel la procédure pour rendre virtuel
(204) le modèle (18) comprend une procédure pour identifier les affectations physiques de registre à l'intérieur du
modèle, et pour transformer les affectations physiques de registre en affectations virtuelles de registre.
55
17. Produit de sous-programme informatique selon la revendication 16, dans lequel le modèle en code assembleur
(18) comprend des opérandes d'entrée et des opérandes de sortie, et dans lequel les procédures d'identification
et de transformation sont exécutées en séquence sur les opérandes d'entrée et les opérandes de sortie.
13
EP 0 806 725 B1
18. Produit de sous-programme informatique selon la revendication 14, dans lequel ledit compilateur (12) comprenant
en outre une procédure pour soumettre le code assembleur sous forme intermédiaire (20) et le code source combinés à au moins un traitement d'optimisation (206).
5
10
19. Produit de sous-programme informatique selon la revendication 18, dans lequel le compilateur (12) comprend une
procédure destinée à générer un code assembleur (14) à partir du résultat de la au moins une procédure d'optimisation (206).
20. Produit de sous-programme informatique selon la revendication 19, dans lequel le compilateur (12) comprend une
procédure (208) destinée à incorporer dans le code assembleur (14) le code assembleur d'un modèle quelconque
en code assembleur ayant une instruction ou un opérande non inclus dans ledit ensemble reconnu.
15
20
25
30
35
40
45
50
55
14
EP 0 806 725 B1
15
EP 0 806 725 B1
16
EP 0 806 725 B1
17
EP 0 806 725 B1
18
EP 0 806 725 B1
19
EP 0 806 725 B1
20
Téléchargement