Abstract
Given an instruction set, the superoptimizer finds the shortest
program to compute a function. Startling programs have been
generated, many of them engaging in convoluted bit-fiddling bearing
little resemblance to the source programs which defined the functions.
The key idea in the superoptimizer is a probabilistic test that
makes exhaustive searches practical for programs of useful size. The
search space is defined by the processor's instruction set, which may
include the whole set, but it is typically restricted to a subset. By
constraining the instructions and observing the effect on the output
program, one can gain insight into the design of instruction sets. In
addition, superoptimized programs may be used by peephole optimizers
to improve the quality of generated code, or by assembly
language programmers to improve manually written code.
1. Introduction
The search for the optimal algorithm to compute a function is one of
the fundamental problems in computer science. In contrast to
theoretical studies of optimal algorithms, practical applications
motivated the design, implementation, and use of the superoptimizer.
Instead of proving upper or lower bounds for abstract algorithms, the
superoptimizcr finds the shortest program in the program space
defined by the instruction set of commercial machines, such the
Motorola 68000 or Intei 8086.
The functions to be optimized are specified with programs written
using the target machine's instruction set. Therefore, the input to the
superoptimizer is a machine language program. The output is
another program, which may be shorter. Since both programs run on
the same processor, with a well-defined environment, we can establish
their equivalence.
A probabilistie test and a method for pruning the search tree makes
the superoptimizer a practical tool for programs of limited size
(about 13 machine instructions).
In section 2, we describe an interesting example to illustrate the superoptimizer
approach. The design azd algorithms used in the superoptimizer
are detailed in section 3. We discuss the applications and
limitations of the superoptimizer in section 4. In section 5, we corn-
Permission to copy without fee all or part of this material is granted
provided that the copies are not made or distributed for direct commercial
advantage, the ACM copyright notice and the title of the publication and
its date appear, and notice is given that copying is by permission of the
Association for Computing Machinery. To copy otherwise, or to
republish, requires a fee and/or specific permission.
pare the superoptimizer with related work. The conclusion in section
6 is followed by a list of interesting minimal programs in appendix I.
2. An Interesting Example
We begin with an example to show what superoptimized code looks
like. The instruction set used here, as in most of the paper, is
Motorola's 68020 instruction set. Our example is the signum function,
defined by the following program:
signum (x)
int x;
{
if(x > 0) return I;
else if(x < 0} return -I;
else return 0;
)
This function compiles to 9 instructions occupying 18 bytes of
memory on the SUN-3 C compiler. Most programmers when asked
to write this function in assembly language would use comparison
instructions and conditional jumps to decide in what range the argument
lies. Typically, this takes 8 68020 instructions, although
clever programmers can do it in 6.
It turns out that by exploiting various properties of two's complement
arithmetic one can write signum in four instructions[ This is
what superoptimizer found when fed the compiled machine code for
the signum function as input:
(x in dO)
add.l d0,d0 ladd dO to itself
subx.l dl,dl lsubtract (dl + Carry) from dl
negx.l dO Iput (0 - dO - Carry) into dO
addx.l dl,dl ladd (dl + Carry) to dl
(signum(x) in dl} (4 instructions}
Like a typical superoptimized program, the logic is really convoluted.
One of the first things that comes to mind is "where are the
conditional jumps?". As we will see later, many functions that
would normally be written with conditional jumps are optimized into
short programs without them. This can result in significant speedups
for certain pipelined machines that execute conditional jumps slowly.
Let us see how it works. The "add.l dO, dO" instruction doubles the
contents of register dO, but more importandy, the sign bit is now in
the carry flag. The "subx.l dl, dl" instruction computes "dl-dlcarry
--> dl". Regardless of the initial value of dl, dl-dl-carry is
-carry. Thus dl is -1 if dO was negative and 0 otherwise. Besides
negating, "negx.i dO" will set the carry flag if and only if dO was
nonzero. Finally, "addx.I dl, dl" doubles dl and adds the carry. Now
if dO was negative, dl is -1 and carry is set, so dl+dl+carry is -1, if
dO was 0, dl is 0 and carry is clear, so d0+d0+carry is 0, if dO was
positive, dl is 0 and carry isset, so dl+dl+carry is I.
Name:
Anonymous2007-10-09 7:29
Why do you need a paper for the superoptimizer?
1) Make your test predicate
2) Figure out how big you want your function
3) Test your predicate against every byte combination in your function's allocated size
4) ???
5) FUCK YOU!
Also, the paper is not all that interesting. Just had a glance at it there now. Maybe if I read it fully it may be..
In any case, hres the introduction for your reading pleasure:
The search for the optimal algorithm to compute a function is one of
the fundamental problems in computer science. In contrast to
theoretical studies of optimal algorithms, practical applications
motivated the design, implementation, and use of the superoptimizer.
Instead of proving upper or lower bounds for abstract algorithms, the
superoptimizcr finds the shortest program in the program space
defined by the instruction set of commercial machines, such the
Motorola 68000 or Intei 8086.
The functions to be optimized are specified with programs written
using the target machine's instruction set. Therefore, the input to the
superoptimizer is a machine language program. The output is
another program, which may be shorter. Since both programs run on
the same processor, with a well-defined environment, we can estab-
lish their equivalence.
A probabilistie test and a method for pruning the search tree makes
the superoptimizer a practical tool for programs of limited size
(about 13 machine instructions).
In section 2, we describe an interesting example to illustrate the su-
peroptimizer approach. The design azd algorithms used in the super-
optimizer are detailed in section 3. We discuss the applications and
limitations of the superoptimizer in section 4. In section 5, we corn-
pare the superoptimizer with related work. The conclusion in section
6 is followed by a list of interesting minimal programs in appendix I.
Name:
Anonymous2007-10-09 8:42
>>8 >>5
Dammit, didn't see your post.. ok, mines obsolete now.. I fail.
Change to CFID=15151515&CFTOKEN=6184618 in the download link, works every time (that's how Google can access them BTW)
Name:
Anonymous2007-10-09 11:59
>>1 Server Error The server encountered an internal error and was unable to complete your request.
Application server is busy. Either there are too many concurrent requests or the server still is starting up. Application server is busy. Application server ENTERPRISE