﻿=============================
          FLOPS.c          
 Version 2.0,  18 Dec 1992 
         Al Aburto         
  aburto@marlin.nosc.mil   
       'ala' on BIX        
=============================


Flops.c is a 'c' program which attempts to estimate your systems
floating-point 'MFLOPS' rating for the FADD, FSUB, FMUL, and FDIV
operations based on specific 'instruction mixes' (discussed below).
The program provides an estimate of PEAK MFLOPS performance by making
maximal use of register variables with minimal interaction with main
memory. The execution loops are all small so that they will fit in
any cache. Flops.c can be used along with Linpack and the Livermore
kernels (which exersize memory much more extensively) to gain further
insight into the limits of system performance. The flops.c execution
modules also include various percent weightings of FDIV's (from 0% to
25% FDIV's) so that the range of performance can be obtained when
using FDIV's. FDIV's, being computationally more intensive than
FADD's or FMUL's, can impact performance considerably on some systems.

Flops.c consists of 8 independent modules (routines) which, except for
module 2, conduct numerical integration of various functions. Module
2, estimates the value of pi based upon the Maclaurin series expansion
of atan(1). MFLOPS ratings are provided for each module, but the
programs overall results are summerized by the MFLOPS(1), MFLOPS(2),
MFLOPS(3), and MFLOPS(4) outputs.

The MFLOPS(1) result is identical to the result provided by all
previous versions of flops.c. It is based only upon the results from
modules 2 and 3. Two problems surfaced in using MFLOPS(1). First, it
was difficult to completely 'vectorize' the result due to the 
recurrence of the 's' variable in module 2. This problem is addressed
in the MFLOPS(2) result which does not use module 2, but maintains
nearly the same weighting of FDIV's (9.2%) as in MFLOPS(1) (9.6%).
The second problem with MFLOPS(1) centers around the percentage of
FDIV's (9.6%) which was viewed as too high for an important class of
problems. This concern is addressed in the MFLOPS(3) result where NO
FDIV's are conducted at all. 

The number of floating-point instructions per iteration (loop) is
given below for each module executed:

MODULE   FADD   FSUB   FMUL   FDIV   TOTAL  Comment
  1        7      0      6      1      14   7.1%  FDIV's
  2        3      2      1      1       7   difficult to vectorize.
  3        6      2      9      0      17   0.0%  FDIV's
  4        7      0      8      0      15   0.0%  FDIV's
  5       13      0     15      1      29   3.4%  FDIV's
  6       13      0     16      0      29   0.0%  FDIV's
  7        3      3      3      3      12   25.0% FDIV's
  8       13      0     17      0      30   0.0%  FDIV's

A*2+3     21     12     14      5      52   A=5, MFLOPS(1), Same as
40.4%  23.1%  26.9%   9.6%                  previous versions of the
                                            flops.c program. Includes
                                            only Modules 2 and 3, does
                                            9.6% FDIV's, and is not
                                            easily vectorizable.

1+3+4     58     14     66     14     152   A=4, MFLOPS(2), New output
+5+6+    38.2%  9.2%   43.4%  9.2%          does not include Module 2,
A*7                                         but does 9.2% FDIV's.

1+3+4     62      5     74      5     146   A=0, MFLOPS(3), New output
+5+6+    42.9%  3.4%   50.7%  3.4%          does not include Module 2,
7+8                                         but does 3.4% FDIV's.

3+4+6     39      2     50      0      91   A=0, MFLOPS(4), New output
+8       42.9%  2.2%   54.9%  0.0%          does not include Module 2,
                                            and does NO FDIV's.

NOTE: Various timer routines are included as indicated below. The
timer routines, with some comments, are attached at the end 
of the main program.

NOTE: Please do not remove any of the printouts.

EXAMPLE COMPILATION:
UNIX based systems
   cc -DUNIX -O flops20.c -o flops
   cc -DUNIX -DROPT flops20.c -o flops
   cc -DUNIX -fast -O4 flops20.c -o flops
   .
   .
   .
  etc.

Al Aburto
aburto@marlin.nosc.mil