Personal tools
You are here: Home Codes ZEUS 3D Cray J90
Document Actions

Cray J90

by streeter last modified 2007-03-30 04:39
Cray UNICOS machines have a hardware performance monitor (hpm), which gives the number of floating point operations per CPU second (FLOPS) performed by a given process.

All routines were compiled with: cft77 -ez
ZEUS-MP (10 steps) 
 
GRID: 32 x 32 x 32 per processor (tile) (10 steps) 
 Processors tused(s) Zone-Cycles/sec   MFLOPS   C90s   Speedup Speedup/Processor 
    1         7.53        43027         39.12   0.36     1.00         1.00 
 
GRID: 64 x 64 x 64 per processor (tile) (10 steps) 
 Processors tused(s) Zone-Cycles/sec   MFLOPS   C90s   Speedup Speedup/Processor 
    1        40.19        64103         51.01   0.28     1.00         1.00 
 
GRID:128 x 64 x 64 per processor (tile) (10 steps) 
 Processors tused(s) Zone-Cycles/sec   MFLOPS   C90s   Speedup Speedup/Processor 
    1        73.70        69815         54.56   0.33     1.00         1.00 
 
 
ZEUS-3D (10 steps) 
 
COMMENT: Total work is scaled with the number of processors.  The ZEUS-3D 
         EDITOR preprocessor inserted parallelization directives.  Fortran 
         routines compiled with cf77 -Zp -Wf"-ez". Light system load.  The 
         speedup is the average concurrency reported by the job accounting 
         utility "ja". 
 
GRID: 32 x 32 x 32 per processor (10 steps) 
                                                                            Speedup/

 Processors  Wall Clock  tused(s) Zone-Cycles/sec   MFLOPS   C90s   Speedup Processor 
      1           5.09       5.07        64632       61.17   0.30     1.00    1.00 
      2           5.18       9.68       123447      116.83   0.28     1.91    0.96 
      4           5.08      18.86       240431      227.55   0.54     3.72    0.93 
      8           4.47      33.31       424633      401.89   1.19     6.57    0.82 
     16           5.25      58.25       742623      702.84   1.67    11.49    0.72 
     32          10.19      56.38       718709      680.21   1.62    11.12    0.35 
 
 
GRID: 64 x 64 x 64 per processor (10 steps) 

                                                                            Speedup/
 Processors  Wall Clock  tused(s) Zone-Cycles/sec   MFLOPS   C90s   Speedup Processor 
      1          27.52      27.45        95514       81.36   0.27     1.00     1.00 
      2          27.18      54.34       189118      161.09   0.43     1.98     0.99 
      4          26.94     107.04       372505      317.30   0.84     3.90     0.98 
      8          24.91     202.28       703940      599.62   1.59     7.37     0.92 
     16          31.15     341.98      1190107     1013.75   2.68    12.46     0.78 
     32          58.60     358.99      1249326     1064.19   2.82    13.08     0.41 
 
GRID:128 x 64 x 64 per processor (10 steps) 

                                                                            Speedup/
 Processors  Wall Clock  tused(s) Zone-Cycles/sec   MFLOPS   C90s   Speedup Processor 
      1          47.95      47.89       109468       91.62   0.25     1.00     1.00 
      2          47.13      95.31       217841      182.32   0.49     1.99     1.00 
      4          46.81     186.79       426925      357.32   0.96     3.90     0.98 
      8          46.58     349.15       798022      667.91   1.80     7.29     0.91 
     16          49.83     659.50      1507374     1261.61   3.40    13.77     0.86 
     32          73.92     653.27      1493144     1249.70   3.37    13.64     0.43 
 

Back to Scaling Comparison Main


Powered by Plone CMS, the Open Source Content Management System

This site conforms to the following standards: