Cray C90
| Cray UNICOS machines have a hardware
performance monitor (hpm), which gives the number of floating point operations
per CPU second (FLOPS) performed by a given process. The FLOPS for other
machines are determined from C90 FLOPS and the ratios of the Zone-Cycles/sec. Note that although ZEUS-MP uses algorithms conceptually the same as those in ZEUS-3D and practically every loop vectorizes, the C90 runs ZEUS-3D twice as fast as it runs the current version of ZEUS-MP. Remarkably, the CPU time spent in loops with strides greater than 1 is more than double the amount for a similar loop with stride 1. (The stride depends on the direction of the sweep in advection substeps.) All routines were compiled with: cft77 -ez
GRID: 32 x 32 x 32 per processor (tile) (10 steps)
Speedup/
Processors tused(s) Zone-Cycles/sec MFLOPS C90s Speedup Processor
1 2.69 120144 109.54 1.00 1.00 1.00
GRID: 64 x 64 x 64 per processor (tile) (10 steps)
Speedup/
Processors tused(s) Zone-Cycles/sec MFLOPS C90s Speedup Processor
1 13.50 190408 150.96 1.00 1.00 1.00
GRID:128 x 64 x 64 per processor (tile) (10 steps)
Speedup/
Processors tused(s) Zone-Cycles/sec MFLOPS C90s Speedup Processor
1 24.41 210263 163.65 1.00 1.00 1.00
Back to Pure Hydro Main |