Personal tools
You are here: Home Codes ZEUS 3D Problem Blast w/CT Scheme
Document Actions

Problem Blast w/CT Scheme

by streeter last modified 2007-03-30 04:46
CODE: ZEUS-3D version 3.4

PROBLEM: Blast -- the expansion of a hot sphere of plasma into an initially uniform magnetic medium.

GEOMETRY: Cartesian XYZ

GRID: 32 to 128 zones in each direction centered on the origin. The ratio of neighboring zone dimensions is 1.02. The smallest zones are near the origin.

ALGORITHM: van Leer advection, original CT scheme to evolve magnetic fields

PRECISION: Single precision on Crays (64-bits), DOUBLE PRECISION on others.

DATA: In the table below, "tused" is the number of CPU seconds used by the master thread in computing the evolution (some system and ZEUS-3D overhead is excluded). The Zone-Cycles/sec is the number of mesh zones times the number of time steps divided by tused.

Cray C90

Cray UNICOS machines have a hardware performance monitor (hpm), which gives the number of floating point operations per CPU second (FLOPS) performed by a given process. The FLOPS for other machines are determined from C90 FLOPS and the ratios of the Zone-Cycles/sec.

All routines were compiled with: cft77 -ez
 
 Processors tused(s) Zone-Cycles/sec   MFLOPS   C90s   Speedup Speedup/Processor 
GRID: 32 x 32 x 32 
    1        1.2420      105532        113.01   1.00     1.00         1.00 
GRID: 64 x 64 x 64 
    1        8.4265      217767        218.64   1.00     1.00         1.00 
GRID:128 x 64 x 64 
    1        19.774      291647        298.86   1.00     1.00         1.00 
GRID: 64 x 64 x128 
    1        24.031      239992        240.77   1.00     1.00         1.00 


SGI Power Challenge

  • The EDITOR preprocessor was used to automatically insert parallelization directives above each loop nest in ZEUS-3D.

  • These runs were performed in dedicated mode.

  • Wall clock time resolution is 1 second.

  • Routines DISPLAYR and DISPLAYI were compiled with f77 -c -w1 -g3

  • Routine NMLSTS and the NAMELIST library were compiled with f77 -O2 -w1 -g3

  • Others: f77 -c -O3 -w1 -g3 -pfa list -WK,-ro=3,-so=3,-o=5,-as=l,-chs=16


 
GRID: 32 x 32 x 32 
                                                                            Speedup/ 
 Processors  Wall Clock  tused(s) Zone-Cycles/sec   MFLOPS   C90s   Speedup Processor 
      1           4.00       3.38        38829       41.58   0.37     1.00    1.00 
      2           2.00       1.95        67085       71.83   0.64     1.73    0.86 
      3           2.00       1.94        67503       72.28   0.64     1.74    0.58 
      4           3.00       1.49        87812       94.03   0.83     2.26    0.57 
      5           3.00       1.53        85914       91.99   0.81     2.21    0.44 
      6           3.00       1.37        95947      102.74   0.91     2.47    0.41 
      7           3.00       1.32        99675      106.73   0.94     2.57    0.37 
      8           3.00       1.22       107493      115.10   1.02     2.77    0.35 
      9           3.00       1.22       107429      115.03   1.02     2.77    0.31 
     10           3.00       1.13       115485      123.66   1.09     2.97    0.30 
     11           3.00       1.10       118821      127.23   1.13     3.06    0.28 
     12           3.00       1.10       119618      128.08   1.13     3.08    0.26 
     13           3.00       1.21       108669      116.36   1.03     2.80    0.22 
     14           3.00       1.09       120045      128.54   1.14     3.09    0.22 
     15           2.00       1.13       115480      123.65   1.09     2.97    0.20 
     16           3.00       1.13       115846      124.04   1.10     2.98    0.19 
 
GRID: 64 x 64 x 64 

                                                                            Speedup/
 Processors  Wall Clock  tused(s) Zone-Cycles/sec   MFLOPS   C90s   Speedup Processor 
      1          71.00      57.96        31661       31.84   0.15     1.00    1.00 
      2          42.00      27.49        66746       67.12   0.31     2.11    1.05 
      3          29.00      19.32        94991       95.53   0.44     3.00    1.00 
      4          27.00      13.36       137369      138.15   0.63     4.34    1.08 
      5          24.00      11.63       157751      158.64   0.72     4.98    1.00 
      6          23.00      10.02       183091      184.13   0.84     5.78    0.96 
      7          23.00       9.19       199587      200.72   0.92     6.30    0.90 
      8          22.00       8.29       221354      222.61   1.02     6.99    0.87 
      9          17.00       7.73       237332      238.68   1.09     7.50    0.83 
     10          18.00       7.66       239483      240.84   1.10     7.56    0.76 
     11          17.00       7.11       258007      259.47   1.18     8.15    0.74 
     12          17.00       7.44       246784      248.18   1.13     7.79    0.65 
     13          18.00       6.77       271215      272.75   1.25     8.57    0.66 
     14          21.00       7.05       260143      261.62   1.19     8.22    0.59 
     15          21.00       7.08       259359      260.83   1.19     8.19    0.55 
     16          21.00       7.10       258386      259.85   1.19     8.16    0.51 
 
GRID: 128 x 64 x 64  

                                                                            Speedup/
 Processors  Wall Clock  tused(s) Zone-Cycles/sec   MFLOPS   C90s   Speedup Processor 
      1         197.00     176.55        32666       33.49   0.11     1.00    1.00 
      2         106.00      88.53        65143       66.79   0.22     1.99    1.00 
      3          80.00      61.81        93307       95.66   0.32     2.86    0.95 
      4          60.00      43.81       131636      134.95   0.45     4.03    1.01 
      5          54.00      35.89       160706      164.76   0.55     4.92    0.98 
      6          50.00      30.03       192061      196.90   0.66     5.88    0.98 
      7          44.00      27.38       210662      215.97   0.72     6.45    0.92 
      8          43.00      24.15       238783      244.80   0.82     7.31    0.91 
      9          41.00      22.89       251995      258.35   0.86     7.71    0.86 
     10          38.00      21.35       270101      276.91   0.93     8.27    0.83 
     11          37.00      19.51       295619      303.07   1.01     9.05    0.82 
     12          37.00      19.55       295020      302.46   1.01     9.03    0.75 
     13          42.00      18.16       317507      325.51   1.09     9.72    0.75 
     14          40.00      18.72       308041      315.81   1.06     9.43    0.67 
     15          36.00      18.30       315084      323.03   1.08     9.65    0.64 
     16          37.00      18.45       312567      320.45   1.07     9.57    0.60 
 
GRID: 64 x 64 x 128  

                                                                            Speedup/
 Processors  Wall Clock  tused(s) Zone-Cycles/sec   MFLOPS   C90s   Speedup Processor 
      1         205.00     183.83        31372       31.50   0.13     1.00    1.00 
      2         108.00      92.06        62643       62.91   0.26     2.00    1.00 
      3          82.00      63.28        91134       91.52   0.38     2.90    0.97 
      4          64.00      45.18       127656      128.19   0.53     4.07    1.02 
      5          56.00      37.68       153056      153.70   0.64     4.88    0.98 
      6          54.00      31.47       183259      184.03   0.76     5.84    0.97 
      7          46.00      28.45       202709      203.56   0.84     6.46    0.92 
      8          43.00      25.20       228881      229.84   0.95     7.30    0.91 
      9          41.00      23.05       250219      251.27   1.04     7.98    0.89 
     10          40.00      21.90       263352      264.46   1.10     8.39    0.84 
     11          41.00      20.27       284466      285.66   1.19     9.07    0.82 
     12          36.00      19.34       298157      299.41   1.24     9.50    0.79 
     13          37.00      17.78       324367      325.73   1.35    10.34    0.80 
     14          37.00      17.59       327853      329.23   1.37    10.45    0.75 
     15          39.00      18.08       318952      320.29   1.33    10.17    0.68 
     16          39.00      18.38       313845      315.16   1.31    10.00    0.63 


SGI Challenge
  • The data below were obtained with ZEUS-3D version 3.4.

  • These runs were performed in dedicated mode.

  • Wall clock time resolution is 1 second.

  • Compile: f77 -c -O2 -w1 -g3 -Nq9999 -pfa list -WK,-ro=3,-so=3,-as=l,-chs=16


 

GRID: 32 x 32 x 32 
                                                                           Speedup/
 Processors  Wall Clock  tused(s) Zone-Cycles/sec   MFLOPS   C90s   Speedup Processor 
      1          17.00      15.65         8376        8.97   0.08     1.00    1.00 
      2           9.00       8.11        16162       17.31   0.15     1.93    0.96 
      3           9.00       7.43        17643       18.89   0.17     2.11    0.70 
      4           7.00       5.30        24730       26.48   0.23     2.95    0.74 
      5           7.00       5.51        23789       25.47   0.23     2.84    0.57 
      6           6.00       4.71        27802       29.77   0.26     3.32    0.55 
      7           5.00       4.38        29950       32.07   0.28     3.58    0.51 
      8           5.00       3.82        34305       36.73   0.33     4.10    0.51 
      9           5.00       3.77        34771       37.23   0.33     4.15    0.46 
     10           5.00       3.47        37768       40.44   0.36     4.51    0.45 
     11           5.00       3.15        41646       44.59   0.39     4.97    0.45 
     12           5.00       3.34        39185       41.96   0.37     4.68    0.39 
 
GRID: 64 x 64 x 64 

                                                                            Speedup/
 Processors  Wall Clock  tused(s) Zone-Cycles/sec   MFLOPS   C90s   Speedup Processor 
      1         205.00     194.79         9420        9.47   0.04     1.00    1.00 
      2         107.00      97.90        18744       18.85   0.09     1.99    0.99 
      3          85.00      75.23        24391       24.53   0.11     2.59    0.86 
      4          64.00      53.55        34265       34.46   0.16     3.64    0.91 
      5          58.00      48.49        37842       38.06   0.17     4.02    0.80 
      6          51.00      41.05        44707       44.96   0.21     4.75    0.79 
      7          46.00      36.42        50385       50.67   0.23     5.35    0.76 
      8          42.00      32.27        56872       57.19   0.26     6.04    0.75 
      9          40.00      30.12        60932       61.28   0.28     6.47    0.72 
     10          37.00      26.91        68202       68.59   0.31     7.24    0.72 
     11          35.00      25.10        73107       73.52   0.34     7.76    0.71 
     12          34.00      23.89        76825       77.26   0.35     8.15    0.68 
 
GRID: 128 x 64 x 64  

                                                                            Speedup/
 Processors  Wall Clock  tused(s) Zone-Cycles/sec   MFLOPS   C90s   Speedup Processor 
      1         585.00     571.28        10095       10.35   0.03     1.00    1.00 
      2         309.00     294.61        19576       20.07   0.07     1.94    0.97 
      3         235.00     218.89        26347       27.01   0.09     2.61    0.87 
      4         179.00     158.84        36307       37.22   0.12     3.60    0.90 
      5         158.00     138.63        41601       42.65   0.14     4.12    0.82 
      6         136.00     115.67        49859       51.12   0.17     4.94    0.82 
      7         124.00     103.34        55809       57.22   0.19     5.53    0.79 
      8         110.00      90.58        63671       65.28   0.22     6.31    0.79 
      9         102.00      82.81        69646       71.40   0.24     6.90    0.77 
     10          96.00      76.78        75113       77.01   0.26     7.44    0.74 
     11          90.00      70.62        81666       83.72   0.28     8.09    0.74 
     12          88.00      68.67        83983       86.10   0.29     8.32    0.69 
 
GRID: 64 x 64 x 128 

                                                                            Speedup/
 Processors  Wall Clock  tused(s) Zone-Cycles/sec   MFLOPS   C90s   Speedup Processor 
      1         588.00     574.83        10033       10.08   0.04     1.00    1.00 
      2         314.00     295.75        19500       19.58   0.08     1.94    0.97 
      3         236.00     221.74        26009       26.12   0.11     2.59    0.86 
      4         178.00     158.23        36447       36.60   0.15     3.63    0.91 
      5         157.00     137.74        41870       42.05   0.17     4.17    0.83 
      6         134.00     115.11        50101       50.31   0.21     4.99    0.83 
      7         121.00     101.26        56954       57.19   0.24     5.68    0.81 
      8         108.00      88.90        64872       65.14   0.27     6.47    0.81 
      9         100.00      81.25        70980       71.28   0.30     7.07    0.79 
     10          89.00      73.17        78815       79.15   0.33     7.86    0.79 
     11          87.00      68.15        84629       84.98   0.35     8.44    0.77 
     12          82.00      63.29        91116       91.50   0.38     9.08    0.76 


SGI Indigo 2 Extreme

  • This workstation has 64MB of memory and can handle up to about a 64 * 64 * 64 grid without paging to disk very much.

  • All routines were compiled with: f77 -c -O2 -w1 -g3 -Nq9999
 
 Processors tused(s) Zone-Cycles/sec   MFLOPS   C90s   Speedup Speedup/Processor 
GRID: 32 x 32 x 32 
    1        17.090        7669          7.91   0.07     1.00         1.00 
GRID: 64 x 64 x 64 
    1        201.83        9091          8.76   0.04     1.00         1.00 


Convex Exemplar SPP-1000
  • The data below were obtained with version 3.4.1 of ZEUS-3D, which calls system routines that time a single thread. Isom Crawfords's Fortran-callable interface to the thread timing routines is available here.

  • This 4-HYPERnode system was configured with one HYPERnode devoted to processing one batch job at a time.

  • All routines were compiled with: fc -c -nw -O3 -or none

  • Parallelization directives were placed in the code just before many loop nests to ensure that they run concurrently.


 
GRID: 32 x 32 x 32 

                                                                            Speedup/
 Processors  Wall Clock  tused(s) Zone-Cycles/sec   MFLOPS   C90s   Speedup Processor 
      1          18.48      13.66         9594       10.27    .09     1.00    1.00 
      2          10.60       6.54        20047       21.47    .19     2.09    1.04 
      3           6.99       4.82        27168       29.09    .26     2.83     .94 
      4           5.91       3.57        36762       39.36    .35     3.83     .96 
      5           6.12       3.73        35101       37.59    .33     3.66     .73 
      6           5.17       2.88        45557       48.78    .43     4.75     .79 
      7           5.40       3.21        40777       43.66    .39     4.25     .61 
      8           5.25       3.00        43717       46.81    .41     4.56     .57 
GRID: 64 x 64 x 64 

                                                                            Speedup/
 Processors  Wall Clock  tused(s) Zone-Cycles/sec   MFLOPS   C90s   Speedup Processor 
      1         213.90     200.13         9169        9.22    .04     1.00    1.00 
      2         124.01     101.36        18103       18.21    .08     1.97     .99 
      3          91.58      71.80        25557       25.70    .12     2.79     .93 
      4          73.26      53.28        34441       34.64    .16     3.76     .94 
      5          66.30      46.17        39748       39.97    .18     4.34     .87 
      6          58.31      37.81        48529       48.80    .22     5.29     .88 
      7          55.63      34.10        53805       54.11    .25     5.87     .84 
      8          51.17      30.32        60530       60.87    .28     6.60     .83 
GRID: 128 x 64 x 64 

                                                                            Speedup/
 Processors  Wall Clock  tused(s) Zone-Cycles/sec   MFLOPS   C90s   Speedup Processor 
      1         636.14     601.41         9589        9.83    .03     1.00    1.00 
      2         345.62     305.53        18876       19.35    .06     1.97     .98 
      3         254.59     215.05        26818       27.49    .09     2.80     .93 
      4         202.70     159.71        36111       37.02    .12     3.77     .94 
      5         173.86     136.37        42290       43.36    .15     4.41     .88 
      6         152.31     115.18        50070       51.33    .17     5.22     .87 
      7         142.38     104.95        54950       56.34    .19     5.73     .82 
      8         130.32      89.23        64631       66.26    .22     6.74     .84 
GRID: 64 x 64 x 128 
                                                                            Speedup/
 Processors  Wall Clock  tused(s) Zone-Cycles/sec   MFLOPS   C90s   Speedup Processor 
      1         673.70     630.38         9149        9.19    .04     1.00    1.00 
      2         360.55     319.57        18047       18.12    .08     1.97     .99 
      3         263.08     222.71        25895       26.00    .11     2.83     .94 
      4         207.31     168.46        34236       34.38    .14     3.74     .94 
      5         179.45     140.11        41162       41.33    .17     4.50     .90 
      6         159.48     121.19        47586       47.79    .20     5.20     .87 
      7         146.25     106.74        54032       54.26    .23     5.91     .84 
      8         134.77      93.27        61831       62.09    .26     6.76     .84 


HP 715/80
  • The data below were obtained with ZEUS-3D version 3.4. This workstation has 32MB of memory and can handle up to about 64 * 32 * 32 zones without paging to disk very much.

  • All routines were compiled with: f77 -c +O3
 
GRID: 32 x 32 x 32 
 Processors tused(s) Zone-Cycles/sec   MFLOPS   C90s   Speedup Speedup/Processor 
    1        12.030       10895         11.30   0.10     1.00         1.00 


Convex C3880
  • The data below were obtained with ZEUS-3D version 3.4.

  • All routines were compiled with: fc -c -fi -O2 -nw -or none -cxdb
 
 Processors tused(s) Zone-Cycles/sec   MFLOPS   C90s   Speedup Speedup/Processor 
GRID: 32 x 32 x 32 
    1        5.4688       23967         24.86   0.22     1.00         1.00 
GRID: 64 x 64 x 64 
    1        66.547       27574         26.28   0.12     1.00         1.00 
GRID:128 x 64 x 64 
    1        265.11       21754         20.93   0.07     1.00         1.00 
GRID: 64 x 64 x128 
    1        218.98       26336         24.10   0.10     1.00         1.00 


IBM RS/6000 Model 530 (128MB)
  • The data below were obtained with ZEUS-3D version 3.4.1.

  • All routines were compiled with: xlf -c -O3
 
GRID: 32 x 32 x 32 
 Processors tused(s) Zone-Cycles/sec   MFLOPS   C90s   Speedup Speedup/Processor 
    1         55.17        2376          2.54   0.02     1.00         1.00 


Back to ZEUS-3D Main


Powered by Plone CMS, the Open Source Content Management System

This site conforms to the following standards: