SFGenBenchmark/DataStarScaling

DataStar Benchmark

Summary

These tests were run on SDSC's DataStar, using the high-memory p655 (8-way) nodes. The problem sizes are based on the suggested problems for the 10243 dataset. However, only the 512 task run replicates the original science run, since the number of passes was limited to 16 for all processor counts. This is chosen simply not to waste CPU resources, and means that the numbers here represent a weak scaling study (increasing the problem size linearly with the number of tasks).

An estimate of the strong scaling (maintaining the problem size, independent of the number of tasks) can be had by multiplying the total pass time by the appropriate ratio. For example, the small bins calculation took 1217 seconds to complete 16 passes. To complete the same number of points per bin as the 512 task run, 64 tasks would have required 128 passes, which would take an estimated 9736 seconds. This is a fair comparison, mainly because of the problem decomposition used in sfgen.

These results collect the different steps taken during each pass into several categories:

  • Pass: PassTotalTime
  • Math: GetPointPairs, CalculateValues, CalculatePDF
  • Data: CountGridCells, GetCellIndexes, ReadInGrids, ReadCellValues
  • Communication: CommunicateCellCounts, CommunicateCellIndexes, CommunicateCellValues

See the page on Performance for an explanation of the individual steps that are measured.

Small Bins (Short Distances)

Average times (seconds):

Ntasks Pass Math Data Communications
64 76.06 53.65 19.01 3.40
128 77.50 53.62 18.38 5.51
256 78.62 53.67 19.08 5.86
512 80.35 53.58 18.05 8.72

Large Bins (Long Distances)

Average times (seconds):

Ntasks Pass Math Data Communications
64 23.99 17.11 5.56 1.33
128 24.32 17.04 5.75 1.52
256 24.77 17.06 5.74 1.98
512 25.01 17.01 5.62 2.29

Small Bins (Short Distances)

64 Tasks Small

=====================================================================
Problem Setup
=====================================================================

NumberOfPasses      = 16
NumberOfPDFBins     = 5000
NumberOfBins        = 8
NumberOfPairs       = 1048576
NumberOfProcessors  = 64
---------------------------------------------------------------------
Total Problems Size = 8589934592

=====================================================================
Pass Statistics
=====================================================================

Step                     Average  Minimum  Maximum    Total  Fraction
---------------------------------------------------------------------
GetPointPairs               8.03     8.01     8.07   128.55     10.6%
CountGridCells              0.36     0.36     0.37     5.81      0.5%
CommunicateCellCounts       0.49     0.18     0.94     7.79      0.6%
GetCellIndexes              1.00     0.97     1.21    16.06      1.3%
CommunicateCellIndexes      0.34     0.29     0.42     5.43      0.4%
ReadInGrids                12.71    12.61    13.68   203.40     16.7%
CommunicateCellValues       2.58     2.40     2.70    41.22      3.4%
ReadCellValues              4.93     4.57     5.22    78.92      6.5%
CalculateValues            29.44    29.38    29.49   471.01     38.7%
CalculatePDF               16.18    16.16    16.20   258.81     21.3%
---------------------------------------------------------------------
PassTotalTime              76.06    75.28    76.68  1217.01    100.0%

Category                 Average    Total  Fraction
---------------------------------------------------
Math                       53.65   858.37     70.5%
Data                       19.01   304.20     25.0%
Communications              3.40    54.45      4.5%

=====================================================================
Other Statistics
=====================================================================

Initialization and Reduction
---------------------------------------------------
ReadAllData Time  =      29.47
Reduce Time =   0.929694
Writing Time =   0.434644

Memory
---------------------------------------------------
Memory high water            =    3313.176 MB
Memory low water             =    3311.772 MB
Total memory                 =     212.025 GB

128 Tasks Small

=====================================================================
Problem Setup
=====================================================================

NumberOfPasses      = 16
NumberOfPDFBins     = 5000
NumberOfBins        = 8
NumberOfPairs       = 1048576
NumberOfProcessors  = 128
---------------------------------------------------------------------
Total Problems Size = 17179869184

=====================================================================
Pass Statistics
=====================================================================

Step                     Average  Minimum  Maximum    Total  Fraction
---------------------------------------------------------------------
GetPointPairs               7.90     7.86     8.21   126.39     10.2%
CountGridCells              0.36     0.36     0.36     5.78      0.5%
CommunicateCellCounts       0.39     0.10     1.23     6.24      0.5%
GetCellIndexes              1.01     0.97     1.19    16.13      1.3%
CommunicateCellIndexes      0.47     0.39     0.55     7.57      0.6%
ReadInGrids                12.08    11.86    13.11   193.33     15.6%
CommunicateCellValues       4.64     4.01     4.85    74.30      6.0%
ReadCellValues              4.93     4.69     5.09    78.83      6.4%
CalculateValues            29.54    29.42    30.20   472.57     38.1%
CalculatePDF               16.18    16.16    16.22   258.92     20.9%
---------------------------------------------------------------------
PassTotalTime              77.50    77.09    78.17  1240.08    100.0%

Category                 Average    Total  Fraction
---------------------------------------------------
Math                       53.62   857.89     69.2%
Data                       18.38   294.07     23.7%
Communications              5.51    88.11      7.1%

=====================================================================
Other Statistics
=====================================================================

Initialization and Reduction
---------------------------------------------------
ReadAllData Time  =    20.2078
Reduce Time =   0.531876
Writing Time =   0.413945

Memory
---------------------------------------------------
Memory high water            =    2938.896 MB
Memory low water             =    2937.292 MB
Total memory                 =     376.142 GB

256 Tasks Small

=====================================================================
Problem Setup
=====================================================================

NumberOfPasses      = 16
NumberOfPDFBins     = 5000
NumberOfBins        = 8
NumberOfPairs       = 1048576
NumberOfProcessors  = 256
---------------------------------------------------------------------
Total Problems Size = 34359738368

=====================================================================
Pass Statistics
=====================================================================

Step                     Average  Minimum  Maximum    Total  Fraction
---------------------------------------------------------------------
GetPointPairs               7.91     7.87     7.94   126.48     10.1%
CountGridCells              0.36     0.36     0.36     5.77      0.5%
CommunicateCellCounts       0.89     0.24     3.44    14.29      1.1%
GetCellIndexes              0.98     0.96     1.14    15.73      1.3%
CommunicateCellIndexes      0.52     0.44     0.77     8.35      0.7%
ReadInGrids                12.92    12.82    13.81   206.69     16.4%
CommunicateCellValues       4.45     3.94     4.92    71.17      5.7%

ReadCellValues              4.82     4.43     5.16    77.07      6.1%
CalculateValues            29.54    29.38    29.99   472.72     37.6%
CalculatePDF               16.22    16.16    16.96   259.58     20.6%
---------------------------------------------------------------------
PassTotalTime              78.62    77.46    81.91  1257.85    100.0%

Category                 Average    Total  Fraction
---------------------------------------------------
Math                       53.67   858.78     68.3%
Data                       19.08   305.26     24.3%
Communications              5.86    93.80      7.5%

=====================================================================
Other Statistics
=====================================================================

Initialization and Reduction
---------------------------------------------------
ReadAllData Time  =    25.2924
Reduce Time =   0.538649
Writing Time =   0.801747

Memory
---------------------------------------------------
Memory high water            =    2754.296 MB
Memory low water             =    2753.264 MB
Total memory                 =     705.028 GB

512 Tasks Small

=====================================================================
Problem Setup
=====================================================================

NumberOfPasses      = 16
NumberOfPDFBins     = 5000
NumberOfBins        = 8
NumberOfPairs       = 1048576
NumberOfProcessors  = 512
---------------------------------------------------------------------
Total Problems Size = 68719476736

=====================================================================
Pass Statistics
=====================================================================

Step                     Average  Minimum  Maximum    Total  Fraction
---------------------------------------------------------------------
GetPointPairs               7.93     7.90     7.98   126.81      9.9%
CountGridCells              0.37     0.36     0.37     5.85      0.5%
CommunicateCellCounts       0.79     0.40     1.28    12.60      1.0%
GetCellIndexes              0.98     0.95     1.15    15.64      1.2%
CommunicateCellIndexes      0.62     0.54     0.73     9.98      0.8%
ReadInGrids                11.92    11.66    12.73   190.68     14.8%
CommunicateCellValues       7.31     5.51     7.77   116.95      9.1%
ReadCellValues              4.79     4.55     5.25    76.62      6.0%
CalculateValues            29.48    29.28    29.56   471.61     36.7%
CalculatePDF               16.18    16.15    16.21   258.80     20.1%
---------------------------------------------------------------------
PassTotalTime              80.35    79.47    81.12  1285.55    100.0%

Category                 Average    Total  Fraction
---------------------------------------------------
Math                       53.58   857.22     66.7%
Data                       18.05   288.79     22.5%
Communications              8.72   139.54     10.9%

=====================================================================
Other Statistics
=====================================================================

Initialization and Reduction
---------------------------------------------------
ReadAllData Time  =    19.2613
Reduce Time =    1.58642
Writing Time =   0.454068

Memory
---------------------------------------------------
Memory high water            =    2667.132 MB
Memory low water             =    2663.316 MB
Total memory                 =    1365.239 GB

Large Bins (Long Distances)

64 Tasks Large

=====================================================================
Problem Setup
=====================================================================

NumberOfPasses      = 16
NumberOfPDFBins     = 5000
NumberOfBins        = 5
NumberOfPairs       = 524288
NumberOfProcessors  = 64
---------------------------------------------------------------------
Total Problems Size = 2684354560

=====================================================================
Pass Statistics
=====================================================================

Step                     Average  Minimum  Maximum    Total  Fraction
---------------------------------------------------------------------
GetPointPairs               2.60     2.58     2.68    41.65     10.9%
CountGridCells              0.13     0.13     0.14     2.14      0.6%
CommunicateCellCounts       0.17     0.02     0.47     2.68      0.7%
GetCellIndexes              0.34     0.33     0.39     5.43      1.4%
CommunicateCellIndexes      0.09     0.09     0.13     1.51      0.4%
ReadInGrids                 3.30     3.24     3.63    52.82     13.8%
CommunicateCellValues       1.06     0.97     1.11    17.02      4.4%
ReadCellValues              1.78     1.67     1.89    28.51      7.4%
CalculateValues             9.44     9.41     9.46   150.97     39.3%
CalculatePDF                5.07     5.06     5.08    81.12     21.1%
---------------------------------------------------------------------
PassTotalTime              23.99    23.69    24.77   383.85    100.0%

Category                 Average    Total  Fraction
---------------------------------------------------
Math                       17.11   273.74     71.3%
Data                        5.56    88.90     23.2%
Communications              1.33    21.21      5.5%

=====================================================================
Other Statistics
=====================================================================

Initialization and Reduction
---------------------------------------------------
ReadAllData Time  =    30.2304
Reduce Time =    0.15192
Writing Time =   0.309057

Memory
---------------------------------------------------
Memory high water            =    1740.360 MB
Memory low water             =    1739.928 MB
Total memory                 =     111.373 GB

128 Tasks Large

=====================================================================
Problem Setup
=====================================================================

NumberOfPasses      = 16
NumberOfPDFBins     = 5000
NumberOfBins        = 5
NumberOfPairs       = 524288
NumberOfProcessors  = 128
---------------------------------------------------------------------
Total Problems Size = 5368709120

=====================================================================
Pass Statistics
=====================================================================

Step                     Average  Minimum  Maximum    Total  Fraction
---------------------------------------------------------------------
GetPointPairs               2.54     2.53     2.56    40.66     10.5%
CountGridCells              0.13     0.13     0.14     2.13      0.5%
CommunicateCellCounts       0.22     0.06     1.09     3.48      0.9%
GetCellIndexes              0.33     0.33     0.38     5.31      1.4%
CommunicateCellIndexes      0.14     0.12     0.19     2.24      0.6%
ReadInGrids                 3.52     3.48     3.76    56.39     14.5%
CommunicateCellValues       1.16     1.10     1.21    18.58      4.8%
ReadCellValues              1.76     1.64     1.87    28.24      7.3%
CalculateValues             9.44     9.42     9.47   151.07     38.8%
CalculatePDF                5.06     5.06     5.06    80.96     20.8%
---------------------------------------------------------------------
PassTotalTime              24.32    24.04    25.49   389.07    100.0%

Category                 Average    Total  Fraction
---------------------------------------------------
Math                       17.04   272.68     70.1%
Data                        5.75    92.08     23.7%
Communications              1.52    24.30      6.2%

=====================================================================
Other Statistics
=====================================================================

Initialization and Reduction
---------------------------------------------------
ReadAllData Time  =    24.5204
Reduce Time =   0.366714
Writing Time =   0.335625

Memory
---------------------------------------------------
Memory high water            =    1366.032 MB
Memory low water             =    1365.300 MB
Total memory                 =     174.835 GB

256 Tasks Large

=====================================================================
Problem Setup
=====================================================================

NumberOfPasses      = 16
NumberOfPDFBins     = 5000
NumberOfBins        = 5
NumberOfPairs       = 524288
NumberOfProcessors  = 256
---------------------------------------------------------------------
Total Problems Size = 10737418240

=====================================================================
Pass Statistics
=====================================================================

Step                     Average  Minimum  Maximum    Total  Fraction
---------------------------------------------------------------------
GetPointPairs               2.55     2.54     2.56    40.78     10.3%
CountGridCells              0.13     0.13     0.14     2.14      0.5%
CommunicateCellCounts       0.47     0.01     4.47     7.53      1.9%
GetCellIndexes              0.34     0.33     0.39     5.38      1.4%
CommunicateCellIndexes      0.16     0.13     0.23     2.53      0.6%
ReadInGrids                 3.49     3.44     3.80    55.91     14.1%
CommunicateCellValues       1.35     1.24     1.46    21.57      5.4%
ReadCellValues              1.77     1.65     1.89    28.38      7.2%
CalculateValues             9.45     9.42     9.56   151.21     38.2%
CalculatePDF                5.06     5.05     5.07    80.93     20.4%
---------------------------------------------------------------------
PassTotalTime              24.77    24.15    29.27   396.35    100.0%

Category                 Average    Total  Fraction
---------------------------------------------------
Math                       17.06   272.92     68.9%
Data                        5.74    91.80     23.2%
Communications              1.98    31.63      8.0%

=====================================================================
Other Statistics
=====================================================================

Initialization and Reduction
---------------------------------------------------
ReadAllData Time  =    21.6039
Reduce Time =   0.349251
Writing Time =   0.334087

Memory
---------------------------------------------------
Memory high water            =    1181.468 MB
Memory low water             =    1180.768 MB
Total memory                 =     302.426 GB

512 Tasks Large

=====================================================================
Problem Setup
=====================================================================

NumberOfPasses      = 16
NumberOfPDFBins     = 5000
NumberOfBins        = 5
NumberOfPairs       = 524288
NumberOfProcessors  = 512
---------------------------------------------------------------------
Total Problems Size = 21474836480

=====================================================================
Pass Statistics
=====================================================================

Step                     Average  Minimum  Maximum    Total  Fraction
---------------------------------------------------------------------
GetPointPairs               2.56     2.55     2.58    41.03     10.3%
CountGridCells              0.13     0.13     0.14     2.13      0.5%
CommunicateCellCounts       0.31     0.07     2.56     4.96      1.2%
GetCellIndexes              0.34     0.33     0.39     5.41      1.4%
CommunicateCellIndexes      0.18     0.16     0.22     2.94      0.7%
ReadInGrids                 3.40     3.32     3.70    54.36     13.6%
CommunicateCellValues       1.80     1.71     1.92    28.81      7.2%
ReadCellValues              1.75     1.62     1.86    28.04      7.0%
CalculateValues             9.46     9.42     9.55   151.42     37.8%
CalculatePDF                5.07     5.05     5.10    81.08     20.3%
---------------------------------------------------------------------
PassTotalTime              25.01    24.58    27.66   400.19    100.0%

Category                 Average    Total  Fraction
---------------------------------------------------
Math                       17.10   273.53     68.3%
Data                        5.62    89.95     22.5%
Communications              2.29    36.71      9.2%

=====================================================================
Other Statistics
=====================================================================

Initialization and Reduction
---------------------------------------------------
ReadAllData Time  =    14.6487
Reduce Time =    0.45023
Writing Time =   0.329942

Memory
---------------------------------------------------
Memory high water            =    1094.284 MB
Memory low water             =    1093.036 MB
Total memory                 =     560.158 GB