DataStar Benchmark
Summary
These tests were run on SDSC's DataStar, using the high-memory p655 (8-way) nodes. The problem sizes are based on the suggested problems for the 10243 dataset. However, only the 512 task run replicates the original science run, since the number of passes was limited to 16 for all processor counts. This is chosen simply not to waste CPU resources, and means that the numbers here represent a weak scaling study (increasing the problem size linearly with the number of tasks).
An estimate of the strong scaling (maintaining the problem size, independent of the number of tasks) can be had by multiplying the total pass time by the appropriate ratio. For example, the small bins calculation took 1217 seconds to complete 16 passes. To complete the same number of points per bin as the 512 task run, 64 tasks would have required 128 passes, which would take an estimated 9736 seconds. This is a fair comparison, mainly because of the problem decomposition used in sfgen.
These results collect the different steps taken during each pass into several categories:
- Pass: PassTotalTime
- Math: GetPointPairs, CalculateValues, CalculatePDF
- Data: CountGridCells, GetCellIndexes, ReadInGrids, ReadCellValues
- Communication: CommunicateCellCounts, CommunicateCellIndexes, CommunicateCellValues
See the page on Performance for an explanation of the individual steps that are measured.
Small Bins (Short Distances)
Average times (seconds):
| Ntasks | Pass | Math | Data | Communications |
| 64 | 76.06 | 53.65 | 19.01 | 3.40 |
| 128 | 77.50 | 53.62 | 18.38 | 5.51 |
| 256 | 78.62 | 53.67 | 19.08 | 5.86 |
| 512 | 80.35 | 53.58 | 18.05 | 8.72 |
Large Bins (Long Distances)
Average times (seconds):
| Ntasks | Pass | Math | Data | Communications |
| 64 | 23.99 | 17.11 | 5.56 | 1.33 |
| 128 | 24.32 | 17.04 | 5.75 | 1.52 |
| 256 | 24.77 | 17.06 | 5.74 | 1.98 |
| 512 | 25.01 | 17.01 | 5.62 | 2.29 |
Small Bins (Short Distances)
64 Tasks Small
===================================================================== Problem Setup ===================================================================== NumberOfPasses = 16 NumberOfPDFBins = 5000 NumberOfBins = 8 NumberOfPairs = 1048576 NumberOfProcessors = 64 --------------------------------------------------------------------- Total Problems Size = 8589934592 ===================================================================== Pass Statistics ===================================================================== Step Average Minimum Maximum Total Fraction --------------------------------------------------------------------- GetPointPairs 8.03 8.01 8.07 128.55 10.6% CountGridCells 0.36 0.36 0.37 5.81 0.5% CommunicateCellCounts 0.49 0.18 0.94 7.79 0.6% GetCellIndexes 1.00 0.97 1.21 16.06 1.3% CommunicateCellIndexes 0.34 0.29 0.42 5.43 0.4% ReadInGrids 12.71 12.61 13.68 203.40 16.7% CommunicateCellValues 2.58 2.40 2.70 41.22 3.4% ReadCellValues 4.93 4.57 5.22 78.92 6.5% CalculateValues 29.44 29.38 29.49 471.01 38.7% CalculatePDF 16.18 16.16 16.20 258.81 21.3% --------------------------------------------------------------------- PassTotalTime 76.06 75.28 76.68 1217.01 100.0% Category Average Total Fraction --------------------------------------------------- Math 53.65 858.37 70.5% Data 19.01 304.20 25.0% Communications 3.40 54.45 4.5% ===================================================================== Other Statistics ===================================================================== Initialization and Reduction --------------------------------------------------- ReadAllData Time = 29.47 Reduce Time = 0.929694 Writing Time = 0.434644 Memory --------------------------------------------------- Memory high water = 3313.176 MB Memory low water = 3311.772 MB Total memory = 212.025 GB
128 Tasks Small
===================================================================== Problem Setup ===================================================================== NumberOfPasses = 16 NumberOfPDFBins = 5000 NumberOfBins = 8 NumberOfPairs = 1048576 NumberOfProcessors = 128 --------------------------------------------------------------------- Total Problems Size = 17179869184 ===================================================================== Pass Statistics ===================================================================== Step Average Minimum Maximum Total Fraction --------------------------------------------------------------------- GetPointPairs 7.90 7.86 8.21 126.39 10.2% CountGridCells 0.36 0.36 0.36 5.78 0.5% CommunicateCellCounts 0.39 0.10 1.23 6.24 0.5% GetCellIndexes 1.01 0.97 1.19 16.13 1.3% CommunicateCellIndexes 0.47 0.39 0.55 7.57 0.6% ReadInGrids 12.08 11.86 13.11 193.33 15.6% CommunicateCellValues 4.64 4.01 4.85 74.30 6.0% ReadCellValues 4.93 4.69 5.09 78.83 6.4% CalculateValues 29.54 29.42 30.20 472.57 38.1% CalculatePDF 16.18 16.16 16.22 258.92 20.9% --------------------------------------------------------------------- PassTotalTime 77.50 77.09 78.17 1240.08 100.0% Category Average Total Fraction --------------------------------------------------- Math 53.62 857.89 69.2% Data 18.38 294.07 23.7% Communications 5.51 88.11 7.1% ===================================================================== Other Statistics ===================================================================== Initialization and Reduction --------------------------------------------------- ReadAllData Time = 20.2078 Reduce Time = 0.531876 Writing Time = 0.413945 Memory --------------------------------------------------- Memory high water = 2938.896 MB Memory low water = 2937.292 MB Total memory = 376.142 GB
256 Tasks Small
===================================================================== Problem Setup ===================================================================== NumberOfPasses = 16 NumberOfPDFBins = 5000 NumberOfBins = 8 NumberOfPairs = 1048576 NumberOfProcessors = 256 --------------------------------------------------------------------- Total Problems Size = 34359738368 ===================================================================== Pass Statistics ===================================================================== Step Average Minimum Maximum Total Fraction --------------------------------------------------------------------- GetPointPairs 7.91 7.87 7.94 126.48 10.1% CountGridCells 0.36 0.36 0.36 5.77 0.5% CommunicateCellCounts 0.89 0.24 3.44 14.29 1.1% GetCellIndexes 0.98 0.96 1.14 15.73 1.3% CommunicateCellIndexes 0.52 0.44 0.77 8.35 0.7% ReadInGrids 12.92 12.82 13.81 206.69 16.4% CommunicateCellValues 4.45 3.94 4.92 71.17 5.7% ReadCellValues 4.82 4.43 5.16 77.07 6.1% CalculateValues 29.54 29.38 29.99 472.72 37.6% CalculatePDF 16.22 16.16 16.96 259.58 20.6% --------------------------------------------------------------------- PassTotalTime 78.62 77.46 81.91 1257.85 100.0% Category Average Total Fraction --------------------------------------------------- Math 53.67 858.78 68.3% Data 19.08 305.26 24.3% Communications 5.86 93.80 7.5% ===================================================================== Other Statistics ===================================================================== Initialization and Reduction --------------------------------------------------- ReadAllData Time = 25.2924 Reduce Time = 0.538649 Writing Time = 0.801747 Memory --------------------------------------------------- Memory high water = 2754.296 MB Memory low water = 2753.264 MB Total memory = 705.028 GB
512 Tasks Small
===================================================================== Problem Setup ===================================================================== NumberOfPasses = 16 NumberOfPDFBins = 5000 NumberOfBins = 8 NumberOfPairs = 1048576 NumberOfProcessors = 512 --------------------------------------------------------------------- Total Problems Size = 68719476736 ===================================================================== Pass Statistics ===================================================================== Step Average Minimum Maximum Total Fraction --------------------------------------------------------------------- GetPointPairs 7.93 7.90 7.98 126.81 9.9% CountGridCells 0.37 0.36 0.37 5.85 0.5% CommunicateCellCounts 0.79 0.40 1.28 12.60 1.0% GetCellIndexes 0.98 0.95 1.15 15.64 1.2% CommunicateCellIndexes 0.62 0.54 0.73 9.98 0.8% ReadInGrids 11.92 11.66 12.73 190.68 14.8% CommunicateCellValues 7.31 5.51 7.77 116.95 9.1% ReadCellValues 4.79 4.55 5.25 76.62 6.0% CalculateValues 29.48 29.28 29.56 471.61 36.7% CalculatePDF 16.18 16.15 16.21 258.80 20.1% --------------------------------------------------------------------- PassTotalTime 80.35 79.47 81.12 1285.55 100.0% Category Average Total Fraction --------------------------------------------------- Math 53.58 857.22 66.7% Data 18.05 288.79 22.5% Communications 8.72 139.54 10.9% ===================================================================== Other Statistics ===================================================================== Initialization and Reduction --------------------------------------------------- ReadAllData Time = 19.2613 Reduce Time = 1.58642 Writing Time = 0.454068 Memory --------------------------------------------------- Memory high water = 2667.132 MB Memory low water = 2663.316 MB Total memory = 1365.239 GB
Large Bins (Long Distances)
64 Tasks Large
===================================================================== Problem Setup ===================================================================== NumberOfPasses = 16 NumberOfPDFBins = 5000 NumberOfBins = 5 NumberOfPairs = 524288 NumberOfProcessors = 64 --------------------------------------------------------------------- Total Problems Size = 2684354560 ===================================================================== Pass Statistics ===================================================================== Step Average Minimum Maximum Total Fraction --------------------------------------------------------------------- GetPointPairs 2.60 2.58 2.68 41.65 10.9% CountGridCells 0.13 0.13 0.14 2.14 0.6% CommunicateCellCounts 0.17 0.02 0.47 2.68 0.7% GetCellIndexes 0.34 0.33 0.39 5.43 1.4% CommunicateCellIndexes 0.09 0.09 0.13 1.51 0.4% ReadInGrids 3.30 3.24 3.63 52.82 13.8% CommunicateCellValues 1.06 0.97 1.11 17.02 4.4% ReadCellValues 1.78 1.67 1.89 28.51 7.4% CalculateValues 9.44 9.41 9.46 150.97 39.3% CalculatePDF 5.07 5.06 5.08 81.12 21.1% --------------------------------------------------------------------- PassTotalTime 23.99 23.69 24.77 383.85 100.0% Category Average Total Fraction --------------------------------------------------- Math 17.11 273.74 71.3% Data 5.56 88.90 23.2% Communications 1.33 21.21 5.5% ===================================================================== Other Statistics ===================================================================== Initialization and Reduction --------------------------------------------------- ReadAllData Time = 30.2304 Reduce Time = 0.15192 Writing Time = 0.309057 Memory --------------------------------------------------- Memory high water = 1740.360 MB Memory low water = 1739.928 MB Total memory = 111.373 GB
128 Tasks Large
===================================================================== Problem Setup ===================================================================== NumberOfPasses = 16 NumberOfPDFBins = 5000 NumberOfBins = 5 NumberOfPairs = 524288 NumberOfProcessors = 128 --------------------------------------------------------------------- Total Problems Size = 5368709120 ===================================================================== Pass Statistics ===================================================================== Step Average Minimum Maximum Total Fraction --------------------------------------------------------------------- GetPointPairs 2.54 2.53 2.56 40.66 10.5% CountGridCells 0.13 0.13 0.14 2.13 0.5% CommunicateCellCounts 0.22 0.06 1.09 3.48 0.9% GetCellIndexes 0.33 0.33 0.38 5.31 1.4% CommunicateCellIndexes 0.14 0.12 0.19 2.24 0.6% ReadInGrids 3.52 3.48 3.76 56.39 14.5% CommunicateCellValues 1.16 1.10 1.21 18.58 4.8% ReadCellValues 1.76 1.64 1.87 28.24 7.3% CalculateValues 9.44 9.42 9.47 151.07 38.8% CalculatePDF 5.06 5.06 5.06 80.96 20.8% --------------------------------------------------------------------- PassTotalTime 24.32 24.04 25.49 389.07 100.0% Category Average Total Fraction --------------------------------------------------- Math 17.04 272.68 70.1% Data 5.75 92.08 23.7% Communications 1.52 24.30 6.2% ===================================================================== Other Statistics ===================================================================== Initialization and Reduction --------------------------------------------------- ReadAllData Time = 24.5204 Reduce Time = 0.366714 Writing Time = 0.335625 Memory --------------------------------------------------- Memory high water = 1366.032 MB Memory low water = 1365.300 MB Total memory = 174.835 GB
256 Tasks Large
===================================================================== Problem Setup ===================================================================== NumberOfPasses = 16 NumberOfPDFBins = 5000 NumberOfBins = 5 NumberOfPairs = 524288 NumberOfProcessors = 256 --------------------------------------------------------------------- Total Problems Size = 10737418240 ===================================================================== Pass Statistics ===================================================================== Step Average Minimum Maximum Total Fraction --------------------------------------------------------------------- GetPointPairs 2.55 2.54 2.56 40.78 10.3% CountGridCells 0.13 0.13 0.14 2.14 0.5% CommunicateCellCounts 0.47 0.01 4.47 7.53 1.9% GetCellIndexes 0.34 0.33 0.39 5.38 1.4% CommunicateCellIndexes 0.16 0.13 0.23 2.53 0.6% ReadInGrids 3.49 3.44 3.80 55.91 14.1% CommunicateCellValues 1.35 1.24 1.46 21.57 5.4% ReadCellValues 1.77 1.65 1.89 28.38 7.2% CalculateValues 9.45 9.42 9.56 151.21 38.2% CalculatePDF 5.06 5.05 5.07 80.93 20.4% --------------------------------------------------------------------- PassTotalTime 24.77 24.15 29.27 396.35 100.0% Category Average Total Fraction --------------------------------------------------- Math 17.06 272.92 68.9% Data 5.74 91.80 23.2% Communications 1.98 31.63 8.0% ===================================================================== Other Statistics ===================================================================== Initialization and Reduction --------------------------------------------------- ReadAllData Time = 21.6039 Reduce Time = 0.349251 Writing Time = 0.334087 Memory --------------------------------------------------- Memory high water = 1181.468 MB Memory low water = 1180.768 MB Total memory = 302.426 GB
512 Tasks Large
===================================================================== Problem Setup ===================================================================== NumberOfPasses = 16 NumberOfPDFBins = 5000 NumberOfBins = 5 NumberOfPairs = 524288 NumberOfProcessors = 512 --------------------------------------------------------------------- Total Problems Size = 21474836480 ===================================================================== Pass Statistics ===================================================================== Step Average Minimum Maximum Total Fraction --------------------------------------------------------------------- GetPointPairs 2.56 2.55 2.58 41.03 10.3% CountGridCells 0.13 0.13 0.14 2.13 0.5% CommunicateCellCounts 0.31 0.07 2.56 4.96 1.2% GetCellIndexes 0.34 0.33 0.39 5.41 1.4% CommunicateCellIndexes 0.18 0.16 0.22 2.94 0.7% ReadInGrids 3.40 3.32 3.70 54.36 13.6% CommunicateCellValues 1.80 1.71 1.92 28.81 7.2% ReadCellValues 1.75 1.62 1.86 28.04 7.0% CalculateValues 9.46 9.42 9.55 151.42 37.8% CalculatePDF 5.07 5.05 5.10 81.08 20.3% --------------------------------------------------------------------- PassTotalTime 25.01 24.58 27.66 400.19 100.0% Category Average Total Fraction --------------------------------------------------- Math 17.10 273.53 68.3% Data 5.62 89.95 22.5% Communications 2.29 36.71 9.2% ===================================================================== Other Statistics ===================================================================== Initialization and Reduction --------------------------------------------------- ReadAllData Time = 14.6487 Reduce Time = 0.45023 Writing Time = 0.329942 Memory --------------------------------------------------- Memory high water = 1094.284 MB Memory low water = 1093.036 MB Total memory = 560.158 GB
