TOP100 State-of-the-Art Analysis and Perspectives of China HPC Development: A View from 2011 HPC TOP100 (Yunquan Zhang) GTC Asia 2011 12 14
HPC TOP100 200210, 2004 863 2007 2005 2006 2007 2004 7 2007 130HPC 2004SCIDACTOPS PI David Keyes TOP100 HPC Supercomputing in China TOP500 Hans Meuer Jack Dongarra. TOP500 TOP100 TOP500 TOP100. 2007 2010 Supercomputing Workshop TOP100
2011 TOP100 863 http://www.samss.org.cn 863 2011 TOP100 (http://www.samss.org.cn) (zyq@mail.rdcps.ac.cn samss@mail.rdcps.ac.cn 863 ( No.2006AA01A105)
2011 TOP100 1 Linpack, Q Linpack T LinpackTOP500( http://www.top500.org) C Linpack U Linpack / / S LinpackTOP500( http://www.top500.org) Linpack
2011 TOP100 2 / / 10 11 samss@mail.rdcps.ac.cn zyq@mail.rdcps.ac.cn
2011 CHINA HPC TOP10 1 3 A/7168x2 Intel Hexa Core Xeon X5670 2.93GHz + 7168 Nvidia Tesla M2050@1.15GHz+2048 Hex Core FT-1000@1GHz/80Gbps /8575x16 Core 1600@975MHz/QDR Infiniband A-HN/2048x2 Intel Hexa Core Xeon X5670 2.93GHz + 2048 Nvidia Tesla M2050@1.15GHz/80Gbps Linpack (Gflops) P e a k (Gflops) 2010 202752 2566000.00 Q 4701000.00 0.546 2011 2011 / / 137200 795900.00 Q 1070160.00 0.744 53248 771700.00 Q 1343200.00 0.575 4 /Dawning TC3600 Blade/2560x (2 Intel Hexa Core X5650 + Nvidia Tesla C2050 GPU)/QDR Infiniband 2011 52416 749200.00 C 1296320.26 0.578 5 xseries x3650m3/intel Xeon X56xx 2.53 GHz/Giga-E 2011 113040 636985.00 T12 1143965.00 0.557 6 Mole-8.5 Cluster/320x2 Intel QC Xeon E5520 2.26 Ghz + 320x6 Nvidia Tesla C2050/QDR Infiniband 2010 / 33120 496500.00 U 1138440.00 0.436 7 /Dawning TC3600 Blade/3040 x 2 Intel Hexa Core X5650/QDR Infiniband 2011 36480 342300.00 C 389168.64 0.880 8 xseries x3650m3/intel Xeon X56xx 2.93 GHz/Giga-E 2011 36336 204754.40 T4 425856.00 0.481 9 xseries x3650m2 Cluster/Intel Xeon QC E55xx 2.53 GHz/Giga-E 2011 34688 196228.00 T4 351044.00 0.559 10 /5000A/1920x4 AMD QC Barcelona 1.9GHz/DDR Infiniband/WCCS +Linux 2008 30720 180600.00 C 233472.00 0.774
2011 CHINA HPC TOP100 NO.11-NO.20 Linpack (Gflops) Peak (Gflops) 11 xseries x3650m3/intel Xeon X56xx 2.53 GHz/Giga-E 2011 29800 168375.00 T3 302385.00 0.557 12 xseries x3630m3/intel Xeon X5620 2.4GHz/Giga-E 2011 41436 159114.24 C 397785.60 0.400 13 xseries x3650m3/intel Xeon X56xx 2.93 GHz/Giga-E 2011 27576 155391.60 T3 323190.00 0.481 14 4000H/1190x2 Six Core Intel Xeon X5675/QDR Infiniband 2011 14280 145600.00 C 167362.00 0.870 15 BladeCenter HS22 Cluster/Intel Xeon QC E5xxx 2.53GHz/Giga-E 2011 24864 140655.60 T3 251623.80 0.559 16 BladeCenter HS22 Cluster/Intel Xeon QC GT 2.53 GHz/Giga-E 2009 / 21504 124120.00 T4 217640.00 0.570 17 xseries x3650m2 Cluster/Intel Xeon QC E55xx 2.53 Ghz/Giga-E 2010 21888 123820.40 T3 217620.60 0.569 18 BladeCenter HS22 Cluster/Intel Xeon QC GT 2.53 GHz/Giga-E 2009 / 21504 123818.10 T3 217620.00 0.569 19 xseries x3650m2 Cluster/Intel Xeon QC E55xx 2.53 Ghz/Giga-E 2010 21456 121376.40 T3 217134.60 0.559 20 7000/1240x2 Intel Xeon QC E5450 3.0GHz/140x4 Intel Xeon QC X7350 2.93GHz Infiniband 4xDDR 2008 12160 102800.00 C 145293.00 0.708
2011 CHINA HPC TOP100 NO.21-NO.30 Linpack (Gflops) Peak (Gflops) 21 BladeCenter HS22 Cluster/Intel Xeon QC GT 2.66GHz/Giga-E 2011 18048 102097.40 T2 192463.80 0.530 22 5000/2640*Intel Xeon 5650 6 2.66GHz/Giga-E 2011 15840 95140.00 C2 168981.12 0.563 23 xseries x3550m3/intel Xeon X56xx 2.53GHz/Giga-E 2011 16416 92504.00 T2 166130.00 0.557 24-10000/768x2 Six Core Intel Xeon X5670 2.93GHz/QDR Infiniband 2011 9216 92420.00 C 107300.00 0.861 25 4000A/700x2 Six Core Intel Xeon X5675/QDR Infiniband 2011 8400 90850.00 C 102816.00 0.884 26 5000/4096*Xeon 5620 4 2.4GHz/Giga-E () 2011 16384 89300.00 C2 157286.00 0.568 27 BladeCenter HS22 Cluster/Intel Xeon QC GT 2.53 GHz/Giga-E 2010 15504 87706.00 T2 156900.40 0.559 28 5000/4096*AMD Opteron 2379 4 2.4GHz/Giga-E 2011 16384 87380.00 C2 157286.40 0.556 29 5000/2000*AMD Opteron 6136 8 2.4GHz/Giga-E 2011 16000 87100.00 C2 153600.00 0.567 30 5000/4400*Xeon 5520 4 2.26GHz/Giga-E 2011 17600 86620.00 C2 159104.00 0.544
2011 CHINA HPC TOP100 NO.31-NO.40 Linpack (Gflops) Peak (Gflops) 31 5000/4400*Xeon 5606 4 2.13GHz/Giga-E 2011 17600 84980.00 C2 149952.00 0.567 32 BladeCenter HS22 Cluster/Intel Xeon QC GT 2.66GHz/Giga-E 2011 14208 80374.60 T2 151514.20 0.530 33 /Dawning TC3600 Blade/220x (2 Intel Hexa Core X5650 + 1 NVidia Tesla C2050)/QDR Infiniband 2010 5720 76350.38 C 141389.60 0.540 34 xseries x3650m2 Cluster/Intel Xeon QC E55xx 2.53 Ghz/Giga-E 2010 / 12800 73880.00 T2 129540.00 0.570 35 BladeCenter HS22 Cluster/Intel Xeon QC E5XXX 2.53 GHz/Giga-E 2011 12324 69720.00 T 124720.00 0.559 36 xseries x3650m3/intel Xeon X56xx 2.53 GHz/Giga-E 2011 11604 65390.00 T 117430.00 0.557 37 5000/TC3600 Blade/ 1024*Intel Xeon 6Core X5650 2.66GHz/QDR Infinand 2011 / 6144 57840.00 C 65544.19 0.882 38 HP Cluster Platform 4000 BL685c G7/ AMD Opteron 12 Core 2.1GHz/Giga-E /Dawning TC3600 Blade/ 39 Intel Hexa Core X5650 + NVidia Tesla C2050 GPU/ QDR Infiniband /TC3600 Blade/320*Intel 40 Xeon X5650 +160* Nvidia Tesla C2050 GPU/ QDR Infiniband 2011 11292 56410.00 T 108400.00 0.520 2010 4160 55527.25 C 102828.80 0.540 2011 4160 55527.25 C 102828.80 0.540
2011 CHINA HPC TOP100 NO.41-NO.50 Linpack (Gflops) Peak (Gflops) 41 5000/TC3600 Blade/960*AMD Opteron 6132 82.2GHz/Infiniband 2011 / 7680 53090.00 C 67584.00 0.786 42 5000/1250*AMD Opteron 6136 8 2.4GHz/Giga-E 43 HP Cluster Platform 3000 BL460c G7/Intel Xeon E5620 2.4GHz/Giga-E 2011 10000 52030.00 C 96000.00 0.542 2011 11292 51420.00 T 98990.00 0.519 44 TS10000/850x2 Intel Xeon Hexa Core X5650/Giga-E () 2011 10200 51340.00 C 108528.00 0.473 45 xseries x3650m2 Cluster/Intel Xeon QC E55xx 2.53 Ghz/Giga-E 2010 8960 51203.30 T 90675.20 0.565 46 5000/1200*Xeon 5675 63.06GHz/ Giga-E 2011 7200 49510.00 C 88329.60 0.561 47 HP Cluster Platform 3000 BL460c G6/Intel Xeon E5530 2.4GHz/Giga-E 48 HP Cluster Platform 3000 BL460c G7/Intel Xeon X5650 2.66GHz/Giga-E 49 HP Cluster Platform 4000 BL685c G7/AMD Opteron 12 Core 2.1GHz/Giga-E /TC3600 Blade/260*Intel Xeon 50 X5650 + 130*Nvidia Tesla C2050 GPU/ QDR Infiniband 2010 9520 47890.00 T 91390.00 0.524 2011 8556 47870.00 T 91040.00 0.526 2011 10800 47300.00 T 90720.00 0.521 2011 3380 46960.00 C 83592.08 0.562
TOP100(1) TOP100 Linpack12 Pflops (2010 6.3PFlops1.902010 2.86 2011 6TOP500K- Computer! 2011 11 1 K Computer Linpack 10.51Petaflops 93 11.28Petaflops29h28m 1A,! 2012Titan(Jaguar Kepler GPU) Sequoia(BlueGene/Q), Mira(BlueGene/ Q),Stampede(Dell+MICA))
TOP100(2) 1A TOP100Linpack 2.57PFlops 4.7PFlops TOP100Linpack795.9TFlops 1.07PFlops 1A-HN TOP100Linpack 771.7TFlops 1.34PFlops CPU!
TOP100(2) Linpack22.1Tflops 2010 9.6TFlops 2.3,1.41 25.6TFlops 2010 11TFlops 2.331.36 CPU+GPU MPP 97 2010 98 104CPU+GPU 13 CPU+GPU
n 4700TFlops n 2566TFlops LINPACK n 2355214336 Intel X5670 CPU 2048 FT 1000 CPU 7168 Nvidea M2050 GPU n 262TB n 2PB n 4.04MW n 140 n 700 n 160 n 10 35 n 1090 14
! 1200 3200
神威蓝光概况! 神威蓝光 Sunway BlueLight MPP 获得科技部863计划支持 由国家 并行计算机工程技术研究中心制造 于2011年9月安装于国家超算济南中心 全部采用自主设计生产的CPU(ShenWei processor SW1600) 系统共8704 个CPU 峰值1.07016PFlops 持续性能795.9TFlops Linpack效率 74.37% 总功耗1074KW
! Remote users Remote users National Grid Local users Local users SW1600 CPU:16 / 975~1100MHz/ 124.8~140.8Gflops QDR 4X10Gbps MPI2us SWCC/C++/Fortran/ UPC/MPICC/ 2PB I/O 200GB/ s IOR(~60GB/s)! Cloud services Online storage Firewall IO nodes Nearline Storage Firewall VPN Login nodes Job manage nodes Data Center Offline Storage VPN Blue Light Compter System manage
! 1024CPU/ 741.06MFlops/ W SWCC/UPC/Fortran Online migration! APP APP APP APP CNOS CNOS CNOS CNOS vm0 vm1 vm2 vm3 SW VMM SW VMM cn cn
2011 10 29 China unveils supercomputer 'Sunway BlueLight MPP' based on its own chips 2011 11 1HPC Wire China's Indigenous Supercomputing Strategy Bears First Fruit 2011 11 1 China's Sunway BlueLight MPP Supercomputer Skyrockets On Most Powerful List 2011 11 1FierceCIO:TechWatch Chinese supercomputer Sunway BlueLight MPP eschews Intel, AMD for homegrown chips 2011 11 2 华尔 报 级计 专 国 发
CLUSTER SHARING TRENDS OF CHINA HPC TOP100 (2002-2011)
TOP100 (1) Rmax [TF/ s] Rpeak [TF/ s] % 35 35% 2848.18 4544.56 61.40% 363864 7 7% 306.93 535.39 60.50% 55748 5 5% 1087.80 1404.71 84.34% 165512 2 2% 3337.70 6044.20 56.00% 256000 1 1% 496.50 1138.44 43.60% 33120 1 1% 102.80 145.29 70.80% 12160 51 51% 8204.11 13812.59 62.90% 886404 IBM 35 35% 3264.31 6020.59 57.60% 588524 HP 13 13% 509.51 927.77 57.60% 98056 Dell 1 1% 23.40 44.93 72.43% 6880 49 49% 3797.22 6993.28 57.50% 690900 100 100% 12001.33 20805.87 59.63% 1577304
TOP100
TOP100 (2) 49%201051% 6 TOP100 43HP DELL TOP100! 9 HP Linpack 68.36%2010 81.08% 201018.92 % 31.64% TOP100
TOP100
TOP100
TOP100(1) [TF/s] 21 21% 2133.82 3963.18 53.30% 404568 16 16% 763.91 1450.00 52.00% 155648 9 9% 293.01 424.04 76.30% 30740 8 8% 5333.40 8892.26 66.84% 502616 7 7% 474.31 923.01 53.20% 88192 6 6% 541.98 1026.46 54.10% 95720 5 5% 742.70 1455.37 67.70% 56300 5 5% 388.62 682.08 57.00% 68648 5 5% 202.46 236.82 85.20% 22064 4 4% 112.02 208.98 59.30% 13852 3 3% 436.35 571.11 63.60% 44300 2 2% 213.88 383.26 55.80% 37872 2 2% 81.87 118.27 67.70% 13440 2 2% 79.20 150.37 53.50% 15352 2 2% 78.93 147.76 53.00% 8480 1 1% 46.38 81.79 56.70% 9600 1 1% 31.03 58.40 53.10% 5840 1 1% 23.27 32.69 71.20% 3072 100 100% 12001.33 20805.87 59.63% 1577304
TOP100(2) 21% 16%9% 8%7% Linpack44.64% 17.78%6.37% 6.19% 4.52% TOP100
TOP100
TOP100
TOP100
TOP100
TOP100
TOP100
TOP100 1E+13 1E+12 1E+11 1E+10 1E+09 1E+08 1E+07 1E+06 100000 10000 GFlops 1000 100 10 1 (1993-2011) 2025 2023 2021 2019 2017 2015 2013 2011 2009 2007 2005 2003 2001 1999 1997 1995 1993
1 1993 1993-2011 199319963 199610001999I 3 199920012 200130002005 4000A 6800 21 4 2004 20073 200820103 2011 2 3
2 TOP100 100TFflops 20072008 2008 10; Linpack20082009 Petaflops 2008 10 Petaflops 20102011 ; 10Petaflops 20122013 Linpack20112012 10Petaflops 2011 10 100Petaflops 20142015 Linpack20132014 100Petaflops
! THANKS Q&A! SAMSS CCF! HPC CHINA 2012 2012 10 2327 HTTP://WWW.SAMSS.ORG.CN