12 12.1 C P U T x X T y Y T x >T y Y P XY Y X P x = 1 / T x P y = 1 / T y ( 1 2-1 ) P y > P x ( 1 2-2 ) C P U = # 12.2
334 C P U 12-1 a I F I D E X E M E M W B C P U 12-1 b C P U C P U t i n s t t i n s t = t I F + t I D + t E X E + t M E M + t W B ( 1 2-3 ) C P U T 1 T 1 t i ( 12-4 ) n s t n n T 1 1 2-2 a C P U I FI D E X EM E MW B a) 1 2 b) 12-1
12 335 a ) 12-2 φ 12-2 b TT 1 T = max{t I F t I D t E X E t M E M t W B } ( 1 2-5 ) 5T T 1 b) 12-3 1 2-3 C C 1C C 2 C C 1 C C 2 C C 3 C C 4 I F I D E X E M E M 12-4 A D D I n s t 1I n s t 2I n s t 3 I F 12-5 C C 1 I n s t 1 I F C C 2 I n s t 1 I D
336 I n s t 2 I F C C 3 I n s t 1 E X E I n s t 2 I D I n s t 3 I F C C 5 I n s t 1 C C 6 I n s t 1 I n s t 6 88. 5 12-4 Inst1 Inst2 12-5 I n s t 1 5T 12-5 I n s t 26T I n s t 37T 5T
12 337 N t 1 = N 5T= 5N T ( 1 2-6 ) t p i p e = 5T + (N- 1 )T = (N+ 4 )T ( 1 2-7 ) 5T N- 1T N- 1 N= 1000 t 1 = 5 0 0 0T t p i p e = 1 0 0 4T ( 1 2-8 ) 5 T 1 2-6 a E X E 1 E X E 2 M E M 1M E M 2 T D T D <T 12-6 b T D 7 T D < 5T ( 1 2-9 ) T = T D a) b) 12-6
338 12.2.1 I n s t 2 I n s t 1 M E M W B add R1R 8R 9 # R 8 + R 9R 1 sub R2 R 1R 6 #R1- R6R 2 R 8R 9 R 1 R 1 R 6 R 2 S U B A D D 12-7 a) 1 2-7 a A D D E X E 12-7 bs U B R 1 R 1 - R 6 R 1 W B 12-7 c S U B SUB R1 R 1 b) 1 2-8 12-8 a A D DS U B 12-8 bs U B A D D R 2 = R 1 - R 6 = 12-4 = 8 S U B A D D W B R 2 =R 8 + R 9- R6=30-4 = 2 6 c) ADD A D D R 1 12-7 a) b) SUB c) ADD WB 12-8 S U B R 1 A D D 12-9 N O P 12-9 A D D S U B
12 339 A D D W B S U B S U B E X E ADD SUB SUB 12-9 12.2.2 N O P 12-10 I FI D E X E M E M W B M E MW 12-10 12.3
340 P C 6 4 M B C M Price = MC (12-10) C t a c c / R A M R A M S R A M S R A 46 R A MD R A MD R A M S R A M D R A S R A M t a c c D R A M D R A M S I M M D I M M 12-11 D R A M C P U C a c h e S R A M C a c h e K B C a c h e C P U C a c h e C P U 12-12 C a c h e (CPU) 12-11
12 341 (CPU) Cache memory (fast) ) 12-12 1 2-13 C a c h e C P U l ws w C a c h e C a c h e C a c h ec P U C a c h e C a c h 1 2-13 b C a c h e C a c h e = / 100 % 1 2-11 CPU Cache a) CPU Cache b) 12-13 Cache C a c h e C a c h
342 C a c h e C P U C a c h ec P C a c h e 12-14 a C P U C a c h e 12-14 b C a c h e C a c h C a c h e C a c h e 12-15 C a c h el 1 L1 Cache C a c h el 2 L1 Cache L 1 L2 Cache L1 Cache L2 L 1 C a c h e L 1 CPU Cache a) C a c h e CPU Cache b) Cache 12-14 Cache () () 12-15 Cache
12 343 V L S IC a c h e L1 Cache C a c h e WL( ) C a c h e C a c h e V L S I S R A M C a c h e C P U C a c h e CMOS SRAM 12-16 6M O S 8 68 = 48 32 12-16 CMOS SRAM 4 44 8 = 192M O S 10243 2 1 K C a c h e 1024*192=196 608 C a c h e C a c h C a c h e L 1 V L S I 1 2-7I n t e l Pro CPU P G A 12-17 Intel Pro
344 C P U L1 Cache L2 Cache L2 Cache 12.4 1 2-18 1 2 C a c h e C a c h e M E M W B 12-19 P C C a c h e 12-20 12-18 2 1 12-19
12 345 1 2-21I n t e l U V V U U W B / F P 1 F P 2F P 2 F o r m a t E r r o r s 60 M H z 66 M H z4 86 C M O S Cache 2 1 1 1 2 2 Format Errors Caohe/ 12-20 12-21 Intel 12.5 21 1. I F 2. I D 3. E X E 4. S TO
346 T a A T b B 12-22 T b <T a B A I F - I D - E X E - S TO B A B A 12-22 P E P E 12-23 P 12-24 P E b) c) CPU 12-23 12-24
12 347 P E P E P 12.5.1 N S N= 1S= 1 S I S D S I S 12-25 D Inst X R x D S I M D N= 1S> 1 M I S D N> 1S= 1 M I M D N> 1S> 1 S I M D 1 2-26 S I M D 4 P E 0P E 1P E 2P E 3 D0D1D2D3 P E 0P E 1P E 2P E 3 R X ( D 0 )R X ( D 1 )R x ( D 2 )R X ( D 3 ) ( 1 2-1 2 ) A D D M I S D 12-25 SISD 12-26 SIMD M I S D 12-27 M I S D D 4 I n s t A I n s t BI n s t CI n s t C 4
348 R A ( D )R B ( D )R C ( D )R D ( D ) ( 1 2-1 3 ) M I M D 12-27 MISD 12-28 M I M D D0D1D2D3 R A ( D 0 )R B ( D 1 )R C ( D 2 )R D ( D 3 ) ( 1 2-1 4 ) 12-28 MIMD 12.5.2 D S P 4 [A]
12 349 A 1 A 2 A 3 A 4 4 [B] ( 1 2-1 5 ) ( 1 2-1 6 ) B 1 B 2 B 3 B 4 44 [C] ( 1 2-1 7 ) C α β α β 4 [ 4 ]2 ( 1 2-1 8 ) S I S D 2 *[ A ] 2 *A 1 2 *A 2 2 *A 3 2 *A 4 2 12-29 SIMD
350 4 S I M D 2 A 1 A 2 A 3 A 4 12-29 P ES I S D S I M S I S D P ( 1 2-1 9 ) S I S D A k B k k = 1234 ( 1 2-2 0 ) 1 3 7 S I M D 12-30 A k B k P E n P E 2n 2 n P P E SIMD 12-30 SIMD ( 1 2-2 1 )
12 351 [C] C i j =A i B j 1 2-2 2 ) i, j= 1, 2, 3, 4 S I S D 16 16 S I M D 12-31 SIMD 1 2 3 12-31 SIMD 12.5.3 4
352 P E P E 12-32 12-32 PE 12-33 P E 12-33 P E 12-33 P E P E P E P E 12-34 P E P E P E P E P
12 353 12-34 12.5.4 N I C 12-35 12-35 12.5.5 12-36 12 - a 12-36 b a) b) 12-36 n 2 n n P E 12-37 a 2 P E 3
354 12-37 b P E n> 3n 12-38 4 2 4 = 16 P E4 P E 3 a) 2 b) 3 12-37 n=2n=3 n n n= 3 3 n 2 n 1 n 0 n i = 01 000 0010 101 00 12-38 n = 4
12 355 12.5.6 12.5.7 L E D 10 6 12-39 12-39 a 12-39 b 1 2-4 0 a) b) 12-39 1 2-41 clock ø
356 12-42 12-40 12-41 12-42 12.6 1. AT A 110 000 BT B 84 000 2. 12-1 100 000 (a) f = 66 MHz (b) f = 100 MHz (c) f = 300 MHz
12 357 (d) f = 333 MHz (e) f = 450 MHz 3. 4. 12-1 t I F = 1. 6 n s t I D = 1. 4 n st E X E = 3. 2 n st M E M = 3. 7 n st W B = 1. 8 n s (a) f 1 (b) 12-2 f 2 (c) t i n s t 5. 12-4 300 M H z 800 000 (a) (b) (c) 6. (a) 12-1 (b) 7. 200 M H z (a) 5 % (b) 8. L 1L 2 9. 0 (a) (b) 10. 12-18 1 I F 88 n s 2 (a) 225 000 t p i p e (b) 11. S I M D [a][x] 12.
358 ( a ) ( b ) ( c ) [Z] = [X] + [Y] 13. ( a ) S I M D [c] = [a] + [b] ( b )( a ) S I M D [c]k ( c ) S I M D K [c] 14. [u] = [u1 u 2 u 3 ] [u] 2 [u]3 [u]1 0 [u][u/ 2 ] ( a ) ( b ) 15. P EM I M D M I M D [ ] 16. n= 3 0x x 1x x 17. n= 3 n 2 n 1 n 0 (a) 101 (b) 100 111 18. 2
12 359 19. [ 6 ] 12.7 I A I A [1] Don Anderson and Tom Shanley. Pentium T M Processor System Architecture,2nd ed. R e a d i n g, M A : A d d i s o n - We s l e y, 1995. Intel Pentium [2] James M. Feldman and Charles T. Retter. Computer Architecture. New York: McGraw- Hill, 1994.A. [3] David A. Patterson and John L. Hennessy. Computer Organization & Design, 2nd ed. San Francisco: Morg a n - K a u ffman Publishers, 1997.. R I S C [4] Tom Shanley and Don Anderson. ISA System Architecture, 3rd ed. Reading, MA: A d d i s o n - We s l e y, 1995.. P C