CPU!! 00-11-8 Liping zhang, Tsinghua 1
: ADD(r1, r, r) CMPLEC(r, 5, r0) MUL(r1, r, r) SUB(r1, r, r5) ADD r, ( ) r CMP. CMP r.. t t + 1 t + t + t + t + 5 t + 6 IF( ) ADD CMP MUL SUB RF NOP ADD CMP MUL SUB ALU NOP NOP ADD CMP MUL SUB WB NOP NOP NOP ADD CMP MUL SUB R0 1 R1 R R 00-11-8 Liping zhang, Tsinghua
: ADD(r1, r, r) CMPLEC(r, 5, r0) MUL(r1, r, r) SUB(r1, r, r5) ADD t+ r, ( t+ ) r CMP t+. CMP r.. t t + 1 t + t + t + t + 5 t + 6 IF ADD CMP MUL SUB RF NOP ADD CMP MUL SUB ALU NOP NOP ADD CMP MUL SUB WB NOP NOP NOP ADD CMP MUL SUB R0 CMP 1 1 1 1 1 1 0 writes R1 wrong R R value 00-11-8 Liping zhang, Tsinghua
NOP ADD(r1, r, r) CMPLEC(r, 5, r0) MUL(r1, r, r) SUB(r1, r, r5) Trick: RF t t + 1 t + t + t + t + 5 t + 6 IF ADD CMP MUL SUB RF NOP ADD CMP MUL SUB ALU NOP NOP ADD CMP MUL SUB WB NOP NOP NOP ADD CMP MUL SUB R0 1 1 1 1 1 1 0 R1 R R 00-11-8 Liping zhang, Tsinghua
NOP ADD(r1, r, r) CMPLEC(r, 5, r0) MUL(r1, r, r) SUB(r1, r, r5) Trick: RF t t + 1 t + t + t + t + 5 t + 6 IF ADD CMP MUL SUB RF NOP ADD CMP MUL SUB ALU NOP NOP ADD CMP MUL WB NOP NOP NOP ADD CMP R0 1 1 1 1 1 1 1, CMP R1 R R <R> 00-11-8 Liping zhang, Tsinghua 5
ADD(r1, r, r) NOP() NOP() CMPLEC(r, 5, r0) MUL(r1, r, r) SUB(r1, r, r5) NOP Trick: RF t t + 1 t + t + t + t + 5 t + 6 IF ADD NOP NOP CMP MUL SUB RF NOP ADD NOP NOP CMP MUL SUB ALU NOP NOP ADD NOP NOP CMP MUL WB NOP NOP NOP ADD NOP NOP CMP R0 1 1 1 1 1 1 1 R1 R, CMP R <R> R0 0 00-11-8 Liping zhang, Tsinghua 6
ADD(r1, r, r) NOP() NOP() CMPLEC(r, 5, r0) MUL(r1, r, r) SUB(r1, r, r5)! t t + 1 t + t + t + t + 5 t + 6 IF ADD NOP NOP CMP MUL SUB RF NOP ADD NOP NOP CMP MUL SUB ALU NOP NOP ADD NOP NOP CMP MUL WB NOP NOP NOP ADD NOP NOP CMP R0 1 1 1 1 1 1 1 R1 R R 00-11-8 Liping zhang, Tsinghua
ADD(r1, r, r) MUL(r1, r, r) SUB(r1, r, r5) CMPLEC(r, 5, r0) MUL, move r1 non-dependent or r. code into delay slot SUB ( ADD ). saves cycles! t t + 1 t + t + t + t + 5 t + 6 IF ADD MUL SUB CMP RF NOP ADD MUL SUB CMP ALU NOP NOP ADD MUL SUB CMP WB NOP NOP NOP ADD MUL SUB CMP R0 1 1 1 1 1 1 1 R1 R R 00-11-8 Liping zhang, Tsinghua 8
DIVC(r,,r) ADD(r1, r, r) CMPLEC(r, 5, r0) MUL(r, r, r) SUB(r1, r, r) 1a., R0 R, SUB ; 1b.?. NOP. initial values R0 1 R1 R R R -., NOP.. correct final values. 0 8-1 00-11-8 Liping zhang, Tsinghua 9
IF RF ALU WB DIVC(r,,r) ADD(r1, r, r) CMPLEC(r, 5, r0) MUL(r, r, r) SUB(r1, r, r) t t + 1 t + t + t + t + 5 t + 6 DIVC ADD CMP MUL SUB NOP DIVC ADD CMP MUL SUB NOP NOP DIVC ADD CMP MUL SUB NOP NOP NOP DIVC ADD CMP MUL R0 1 1 R1 R R R - - 1 - Q#1a Remember: RF WB 1-1 1 1 SUB 1 8 t + t+8 1 8-1 00-11-8 Liping zhang, Tsinghua 10
DIVC(r,,r) ADD(r1, r, r) CMPLEC(r, 5, r0) MUL(r, r, r) SUB(r1, r, r) Q#1b :! CMP <R> R0 IF RF ALU WB t t + 1 t + t + t + t + 5 t + 6 DIVC ADD CMP MUL SUB NOP DIVC ADD CMP MUL SUB NOP NOP DIVC ADD CMP MUL SUB NOP NOP NOP DIVC ADD CMP MUL R0 1 1 R1 R R R - - 1-1 - 1 1 1 SUB 8 t + t+8 1 1 8-1 00-11-8 Liping zhang, Tsinghua 11
DIVC(r,,r) ADD(r1, r, r) NOP() NOP() CMPLEC(r, 5, r0) MUL(r, r, r) SUB(r1, r, r) Q# NOP t t + 1 t + t + t + t + 5 t + 6 t + t+8 IF DIVC ADD NOP NOP CMP MUL SUB RF NOP DIVC ADD NOP NOP CMP MUL SUB ALU NOP NOP DIVC ADD NOP NOP CMP MUL WB NOP NOP NOP DIVC ADD NOP NOP CMP R0 1 1 1 1 1 1 1 1 0 R1 R R R - - - - 00-11-8 Liping zhang, Tsinghua 1
DIVC(r,,r) ADD(r1, r, r) MULC(r, r, r) SUB(r1, r, r) CMPLEC(r, 5, r0) Q#? CMP is OK... t t + 1 t + t + t + t + 5 t + 6 t + t+8 IF DIVC ADD MUL SUB CMP RF NOP DIVC ADD MUL SUB CMP ALU NOP NOP DIVC ADD MUL SUB CMP WB NOP NOP NOP DIVC ADD MUL SUB CMP R0 1 1 1 1 1 1 1 1 0 R1 R R R - - - - 00-11-8 Liping zhang, Tsinghua 1
DIVC(r,,r) ADD(r1, r, r) MULC(r, r, r) SUB(r1, r, r) CMPLEC(r, 5, r0) Q#? CMP... MUL MUL DIVC t t + 1 t + t + t + t + 5 t + 6 t + t+8 IF DIVC ADD MUL SUB CMP RF NOP DIVC ADD MUL SUB CMP ALU NOP NOP DIVC ADD MUL SUB CMP WB NOP NOP NOP DIVC ADD MUL SUB CMP R0 1 1 1 1 1 1 1 1 0 R1 R R -8-8 -8 R - - - - 00-11-8 Liping zhang, Tsinghua 1
A solution DIVC(r,,r) ADD(r1, r, r) SUB(r1, r, r) MUL(r, r, r) CMPLEC(r, 5, r0) Q#. t t + 1 t + t + t + t + 5 t + 6 t + t+8 IF DIVC ADD SUB MUL CMP RF NOP DIVC ADD SUB MUL CMP ALU NOP NOP DIVC ADD SUB MUL CMP WB NOP NOP NOP DIVC ADD SUB MUL CMP R0 1 1 1 1 1 1 1 1 0 R1 R R 8 8 R - - - - -1-1 -1 00-11-8 Liping zhang, Tsinghua 15
!! ~ : DIVC(r,,r) ADD(r1, r, r) DIVC(r,,r) NOP() ADD(r1, r, r) NOP() CMPLEC(r, 5, r0) CMPLEC(r, 5, r0) MUL(r, r, r) MUL(r, r, r) SUB(r1, r, r) SUB(r1, r, r) 00-11-8 Liping zhang, Tsinghua DIVC(r,,r) ADD(r1, r, r) SUB(r1, r, r) MUL(r, r, r) CMPLEC(r, 5, r0) 16
: LOOP: CMPLEC(r, 100, r0) ADD(r1, r, r) SUB(r1, r, r) BNE(r0, LOOP) XOR(r1, r1, r) : IF RF ALU WB i i + 1 i + i + i + i + 5 i + 6 CMP ADD SUB BNE? CMP ADD SUB BNE CMP ADD SUB BNE CMP ADD SUB BNE?? 00-11-8 Liping zhang, Tsinghua 1
: PCSEL IF XOR 1 0 PC IF + 00 A Instruction Memory D XOR! PC RF IR RF RF BNE IF RF ALU WB PC IF Z PCSEL 00-11-8 Liping zhang, Tsinghua + <PC> + + C C<15:0> << Z RA1 RD1 CMP ADD SUB BNE XOR CMP ADD SUB BNE Register File CMP ADD SUB BNE 0 1 RA RD i i + 1 i + i + i + i + 5 i + 6 CMP XOR CMP XOR CMP ADD SUB BNE 0x100 0x10 0x108 0x10C 0x110 0x100 0 1 RASEL 18
: : NOP, IF RF ALU WB PC IF Z PCSEL i i + 1 i + i + i + i + 5 i + 6 CMP ADD SUB BNE XOR CMP ADD SUB BNE CMP ADD SUB BNE CMP ADD SUB BNE 0x100 0x10 0x108 0x10C 0x110 0x100 00-11-8 Liping zhang, Tsinghua 0 1 CMP XOR CMP XOR 19
NOP LOOP: CMPLEC(r, 100, r0) ADD(r1, r, r) SUB(r1, r, r) BNE(r0, LOOP) IF RF ALU WB PC IF Z PCSEL NOP() XOR(r1, r1, r) i i + 1 i + i + i + i + 5 i + 6 CMP ADD SUB BNE NOP CMP CMP ADD SUB BNE CMP ADD SUB BNE CMP ADD SUB BNE 0x100 0x10 0x108 0x10C 0x110 0x100 00-11-8 Liping zhang, Tsinghua Trick: NOP PC. 0 1 NOP CMP NOP 0
LOOP: SUB(r1, r, r) MUL(r1, r, r) ADD(r1,r, r5) BRLT(r,r, LOOP) new instr.: branch if less-than XOR(r1, r1, r) Trick: NOP PC. IF RF ALU WB i i + 1 i + i + i + i + 5 i + 6 SUB MUL ADD BRLT SUB MUL ADD BRLT SUB MUL ADD BRLT PC IF 0x100 0x10 0x108 0x10C 0x110 SUB MUL ADD BRLT 00-11-8 Liping zhang, Tsinghua 1
PCSEL XAdr IF ILL OP JT 1 0 PC IF + 00 -Stage β Pipeline A Instruction Memory D PC RF IR RF RF <PC> + + C + C<15:0> << Z ra<0:16> rb<15:11> rc<5:1> 0 1 RASEL RA1 RD1 Register File RA RD JT C<15:0> ASEL 1 0 1 0 BSEL PC ALU IR ALU A B D ALU ALU ALUFN A ALU B WB PC WB IR WB Y rc<5:1> WASEL 0 1 0 1 WDSEL WA WD Register WERF File 00-11-8 Liping zhang, Tsinghua XP A RD D WB WD Data Memory R/W Wr
PCSEL XAdr IF ILL OP JT 1 0 PC IF + 00 -Stage β Pipeline A Instruction Memory D PC RF IR RF RF <PC> + + C + C<15:0> << Z ra<0:16> rb<15:11> rc<5:1> 0 1 RASEL RA1 RD1 Register File RA RD JT C<15:0> ASEL 1 0 1 0 BSEL ALU PC ALU target IR ALU A B PC ALU A ALU B ALUFN D ALU PC WB IR WB Y D WB WB rc<5:1> WASEL 0 1 0 1 WDSEL WA WD Register WERF File 00-11-8 Liping zhang, Tsinghua XP A RD WD Data Memory R/W Wr
Another Example LOOP: SUB(r1, r, r) MUL(r1, r, r) ADD(r1,r, r5) BRLT(r,r, LOOP) new instr.: branch if less-than XOR(r1, r1, r) Trick: The later add NOPs we up to decide, and including the cycle more the where branch correct delay target PC is decided. slots. IF RF ALU WB PC IF ALU<0> PCSEL i i + 1 i + i + i + i + 5 i + 6 NOP NOP SUB MUL ADD BRLT SUB SUB MUL ADD BRLT NOP NOP SUB MUL ADD BRLT NOP 0x100 0x10 0x108 0x10C 00-11-8 Liping zhang, Tsinghua SUB MUL ADD BRLT New Mux input coming from TargetPC ALU 0x110 0x11 1 5 0x100
( ) CPU CPU? / NOP? 00-11-8 Liping zhang, Tsinghua 5