Computer rchitecture ELEC344 ISC vs CISC Iro Law CPUTime = # of istructio program # of cycle istructio cycle Lecture 5 Pipeliig () Dr. Hayde Kwok-Hay So Departmet of Electrical ad Electroic Egieerig L4 L5,6 Microarchitecture CPI Cycle Time CISC > short ISC sigle cycle upipelied log ISC pipelied short Pipelie Motivatio uyig Food from Catee customer 2 customers Order Food Drik Order Food Drik Order Food Drik Gettig food from catee ivolves 3 steps: Place order (P) Pickup food (F) Pickup drik (D) If there is oly customer: P è F è D How to serve 2 customer? Slow 2d sem., '6-7 2 Food Orderig Pipelie 4 customers (pipelie) 4 customers (o pipelie) Servig oe after oe è Slow: ssume each step take uit of, the N customers è 3N uits of etter solutio: Pipelie Overlap differet steps i parallel N customers è 2 + N uits of Pre-requisite: ll steps must be able to operate idepedetly i parallel 2d sem., '6-7 3 2d sem., '6-7 4
Pipelie Observatios: 4 customers (pipelie) 4 customers (ubalaced pipelie) 2 Views of Pipelie Customer 0 Customer Customer 2 Customer 3 Timelie View 4 customers (o pipelie) ll stages (P, F, D) are busy all the No-pipelie: busy /3 of the alaced pipelie: customer per uit of The logest stage dictates the overall performace customer per -of-logest-stage alaced delay o each stage è best performace Order (P) Food (F) Drik (D) c0 c c0 c2 c c0 c3 c2 c c3 c2 c3 esource View 2d sem., '6-7 5 Istructio Pipeliig ecall there are 5 steps to execute istructio i ISC-V Istructio Fetch Istructio Decode Executio operatio Write ack The 5 steps ca be pipelied if they ca operate idepedetly è Pipelie registers d more 2d sem., '6-7 7 a[i] b[i] c[i] d[i] a[i] b[i] c[i] d[i] clk 2d sem., '6-7 6 ackgroud: Hardre Pipelie with egisters + + Feedforrd digital systems ca be pipelied by itroducig pipelie registers Pipeliig icreases latecy E.g. y[0] ow appears oly whe i=4 Throughput the same y[i] obtaied every cycle Cycle shorte y[i] = (a[i]+ b[i] c[i]) d[i] a[i 2]+ b[i 2] c[i 2] a[i ]+ b[i ] (a[i 3]+ b[i 3] c[i 3]) d[i 3] c[i ] d[i ] d[i 2] 2d sem., '6-7 8 y[i 4] = ( a[i 4]+ b[i 4] c[i 4] ) d[i 4]
Pipelied path 5-Stage Pipelied Execu6o Ist. Istruc0o Fetch Istruc0o decode & eg-fetch Execute ccess write -back Ist. I-Fetch (IF) Decode, eg. Fetch (ID) Execute (EX) (M) 0me t0 t t2 t3 t4 t5 t6 t7.... istrucho IF ID EX M W istrucho2 IF 2 ID 2 EX 2 M 2 W 2 istrucho3 IF 3 ID 3 EX 3 M 3 W 3 istrucho4 IF 4 ID 4 EX 4 M 4 W 4 istrucho5 IF 5 ID 5 EX 5 M 5 W 5 Write -ack (W) 2 d Semester 203 ELEC344 - HS 9 2 d Semester 203 ELEC344 - HS 0 Ist. I-Fetch (IF) esources 5-Stage Pipelied Execu6o esource Usage Diagram ws Decode, eg. Fetch (ID) 0me t0 t t2 t3 t4 t5 t6 t7.... IF I I 2 I 3 I 4 I 5 ID I I 2 I 3 I 4 I 5 EX I I 2 I 3 I 4 I 5 M I I 2 I 3 I 4 I 5 W I I 2 I 3 I 4 I 5 2 d Semester 203 ELEC344 - HS Execute (EX) (M) Write -ack (W) eefit of Istructio Pipeliig CPUTime = Whe the pipelie is filled, CPI= Shorter cycle because less work to do per cycle I fact, more pipelie stages è shorter cycle Commercial processors ca have up to 20 stages pipelie 2d sem., '6-7 # of istructio program # of cycle istructio cycle 2
Pipelie is Difficult Structural Hazard Hazard Cotrol Hazard O every cycle, the hardre eeds to detect ad resolve all types of hazards, while keepig pipelie as filled as possible to achieve CPI= I real systems, CPI suffers slightly i retur for higher clock speed Need to make sure hardre adheres to the IS cotract with the programmer difficult but worth it Structural Hazard Structural hazard arises whe more tha pipelie stages require access to the same physical hardre Solutios:. extra copies of the resource 2. Chage resource so that it ca hadle cocurret use 3. equire differet stages to access hardre at differet Stall oe (some) of the coflictig stages void the cocurret use 2d sem., '6-7 3 2d sem., '6-7 4 IF ID EX MEM W Structural Hazard Ex t0 t t2 t3 t4 t5 t6 t7 t8 t9 Physically there is oly mai memory i a computer Holds istructio ad data Durig ru-, both IF ad MEM stages eed access to the mai memory è structural hazard Solutio so far: replicate the memory Istructio + memory eality: Ist + Cache IF ID EX MEM W Structural Hazard Ex egfile t0 t t2 t3 t4 t5 t6 t7 t8 t9 Physically there is oly register file Durig ru-, there are chaces that both ID ad W stages eed access to the regfile ID: read regfile (, ) W: write regfile (rd) è structural hazard Solutio so far: regfile supports cocurret read ad write 2d sem., '6-7 5 2d sem., '6-7 6
Hazard hazard arises whe pipelie stages access data locatio i ys that are icompatible with the IS cotract with the programmer Techically 3 types of hazards ead fter Write hazards (W) Write fter ead hazards (W) Write fter Write hazards (WW) What may go wrog? W: a later read happes before a earlier write W: a later write happes before a earlier read WW: a later write happes before a earlier write hazard happes o register ND memory locatios I our 5-stage pipelie, oly W ca happe W x ß x0 + 0 x4 ß x + 7 x4 ß x2 + x3 x2 ß x4 + W WW Ist. I-Fetch (IF) Hazard Example Decode, eg. Fetch (ID) Execute (EX) (M) 0me t0 t t2 t3 t4 t5 t6 t7.... x ß x0 + 0 IF ID EX M W x4 ß x + 7 IF 2 ID 2 EX 2 M 2 W 2 writes val of x Write -ack (W) ew val of x calculated old val of x read 2d sem., '6-7 7 2 d Semester 203 ELEC344 - HS What is wrog with this? writig back to regfile is from a istructio 3 cycles ago 8 ist Ist add x, x2, x3 lw x4, 20(x5) ori x6, x7, sub x8, x9, x0 2d sem., '6-7 9 2d sem., '6-7 20
Pipelied Executio Cotrol: decode Pipelied ISC-V path without F jumps D E M W ist Ist ist Ist egwritee FucSel MemWrite WSel eplicate istructio register to every stage Distributed decodig for each stage based o the curret istructio of that stage Op2Sel Cotrol Poits Need to e Coected 2d sem., '6-7 2 2d sem., '6-7 22 Last Time Hazard hazard arises whe pipelie stages access data locatio i ys that are icompatible with the IS cotract with the programmer Techically 3 types of hazards ead fter Write hazards (W) Write fter ead hazards (W) Write fter Write hazards (WW) What may go wrog? W: a later read happes before a earlier write W: a later write happes before a earlier read WW: a later write happes before a earlier write hazard happes o register ND memory locatios I our 5-stage pipelie, oly W ca happe x ß x0 + 0 x4 ß x + 7 x4 ß x2 + x3 x2 ß x4 + W W WW 2d sem., '6-7 23 2d sem., '6-7 24
Ist. I-Fetch (IF) Hazard Example W ess Decode, eg. Fetch (ID) Execute (EX) (M) 0me t0 t t2 t3 t4 t5 t6 t7.... x ß x0 + 0 IF ID EX M W x4 ß x + 7 IF 2 ID 2 EX 2 M 2 W 2 ew val of x calculated writes val of x Write -ack (W) esolvig Hazards Strategy : Stallig Wait for the result to be available by freezig earlier pipelie stages è Iterlocks Strategy 2: Forrdig oute data as soo as possible after it is calculated to the earlier pipelie stage è bypass old val of x read 25 2d sem., '6-7 26 Feedback to esolve Hazards Iterlocks to resolve Hazards Stall Coditio F stage F 2 F 3 F 4 stage 2 stage 3 stage 4 Later stages provide depedece iformaho to earlier stages which ca stall (or kill) istruc0os Cotrollig a pipelie i this maer works provided the istructio at stage i+ ca complete without ay iterferece from istructios i stages to i (otherwise deadlocks may occur) ist Ist x x0 + 0 x4 x + 7 27 28
Iterlocks to resolve Hazards Sed i place of Freeze Stall Coditio at decoded 2 d istructio 2 d istructio st istructio proceeds Stalled Stages ad Pipelie ubbles t0 t t2 t3 t4 t5 t6 t7.... (I ) x (x0) + 0 IF ID EX M W (I 2 ) x4 (x) + 7 IF 2 ID 2 ID 2 ID 2 ID 2 EX 2 M 2 W 2 (I 3 ) IF 3 IF 3 IF 3 IF 3 ID 3 EX 3 M 3 W 3 (I 4 ) stalled stages IF 4 ID 4 EX 4 M 4 W 4 (I 5 ) IF 5 ID 5 EX 5 M 5 W 5 ist Ist x x0 + 0 x4 x + 7 esource Usage t0 t t2 t3 t4 t5 t6 t7.... IF I I 2 I 3 I 3 I 3 I 3 I 4 I 5 ID I I 2 I 2 I 2 I 2 I 3 I 4 I 5 EX I - - - I 2 I 3 I 4 I 5 M I - - - I 2 I 3 I 4 I 5 W I - - - I 2 I 3 I 4 I 5 - pipelie 29 30 Pipelie ubbles Pipelie is a logical cocept Ca be implemeted usig NOP istructio Special cotrol decodig Stall pipelie by disablig pipelie registers Causes pipelie stalls ubbles turs ito NOPs addi x, x0, 0 addi x4, x, 7 ori x6, x7, sub x8, x9, x0 addi x, x0, 0 NOP NOP NOP addi x4, x, 7 ori x6, x7, sub x8, x9, x0 2d sem., '6-7 3 2d sem., '6-7 32
Hazards due to Loads & Stores Load & Store Hazards Stall Codi0o What if x+7 = x3+5? M[x+7] x2 x4 M[x3+5] x+7 = x3+5 data hazard ist Ist M[x+7] x2 x4 M[x3+5] Is there ay possible data hazard i this istruc0o sequece? Hover, the hazard is avoided because our memory system completes writes i oe cycle Load/Store hazards are somehmes resolved i the pipelie ad somehmes i the memory system itself. More o this later i the course. 33 34 Time Pipelie CPI Examples Ist Ist 2 Ist 3 Ist Ist 2 ubble Ist 3 Ist ubble Ist 2 Ist ubble 3 2 Ist 3 Measure from whe first istruc0o fiishes to whe last istruc0o i sequece fiishes. 3 istruchos fiish i 3 cycles CPI = 3/3 = 3 istruchos fiish i 4 cycles CPI = 4/3 =.33 3 istruchos fiish i 5cycles CPI = 5/3 =.67 esolvig Hazards Strategy : Stallig Wait for the result to be available by freezig earlier pipelie stages è Iterlocks Strategy 2: Forrdig oute data as soo as possible after it is calculated to the earlier pipelie stage è bypass 35 2d sem., '6-7 36
ypassig t0 t t2 t3 t4 t5 t6 t7.... (I ) x x0 + 0 IF ID EX M W (I 2 ) x4 x + 7 IF 2 ID 2 ID 2 ID 2 ID 2 EX 2 M 2 W 2 (I 3 ) IF 3 IF 3 IF 3 IF 3 ID 3 EX 3 M 3 (I 4 ) stalled stages IF 4 ID 4 EX 4 (I 5 ) IF 5 ID 5 Each stall or kill itroduces a i the pipelie CPI > ew datapath, i.e., a bypass, ca get the data from the output of the to its iput t0 t t2 t3 t4 t5 t6 t7.... (I ) x x0 + 0 IF ID EX M W (I 2 ) x4 x + 7 IF 2 ID 2 EX 2 M 2 W 2 (I 3 ) IF 3 ID 3 EX 3 M 3 W 3 (I 4 ) IF 4 ID 4 EX 4 M 4 W 4 (I 5 ) IF 5 ID 5 EX 5 M 5 W 5 37 stall ist Ist D ig a ypass x4 x Src x E M W Whe does this bypass help? (I ) x x0 + 0 x M[x0 + 0] JL 500 (I 2 ) x4 x + 7 yes x4 x + 7 o x4 x + 7 o 38 stall Fully ypassed path for JL, Src E M W Ques6os about LW ad forrdig DDIU 24 O 3,3,2 LW 28(29) Do eed to stall? ID (Decode) EX MEM WE, MemToeg W ist Ist Is there s0ll a eed for the stall sigal? D Src Mux,Logic From W M M 39 40
Ques6os about LW ad forrdig DDIU 24 LW 28(29) O,3, Do eed to stall? ID (Decode) EX MEM WE, MemToeg W stall Fully ypassed path for JL, Src E M W Mux,Logic From W M M ist Ist Is there s0ll a eed for the stall sigal? D Src stall = ( D =ws E ). (opcode E =LW E ).(ws E 0 ).re D + ( D =ws E ). (opcode E =LW E ).(ws E 0 ).re2 D 4 42 I Coclusios Pipelie is a ll-studied digital system desig techique Pipeliig allows cocurret executio of multiple steps 5-stages of ISV-V pipelie: Istructio Fetch Istructio Decode Istructio Execute ccess Write ack 3 Types of Hazards Structural hazard hazard Cotrol hazard ckowledgemets These slides cotai material developed ad copyright by: rvid (MIT) Krste saovic (MIT/UC) Joel Emer (Itel/MIT) James Hoe (CMU) Joh Kubiatowicz (UC) David Patterso (UC) MIT material derived from course 6.823 UC material derived from course CS52, CS252 2d sem., '6-7 43 2d sem., '6-7 44