Outline Single Cycle Processor Design Multi cycle Processor. Pipelined Processor: Hazards and Removal. Instruction Pipeline. Time

Similar documents
A Single Cycle Processor. ECE4680 Computer Organization and Architecture. Designing a Pipeline Processor. Overview of a Multiple Cycle Implementation

ELE 455/555 Computer System Engineering. Section 2 The Processor Class 4 Pipelining with Hazards

Data Hazards. Result does not feed back around in time for next operation Pipelining has changed behavior of system. Comb. logic A. R e g.

Pipelined Implementation

CS650 Computer Architecture. Lecture 5-1 Branch Hazards

CS3350B Computer Architecture. Lecture 6.2: Instructional Level Parallelism: Hazards and Resolutions

Lecture 7: Pipelined Implementation. James C. Hoe Department of ECE Carnegie Mellon University

CPE/EE 427, CPE 527 VLSI Design I L21: Sequential Circuits. Review: The Regenerative Property

VLSI Design I; A. Milenkovic 1

MICROPROCESSOR ARCHITECTURE

IA-64: Advanced Loads Speculative Loads Software Pipelining

VLSI Design 14. Memories

A handy systematic method for data hazards detection in an instruction set of a pipelined microprocessor

VLSI Design I; A. Milenkovic 1

VLSI Design I; A. Milenkovic 1

Instruction Cache Compression for Embedded Systems by Yujia Jin and Rong Chen

A 64 Bit Pipeline Based Decimal Adder Using a New High Speed BCD Adder

Profile-driven Selective Code Compression

Design and Simulation of a Pipelined Decompression Architecture for Embedded Systems

An Efficient Code Compression Technique using Application-Aware Bitmask and Dictionary Selection Methods

VLSI Design I; A. Milenkovic 1

82C288 BUS CONTROLLER FOR PROCESSORS (82C C C288-8)

The Implementation and Evaluation of Dynamic Code Decompression Using DISE

ECE 757. Homework #2

Computing s Energy Problem:

Driver Training School Instructor Curriculum Requirements for Student Learning & Performance Goals

CMPUT680 - Winter 2001

HSIS. Association of Selected Intersection Factors With Red-Light-Running Crashes. State Databases Used SUMMARY REPORT

Roundabout Design 101: Roundabout Capacity Issues

Spacecraft Simulation Tool. Debbie Clancy JHU/APL

Surfing Interconnect

Race Screen: Figure 2: Race Screen. Figure 3: Race Screen with Top Bulb Lock

National Committee on Uniform Traffic Control Devices RWSTC RECOMMENDATION FOLLOWING SPONSOR COMMENTS

AGW SYSTEMS. Blue Clock W38X

Multimodal Approach to Planning & Implementation of Transit Signal Priority within Montgomery County Maryland

SUPPLEMENT MATERIALS

Perfect Game Baseball - msherwood pseudo code backbone Using MVC Convention (Model/View/Controller) With Multimodal I/O

Minor Amendments to the Street and Traffic By-law 2849 and Skateboards in Protected Bike Lanes

CHAPTER 1 INTRODUCTION TO RELIABILITY

Technical Bulletin, Communicating with Gas Chromatographs

Dynamic Analysis of a Multi-Stage Compressor Train. Augusto Garcia-Hernandez, Jeffrey A. Bennett, and Dr. Klaus Brun

ELIMINATOR COMPETITION DRAG RACE Program Manual Firm Ver 4.11

Reducing Code Size with Run-time Decompression

CPE/EE 427, CPE 527 VLSI Design I L06: Complementary CMOS Logic Gates

Appendix A: Crosswalk Policy

Offset Single Point Interchange I-25 at Rio Bravo Boulevard Albuquerque, New Mexico

Design and Evaluation of Adaptive Traffic Control System for Heterogeneous flow conditions

3 TRAFFIC CONTROL SIGNAL TIMING AND SYNCHRONIZATION

Cycle Track Design Best Practices Cycle Track Sections

City and County of San Francisco APS Safety & Access Tool. Cover Sheet. Total score = crosswalk worksheet score + intersection worksheet score

PIG MOTION AND DYNAMICS IN COMPLEX GAS NETWORKS. Dr Aidan O Donoghue, Pipeline Research Limited, Glasgow

The Evolution of Transport Planning

CBC2 performance with switched capacitor DC-DC converter. systems meeting, 12/2/14

THE PERFECT PRESSURE DROP TRAP PRIMER

Out-of-Core Cholesky Factorization Algorithm on GPU and the Intel MIC Co-processors

95 th Street Corridor Transportation Plan. Steering Committee Meeting #2

An Architecture for Combined Test Data Compression and Abort-on-Fail Test

Section Outline Signalised Intersection Layout

A Point-Based Algorithm to Generate Final Table States of Football Tournaments

Software Reliability 1

Shockoe Bottom Preliminary Traffic and Parking Analysis

H.T.S.T. EQUIPMENT TEST PROCEDURES. Field Reference. Sequence of Testing. Pasteurizer Not Operating 3, 1, 4, 2, 7, 10, 8, 5B, 5E, 5C (dual stem), 9

Visualize Nitrogen Gas Consumption

NanoSight NS300. NanoSight NS300. Operation instructions. Laser Spectroscopy Labs, UCI

Design Strategies for ARX with Provable Bounds: SPARX and LAX

OSHA S CONFINED SPACES IN CONSTRUCTION STANDARD. C. Gary Lopez AJ Gallagher Risk Management

Tick Kit. Assembly Instructions

Strategies to Re capture Lost Arterial Traffic Carrying Capacities

Time and synchronization

Time and synchronization. ( There s never enough time )

BLOCKAGE LOCATION THE PULSE METHOD

At each type of conflict location, the risk is affected by certain parameters:

Scales of Atmospheric Motion Scale Length Scale (m) Time Scale (sec) Systems/Importance Molecular (neglected)

FAQs about Directive PNG017: Measurement Requirements for Oil and Gas Operations

Uninformed search methods

SeaSmart. Jonathan Evans

Landmark Center Rotary Path and Roadway Redesign Team Members:

Milking Systems, Selection, Cost and Implications

An Assessment of FlowRound for Signalised Roundabout Design.

Intraday Shipper Allocations Customer Operations

Feasibility Testing Report Stair Detection with filtering

Session Objectives. At the end of the session, the participants should: Understand advantages of BFD implementation on S9700

Introduction to Relative Permeability AFES Meeting Aberdeen 28 th March Dave Mogford ResLab UK Limited

MUTCD (HAWK) 2016 & Adapting the Pedestrian Hybrid Beacon (HAWK) to Facilitate Bicycle Use. ITE-IMSA March, 2014

Eliminate on-street parking where it will allow for a dedicated bus only lane %

Figure 1: East West Connector Alignment Alternatives Concept Drawing

Matrix-based software test data decompression for systems-on-a-chip

Estimating benefits of travel demand management measures

ENHANCED PARKWAY STUDY: PHASE 2 CONTINUOUS FLOW INTERSECTIONS. Final Report

The HumiPyc - Model 1 - Gas Pycnometer; Density, Moisture, Permeation Analyzer; RH sensor Calibrator

Reliability predictions in product development. Proof Engineering Co

Overview: Underwater sensing

Efficient Placement of Compressed Code for Parallel Decompression

NCE Mini Panel Cmds Plan 3 - Siding v2012-f 7/23/2013

Recommended Improvements - Ruby/Banning Neighbourhood Greenway

Design of Experiments Example: A Two-Way Split-Plot Experiment

4 DISRUPTION MANAGEMENT PLAN HIGHWAY 7 RAPIDWAY CONSTRUCTION BETWEEN BAYVIEW AVENUE AND WARDEN AVENUE TOWNS OF MARKHAM AND RICHMOND HILL

GOLOMB Compression Technique For FPGA Configuration

SoundCast Design Intro

Enbridge Pipelines Inc. PIPELINE INTEGRITY AXIAL CRACK THREAT ASSESSMENT

Transcription:

3/8/5 Pipelined Processor: Hazards and its Removal A. Sahu CSE, T Guwahati Please be updated with http://jatinga.iitg.ernet.in/~asahu/cs/ Outline Single Cycle Processor esign Multi cycle Processor Merging M and M, Removing Adder and Adder Synchronized : ntroducing Mux and Register Pipelined Processor (ata Path) Pipeline : ntroduction and Performance, Cost Pipelined Processor: Hazards and Removal ata Hazards : Bypass/forward Path Control Hazards : Branch Prediction NS NS nstruction Pipeline Time or Cycle 3 4 5 6 7 8 Clock T nstruction Pipeline Time To Execute N nstruction : Total Time= T*N*5 No Pipeline NS 3 NS 4 Pipeline NS 5 All the Stages work in parallel, No resource can be shared by stages Performance: uction per Cycle 3 Total Time= T*(N4) 4 Superscalar Pipeline: Pentium Single Pipeline Pipeline Pipeline Pipeline3 Fetch 3 uctions ecode 3 uctions Execute 3 nstructions 5 ifficulties in Pipeline

3/8/5 Hazards in Pipelining Resource conflicts => Structural hazards use of same resource in different stages ata dependencies => ata hazards RAW (read after write) WAR (write after read) WAW (write after write) Procedural dependencies => Control hazards conditional and unconditional branches, calls/returns Structural Hazards Structural Hazards Caused by Resource Conflicts Use of a hardware resource in more than one cycle A B A C A B A C A B A C Hazards & Handle Resource Hazards : We don t allow this to happen Hardware ependent ifferent sequences of resource usage by different uctions Non pipelined multi cycle resources A B C A C B F X X F X X ata Hazards : ata ependency Control Hazards: JMP, Call, RTN, BEQ, BNZ,.. Program ependent: Hardware designer don t have any control nstruction Pipeline: ata hazards ata Hazards LW R, LW R, 4 A R3, R, R SW R3, 8 Clock F F F Time F Result of is used by 3: Complete then start 3 Result of 3 is used by 4: Complete 3 then start 4

3/8/5 previous current ata Hazards read/write read/write s due to data hazards M M M M M M M : lw $t,... add $s,$t,.. delay = 3 Fun: not considering R in nd Half of cycle and W in st half of Cycle ata hazards: Handling with data forwarding M M : lw $t,... add $s,$t,.. ata hazards: Handling with data forwarding M M : lw $t,... add $s,$t,.. M M M M M M M Create an extra Forward Path from ory to Create an extra Forward Path from ory to Fun: not considering R in nd Half of cycle and W in st half of Cycle Fun: not considering R in nd Half of cycle and W in st half of Cycle s due to data hazards uction view M M M M M : lw $t,... add $s,$t,.. M M M HA..HA.. : Considering R in nd Half of cycle and W in st half of Cycle previous current previous current Handling ata Hazards R W ata Forwarding W nsert many non dependent uctions in between both R nstruction Reordering 3

3/8/5 Control Hazards s due to control hazards We will get result here M M M : beq...,l... L: add... M L M M We need to wait till Result of comparison Examples Suppose a processor with S stages pipeline When we encounter a branch uction, whole pipeline need to be flushed, till execution (finishing) of the branch uction t will take S cycle Probability of branch is b Performance of Program with N uctions Execution time = N ( Branch Prob * Branch Penalty ) = N( b (S ) ) Cycles branch next inline target General Branch nstruction target addr gen delay = delay = 5 cond eval the order of cond eval and target addr gen may be different cond eval may be done in previous uction branch next inline target General Branch nstruction cond eval delay = delay = 5 target addr gen the order of cond eval and target addr gen may be different cond eval may be done in previous uction Remember BEQ nstruction : BEQ $S $S 6bitLabel if (S==S) PC= (PC4) SinX3(6BitLabel<<); else PC = PC4; Address gen : SinX3(6BitLabel<<); Condition eval : S==S F /R M S==S SinX3(Label<<); SinX3(Label<<); S==S 4

3/8/5 Handling ata Hazards Handling hazards ata hazards detect uctions with data dependence introduce nop uctions (bubbles) in the pipeline more complex: data forwarding Control hazards detect branch uctions flush inline uctions if branching occurs more complex: branch prediction Pipeline ata Hazards s due to data hazards Control to introduce stall cycles etecting data hazard conditions ata forwarding paths ata forwarding control s with data forwarding s due to data hazards uction view M M M M M : lw $t,... add $s,$t,.. M M M etecting data hazard Condition to be checked: nstruction in stage reads from a register in which uction in stage or M stage is going to write /.RW and (F/.rs=/.rd or F/.rt=/.rd) /M.RW and (F/.rs=/M.rd or F/.rt=/M.rd) We need to ensure that uction in actually reads rs and/or rt (not taken care here) Note: rd is the destination address after multiplexing ata forwarding path P M M M M : add $t,... add $s,$t, to 5

3/8/5 ata forwarding path P M M : lw $t,... add $s,$t,.. ata forwarding path P3 M M : add $t,... sw $t,.. M M M M M M M to M to M by passing ata forwarding path P4 M M M M M : lw $t,... sw $t,.. M to M by passing ata forwarding path list P from out (/M) to in/ P from M/ out (M/) to in/ P3/P4 from M/ out (M/) to M in Actual forwarding paths / / / fwda fwdb fwdc ad rd M wd Handling Control Hazards 6

3/8/5 Handling Control Hazards Branch Elimination Predicate uction Branch Speed up Early CC, elayed branch Branch Prediction Fixed, Static, ynamic Branch target capture BTB, BTAC, BTC Branch Elimination Branch Elimination Branch Elimination: SLT C S T OP BC CC = Z, A R3, R, R OP F Use conditional uctions (predicated execution) C : S OP A R3, R, R, NZ OP Set on less then (SLT) SLT R R R3 Meaning if (R < R ) R3= Suppose if you don t have this NS CMP R R JNZ Label MOV R3 Label: Branch Speed up 7