Surfing Interconnect

Similar documents
VLSI Design I; A. Milenkovic 1

CPE/EE 427, CPE 527 VLSI Design I L21: Sequential Circuits. Review: The Regenerative Property

VLSI Design I; A. Milenkovic 1

VLSI Design I; A. Milenkovic 1

Simulation with IBIS in Tight Timing Budget Systems

82C288 BUS CONTROLLER FOR PROCESSORS (82C C C288-8)

VLSI Design 14. Memories

Stack Height Analysis for FinFET Logic and Circuit

CPE/EE 427, CPE 527 VLSI Design I L06: Complementary CMOS Logic Gates

A Single Cycle Processor. ECE4680 Computer Organization and Architecture. Designing a Pipeline Processor. Overview of a Multiple Cycle Implementation

CBC2 performance with switched capacitor DC-DC converter. systems meeting, 12/2/14

SCIENTIFIC DATA SYSTEMS, INC. Depth Tension Line Speed Panel. DTLS Manual

Prudhoe Bay Oil Production Optimization: Using Virtual Intelligence Techniques, Stage One: Neural Model Building

Estimating a Toronto Pedestrian Route Choice Model using Smartphone GPS Data. Gregory Lue

EE241 - Spring 2013 Advanced Digital Integrated Circuits. Assigned reading. No new reading. Lecture 4: Transistor Models

Experiment #12 Waves Pre-lab Questions

Module 5 : Pulse propagation through third order nonlinear optical medium. Lecture 38 : Long distance soliton transmission system.

Circuit building blocks

VLSI Design I; A. Milenkovic 1

Application Block Library Fan Control Optimization

Stand-Alone Bubble Detection System

3. Real-time operation and review of complex circuits, allowing the weighing of alternative design actions.

Synchronous Sequential Logic. Topics. Sequential Circuits. Chapter 5 Steve Oldridge Dr. Sidney Fels. Sequential Circuits

IBIS Modeling for IO-SSO Analysis. Thunder Lay and Jack W.C. Lin IBIS Asia Summit Taipei, Taiwan Nov. 19, 2013

Exercise 1: Control Functions

GATE 2 Part No

HW #5: Digital Logic and Flip Flops

Exercise 8. Closed-Loop Pressure Control, Proportional-Plus-Integral Mode EXERCISE OBJECTIVE

Application of CHE100 in Frequency Conversion Alteration of Air Compressor System

GOLOMB Compression Technique For FPGA Configuration

Yoke Instrumentation: ILD Muon System / Tail Catcher. Valeri Saveliev IAM, RAS, Russia DESY, Germany 3 June, 2016

Analysis of Pressure Rise During Internal Arc Faults in Switchgear

Evaluating and Preventing Capacity Loss when Designing Train Control to Enforce NFPA 130 Compliance

Energy Output. Outline. Characterizing Wind Variability. Characterizing Wind Variability 3/7/2015. for Wind Power Management

Hydronic Systems Balance

Physical Design of CMOS Integrated Circuits

Author s Name Name of the Paper Session. Positioning Committee. Marine Technology Society. DYNAMIC POSITIONING CONFERENCE September 18-19, 2001

5KP-HR Series. TVS Diodes Axial Leaded 5 kw > 5KP-HR series. Description. Uni-directional

ROSE-HULMAN INSTITUTE OF TECHNOLOGY Department of Mechanical Engineering. Mini-project 3 Tennis ball launcher

Smart Water Application Technologies (SWAT)

05-341r0: Updated Test and Simulation Results in Support of SAS-2. Kevin Witt, Mahbubul Bari, Brad Holway

A Slew/Load-Dependent Approach to Single-variable Statistical Delay Modeling. Tau Workshop 2014 Brandon Bautz & Swamy Lokanadham Cadence

Self-organizing traffic lights

Queue analysis for the toll station of the Öresund fixed link. Pontus Matstoms *

BLOCKAGE LOCATION THE PULSE METHOD

Air Compressor Control System for Energy Saving in Manufacturing Plant

A Hare-Lynx Simulation Model

Page 2 of 46 Prepared by: Engr. Ram Sankar

Assessing the Traffic and Energy Impacts of Connected and Automated Vehicles (CAVs)

LIDAR Correlation to Extreme Flapwise Moment : Gust Impact Prediction Time and Feedforward Control

Pressure and Flow Control Valves DBGM, German and European Patents

GOLT! RED LIGHT DISTRICT

Appendix E-2: Microsimulation Report

Asynchronous, Cascadable 8K/16K/32K/64K x9 FIFOs

LENNOX SLP98UHV DIAGNOSTIC CODES

Transit Signal Preemption and Priority Treatments

CONTROL LOGIC DESCRIPTION DOCUMENT IC-410ND

Gas exchange measurement in module M- COVX

MIL-STD-883G METHOD

15KPA-HRA Series. TVS Diodes Axial Leaded 15 kw > 15KPA-HRA series. Descritions

TVS Diodes Axial Leaded 30 kw > 30KPA-HRA series. Features. 30 kw 8.0 W 400 A. -55 to 175 C. Notes: 8.0 C/W

Adiabatic Switching. A Survey of Reversible Computation Circuits. Benjamin Bobich, 2004

Verification and Validation Pathfinder Release 0730 x64

Application Note AN-107

STRUCTURE S-65 PURPOSE SPILLWAY OPERATION

Operational Comparison of Transit Signal Priority Strategies

256/512 /1K /2K/4K x 9 Asynchronous FIFO

Scaling. Krish Chakrabarty 1. Scaling

Master Control Systems, Inc. Variable Speed Fire Pump Controllers

Beyond Carrying Coal To Newcastle: Dual Citizen Circuits

JAR-23 Normal, Utility, Aerobatic, and Commuter Category Aeroplanes \ Issued 11 March 1994 \ Section 1- Requirements \ Subpart C - Structure \ General

A4 Operation Manual. Fig.1-1 Controller Socket Diagram

Study of the IBF of Double/Triple THGEM

Serve the only stroke in which the player has full control over its outcome. Bahamonde (2000) The higher the velocity, the smaller the margin of

Simulation-based design to reduce metabolic cost

REQUIREMENTS AND HAND-OVER DOCUMENTATION FOR ENERGY-OPTIMAL DEMAND-CONTROLLED VENTILATION

Product Information. Three in One: Gas Meter Tariff Device Volume Corrector. Ultrasonic Gas Meter ECOSONIC X12

Master Control Systems, Inc. Variable Speed Fire Pump Controllers

unconventional-airsoft.com

XSL-360-5E. UV LED 5 mm TECHNICAL DATA. Absolute Maximum Ratings (T a =25 C) Specifications (If=20mA, T a =25 C) Drawing

INTERSECTIONS AT GRADE INTERSECTIONS

A4s Operation Manual

ECL Comfort 110, application 131 (valid as of software version 2.00)

intended velocity ( u k arm movements

Autodesk Moldflow Communicator Process settings

Introduction to Waves. If you do not have access to equipment, the following experiments can be observed here:

Validation of Measurements from a ZephIR Lidar

EFFECTS OF SIDEWALL OPENINGS ON THE WIND LOADS ON PIPE-FRAMED GREENHOUSES

SAFE CAPACITY TEST: INNOVATION AND SAVINGS

Electromagnetic Attacks on Ring Oscillator-Based True Random Number Generator

Remote sensing standards: their current status and significance for offshore projects

RESEARCH OF BLOCKAGE SEGMENT DETECTION IN WATER SUPPLY PIPELINE BASED ON FLUID TRANSIENT ANALYSIS ABSTRACT

CONSIDERATION OF DENSITY VARIATIONS IN THE DESIGN OF A VENTILATION SYSTEM FOR ROAD TUNNELS

Undertow - Zonation of Flow in Broken Wave Bores

Lab 12 Standing Waves

Internal Arc Simulation in MV/LV Substations. Charles BESNARD 8 11 June 2009

Watts. xppower.com. Input. General. Environmental. Output. EMC & Safety. Output Voltages from 2.5 V to 48 V. Non-standard Outputs Available

FLOW CONSIDERATIONS IN INDUSTRIAL SILENCER DESIGN

Journal of Applied Fluid Transients, Vol 1-1, April 2014 (3-1)

Arterial Traffic Analysis Actuated Signal Control

Transcription:

Surfing Interconnect Mark R. Greenstreet and Jihong Ren University of British Columbia, Rambus Surfing Interconnect p.1/17

Motivation Wires are the problem: Wires scale poorly with feature size: Gate delays scale with feature size. Wire delays invariant under feature size scaling. Long (i.e. cross-chip) wires have delays that increase quadratically with inverse feature size. Long-wires consume substantial power. Long wires have serious signal integrity and timing concerns. Surfing Interconnect p.2/17

Outline Surfing The timing chain Comparison with traditional synchronous and asynchronous techniques Surfing Interconnect p.3/17

Outline Surfing Surfing pipelines Wire buffering A surfing buffer for long wire interconnect The timing chain Comparison with traditional synchronous and asynchronous techniques Surfing Interconnect p.3/17

Surfing pipelines in out in out in out Data path Timing path Datapath elements are er when is asserted than when it is not. If the maximum delay of a datapath element in mode is less than the minimum timing chain delay, and the minimum delay of a datapath element in slow mode is greater than the maximum timing chain delay, then events in the datapath are attracted to coincide with the rising edge of. Surfing Interconnect p.4/17

Unbuffered Interconnect l source Wire resistance Wire capactance = r w l = c w l destination Wire delay r wc w 2 l2 Wire delay grows quadratically with length. Surfing Interconnect p.5/17

Buffered Interconnect l source 1 2 3 n Wire resistance (per segment) Wire capactance (per segment) l = r w n l = c w n Wire delay (per segment) r wc w 2 Buffer delay (per segment) Total delay destination ( ln ) 2 = δ buf = r wc w 2n l2 + nδ buf Total delay minimized when wire delay and buffer delay are equal. Optimal delay grows linearly with length δ total l 2r w c w δbuf Surfing Interconnect p.5/17

Pipelined Interconnect D Q en D Q en D Q en Φ 1 Φ 2 Φ 1 Total delay for a long wire can be greater than a clock period. Pipelining allows high throughput even with long total delay. Latches add extra overhead because they have larger delays than inverters. Handshaking alternatives are considered later. Surfing Interconnect p.6/17

Surfing Interconnect Data path edge to pulse converter e2p e2p e2p Timing path Variable delay inverers in data path provide surfing. A separate data wave surfs on each edge of the timing signal. This reduces the speed at which the timing channel must operate by a factor of two (compared with level-sensitive signaling). The edge-to-pulse converter provides a pulse on for each edge of the timing signal. Surfing Interconnect p.7/17

The Surfing Buffer in out Added drive when is asserted reduces delay. The circuit is fully static, no extra short-circuit currents, or charge-sharing. Unlike other surfing circuits, this buffer does not achieve negative overhead. Thus, we also refer to it as a soft latch. Surfing Interconnect p.8/17

The Edge-To-Pulse Converter risingedge fallingedge rising edge detector falling edge detector Separate edge detectors for rising and falling edges. Each edge detector is self-resetting outputs a pulse in response to the appropriate edge. The pulses from the two edge detectors are combined with self-resetting NOR gates. Surfing Interconnect p.9/17

Comparison: Framework Compare surfing with existing interconnect techniques: Synchronous, Two-phase, Time-borrowing, Transparent Latches Micropipeline GasP with twin-control Criteria: Velocity vs. Throughput: Velocity is distance traversed divided by latency. Velocity decreases with increasing throughput because of increasing overhead for more latches. Energy weighted velocity vs. Throughput Methodology: Optimize each approach for given metric using Elmore delay models (logical effort with wire delays). Assume wide data bus; thus control energy is dominated by datapath energy. Model parameters based on TSMC 0.18µ bulk CMOS process. Surfing Interconnect p.10/17

Design Margins Surfing: 30% difference between and slow delays for datapath. Control path delay set to midpoint. Handshaking: 30% timing margin between data path and control path. Synchronous: Assume parts graded by speed. Thus, we only consider typical process parameters here. Typically, a synchronous design must achieve its target clock fuency over all temperature and voltage conditions. Thus, we report results for a derated design operating at 1.6 volts and 100C. Temperature and power supply voltage change slowly enough that an asynchronous design can benefit from average case. We report results for surfing and the asynchronous designs operating at 1.8 volts and 25C. We include a non-derated synchronous design operating under the same conditions for the sake of comparison. Surfing Interconnect p.11/17

Velocity velocity (m/s) 12 x 106 10 8 6 Surfing Two Phase GasP twin control Micropipeline Two Phase derated 4 2 0 1 2 3 4 f (GHz) (Higher is better) Surfing Interconnect p.12/17

Energy Weighted Velocity 1 x 10 16 energy delay product (J*s/m 2 ) 0.8 0.6 0.4 Surfing Two Phase 0.2 GasP twin control Micropipeline Two Phase derated 0 0 1 2 3 4 f (GHz) (Lower is better) Surfing Interconnect p.13/17

Robustness Verified correct operation with five-corner HSPICE simulations. Verified correct operation with 0.4V peak-to-peak V dd noise in HSPICE simulations. The clock forwarding chain is the weakest link. Very long chains will drop clock edges due to drafting induced jitter amplification. Surfing Interconnect p.14/17

The Edge To Pulse Converter, Timing e d j g f a b c i h action action Surfing Interconnect p.15/17

The Edge To Pulse Converter, Timing e d j g f a b c i h action action Surfing Interconnect p.15/17

The Edge To Pulse Converter, Timing e d j g f a b c i h action a action Surfing Interconnect p.15/17

The Edge To Pulse Converter, Timing e d j g f a b c i h action a b action Surfing Interconnect p.15/17

The Edge To Pulse Converter, Timing e d j g f a b c i h action a b c action Surfing Interconnect p.15/17

The Edge To Pulse Converter, Timing e d j g f a b c i h action a b c d action Surfing Interconnect p.15/17

The Edge To Pulse Converter, Timing e d j g f a b c i h action a b c d e f action Surfing Interconnect p.15/17

The Edge To Pulse Converter, Timing e d j g f a b c i h action a b c d e f g action Surfing Interconnect p.15/17

The Edge To Pulse Converter, Timing e d j g f a b c i h action a b c d e f g action Surfing Interconnect p.15/17

The Edge To Pulse Converter, Timing e d j g f a b c h i action a b c d e f g action Surfing Interconnect p.15/17

The Edge To Pulse Converter, Timing e d j g f a b c h i action a b c d e f g action Surfing Interconnect p.15/17

The Edge To Pulse Converter, Timing e d j g f a b c h i action a b c d e f g action a Surfing Interconnect p.15/17

The Edge To Pulse Converter, Timing e d j g f a b c h i action a b c d e f g action a b Surfing Interconnect p.15/17

The Edge To Pulse Converter, Timing e d j g f a b c h i action a b c d e f g action a b h Surfing Interconnect p.15/17

The Edge To Pulse Converter, Timing e d j g f a b c h i action a b c d e f g action a b h i Surfing Interconnect p.15/17

The Edge To Pulse Converter, Timing e d j g f a b c h i action a b c d e f g action a b h i j Surfing Interconnect p.15/17

The Edge To Pulse Converter, Timing e d j g f a b c h i action a b c d e f g action a b h i j Surfing Interconnect p.15/17

The Edge To Pulse Converter, Timing e d j g f a b c h i action a b c d e f g action a b h i j Surfing Interconnect p.15/17

The Event Attractor In Action delay from data edge to its corresponding edge (ns) 0.5 0.4 0.3 0.2 0.1 0 0.1 rising edge max rising edge min falling edge max falling edge min Fail after the 15 th stage delay from data edge to its corresponding edge (ns) 0.15 0.1 0.05 0 0.05 0.1 0.15 0.2 rising edge max rising edge min falling edge max falling edge min 0.2 0 5 10 15 20 stage number Delay Variation Without Surfing 0.25 0 5 10 15 20 stage number Delay Variation With Surfing Surfing Interconnect p.16/17

Conclusions and Future Work What We ve Shown Surfing works for interconnect: Better performance than pure asynchronous approaches. Competitive with best synchronous. Outperforms synchronous derated for V dd and temperature variation. A simple, fully static, surfing buffer with no short-circuit currents. Clear reduction in timing variation. Future Work Use with low voltage swing negative overhead(?) Use surfing techniques for clock forwarding, not just data. Examine crosstalk and other signal integrity issues. Surfing Interconnect p.17/17