Radix-64 Floating- Point Divider

Similar documents
Radix-64 Floating-Point Divider

Fuzzy Hidden Markov Models For Indonesian Speech Classification

Performance Optimization of Markov Models in Simulating Computer Networks

Morningstar Investor Return

3. The amount to which $1,000 will grow in 5 years at a 6 percent annual interest rate compounded annually is

Using Rates of Change to Create a Graphical Model. LEARN ABOUT the Math. Create a speed versus time graph for Steve s walk to work.

CALCULATORS: Casio: ClassPad 300 ClassPad 300 Plus ClassPad Manager TI: TI-89, TI-89 Titanium Voyage 200. The Casio ClassPad 300

Capacity Utilization Metrics Revisited: Delay Weighting vs Demand Weighting. Mark Hansen Chieh-Yu Hsiao University of California, Berkeley 01/29/04

8/31/11. the distance it travelled. The slope of the tangent to a curve in the position vs time graph for a particles motion gives:

Time & Distance SAKSHI If an object travels the same distance (D) with two different speeds S 1 taking different times t 1

Interpreting Sinusoidal Functions

Paul M. Sommers David U. Cha And Daniel P. Glatt. March 2010 MIDDLEBURY COLLEGE ECONOMICS DISCUSSION PAPER NO

Stages Written by: Striker. Old West Sayings. Always drink upstream from the herd.-- Will Rogers

ON MULTIPERIOD PORTFOLIO SELECTION WITH DIFFERENT BORROWING AND LENDING RATES

Chapter / rev/min Ans. C / in. C mm Ans teeth Ans. C / mm Ans.

The Measuring System for Estimation of Power of Wind Flow Generated by Train Movement and Its Experimental Testing

Bill Turnblad, Community Development Director City of Stillwater Leif Garnass, PE, PTOE, Senior Associate Joe DeVore, Traffic Engineer

AP Physics 1 Per. Unit 2 Homework. s av

Zelio Control Measurement Relays RM4L Liquid Level Relays

Monte Carlo simulation modelling of aircraft dispatch with known faults

An Alternative Mathematical Model for Oxygen Transfer Evaluation in Clean Water

2017 MCM/ICM Merging Area Designing Model for A Highway Toll Plaza Summary Sheet

"Pecos Bill Rides a Tornado" Stages Written by: Striker

KEY CONCEPTS AND PROCESS SKILLS. 1. An allele is one of the two or more forms of a gene present in a population. MATERIALS AND ADVANCE PREPARATION

CHAPTER TEST REVIEW, LESSONS 4-1 TO 4-5

Strategic Decision Making in Portfolio Management with Goal Programming Model

What the Puck? an exploration of Two-Dimensional collisions

Mattaponi Sundowners. The History of Father s Day in the United States

A Study on the Powering Performance of Multi-Axes Propulsion Ships with Wing Pods

INSTRUCTIONS FOR USE. This file can only be used to produce a handout master:

MVS. Electronic fan speed controller for DIN rail. Key features. Article codes Technical specifications. Area of use

) and magnetization left from the last breath (10), as follows:

Machine Learning for Stock Selection

Simulation Validation Methods

Improving Measurement Uncertainty of Differential Pressures at High Line Pressures & the Potential Impact on the Global Economy & Environment.

Homework 2. is unbiased if. Y is consistent if. c. in real life you typically get to sample many times.

TRACK PROCEDURES 2016 RACE DAY

(I Got Spurs That) Jingle Jangle Jingle lyrics Tex Ritter

San Francisco State University ECON 560 Fall Midterm Exam 2. Tuesday, October hour, 15 minutes

An improvement in calculation method for apparel assembly line balancing

A Simple Approach to Dynamic Material Balance in Gas-Condensate Reservoirs

Instruction Manual. Rugged PCB type. 1 Terminal Block. 2 Function. 3 Series Operation and Parallel Operation. 4 Assembling and Installation Method

Simulation of Scattering Acoustic Field in Rod and Identify of. Ultrasonic Flaw Detecting Signal

RECOMMENDATION FOR INTERCHANGEABLE STUD BOLTS AND TAP END STUDS FOR API SPEC 6A FLANGES

WELCOME! PURPOSE OF WORKSHOP

1. The value of the digit 4 in the number 42,780 is 10 times the value of the digit 4 in which number?

67.301/1. RLP 10: Pneumatic volume-flow controller. Sauter Components

Evaluating Portfolio Policies: A Duality Approach

Basic Systematic Experiments and New Type Child Unit of Anchor Climber: Swarm Type Wall Climbing Robot System

3.00 m. 8. At La Ronde, the free-fall ride called the Orbit" causes a 60.0 kg person to accelerate at a rate of 9.81 m/s 2 down.

VERTICAL DOUBLE TEAM TECHNIQUE ON POWER / COUNTER ( DUECE / TREY )

Asset Allocation with Higher Order Moments and Factor Models

Proportional Reasoning

Flow Switch LABO-VHZ-S

Semi-Fixed-Priority Scheduling: New Priority Assignment Policy for Practical Imprecise Computation

Making Sense of Genetics Problems

SIMULATION OF WAVE EFFECT ON SHIP HYDRODYNAMICS BY RANSE

Market Timing with GEYR in Emerging Stock Market: The Evidence from Stock Exchange of Thailand

Real-time Stochastic Evacuation Models for Decision Support in Actual Emergencies

Guidance Statement on Calculation Methodology

Chapter : Linear Motion 1

Dual Boost High Performances Power Factor Correction (PFC)

As time goes by - Using time series based decision tree induction to analyze the behaviour of opponent players

Avoiding Component Failure in Industrial Refrigeration Systems

LSU RISK ASSESSMENT FORM Please read How to Complete a Risk Assessment before completion

Filter Regulator with Back Flow Mechanism AW20K/30K/40K. How to Order. Symbol. Note 4) J N R W Z. Note 5) Note 6) Accessories Note 3) Symbol.

Name Class Date. Step 2: Rearrange the acceleration equation to solve for final speed. a v final v initial v. final v initial v.

FORECASTING TECHNIQUES ADE 2013 Prof Antoni Espasa TOPIC 1 PART 2 TRENDS AND ACCUMULATION OF KNOWLEDGE. SEASONALITY HANDOUT

ANALYSIS OF RELIABILITY, MAINTENANCE AND RISK BASED INSPECTION OF PRESSURE SAFETY VALVES

HKS Colour System Colour system consisting of 3 series for optimum colour fidelity and colour identity

Examining the limitations for visual anglecar following models

The Construction of a Bioeconomic Model of the Indonesian Flying Fish Fishery

FIVE RISK FACTORS MODEL: PRICING SECTORAL PORTFOLIOS IN THE BRAZILIAN STOCK MARKET

Methods for Estimating Term Structure of Interest Rates

Revisiting the Growth of Hong Kong, Singapore, South Korea, and Taiwan, From the Perspective of a Neoclassical Model

2020 ClearTrace In-line Turbidity System

Neighborhood & Community Services Department

Evaluation of a car-following model using systems dynamics

Name: Answer Key Date: Regents Physics. Waves

A Probabilistic Approach to Worst Case Scenarios

KINEMATICS IN ONE DIMENSION

The t-test. What We Will Cover in This Section. A Research Situation

Gas Source Localisation by Constructing Concentration Gridmaps with a Mobile Robot

SURFACE PAVEMENT CHARACTERISTICS AND ACCIDENT RATE

EXAMINING THE FEASIBILITY OF PAIRED CLOSELY-SPACED PARALLEL APPROACHES

Application of System Dynamics in Car-following Models

CONTROL VALVES IN TURBO-COMPRESSOR ANTI-SURGE SYSTEMS

SAMPLE QUANTITY SHEETS. Sample Quantity Sheets are attached. August 2017 (UNDER REVIEW) CDED F-510

Improving the Tournament Performance of ATP Players from the Perspective of Efficiency Enhancement

Lifecycle Funds. T. Rowe Price Target Retirement Fund. Lifecycle Asset Allocation

Bootstrapping Multilayer Neural Networks for Portfolio Construction

Back In The Saddle Again -- by Ray Whitley and Gene Autry --

Modeling the Repair Process ofa Power Distribution System

Race car damping 2. Fig-1 quarter car model.

Georgia. basketball. Game. Georgia: 14-10, 7-6 SEC. Coach: Mark Fox

Economic Growth with Bubbles

MODELLING THE EFFECTS OF PEDESTRIANS ON INTERSECTION CAPACITY AND DELAY WITH ACTUATED SIGNAL CONTROL

Mattaponi Sundowners

C2 GOA Groundfish Harvest Specifications, Council motion October 4, 2018

Evaluating the Performance of Forecasting Models for Portfolio Allocation Purposes with Generalized GRACH Method

Transcription:

Radix-64 Floaing- Poin Divider Javier D. Bruguera ARITH25 June 25-27, 2018 2018 Ar Liied

Overview Main feaures General archiecure Perforance Radix-64 digi-recurrence division Overlapping of hree radix-4 ieraions Microarchiecure Pre-processing Digi ieraion Digi selecion Nex reainder calculaion Evaluaion and coparison 2 2018 Ar Liied

Main Feaures 2018 Ar Liied

Operands and Resul Floaing-poin division,! = # $, Noralized operands, &, ' [1,2), alhough Subnoral operands are acceped à operand noralizaion before he digi ieraions To siplify he rounding resul is forced o be in! [1,2) If resul is! [0.5,2) à rounding needs a guard bi and a round bi, and resul can need a 1-bi lef shif! 0.5,1 if & < ', 1 2, 3 if 4 6 Early deecion of & < ' à! = 7#, 8'! [1,2) Sae anissa as in &/', bu exponen needs o be decreened Suppor for double, single and half-precisions $ 4 2018 Ar Liied

Digi-Recurrence Division Algorih Radix-64, overlapping hree radix-4 ieraion: 6 bis of he resul are obained every cycle (6 bis/cycle) Each radix-4 ieraion gives 2 bis of he resul Each radix-4 ieraion, 1. Quoien digi selecion, digis { 2, 1, 0, +1, +2} 2. Reainder updae Operands pre-scaling o have a siple quoien-digi selecion funcion If divisor is close enough o 1, he digi selecion is independen on he divisor, depends only on he reainder Dividend scaled as well o preserve he resul The firs quoien-digi (ineger digi) can ake values {+1, +2} à siplified selecion logic In parallel wih he pre-scaling 5 2018 Ar Liied

Early Terinaion Mode Occurs when Any of he operands is NaN,, 0 Division by a power of 2 The resul is no obained in he digi calculaion ieraions Shorer laency 6 2018 Ar Liied

Laency Nuber of resul bis: 1 ineger bi + n fracional bis Ineger bi is obained in parallel wih he pre-scaling Fracional bis include he guard bi,! = 53 %&, 24 *&, 11 (-&) Nuber of digi cycles is (6 bi/cycle) Half-precision: Single-precision: Double-precision: 2 digi cycles -> 6 radix-4 ieraions -> 12 fracional bis 4 digi cycles -> 12 radix-4 ieraions -> 24 fracional bis digi cycles -> 27 radix-4 ieraions -> 54 fracional bis Laency for noral operaion (no subnorals, resul noralized) Half-precision: Single precision: Double-precision: PSC DGT DGT RND PSC DGT DGT DGT DGT RND PSC DGT DGT DGT DGT DGT - DGT DGT DGT DGT RND Obaining he firs digi in parallel wih he pre-scaling and forcing he resul in [1,2) conribue o save 1 cycle laency 7 2018 Ar Liied

Digi-Recurrence Division 2018 Ar Liied

Radix-r Digi-Recurrence Division Ieraive algorih Ieraion i copues : a radix-r quoien digi,! "#$, a reainder, %&'[) + 1] The reainder is used o ge he nex quoien digi % = 4 " Parial quoien before ieraion i, Q i = 234! 2 4 62 A ieraion i! "#$ = 78 4 :%&' ) %&' ) + 1 = 4 %&' )! "#$ < Usually, %&'[)] in redundan carry-save or signed-digi represenaion = %&'[)] is an esiaion of %&'[)] wih few bis (6 bis in radix 4) 2018 Ar Liied

Nuber of Ieraions and Cycles Radix-4 ieraions Nuber of ieraions is!" = % log ) 4 = % 2 For exaple, double-precision: % = 53!" = 27 Radix-64 division, hree radix-4 ieraions per cycle Nuber of cycles for noral division is 010234 =!" 3 + 2 1 pre-scaling cycle,!" 3 digi cycles, 1 rounding cycle For exaple, double-precision:!" = 27 010234 = + 2 = 11 10 2018 Ar Liied

Radix-64 Divider Naive Ipleenaion MSBs of 4 x re[i] re[i] -bi!"# $ + 1 = 4!"# $ + $ + 1, -bi Msb s 2 2 re[i+1] -bi Msb s re[i+2] q[i+3] 11 2018 Ar Liied re[i+3]

Radix-64 Divider Microarchiecure 2018 Ar Liied

Overlapping Radix-4 Ieraions wih Speculaion/Replicaion MSBs of 4 x re[i] re[i] -bi -bi Msb s 2 2 re[i+1] -bi Msb s re[i+2] q[i+3] 13 2018 Ar Liied re[i+3]

Overlapping Radix-4 Ieraions wih Speculaion/Replicaion MSBs of 4 x re[i] re[i] -bi -bi Msb s 2 2 re[i+1] -bi Msb s re[i+2] q[i+3] 14 2018 Ar Liied re[i+3]

Overlapping Radix-4 Ieraions wih Speculaion/Replicaion MSBs of 4 x re[i] re[i] -bi -bi Msb s 2 2 re[i+1] -bi Msb s re[i+2] q[i+3] 15 2018 Ar Liied re[i+3]

Overlapping Radix-4 Ieraions wih Speculaion/Replicaion MSBs of 4 x re[i] re[i] -bi -bi Msb s 2 2 re[i+1] -bi Msb s re[i+2] q[i+3] 16 2018 Ar Liied re[i+3]

Overlapping Radix-4 Ieraions wih Speculaion/Replicaion MSBs of 4 x re[i] re[i] -bi -bi Msb s 2 2 re[i+1] -bi Msb s re[i+2] q[i+3] 17 2018 Ar Liied re[i+3]

Overlapping Radix-4 Ieraions wih Speculaion/Replicaion MSBs of 4 x re[i] re[i] -bi -bi Msb s 2 2 re[i+1] -bi Msb s re[i+2] q[i+3] 18 2018 Ar Liied re[i+3]

Radix-64 Ieraion 6 MSBs of 4 x re[i] 6 6 6b 6 1 afer a 2b lef shif afer a 2b lef shif afer a 2b lef shif afer a 2b lef shif afer a 2b lef shif - d re[i] 1 1 1 1 1 b b b b b 4re[i]-qd 6 MSBs 6 7 MSBs 2 of -qd ^ - d^ 0 d^ ^ 7 7 7 7 7 7 1 1 1 1 1 7b 7b 7b 7b 7b 6 6 6 6 6 6 MSBs - d re[i+1] 4re[i+1]-qd 6 re[i+2] d 4re[i+2]-qd q[i+3] 1 2018 Ar Liied Digi selecion Reainder calculaion q[i+3] re[i+3]

Radix-64 Ieraion 6 MSBs of 4 x re[i] 6 6 6 MSBs 6b afer a 2b lef shif afer a 2b lef shif afer a 2b lef shif afer a 2b lef shif afer a 2b lef shif 1 1 1 1 1 b b b b b 6 6 1 Reainder updae 7 MSBs 2 of -qd ^ - d^ 0 d^ ^ 7 7 7 7 7 7 1 1 1 1 1 7b 7b 7b 7b 7b 6 6 6 6 6 6 MSBs - d - d re[i] 4re[i]-qd 4re[i+1]-qd re[i+1] 6 re[i+2] d 4re[i+2]-qd q[i+3] 20 2018 Ar Liied Digi selecion Reainder calculaion q[i+3] re[i+3]

Radix-64 Ieraion Digi Selecion 6 MSBs of 4 x re[i] 6 6 6 MSBs 6b afer a 2b lef shif afer a 2b lef shif afer a 2b lef shif afer a 2b lef shif afer a 2b lef shif 1 1 1 1 1 b b b b b 6 6 1 Reainder updae 7 MSBs 2 of -qd ^ - d^ 0 d^ ^ 7 7 7 7 7 7 1 1 1 1 1 7b 7b 7b 7b 7b 6 6 6 6 6 6 MSBs - d - d re[i] 4re[i]-qd 4re[i+1]-qd re[i+1] 6 re[i+2] d 4re[i+2]-qd q[i+3] 21 2018 Ar Liied Digi selecion Reainder calculaion q[i+3] re[i+3]

Radix-64 Ieraion 6 MSBs of 4 x re[i] 6 6 6b 6 1 afer a 2b lef shif afer a 2b lef shif afer a 2b lef shif afer a 2b lef shif afer a 2b lef shif - d re[i] 1 1 1 1 1 b b b b b 4re[i]-qd 6 MSBs 6 7 MSBs 2 of -qd ^ - d^ 0 d^ ^ 7 7 7 7 7 7 1 1 1 1 1 7b 7b 7b 7b 7b 6 6 6 6 6 6 MSBs - d re[i+1] 4re[i+1]-qd 6 re[i+2] d 4re[i+2]-qd q[i+3] 22 2018 Ar Liied Digi selecion Reainder calculaion q[i+3] re[i+3]

Radix-64 Ieraion Digi Selecion 6 MSBs of 4 x re[i] 6 6 6 MSBs 6b afer a 2b lef shif afer a 2b lef shif afer a 2b lef shif afer a 2b lef shif afer a 2b lef shif 1 1 1 1 1 b b b b b 6 6 1 Reainder updae 7 MSBs 2 of -qd ^ - d^ 0 d^ ^ 7 7 7 7 7 7 1 1 1 1 1 7b 7b 7b 7b 7b 6 6 6 6 6 6 MSBs - d - d re[i] 4re[i]-qd 4re[i+1]-qd re[i+1] 6 re[i+2] d 4re[i+2]-qd q[i+3] 23 2018 Ar Liied Digi selecion Reainder calculaion q[i+3] re[i+3]

Radix-64 Ieraion Digi Selecion 6 MSBs of 4 x re[i] 6 6 6 MSBs 6b afer a 2b lef shif afer a 2b lef shif afer a 2b lef shif afer a 2b lef shif afer a 2b lef shif 1 1 1 1 1 b b b b b 6 6 1 Reainder updae 7 MSBs 2 of -qd ^ - d^ 0 d^ ^ 7 7 7 7 7 7 1 1 1 1 1 7b 7b 7b 7b 7b 6 6 6 6 6 6 MSBs - d - d re[i] 4re[i]-qd 4re[i+1]-qd re[i+1] 6 re[i+2] d 4re[i+2]-qd q[i+3] 24 2018 Ar Liied Digi selecion Reainder calculaion q[i+3] re[i+3]

Firs Cycle: Operands Pre-Scaling 1. Scaling of divisor and Divisor close o 1 o have a sipler digi selecion funcion Digi selecion funcion depends only on he reainder, i does no depend on he divisor Dividend is scaled as well o preserve he resul 2. Operand coparison If! < # he is lef-shifed by 1 bi, $ = 2! # Resul in [1,2) Makes rounding easier: only a guard bi, here is no a rounding bi 3. Ineger quoien-digi calculaion Resul is in 1,2, hen ineger digi is {+1, +2} Siplified selecion funcion Replicaed for! > # and! # 4. Iniial reainder The scaled divisor and scaled are used for he iniial redundan reainder Includes 1-bi lef shif if! < # 25 2018 Ar Liied

Firs Cycle: Operands Pre-Scaling divisor divisor 1 2 3 1 1 2 3 1 SUB redundan scaled divisor 1 1 quoien digi selecion quoien digi selecion redundan scaled divisor > divisor > divisor > 1 1 scaled scaled divisor q 1 re[1] 26 2018 Ar Liied

Firs Cycle: Operands Pre-Scaling divisor SUB redundan scaled divisor divisor Operand Pre-scaling! # [1 1 2 3 1 1 2 3 1 1 1 quoien digi selecion quoien digi selecion redundan scaled Scaling facor! = 1 + 0 3 0 0 8, 0 7 1.xxx 1 64, 1 + 1 8] M 000 1+1/2+1/2 divisor > divisor > 001 1+1/4+1/2 010 1+1/2+1/8 011 1+1/2+0 1 divisor > 1 scaled 100 1+1/4+1/8 101 1+1/4+0 110 1+0+1/8 scaled divisor q 1 re[1] 111 1+01/8 27 2018 Ar Liied

Firs Cycle: Operands Pre-Scaling Operand Coparison divisor divisor 1 2 3 1 1 2 3 1 To save ie, he non-scaled operands are copared SUB redundan scaled divisor 1 1 quoien digi selecion quoien digi selecion redundan scaled divisor > divisor > divisor > 1 1 scaled scaled divisor q 1 re[1] 28 2018 Ar Liied

Firs Cycle: Operands Pre-Scaling divisor divisor 1 2 3 1 1 2 3 1 SUB redundan scaled divisor Ineger quoien-digi 1 1 quoien digi selecion quoien digi selecion redundan scaled divisor > divisor > divisor > 1 1 scaled scaled divisor q 1 re[1] 2 2018 Ar Liied

Firs Cycle: Operands Pre-Scaling divisor divisor 1 2 3 1 1 2 3 1 SUB redundan scaled divisor 1 1 quoien digi selecion quoien digi selecion redundan scaled divisor > divisor > divisor > 1 1 scaled Iniial reainder scaled divisor q 1 re[1]!"# 1 = &' ) * &, 30 2018 Ar Liied

Evaluaion 2018 Ar Liied

Evaluaion and Coparison Evaluaion Laency Area Coparison wih oher recen processors AMD K7 AMD Jaguar IBM z13 HAL Sparc Inel 2018 (*lake) uliplicaive division algorih uliplicaive division algorih radix-4 digi-recurrence division algorih uliplicaive division algorih radix-1024 digi-recurrence division algorih * No inforaion abou he icroarchiecure, jus soe noes wih he radix and SP/DP laencies 32 2018 Ar Liied

Laency Double precision Single precision Half precision Regular inpu, noralized resul 11 6 4 Regular inpu, subnoral resul 12 7 5 One subnoral inpu, noralized resul 13 8 6 One subnoral inpu, subnoral resul 14 7 Two subnoral inpu, noralized resul 14 7 Exaple: Single precision Regular inpu, nor resul: Regular inpu, subnor resul: 1 subnoral inpu, nor resul: 1 subnoral inpu, subnor resul: 2 subnoral inpu, nor resul: PSC DGT DGT DGT DGT RND1 PSC DGT DGT DGT DGT RND1 RND2 UNP NM PSC DGT DGT DGT DGT RND1 UNP NM PSC DGT DGT DGT DGT RND1 RND2 UNP NM NM PSC DGT DGT DGT DGT RND1 PSC - > pre-scaling, UNP -> unpacking, DGT -> digi ieraion, NM -> noralizaion, RND1,2 -> rounding 33 2018 Ar Liied

Laency - Coparison Algorih Half-precision Single-precision Double-precision AMD K7 uliplicaive N/A 16 20 AMD Jaguar uliplicaive N/A 14 1 IBM z13 Radix-4 N/A 23 37 HAL Sparc uliplicaive N/A 16 1 Inel 2018 (lake) Radix-1024 N/A 10 13 ARM Radix-64 4 6 11 Laencies include pre-processing (unpacking, pre-scaling), ieraion cycles, and posprocessing (rounding) and assuing noralized inpu/oupu Divider based on uliplicaive algorihs: Laency liied by he laency of uliplicaion or uliply-and-accuulae Can be significaive Radix-1024 divider (10 bi/cycle) Pre-processing (probably) needs several cycles 34 2018 Ar Liied

Area Large area Radix-64 ieraion fifeen 58-bi (5 58-bi per radix-4 ieraion) five -bi adder five 7-bi adder Selecion of hree quoien-digis Muxes Pre-scaling Three 58-bi adder Two 58-bi Reduced selecion logic Muxes Rounding, Noralizaion DGT Pre-proc Toal 58-bi adders -- 3 3 Sall adders 10 2 12 58-bi 15 2 17 Selecion logic 3 2 5 58-bi 4-o-1 ux 6 -- 6 58-bi 2-o-1 ux -- 6 6 narrow uxes 2 1 3 35 2018 Ar Liied

Area - Coparison Muliplicaive algorihs Modes area, saller han in he radix-64 divider Reusing he exising FP ulipliers Look-Up able for iniial seed Muxes Radix-4 algorih Redundan reainder: 116-bi su word, 28-bi carry word 6 os-significan bis of he reainder are non-redundan Area: 3-o-2, 2 sall CPA, digi selecion able Radix-1024 algorih Large area (probably) Pre-scaling: 1 # approxiaion, 2 uliplier Ieraion: sall adder for digi selecion, recangular uliplier for reainder updae 36 2018 Ar Liied

Conclusions 2018 Ar Liied

Conclusions Radix-64 floaing-poin divider 6 bis/cycle Overlapping of hree radix-4 ieraions Pre-scaling o have a siple selecion funcion Resul in [1,2) Firs ieraion in parallel wih he pre-scaling Low laency fp divider 11, 6, and 4 cycles for double, single, and half-precision Addiional cycles in case of subnoral inpu/oupu Saller laency han oher ipleenaions, alhough area is larger Ineger division could be easily inegraed Shared logic wih floaing-poin square-roo 38 2018 Ar Liied

Thank You Danke Merci Gracias Kiios 감사합니다 ध"यव द ה ד ו ת 3 2018 Ar Liied

Floaing-Poin Division Ieraive digi-recurrence algorih In a radix-r algorih each ieraion copues a radix-r digi of he quoien. A radix-r digi represens log $ % bis of he quoien Nuber of ieraions depends on he resul precision and on he radix Several cycles Unpacking Pre-scaling Noralizaion (1 cycle per subnoral inpu) Digi calculaion (several cycles) Rounding (2 rounding cycles if quoien is subnoral) 40 2018 Ar Liied